Series comparison

-[Qemu-devel] [PULL 00/45] target-arm queue
+[PULL 00/26] target-arm queue
-As promised, another pullreq... This one's mostly RTH's patches.
+Hi; here's a target-arm pullreq. Mostly this is RTH's FEAT_RME
 series; there are also a handful of bug fixes including some
 which aren't arm-specific but which it's convenient to include
 here.
 thanks
 -- PMM
-The following changes since commit 784c2e4f232adf5ef47a84a262ec72a07d068d6a:
+The following changes since commit b455ce4c2f300c8ba47cba7232dd03261368a4cb:
-  Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging (2018-10-19 15:30:40 +0100)
+  Merge tag 'q800-for-8.1-pull-request' of https://github.com/vivier/qemu-m68k into staging (2023-06-22 10:18:32 +0200)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20181019
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20230623
-for you to fetch changes up to 88c9add25e7120e8622796c81ad3f3fb7f8d40e7:
+for you to fetch changes up to 497fad38979c16b6412388927401e577eba43d26:
-  target/arm: Only flush tlb if ASID changes (2018-10-19 17:38:48 +0100)
+  pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym (2023-06-23 11:46:02 +0100)
 ----------------------------------------------------------------
 target-arm queue:
- * ssi-sd: Make devices picking up backends unavailable with -device
+ * Add (experimental) support for FEAT_RME
- * Add support for VCPU event states
+ * host-utils: Avoid using __builtin_subcll on buggy versions of Apple Clang
- * Move towards making ID registers the source of truth for
+ * target/arm: Restructure has_vfp_d32 test
-   whether a guest CPU implements a feature, rather than having
+ * hw/arm/sbsa-ref: add ITS support in SBSA GIC
-   parallel ID registers and feature bit flags
+ * target/arm: Fix sve predicate store, 8 <= VQ <= 15
- * Implement various HCR hypervisor trap/config bits
+ * pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym
  * Get IL bit correct for v7 syndrome values
  * Report correct syndrome for FP/SIMD traps to Hyp mode
  * hw/arm/boot: Increase compliance with kernel arm64 boot protocol
  * Refactor A32 Neon to use generic vector infrastructure
  * Fix a bug in A32 VLD2 "(multiple 2-element structures)" insn
  * net: cadence_gem: Report features correctly in ID register
  * Avoid some unnecessary TLB flushes on TTBR register writes
 ----------------------------------------------------------------
-Dongjiu Geng (1):
+Peter Maydell (2):
-      target/arm: Add support for VCPU event states
+      host-utils: Avoid using __builtin_subcll on buggy versions of Apple Clang
       pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym
-Edgar E. Iglesias (2):
+Richard Henderson (23):
-      net: cadence_gem: Announce availability of priority queues
+      target/arm: Add isar_feature_aa64_rme
-      net: cadence_gem: Announce 64bit addressing support
+      target/arm: Update SCR and HCR for RME
       target/arm: SCR_EL3.NS may be RES1
       target/arm: Add RME cpregs
       target/arm: Introduce ARMSecuritySpace
       include/exec/memattrs: Add two bits of space to MemTxAttrs
       target/arm: Adjust the order of Phys and Stage2 ARMMMUIdx
       target/arm: Introduce ARMMMUIdx_Phys_{Realm,Root}
       target/arm: Remove __attribute__((nonnull)) from ptw.c
       target/arm: Pipe ARMSecuritySpace through ptw.c
       target/arm: NSTable is RES0 for the RME EL3 regime
       target/arm: Handle Block and Page bits for security space
       target/arm: Handle no-execute for Realm and Root regimes
       target/arm: Use get_phys_addr_with_struct in S1_ptw_translate
       target/arm: Move s1_is_el0 into S1Translate
       target/arm: Use get_phys_addr_with_struct for stage2
       target/arm: Add GPC syndrome
       target/arm: Implement GPC exceptions
       target/arm: Implement the granule protection check
       target/arm: Add cpu properties for enabling FEAT_RME
       docs/system/arm: Document FEAT_RME
       target/arm: Restructure has_vfp_d32 test
       target/arm: Fix sve predicate store, 8 <= VQ <= 15
-Markus Armbruster (1):
+Shashi Mallela (1):
-      ssi-sd: Make devices picking up backends unavailable with -device
+      hw/arm/sbsa-ref: add ITS support in SBSA GIC
-Peter Maydell (10):
+ docs/system/arm/cpu-features.rst |  23 ++
-      target/arm: Improve debug logging of AArch32 exception return
+ docs/system/arm/emulation.rst    |   1 +
-      target/arm: Make switch_mode() file-local
+ docs/system/arm/sbsa.rst         |  14 +
-      target/arm: Implement HCR.FB
+ include/exec/memattrs.h          |   9 +-
-      target/arm: Implement HCR.DC
+ include/qemu/compiler.h          |  13 +
-      target/arm: ISR_EL1 bits track virtual interrupts if IMO/FMO set
+ include/qemu/host-utils.h        |   2 +-
-      target/arm: Implement HCR.VI and VF
+ target/arm/cpu.h                 | 151 ++++++++---
-      target/arm: Implement HCR.PTW
+ target/arm/internals.h           |  27 ++
-      target/arm: New utility function to extract EC from syndrome
+ target/arm/syndrome.h            |  10 +
-      target/arm: Get IL bit correct for v7 syndrome values
+ hw/arm/sbsa-ref.c                |  33 ++-
-      target/arm: Report correct syndrome for FP/SIMD traps to Hyp mode
+ target/arm/cpu.c                 |  32 ++-
+ target/arm/helper.c              | 162 ++++++++++-
-Richard Henderson (30):
+ target/arm/ptw.c                 | 570 +++++++++++++++++++++++++++++++--------
-      target/arm: Move some system registers into a substructure
+ target/arm/tcg/cpu64.c           |  53 ++++
-      target/arm: V8M should not imply V7VE
+ target/arm/tcg/tlb_helper.c      |  96 ++++++-
-      target/arm: Convert v8 extensions from feature bits to isar tests
+ target/arm/tcg/translate-sve.c   |   2 +-
-      target/arm: Convert division from feature bits to isar0 tests
+ pc-bios/keymaps/meson.build      |   2 +-
-      target/arm: Convert jazelle from feature bit to isar1 test
+files changed, 1034 insertions(+), 166 deletions(-)
       target/arm: Convert t32ee from feature bit to isar3 test
       target/arm: Convert sve from feature bit to aa64pfr0 test
       target/arm: Convert v8.2-fp16 from feature bit to aa64pfr0 test
       target/arm: Hoist address increment for vector memory ops
       target/arm: Don't call tcg_clear_temp_count
       target/arm: Use tcg_gen_gvec_dup_i64 for LD[1-4]R
       target/arm: Promote consecutive memory ops for aa64
       target/arm: Mark some arrays const
       target/arm: Use gvec for NEON VDUP
       target/arm: Use gvec for NEON VMOV, VMVN, VBIC & VORR (immediate)
       target/arm: Use gvec for NEON_3R_LOGIC insns
       target/arm: Use gvec for NEON_3R_VADD_VSUB insns
       target/arm: Use gvec for NEON_2RM_VMN, NEON_2RM_VNEG
       target/arm: Use gvec for NEON_3R_VMUL
       target/arm: Use gvec for VSHR, VSHL
       target/arm: Use gvec for VSRA
       target/arm: Use gvec for VSRI, VSLI
       target/arm: Use gvec for NEON_3R_VML
       target/arm: Use gvec for NEON_3R_VTST_VCEQ, NEON_3R_VCGT, NEON_3R_VCGE
       target/arm: Use gvec for NEON VLD all lanes
       target/arm: Reorg NEON VLD/VST all elements
       target/arm: Promote consecutive memory ops for aa32
       target/arm: Reorg NEON VLD/VST single element to one lane
       target/arm: Remove writefn from TTBR0_EL3
       target/arm: Only flush tlb if ASID changes
 Stewart Hildebrand (1):
       hw/arm/boot: Increase compliance with kernel arm64 boot protocol
  target/arm/cpu.h            |  227 ++++++-
  target/arm/internals.h      |   45 +-
  target/arm/kvm_arm.h        |   24 +
  target/arm/translate.h      |   21 +
  hw/arm/boot.c               |   18 +
  hw/intc/armv7m_nvic.c       |   12 +-
  hw/net/cadence_gem.c        |    9 +-
  hw/sd/ssi-sd.c              |    2 +
  linux-user/aarch64/signal.c |    4 +-
  linux-user/elfload.c        |   60 +-
  linux-user/syscall.c        |   10 +-
  target/arm/cpu.c            |  242 ++++----
  target/arm/cpu64.c          |  148 +++--
  target/arm/helper.c         |  397 ++++++++----
  target/arm/kvm.c            |   60 ++
  target/arm/kvm32.c          |   13 +
  target/arm/kvm64.c          |   15 +-
  target/arm/machine.c        |   28 +-
  target/arm/op_helper.c      |    2 +-
  target/arm/translate-a64.c  |  715 ++++-----------------
  target/arm/translate.c      | 1451 ++++++++++++++++++++++++++++---------------
 files changed, 2021 insertions(+), 1482 deletions(-)

-[Qemu-devel] [PULL 01/45] ssi-sd: Make devices picking up backends unavailable with -device
+Deleted patch
-From: Markus Armbruster <armbru@redhat.com>
-Device models aren't supposed to go on fishing expeditions for
-backends.  They should expose suitable properties for the user to set.
-For onboard devices, board code sets them.
-Device ssi-sd picks up its block backend in its init() method with
-drive_get_next() instead.  This mistake is already marked FIXME since
-commit af9e40a.
-Unset user_creatable to remove the mistake from our external
-interface.  Since the SSI bus doesn't support hotplug, only -device
-can be affected.  Only certain ARM machines have ssi-sd and provide an
-SSI bus for it; this patch breaks -device ssi-sd for these machines.
-No actual use of -device ssi-sd is known.
-Signed-off-by: Markus Armbruster <armbru@redhat.com>
-Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Acked-by: Thomas Huth <thuth@redhat.com>
-Message-id: 20181009060835.4608-1-armbru@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- hw/sd/ssi-sd.c | 2 ++
-file changed, 2 insertions(+)
-diff --git a/hw/sd/ssi-sd.c b/hw/sd/ssi-sd.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/sd/ssi-sd.c
-+++ b/hw/sd/ssi-sd.c
-@@ -XXX,XX +XXX,XX @@ static void ssi_sd_class_init(ObjectClass *klass, void *data)
-     k->cs_polarity = SSI_CS_LOW;
-     dc->vmsd = &vmstate_ssi_sd;
-     dc->reset = ssi_sd_reset;
-+    /* Reason: init() method uses drive_get_next() */
-+    dc->user_creatable = false;
- }
- static const TypeInfo ssi_sd_info = {
---
-.19.1

-[Qemu-devel] [PULL 07/45] target/arm: Convert jazelle from feature bit to isar1 test
+[PULL 01/26] target/arm: Add isar_feature_aa64_rme
 From: Richard Henderson <richard.henderson@linaro.org>
-Having V6 alone imply jazelle was wrong for cortex-m0.
+Add the missing field for ID_AA64PFR0, and the predicate.
-Change to an assertion for V6 & !M.
+Disable it if EL3 is forced off by the board or command-line.
-This was harmless, because the only place we tested ARM_FEATURE_JAZELLE
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
-was for 'bxj' in disas_arm(), which is unreachable for M-profile cores.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181016223115.24100-6-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-2-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h       |  6 +++++-
+ target/arm/cpu.h | 6 ++++++
- target/arm/cpu.c       | 17 ++++++++++++++---
+ target/arm/cpu.c | 4 ++++
- target/arm/translate.c |  2 +-
+files changed, 10 insertions(+)
 files changed, 20 insertions(+), 5 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ enum arm_features {
+@@ -XXX,XX +XXX,XX @@ FIELD(ID_AA64PFR0, SEL2, 36, 4)
-     ARM_FEATURE_PMU, /* has PMU support */
+ FIELD(ID_AA64PFR0, MPAM, 40, 4)
-     ARM_FEATURE_VBAR, /* has cp15 VBAR */
+ FIELD(ID_AA64PFR0, AMU, 44, 4)
-     ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
+ FIELD(ID_AA64PFR0, DIT, 48, 4)
--    ARM_FEATURE_JAZELLE, /* has (trivial) Jazelle implementation */
++FIELD(ID_AA64PFR0, RME, 52, 4)
-     ARM_FEATURE_SVE, /* has Scalable Vector Extension */
+ FIELD(ID_AA64PFR0, CSV2, 56, 4)
-     ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
+ FIELD(ID_AA64PFR0, CSV3, 60, 4)
-     ARM_FEATURE_M_MAIN, /* M profile Main Extension */
-@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_arm_div(const ARMISARegisters *id)
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_sel2(const ARMISARegisters *id)
-     return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) > 1;
+     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SEL2) != 0;
  }
-+static inline bool isar_feature_jazelle(const ARMISARegisters *id)
++static inline bool isar_feature_aa64_rme(const ARMISARegisters *id)
 +{
-+    return FIELD_EX32(id->id_isar1, ID_ISAR1, JAZELLE) != 0;
++    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, RME) != 0;
 +}
 +
- static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
+ static inline bool isar_feature_aa64_vh(const ARMISARegisters *id)
  {
-     return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
+     return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, VH) != 0;
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
 @@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
+         cpu->isar.id_dfr0 = FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, COPSDBG, 0);
+         cpu->isar.id_aa64pfr0 = FIELD_DP64(cpu->isar.id_aa64pfr0,
+                                            ID_AA64PFR0, EL3, 0);
++
++        /* Disable the realm management extension, which requires EL3. */
++        cpu->isar.id_aa64pfr0 = FIELD_DP64(cpu->isar.id_aa64pfr0,
++                                           ID_AA64PFR0, RME, 0);
      }
-     if (arm_feature(env, ARM_FEATURE_V6)) {
-         set_feature(env, ARM_FEATURE_V5);
+     if (!cpu->has_el2) {
 -        set_feature(env, ARM_FEATURE_JAZELLE);
          if (!arm_feature(env, ARM_FEATURE_M)) {
 +            assert(cpu_isar_feature(jazelle, cpu));
              set_feature(env, ARM_FEATURE_AUXCR);
          }
      }
@@ -XXX,XX +XXX,XX @@ static void arm926_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_VFP);
      set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
      set_feature(&cpu->env, ARM_FEATURE_CACHE_TEST_CLEAN);
 -    set_feature(&cpu->env, ARM_FEATURE_JAZELLE);
      cpu->midr = 0x41069265;
      cpu->reset_fpsid = 0x41011090;
      cpu->ctr = 0x1dd20d2;
      cpu->reset_sctlr = 0x00090078;
 +
 +    /*
 +     * ARMv5 does not have the ID_ISAR registers, but we can still
 +     * set the field to indicate Jazelle support within QEMU.
 +     */
 +    cpu->isar.id_isar1 = FIELD_DP32(cpu->isar.id_isar1, ID_ISAR1, JAZELLE, 1);
  }
  static void arm946_initfn(Object *obj)
@@ -XXX,XX +XXX,XX @@ static void arm1026_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_AUXCR);
      set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
      set_feature(&cpu->env, ARM_FEATURE_CACHE_TEST_CLEAN);
 -    set_feature(&cpu->env, ARM_FEATURE_JAZELLE);
      cpu->midr = 0x4106a262;
      cpu->reset_fpsid = 0x410110a0;
      cpu->ctr = 0x1dd20d2;
      cpu->reset_sctlr = 0x00090078;
      cpu->reset_auxcr = 1;
 +
 +    /*
 +     * ARMv5 does not have the ID_ISAR registers, but we can still
 +     * set the field to indicate Jazelle support within QEMU.
 +     */
 +    cpu->isar.id_isar1 = FIELD_DP32(cpu->isar.id_isar1, ID_ISAR1, JAZELLE, 1);
 +
      {
          /* The 1026 had an IFAR at c6,c0,0,1 rather than the ARMv6 c6,c0,0,2 */
          ARMCPRegInfo ifar = {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@
  #define ENABLE_ARCH_5     arm_dc_feature(s, ARM_FEATURE_V5)
  /* currently all emulated v5 cores are also v5TE, so don't bother */
  #define ENABLE_ARCH_5TE   arm_dc_feature(s, ARM_FEATURE_V5)
 -#define ENABLE_ARCH_5J    arm_dc_feature(s, ARM_FEATURE_JAZELLE)
 +#define ENABLE_ARCH_5J    dc_isar_feature(jazelle, s)
  #define ENABLE_ARCH_6     arm_dc_feature(s, ARM_FEATURE_V6)
  #define ENABLE_ARCH_6K    arm_dc_feature(s, ARM_FEATURE_V6K)
  #define ENABLE_ARCH_6T2   arm_dc_feature(s, ARM_FEATURE_THUMB2)
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 10/45] target/arm: Convert v8.2-fp16 from feature bit to aa64pfr0 test
+[PULL 02/26] target/arm: Update SCR and HCR for RME
 From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Define the missing SCR and HCR bits, allow SCR_NSE and {SCR,HCR}_GPF
 to be set, and invalidate TLBs when NSE changes.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181016223115.24100-9-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-3-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h           | 17 +++++++++++++++-
+ target/arm/cpu.h    |  5 +++--
- linux-user/elfload.c       |  6 +-----
+ target/arm/helper.c | 10 ++++++++--
- target/arm/cpu64.c         | 16 ++++++++-------
+files changed, 11 insertions(+), 4 deletions(-)
  target/arm/helper.c        |  2 +-
  target/arm/translate-a64.c | 40 +++++++++++++++++++-------------------
  target/arm/translate.c     |  6 +++---
 files changed, 50 insertions(+), 37 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ enum arm_features {
+@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
-     ARM_FEATURE_PMU, /* has PMU support */
+ #define HCR_TERR      (1ULL << 36)
-     ARM_FEATURE_VBAR, /* has cp15 VBAR */
+ #define HCR_TEA       (1ULL << 37)
-     ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
+ #define HCR_MIOCNCE   (1ULL << 38)
--    ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
+-/* RES0 bit 39 */
-     ARM_FEATURE_M_MAIN, /* M profile Main Extension */
++#define HCR_TME       (1ULL << 39)
- };
+ #define HCR_APK       (1ULL << 40)
+ #define HCR_API       (1ULL << 41)
-@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_dp(const ARMISARegisters *id)
+ #define HCR_NV        (1ULL << 42)
-     return FIELD_EX32(id->id_isar6, ID_ISAR6, DP) != 0;
+@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
- }
+ #define HCR_NV2       (1ULL << 45)
+ #define HCR_FWB       (1ULL << 46)
-+static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
+ #define HCR_FIEN      (1ULL << 47)
-+{
+-/* RES0 bit 48 */
-+    /*
++#define HCR_GPF       (1ULL << 48)
-+     * This is a placeholder for use by VCMA until the rest of
+ #define HCR_TID4      (1ULL << 49)
-+     * the ARMv8.2-FP16 extension is implemented for aa32 mode.
+ #define HCR_TICAB     (1ULL << 50)
-+     * At which point we can properly set and check MVFR1.FPHP.
+ #define HCR_AMVOFFEN  (1ULL << 51)
-+     */
+@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
-+    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
+ #define SCR_TRNDR             (1ULL << 40)
-+}
+ #define SCR_ENTP2             (1ULL << 41)
-+
+ #define SCR_GPF               (1ULL << 48)
- /*
++#define SCR_NSE               (1ULL << 62)
-  * 64-bit feature tests via id registers.
-  */
+ #define HSTR_TTEE (1 << 16)
-@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
+ #define HSTR_TJDBX (1 << 17)
      return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
  }
 +static inline bool isar_feature_aa64_fp16(const ARMISARegisters *id)
 +{
 +    /* We always set the AdvSIMD and FP fields identically wrt FP16.  */
 +    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
 +}
 +
  static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
  {
      return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
 diff --git a/linux-user/elfload.c b/linux-user/elfload.c
 index XXXXXXX..XXXXXXX 100644
 --- a/linux-user/elfload.c
 +++ b/linux-user/elfload.c
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
      hwcaps |= ARM_HWCAP_A64_ASIMD;
      /* probe for the extra features */
 -#define GET_FEATURE(feat, hwcap) \
 -    do { if (arm_feature(&cpu->env, feat)) { hwcaps |= hwcap; } } while (0)
  #define GET_FEATURE_ID(feat, hwcap) \
      do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
      GET_FEATURE_ID(aa64_sha3, ARM_HWCAP_A64_SHA3);
      GET_FEATURE_ID(aa64_sm3, ARM_HWCAP_A64_SM3);
      GET_FEATURE_ID(aa64_sm4, ARM_HWCAP_A64_SM4);
 -    GET_FEATURE(ARM_FEATURE_V8_FP16,
 -                ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
 +    GET_FEATURE_ID(aa64_fp16, ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
      GET_FEATURE_ID(aa64_atomics, ARM_HWCAP_A64_ATOMICS);
      GET_FEATURE_ID(aa64_rdm, ARM_HWCAP_A64_ASIMDRDM);
      GET_FEATURE_ID(aa64_dp, ARM_HWCAP_A64_ASIMDDP);
      GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
      GET_FEATURE_ID(aa64_sve, ARM_HWCAP_A64_SVE);
 -#undef GET_FEATURE
  #undef GET_FEATURE_ID
      return hwcaps;
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
          t = cpu->isar.id_aa64pfr0;
          t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1);
 +        t = FIELD_DP64(t, ID_AA64PFR0, FP, 1);
 +        t = FIELD_DP64(t, ID_AA64PFR0, ADVSIMD, 1);
          cpu->isar.id_aa64pfr0 = t;
          /* Replicate the same data to the 32-bit id registers.  */
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
          u = FIELD_DP32(u, ID_ISAR6, DP, 1);
          cpu->isar.id_isar6 = u;
 -#ifdef CONFIG_USER_ONLY
 -        /* We don't set these in system emulation mode for the moment,
 -         * since we don't correctly set the ID registers to advertise them,
 -         * and in some cases they're only available in AArch64 and not AArch32,
 -         * whereas the architecture requires them to be present in both if
 -         * present in either.
 +        /*
 +         * FIXME: We do not yet support ARMv8.2-fp16 for AArch32 yet,
 +         * so do not set MVFR1.FPHP.  Strictly speaking this is not legal,
 +         * but it is also not legal to enable SVE without support for FP16,
 +         * and enabling SVE in system mode is more useful in the short term.
           */
 -        set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
 +
 +#ifdef CONFIG_USER_ONLY
          /* For usermode -cpu max we can use a larger and more efficient DCZ
           * blocksize since we don't have to follow what the hardware does.
           */
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
+@@ -XXX,XX +XXX,XX @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
-     uint32_t changed;
+         if (cpu_isar_feature(aa64_fgt, cpu)) {
+             valid_mask |= SCR_FGTEN;
-     /* When ARMv8.2-FP16 is not supported, FZ16 is RES0.  */
+         }
--    if (!arm_feature(env, ARM_FEATURE_V8_FP16)) {
++        if (cpu_isar_feature(aa64_rme, cpu)) {
-+    if (!cpu_isar_feature(aa64_fp16, arm_env_get_cpu(env))) {
++            valid_mask |= SCR_NSE | SCR_GPF;
-         val &= ~FPCR_FZ16;
++        }
      } else {
          valid_mask &= ~(SCR_RW | SCR_ST);
          if (cpu_isar_feature(aa32_ras, cpu)) {
@@ -XXX,XX +XXX,XX @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
      env->cp15.scr_el3 = value;
      /*
 -     * If SCR_EL3.NS changes, i.e. arm_is_secure_below_el3, then
 +     * If SCR_EL3.{NS,NSE} changes, i.e. change of security state,
       * we must invalidate all TLBs below EL3.
       */
 -    if (changed & SCR_NS) {
 +    if (changed & (SCR_NS | SCR_NSE)) {
          tlb_flush_by_mmuidx(env_cpu(env), (ARMMMUIdxBit_E10_0 |
                                             ARMMMUIdxBit_E20_0 |
                                             ARMMMUIdxBit_E10_1 |
@@ -XXX,XX +XXX,XX @@ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
          if (cpu_isar_feature(aa64_fwb, cpu)) {
              valid_mask |= HCR_FWB;
          }
 +        if (cpu_isar_feature(aa64_rme, cpu)) {
 +            valid_mask |= HCR_GPF;
 +        }
      }
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+     if (cpu_isar_feature(any_evt, cpu)) {
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_fp_compare(DisasContext *s, uint32_t insn)
          break;
      case 3:
          size = MO_16;
 -        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        if (dc_isar_feature(aa64_fp16, s)) {
              break;
          }
          /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
          break;
      case 3:
          size = MO_16;
 -        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        if (dc_isar_feature(aa64_fp16, s)) {
              break;
          }
          /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
          break;
      case 3:
          sz = MO_16;
 -        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        if (dc_isar_feature(aa64_fp16, s)) {
              break;
          }
          /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_1src(DisasContext *s, uint32_t insn)
              handle_fp_1src_double(s, opcode, rd, rn);
              break;
          case 3:
 -            if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +            if (!dc_isar_feature(aa64_fp16, s)) {
                  unallocated_encoding(s);
                  return;
              }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_2src(DisasContext *s, uint32_t insn)
          handle_fp_2src_double(s, opcode, rd, rn, rm);
          break;
      case 3:
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        if (!dc_isar_feature(aa64_fp16, s)) {
              unallocated_encoding(s);
              return;
          }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_3src(DisasContext *s, uint32_t insn)
          handle_fp_3src_double(s, o0, o1, rd, rn, rm, ra);
          break;
      case 3:
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        if (!dc_isar_feature(aa64_fp16, s)) {
              unallocated_encoding(s);
              return;
          }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
          break;
      case 3:
          sz = MO_16;
 -        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        if (dc_isar_feature(aa64_fp16, s)) {
              break;
          }
          /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_fixed_conv(DisasContext *s, uint32_t insn)
      case 1: /* float64 */
          break;
      case 3: /* float16 */
 -        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        if (dc_isar_feature(aa64_fp16, s)) {
              break;
          }
          /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
              break;
          case 0x6: /* 16-bit float, 32-bit int */
          case 0xe: /* 16-bit float, 64-bit int */
 -            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +            if (dc_isar_feature(aa64_fp16, s)) {
                  break;
              }
              /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
          case 1: /* float64 */
              break;
          case 3: /* float16 */
 -            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +            if (dc_isar_feature(aa64_fp16, s)) {
                  break;
              }
              /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
           */
          is_min = extract32(size, 1, 1);
          is_fp = true;
 -        if (!is_u && arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        if (!is_u && dc_isar_feature(aa64_fp16, s)) {
              size = 1;
          } else if (!is_u || !is_q || extract32(size, 0, 1)) {
              unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
      if (o2 != 0 || ((cmode == 0xf) && is_neg && !is_q)) {
          /* Check for FMOV (vector, immediate) - half-precision */
 -        if (!(arm_dc_feature(s, ARM_FEATURE_V8_FP16) && o2 && cmode == 0xf)) {
 +        if (!(dc_isar_feature(aa64_fp16, s) && o2 && cmode == 0xf)) {
              unallocated_encoding(s);
              return;
          }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
      case 0x2f: /* FMINP */
          /* FP op, size[0] is 32 or 64 bit*/
          if (!u) {
 -            if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +            if (!dc_isar_feature(aa64_fp16, s)) {
                  unallocated_encoding(s);
                  return;
              } else {
@@ -XXX,XX +XXX,XX @@ static void handle_simd_shift_intfp_conv(DisasContext *s, bool is_scalar,
          size = MO_32;
      } else if (immh & 2) {
          size = MO_16;
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        if (!dc_isar_feature(aa64_fp16, s)) {
              unallocated_encoding(s);
              return;
          }
@@ -XXX,XX +XXX,XX @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
          size = MO_32;
      } else if (immh & 0x2) {
          size = MO_16;
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        if (!dc_isar_feature(aa64_fp16, s)) {
              unallocated_encoding(s);
              return;
          }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
          return;
      }
 -    if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +    if (!dc_isar_feature(aa64_fp16, s)) {
          unallocated_encoding(s);
      }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
      TCGv_ptr fpst;
      bool pairwise = false;
 -    if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +    if (!dc_isar_feature(aa64_fp16, s)) {
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
      case 0x1c: /* FCADD, #90 */
      case 0x1e: /* FCADD, #270 */
          if (size == 0
 -            || (size == 1 && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))
 +            || (size == 1 && !dc_isar_feature(aa64_fp16, s))
              || (size == 3 && !is_q)) {
              unallocated_encoding(s);
              return;
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
      bool need_fpst = true;
      int rmode;
 -    if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +    if (!dc_isar_feature(aa64_fp16, s)) {
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
          }
          break;
      }
 -    if (is_fp16 && !arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +    if (is_fp16 && !dc_isar_feature(aa64_fp16, s)) {
          unallocated_encoding(s);
          return;
      }
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
          int size = extract32(insn, 20, 1);
          data = extract32(insn, 23, 2); /* rot */
          if (!dc_isar_feature(aa32_vcma, s)
 -            || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
 +            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
              return 1;
          }
          fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
          int size = extract32(insn, 20, 1);
          data = extract32(insn, 24, 1); /* rot */
          if (!dc_isar_feature(aa32_vcma, s)
 -            || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
 +            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
              return 1;
          }
          fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
              return 1;
          }
          if (size == 0) {
 -            if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +            if (!dc_isar_feature(aa32_fp16_arith, s)) {
                  return 1;
              }
              /* For fp16, rm is just Vm, and index is M.  */
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 45/45] target/arm: Only flush tlb if ASID changes
+[PULL 03/26] target/arm: SCR_EL3.NS may be RES1
 From: Richard Henderson <richard.henderson@linaro.org>
-Since QEMU does not implement ASIDs, changes to the ASID must flush the
+With RME, SEL2 must also be present to support secure state.
-tlb.  However, if the ASID does not change there is no reason to flush.
+The NS bit is RES1 if SEL2 is not present.
-In testing a boot of the Ubuntu installer to the first menu, this reduces
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 the number of flushes by 30%, or nearly 600k instances.
 Reviewed-by: Aaron Lindsay <aaron@os.amperecomputing.com>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20230620124418.805717-4-richard.henderson@linaro.org
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Message-id: 20181019015617.22583-3-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.c | 8 +++-----
+ target/arm/helper.c | 3 +++
-file changed, 3 insertions(+), 5 deletions(-)
+file changed, 3 insertions(+)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
- static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+         }
-                             uint64_t value)
+         if (cpu_isar_feature(aa64_sel2, cpu)) {
- {
+             valid_mask |= SCR_EEL2;
--    /* 64 bit accesses to the TTBRs can change the ASID and so we
++        } else if (cpu_isar_feature(aa64_rme, cpu)) {
--     * must flush the TLB.
++            /* With RME and without SEL2, NS is RES1 (R_GSWWH, I_DJJQJ). */
--     */
++            value |= SCR_NS;
--    if (cpreg_field_is_64bit(ri)) {
+         }
-+    /* If the ASID changes (with a 64-bit write), we must flush the TLB.  */
+         if (cpu_isar_feature(aa64_mte, cpu)) {
-+    if (cpreg_field_is_64bit(ri) &&
+             valid_mask |= SCR_ATA;
 +        extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {
          ARMCPU *cpu = arm_env_get_cpu(env);
 -
          tlb_flush(CPU(cpu));
      }
      raw_write(env, ri, value);
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 09/45] target/arm: Convert sve from feature bit to aa64pfr0 test
+[PULL 04/26] target/arm: Add RME cpregs
 From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+This includes GPCCR, GPTBR, MFAR, the TLB flush insns PAALL, PAALLOS,
 RPALOS, RPAOS, and the cache flush insns CIPAPA and CIGDPAPA.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181016223115.24100-8-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-5-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h            | 16 +++++++++++++++-
+ target/arm/cpu.h    | 19 ++++++++++
- linux-user/aarch64/signal.c |  4 ++--
+ target/arm/helper.c | 84 +++++++++++++++++++++++++++++++++++++++++++++
- linux-user/elfload.c        |  2 +-
+files changed, 103 insertions(+)
  linux-user/syscall.c        | 10 ++++++----
  target/arm/cpu64.c          |  5 ++++-
  target/arm/helper.c         |  9 ++++++---
  target/arm/machine.c        |  3 +--
  target/arm/translate-a64.c  |  4 ++--
 files changed, 37 insertions(+), 16 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ FIELD(ID_AA64ISAR1, FRINTTS, 32, 4)
+@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
- FIELD(ID_AA64ISAR1, SB, 36, 4)
+         uint64_t fgt_read[2]; /* HFGRTR, HDFGRTR */
- FIELD(ID_AA64ISAR1, SPECRES, 40, 4)
+         uint64_t fgt_write[2]; /* HFGWTR, HDFGWTR */
+         uint64_t fgt_exec[1]; /* HFGITR */
-+FIELD(ID_AA64PFR0, EL0, 0, 4)
++
-+FIELD(ID_AA64PFR0, EL1, 4, 4)
++        /* RME registers */
-+FIELD(ID_AA64PFR0, EL2, 8, 4)
++        uint64_t gpccr_el3;
-+FIELD(ID_AA64PFR0, EL3, 12, 4)
++        uint64_t gptbr_el3;
-+FIELD(ID_AA64PFR0, FP, 16, 4)
++        uint64_t mfar_el3;
-+FIELD(ID_AA64PFR0, ADVSIMD, 20, 4)
+     } cp15;
-+FIELD(ID_AA64PFR0, GIC, 24, 4)
-+FIELD(ID_AA64PFR0, RAS, 28, 4)
+     struct {
-+FIELD(ID_AA64PFR0, SVE, 32, 4)
+@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
      uint64_t reset_cbar;
      uint32_t reset_auxcr;
      bool reset_hivecs;
 +    uint8_t reset_l0gptsz;
      /*
       * Intermediate values used during property parsing.
@@ -XXX,XX +XXX,XX @@ FIELD(MVFR1, SIMDFMAC, 28, 4)
  FIELD(MVFR2, SIMDMISC, 0, 4)
  FIELD(MVFR2, FPMISC, 4, 4)
 +FIELD(GPCCR, PPS, 0, 3)
 +FIELD(GPCCR, IRGN, 8, 2)
 +FIELD(GPCCR, ORGN, 10, 2)
 +FIELD(GPCCR, SH, 12, 2)
 +FIELD(GPCCR, PGS, 14, 2)
 +FIELD(GPCCR, GPC, 16, 1)
 +FIELD(GPCCR, GPCP, 17, 1)
 +FIELD(GPCCR, L0GPTSZ, 20, 4)
 +
 +FIELD(MFAR, FPA, 12, 40)
 +FIELD(MFAR, NSE, 62, 1)
 +FIELD(MFAR, NS, 63, 1)
 +
  QEMU_BUILD_BUG_ON(ARRAY_SIZE(((ARMCPU *)0)->ccsidr) <= R_V7M_CSSELR_INDEX_MASK);
  /* If adding a feature bit which corresponds to a Linux ELF
-@@ -XXX,XX +XXX,XX @@ enum arm_features {
-     ARM_FEATURE_PMU, /* has PMU support */
-     ARM_FEATURE_VBAR, /* has cp15 VBAR */
-     ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
--    ARM_FEATURE_SVE, /* has Scalable Vector Extension */
-     ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
-     ARM_FEATURE_M_MAIN, /* M profile Main Extension */
- };
-@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
-     return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
- }
-+static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
-+{
-+    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
-+}
-+
- /*
-  * Forward to the above feature tests given an ARMCPU pointer.
-  */
-diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
-index XXXXXXX..XXXXXXX 100644
---- a/linux-user/aarch64/signal.c
-+++ b/linux-user/aarch64/signal.c
-@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
-             break;
-         case TARGET_SVE_MAGIC:
--            if (arm_feature(env, ARM_FEATURE_SVE)) {
-+            if (cpu_isar_feature(aa64_sve, arm_env_get_cpu(env))) {
-                 vq = (env->vfp.zcr_el[1] & 0xf) + 1;
-                 sve_size = QEMU_ALIGN_UP(TARGET_SVE_SIG_CONTEXT_SIZE(vq), 16);
-                 if (!sve && size == sve_size) {
-@@ -XXX,XX +XXX,XX @@ static void target_setup_frame(int usig, struct target_sigaction *ka,
-                                       &layout);
-     /* SVE state needs saving only if it exists.  */
--    if (arm_feature(env, ARM_FEATURE_SVE)) {
-+    if (cpu_isar_feature(aa64_sve, arm_env_get_cpu(env))) {
-         vq = (env->vfp.zcr_el[1] & 0xf) + 1;
-         sve_size = QEMU_ALIGN_UP(TARGET_SVE_SIG_CONTEXT_SIZE(vq), 16);
-         sve_ofs = alloc_sigframe_space(sve_size, &layout);
-diff --git a/linux-user/elfload.c b/linux-user/elfload.c
-index XXXXXXX..XXXXXXX 100644
---- a/linux-user/elfload.c
-+++ b/linux-user/elfload.c
-@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
-     GET_FEATURE_ID(aa64_rdm, ARM_HWCAP_A64_ASIMDRDM);
-     GET_FEATURE_ID(aa64_dp, ARM_HWCAP_A64_ASIMDDP);
-     GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
--    GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
-+    GET_FEATURE_ID(aa64_sve, ARM_HWCAP_A64_SVE);
- #undef GET_FEATURE
- #undef GET_FEATURE_ID
-diff --git a/linux-user/syscall.c b/linux-user/syscall.c
-index XXXXXXX..XXXXXXX 100644
---- a/linux-user/syscall.c
-+++ b/linux-user/syscall.c
-@@ -XXX,XX +XXX,XX @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
-              * even though the current architectural maximum is VQ=16.
-              */
-             ret = -TARGET_EINVAL;
--            if (arm_feature(cpu_env, ARM_FEATURE_SVE)
-+            if (cpu_isar_feature(aa64_sve, arm_env_get_cpu(cpu_env))
-                 && arg2 >= 0 && arg2 <= 512 * 16 && !(arg2 & 15)) {
-                 CPUARMState *env = cpu_env;
-                 ARMCPU *cpu = arm_env_get_cpu(env);
-@@ -XXX,XX +XXX,XX @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
-             return ret;
-         case TARGET_PR_SVE_GET_VL:
-             ret = -TARGET_EINVAL;
--            if (arm_feature(cpu_env, ARM_FEATURE_SVE)) {
--                CPUARMState *env = cpu_env;
--                ret = ((env->vfp.zcr_el[1] & 0xf) + 1) * 16;
-+            {
-+                ARMCPU *cpu = arm_env_get_cpu(cpu_env);
-+                if (cpu_isar_feature(aa64_sve, cpu)) {
-+                    ret = ((cpu->env.vfp.zcr_el[1] & 0xf) + 1) * 16;
-+                }
-             }
-             return ret;
- #endif /* AARCH64 */
-diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu64.c
-+++ b/target/arm/cpu64.c
-@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
-         t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
-         cpu->isar.id_aa64isar1 = t;
-+        t = cpu->isar.id_aa64pfr0;
-+        t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1);
-+        cpu->isar.id_aa64pfr0 = t;
-+
-         /* Replicate the same data to the 32-bit id registers.  */
-         u = cpu->isar.id_isar5;
-         u = FIELD_DP32(u, ID_ISAR5, AES, 2); /* AES + PMULL */
-@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
-          * present in either.
-          */
-         set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
--        set_feature(&cpu->env, ARM_FEATURE_SVE);
-         /* For usermode -cpu max we can use a larger and more efficient DCZ
-          * blocksize since we don't have to follow what the hardware does.
-          */
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo sme_reginfo[] = {
+       .access = PL2_RW, .accessfn = access_esm,
+       .type = ARM_CP_CONST, .resetvalue = 0 },
+ };
++
++static void tlbi_aa64_paall_write(CPUARMState *env, const ARMCPRegInfo *ri,
++                                  uint64_t value)
++{
++    CPUState *cs = env_cpu(env);
++
++    tlb_flush(cs);
++}
++
++static void gpccr_write(CPUARMState *env, const ARMCPRegInfo *ri,
++                        uint64_t value)
++{
++    /* L0GPTSZ is RO; other bits not mentioned are RES0. */
++    uint64_t rw_mask = R_GPCCR_PPS_MASK | R_GPCCR_IRGN_MASK |
++        R_GPCCR_ORGN_MASK | R_GPCCR_SH_MASK | R_GPCCR_PGS_MASK |
++        R_GPCCR_GPC_MASK | R_GPCCR_GPCP_MASK;
++
++    env->cp15.gpccr_el3 = (value & rw_mask) | (env->cp15.gpccr_el3 & ~rw_mask);
++}
++
++static void gpccr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
++{
++    env->cp15.gpccr_el3 = FIELD_DP64(0, GPCCR, L0GPTSZ,
++                                     env_archcpu(env)->reset_l0gptsz);
++}
++
++static void tlbi_aa64_paallos_write(CPUARMState *env, const ARMCPRegInfo *ri,
++                                    uint64_t value)
++{
++    CPUState *cs = env_cpu(env);
++
++    tlb_flush_all_cpus_synced(cs);
++}
++
++static const ARMCPRegInfo rme_reginfo[] = {
++    { .name = "GPCCR_EL3", .state = ARM_CP_STATE_AA64,
++      .opc0 = 3, .opc1 = 6, .crn = 2, .crm = 1, .opc2 = 6,
++      .access = PL3_RW, .writefn = gpccr_write, .resetfn = gpccr_reset,
++      .fieldoffset = offsetof(CPUARMState, cp15.gpccr_el3) },
++    { .name = "GPTBR_EL3", .state = ARM_CP_STATE_AA64,
++      .opc0 = 3, .opc1 = 6, .crn = 2, .crm = 1, .opc2 = 4,
++      .access = PL3_RW, .fieldoffset = offsetof(CPUARMState, cp15.gptbr_el3) },
++    { .name = "MFAR_EL3", .state = ARM_CP_STATE_AA64,
++      .opc0 = 3, .opc1 = 6, .crn = 6, .crm = 0, .opc2 = 5,
++      .access = PL3_RW, .fieldoffset = offsetof(CPUARMState, cp15.mfar_el3) },
++    { .name = "TLBI_PAALL", .state = ARM_CP_STATE_AA64,
++      .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 7, .opc2 = 4,
++      .access = PL3_W, .type = ARM_CP_NO_RAW,
++      .writefn = tlbi_aa64_paall_write },
++    { .name = "TLBI_PAALLOS", .state = ARM_CP_STATE_AA64,
++      .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 1, .opc2 = 4,
++      .access = PL3_W, .type = ARM_CP_NO_RAW,
++      .writefn = tlbi_aa64_paallos_write },
++    /*
++     * QEMU does not have a way to invalidate by physical address, thus
++     * invalidating a range of physical addresses is accomplished by
++     * flushing all tlb entries in the outer sharable domain,
++     * just like PAALLOS.
++     */
++    { .name = "TLBI_RPALOS", .state = ARM_CP_STATE_AA64,
++      .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 4, .opc2 = 7,
++      .access = PL3_W, .type = ARM_CP_NO_RAW,
++      .writefn = tlbi_aa64_paallos_write },
++    { .name = "TLBI_RPAOS", .state = ARM_CP_STATE_AA64,
++      .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 4, .opc2 = 3,
++      .access = PL3_W, .type = ARM_CP_NO_RAW,
++      .writefn = tlbi_aa64_paallos_write },
++    { .name = "DC_CIPAPA", .state = ARM_CP_STATE_AA64,
++      .opc0 = 1, .opc1 = 6, .crn = 7, .crm = 14, .opc2 = 1,
++      .access = PL3_W, .type = ARM_CP_NOP },
++};
++
++static const ARMCPRegInfo rme_mte_reginfo[] = {
++    { .name = "DC_CIGDPAPA", .state = ARM_CP_STATE_AA64,
++      .opc0 = 1, .opc1 = 6, .crn = 7, .crm = 14, .opc2 = 5,
++      .access = PL3_W, .type = ARM_CP_NOP },
++};
+ #endif /* TARGET_AARCH64 */
+ static void define_pmu_regs(ARMCPU *cpu)
 @@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
-         define_one_arm_cp_reg(cpu, &sctlr);
+     if (cpu_isar_feature(aa64_fgt, cpu)) {
          define_arm_cp_regs(cpu, fgt_reginfo);
      }
--    if (arm_feature(env, ARM_FEATURE_SVE)) {
-+    if (cpu_isar_feature(aa64_sve, cpu)) {
-         define_one_arm_cp_reg(cpu, &zcr_el1_reginfo);
-         if (arm_feature(env, ARM_FEATURE_EL2)) {
-             define_one_arm_cp_reg(cpu, &zcr_el2_reginfo);
-@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
-     uint32_t flags;
-     if (is_a64(env)) {
-+        ARMCPU *cpu = arm_env_get_cpu(env);
 +
-         *pc = env->pc;
++    if (cpu_isar_feature(aa64_rme, cpu)) {
-         flags = ARM_TBFLAG_AARCH64_STATE_MASK;
++        define_arm_cp_regs(cpu, rme_reginfo);
-         /* Get control bits for tagged addresses */
++        if (cpu_isar_feature(aa64_mte, cpu)) {
-         flags |= (arm_regime_tbi0(env, mmu_idx) << ARM_TBFLAG_TBI0_SHIFT);
++            define_arm_cp_regs(cpu, rme_mte_reginfo);
-         flags |= (arm_regime_tbi1(env, mmu_idx) << ARM_TBFLAG_TBI1_SHIFT);
++        }
++    }
--        if (arm_feature(env, ARM_FEATURE_SVE)) {
+ #endif
-+        if (cpu_isar_feature(aa64_sve, cpu)) {
-             int sve_el = sve_exception_el(env, current_el);
+     if (cpu_isar_feature(any_predinv, cpu)) {
              uint32_t zcr_len;
@@ -XXX,XX +XXX,XX @@ void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq)
  void aarch64_sve_change_el(CPUARMState *env, int old_el,
                             int new_el, bool el0_a64)
  {
 +    ARMCPU *cpu = arm_env_get_cpu(env);
      int old_len, new_len;
      bool old_a64, new_a64;
      /* Nothing to do if no SVE.  */
 -    if (!arm_feature(env, ARM_FEATURE_SVE)) {
 +    if (!cpu_isar_feature(aa64_sve, cpu)) {
          return;
      }
 diff --git a/target/arm/machine.c b/target/arm/machine.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/machine.c
 +++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_iwmmxt = {
  static bool sve_needed(void *opaque)
  {
      ARMCPU *cpu = opaque;
 -    CPUARMState *env = &cpu->env;
 -    return arm_feature(env, ARM_FEATURE_SVE);
 +    return cpu_isar_feature(aa64_sve, cpu);
  }
  /* The first two words of each Zreg is stored in VFP state.  */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
      cpu_fprintf(f, "     FPCR=%08x FPSR=%08x\n",
                  vfp_get_fpcr(env), vfp_get_fpsr(env));
 -    if (arm_feature(env, ARM_FEATURE_SVE) && sve_exception_el(env, el) == 0) {
 +    if (cpu_isar_feature(aa64_sve, cpu) && sve_exception_el(env, el) == 0) {
          int j, zcr_len = sve_zcr_len_for_el(env, el);
          for (i = 0; i <= FFR_PRED_NUM; i++) {
@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
          unallocated_encoding(s);
          break;
      case 0x2:
 -        if (!arm_dc_feature(s, ARM_FEATURE_SVE) || !disas_sve(s, insn)) {
 +        if (!dc_isar_feature(aa64_sve, s) || !disas_sve(s, insn)) {
              unallocated_encoding(s);
          }
          break;
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 03/45] target/arm: Move some system registers into a substructure
+[PULL 05/26] target/arm: Introduce ARMSecuritySpace
 From: Richard Henderson <richard.henderson@linaro.org>
-Create struct ARMISARegisters, to be accessed during translation.
+Introduce both the enumeration and functions to retrieve
 the current state, and state outside of EL3.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181016223115.24100-2-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-6-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h      |  32 ++++----
+ target/arm/cpu.h    | 89 ++++++++++++++++++++++++++++++++++-----------
- hw/intc/armv7m_nvic.c |  12 +--
+ target/arm/helper.c | 60 ++++++++++++++++++++++++++++++
- target/arm/cpu.c      | 178 +++++++++++++++++++++---------------------
+files changed, 127 insertions(+), 22 deletions(-)
  target/arm/cpu64.c    |  70 ++++++++---------
  target/arm/helper.c   |  28 +++----
 files changed, 162 insertions(+), 158 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
+@@ -XXX,XX +XXX,XX @@ static inline int arm_feature(CPUARMState *env, int feature)
-      * ARMv7AR ARM Architecture Reference Manual. A reset_ prefix
-      * is used for reset values of non-constant registers; no reset_
+ void arm_cpu_finalize_features(ARMCPU *cpu, Error **errp);
-      * prefix means a constant register.
-+     * Some of these registers are split out into a substructure that
+-#if !defined(CONFIG_USER_ONLY)
-+     * is shared with the translators to control the ISA.
+ /*
-      */
++ * ARM v9 security states.
-+    struct ARMISARegisters {
++ * The ordering of the enumeration corresponds to the low 2 bits
-+        uint32_t id_isar0;
++ * of the GPI value, and (except for Root) the concat of NSE:NS.
-+        uint32_t id_isar1;
++ */
-+        uint32_t id_isar2;
++
-+        uint32_t id_isar3;
++typedef enum ARMSecuritySpace {
-+        uint32_t id_isar4;
++    ARMSS_Secure     = 0,
-+        uint32_t id_isar5;
++    ARMSS_NonSecure  = 1,
-+        uint32_t id_isar6;
++    ARMSS_Root       = 2,
-+        uint32_t mvfr0;
++    ARMSS_Realm      = 3,
-+        uint32_t mvfr1;
++} ARMSecuritySpace;
-+        uint32_t mvfr2;
++
-+        uint64_t id_aa64isar0;
++/* Return true if @space is secure, in the pre-v9 sense. */
-+        uint64_t id_aa64isar1;
++static inline bool arm_space_is_secure(ARMSecuritySpace space)
-+        uint64_t id_aa64pfr0;
++{
-+        uint64_t id_aa64pfr1;
++    return space == ARMSS_Secure || space == ARMSS_Root;
-+    } isar;
++}
-     uint32_t midr;
++
-     uint32_t revidr;
++/* Return the ARMSecuritySpace for @secure, assuming !RME or EL[0-2]. */
-     uint32_t reset_fpsid;
++static inline ARMSecuritySpace arm_secure_to_space(bool secure)
--    uint32_t mvfr0;
++{
--    uint32_t mvfr1;
++    return secure ? ARMSS_Secure : ARMSS_NonSecure;
--    uint32_t mvfr2;
++}
-     uint32_t ctr;
++
-     uint32_t reset_sctlr;
++#if !defined(CONFIG_USER_ONLY)
-     uint32_t id_pfr0;
++/**
-@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
++ * arm_security_space_below_el3:
-     uint32_t id_mmfr2;
++ * @env: cpu context
-     uint32_t id_mmfr3;
++ *
-     uint32_t id_mmfr4;
++ * Return the security space of exception levels below EL3, following
--    uint32_t id_isar0;
++ * an exception return to those levels.  Unlike arm_security_space,
--    uint32_t id_isar1;
++ * this doesn't care about the current EL.
--    uint32_t id_isar2;
++ */
--    uint32_t id_isar3;
++ARMSecuritySpace arm_security_space_below_el3(CPUARMState *env);
--    uint32_t id_isar4;
++
--    uint32_t id_isar5;
++/**
--    uint32_t id_isar6;
++ * arm_is_secure_below_el3:
--    uint64_t id_aa64pfr0;
++ * @env: cpu context
--    uint64_t id_aa64pfr1;
++ *
-     uint64_t id_aa64dfr0;
+  * Return true if exception levels below EL3 are in secure state,
-     uint64_t id_aa64dfr1;
+- * or would be following an exception return to that level.
-     uint64_t id_aa64afr0;
+- * Unlike arm_is_secure() (which is always a question about the
-     uint64_t id_aa64afr1;
+- * _current_ state of the CPU) this doesn't care about the current
--    uint64_t id_aa64isar0;
+- * EL or mode.
--    uint64_t id_aa64isar1;
++ * or would be following an exception return to those levels.
-     uint64_t id_aa64mmfr0;
+  */
-     uint64_t id_aa64mmfr1;
+ static inline bool arm_is_secure_below_el3(CPUARMState *env)
-     uint32_t dbgdidr;
+ {
-diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
+-    assert(!arm_feature(env, ARM_FEATURE_M));
-index XXXXXXX..XXXXXXX 100644
+-    if (arm_feature(env, ARM_FEATURE_EL3)) {
---- a/hw/intc/armv7m_nvic.c
+-        return !(env->cp15.scr_el3 & SCR_NS);
-+++ b/hw/intc/armv7m_nvic.c
+-    } else {
-@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
+-        /* If EL3 is not supported then the secure state is implementation
-     case 0xd5c: /* MMFR3.  */
+-         * defined, in which case QEMU defaults to non-secure.
-         return cpu->id_mmfr3;
+-         */
-     case 0xd60: /* ISAR0.  */
+-        return false;
--        return cpu->id_isar0;
+-    }
-+        return cpu->isar.id_isar0;
++    ARMSecuritySpace ss = arm_security_space_below_el3(env);
-     case 0xd64: /* ISAR1.  */
++    return ss == ARMSS_Secure;
--        return cpu->id_isar1;
+ }
-+        return cpu->isar.id_isar1;
-     case 0xd68: /* ISAR2.  */
+ /* Return true if the CPU is AArch64 EL3 or AArch32 Mon */
--        return cpu->id_isar2;
+@@ -XXX,XX +XXX,XX @@ static inline bool arm_is_el3_or_mon(CPUARMState *env)
-+        return cpu->isar.id_isar2;
+     return false;
-     case 0xd6c: /* ISAR3.  */
+ }
--        return cpu->id_isar3;
-+        return cpu->isar.id_isar3;
+-/* Return true if the processor is in secure state */
-     case 0xd70: /* ISAR4.  */
++/**
--        return cpu->id_isar4;
++ * arm_security_space:
-+        return cpu->isar.id_isar4;
++ * @env: cpu context
-     case 0xd74: /* ISAR5.  */
++ *
--        return cpu->id_isar5;
++ * Return the current security space of the cpu.
-+        return cpu->isar.id_isar5;
++ */
-     case 0xd78: /* CLIDR */
++ARMSecuritySpace arm_security_space(CPUARMState *env);
-         return cpu->clidr;
++
-     case 0xd7c: /* CTR */
++/**
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
++ * arm_is_secure:
-index XXXXXXX..XXXXXXX 100644
++ * @env: cpu context
---- a/target/arm/cpu.c
++ *
-+++ b/target/arm/cpu.c
++ * Return true if the processor is in secure state.
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
++ */
-     g_hash_table_foreach(cpu->cp_regs, cp_reg_check_reset, cpu);
+ static inline bool arm_is_secure(CPUARMState *env)
+ {
-     env->vfp.xregs[ARM_VFP_FPSID] = cpu->reset_fpsid;
+-    if (arm_feature(env, ARM_FEATURE_M)) {
--    env->vfp.xregs[ARM_VFP_MVFR0] = cpu->mvfr0;
+-        return env->v7m.secure;
--    env->vfp.xregs[ARM_VFP_MVFR1] = cpu->mvfr1;
+-    }
--    env->vfp.xregs[ARM_VFP_MVFR2] = cpu->mvfr2;
+-    if (arm_is_el3_or_mon(env)) {
-+    env->vfp.xregs[ARM_VFP_MVFR0] = cpu->isar.mvfr0;
+-        return true;
-+    env->vfp.xregs[ARM_VFP_MVFR1] = cpu->isar.mvfr1;
+-    }
-+    env->vfp.xregs[ARM_VFP_MVFR2] = cpu->isar.mvfr2;
+-    return arm_is_secure_below_el3(env);
++    return arm_space_is_secure(arm_security_space(env));
-     cpu->power_state = cpu->start_powered_off ? PSCI_OFF : PSCI_ON;
+ }
-     s->halted = cpu->start_powered_off;
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
+ /*
-          * registers as well. These are id_pfr1[7:4] and id_aa64pfr0[15:12].
+@@ -XXX,XX +XXX,XX @@ static inline bool arm_is_el2_enabled(CPUARMState *env)
-          */
+ }
-         cpu->id_pfr1 &= ~0xf0;
--        cpu->id_aa64pfr0 &= ~0xf000;
+ #else
-+        cpu->isar.id_aa64pfr0 &= ~0xf000;
++static inline ARMSecuritySpace arm_security_space_below_el3(CPUARMState *env)
-     }
++{
++    return ARMSS_NonSecure;
-     if (!cpu->has_el2) {
++}
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
++
-          * registers if we don't have EL2. These are id_pfr1[15:12] and
+ static inline bool arm_is_secure_below_el3(CPUARMState *env)
-          * id_aa64pfr0_el1[11:8].
+ {
-          */
+     return false;
--        cpu->id_aa64pfr0 &= ~0xf00;
+ }
-+        cpu->isar.id_aa64pfr0 &= ~0xf00;
-         cpu->id_pfr1 &= ~0xf000;
++static inline ARMSecuritySpace arm_security_space(CPUARMState *env)
-     }
++{
++    return ARMSS_NonSecure;
-@@ -XXX,XX +XXX,XX @@ static void arm1136_r2_initfn(Object *obj)
++}
-     set_feature(&cpu->env, ARM_FEATURE_CACHE_BLOCK_OPS);
++
-     cpu->midr = 0x4107b362;
+ static inline bool arm_is_secure(CPUARMState *env)
-     cpu->reset_fpsid = 0x410120b4;
+ {
--    cpu->mvfr0 = 0x11111111;
+     return false;
 -    cpu->mvfr1 = 0x00000000;
 +    cpu->isar.mvfr0 = 0x11111111;
 +    cpu->isar.mvfr1 = 0x00000000;
      cpu->ctr = 0x1dd20d2;
      cpu->reset_sctlr = 0x00050078;
      cpu->id_pfr0 = 0x111;
@@ -XXX,XX +XXX,XX @@ static void arm1136_r2_initfn(Object *obj)
      cpu->id_mmfr0 = 0x01130003;
      cpu->id_mmfr1 = 0x10030302;
      cpu->id_mmfr2 = 0x01222110;
 -    cpu->id_isar0 = 0x00140011;
 -    cpu->id_isar1 = 0x12002111;
 -    cpu->id_isar2 = 0x11231111;
 -    cpu->id_isar3 = 0x01102131;
 -    cpu->id_isar4 = 0x141;
 +    cpu->isar.id_isar0 = 0x00140011;
 +    cpu->isar.id_isar1 = 0x12002111;
 +    cpu->isar.id_isar2 = 0x11231111;
 +    cpu->isar.id_isar3 = 0x01102131;
 +    cpu->isar.id_isar4 = 0x141;
      cpu->reset_auxcr = 7;
  }
@@ -XXX,XX +XXX,XX @@ static void arm1136_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_CACHE_BLOCK_OPS);
      cpu->midr = 0x4117b363;
      cpu->reset_fpsid = 0x410120b4;
 -    cpu->mvfr0 = 0x11111111;
 -    cpu->mvfr1 = 0x00000000;
 +    cpu->isar.mvfr0 = 0x11111111;
 +    cpu->isar.mvfr1 = 0x00000000;
      cpu->ctr = 0x1dd20d2;
      cpu->reset_sctlr = 0x00050078;
      cpu->id_pfr0 = 0x111;
@@ -XXX,XX +XXX,XX @@ static void arm1136_initfn(Object *obj)
      cpu->id_mmfr0 = 0x01130003;
      cpu->id_mmfr1 = 0x10030302;
      cpu->id_mmfr2 = 0x01222110;
 -    cpu->id_isar0 = 0x00140011;
 -    cpu->id_isar1 = 0x12002111;
 -    cpu->id_isar2 = 0x11231111;
 -    cpu->id_isar3 = 0x01102131;
 -    cpu->id_isar4 = 0x141;
 +    cpu->isar.id_isar0 = 0x00140011;
 +    cpu->isar.id_isar1 = 0x12002111;
 +    cpu->isar.id_isar2 = 0x11231111;
 +    cpu->isar.id_isar3 = 0x01102131;
 +    cpu->isar.id_isar4 = 0x141;
      cpu->reset_auxcr = 7;
  }
@@ -XXX,XX +XXX,XX @@ static void arm1176_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_EL3);
      cpu->midr = 0x410fb767;
      cpu->reset_fpsid = 0x410120b5;
 -    cpu->mvfr0 = 0x11111111;
 -    cpu->mvfr1 = 0x00000000;
 +    cpu->isar.mvfr0 = 0x11111111;
 +    cpu->isar.mvfr1 = 0x00000000;
      cpu->ctr = 0x1dd20d2;
      cpu->reset_sctlr = 0x00050078;
      cpu->id_pfr0 = 0x111;
@@ -XXX,XX +XXX,XX @@ static void arm1176_initfn(Object *obj)
      cpu->id_mmfr0 = 0x01130003;
      cpu->id_mmfr1 = 0x10030302;
      cpu->id_mmfr2 = 0x01222100;
 -    cpu->id_isar0 = 0x0140011;
 -    cpu->id_isar1 = 0x12002111;
 -    cpu->id_isar2 = 0x11231121;
 -    cpu->id_isar3 = 0x01102131;
 -    cpu->id_isar4 = 0x01141;
 +    cpu->isar.id_isar0 = 0x0140011;
 +    cpu->isar.id_isar1 = 0x12002111;
 +    cpu->isar.id_isar2 = 0x11231121;
 +    cpu->isar.id_isar3 = 0x01102131;
 +    cpu->isar.id_isar4 = 0x01141;
      cpu->reset_auxcr = 7;
  }
@@ -XXX,XX +XXX,XX @@ static void arm11mpcore_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
      cpu->midr = 0x410fb022;
      cpu->reset_fpsid = 0x410120b4;
 -    cpu->mvfr0 = 0x11111111;
 -    cpu->mvfr1 = 0x00000000;
 +    cpu->isar.mvfr0 = 0x11111111;
 +    cpu->isar.mvfr1 = 0x00000000;
      cpu->ctr = 0x1d192992; /* 32K icache 32K dcache */
      cpu->id_pfr0 = 0x111;
      cpu->id_pfr1 = 0x1;
@@ -XXX,XX +XXX,XX @@ static void arm11mpcore_initfn(Object *obj)
      cpu->id_mmfr0 = 0x01100103;
      cpu->id_mmfr1 = 0x10020302;
      cpu->id_mmfr2 = 0x01222000;
 -    cpu->id_isar0 = 0x00100011;
 -    cpu->id_isar1 = 0x12002111;
 -    cpu->id_isar2 = 0x11221011;
 -    cpu->id_isar3 = 0x01102131;
 -    cpu->id_isar4 = 0x141;
 +    cpu->isar.id_isar0 = 0x00100011;
 +    cpu->isar.id_isar1 = 0x12002111;
 +    cpu->isar.id_isar2 = 0x11221011;
 +    cpu->isar.id_isar3 = 0x01102131;
 +    cpu->isar.id_isar4 = 0x141;
      cpu->reset_auxcr = 1;
  }
@@ -XXX,XX +XXX,XX @@ static void cortex_m3_initfn(Object *obj)
      cpu->id_mmfr1 = 0x00000000;
      cpu->id_mmfr2 = 0x00000000;
      cpu->id_mmfr3 = 0x00000000;
 -    cpu->id_isar0 = 0x01141110;
 -    cpu->id_isar1 = 0x02111000;
 -    cpu->id_isar2 = 0x21112231;
 -    cpu->id_isar3 = 0x01111110;
 -    cpu->id_isar4 = 0x01310102;
 -    cpu->id_isar5 = 0x00000000;
 -    cpu->id_isar6 = 0x00000000;
 +    cpu->isar.id_isar0 = 0x01141110;
 +    cpu->isar.id_isar1 = 0x02111000;
 +    cpu->isar.id_isar2 = 0x21112231;
 +    cpu->isar.id_isar3 = 0x01111110;
 +    cpu->isar.id_isar4 = 0x01310102;
 +    cpu->isar.id_isar5 = 0x00000000;
 +    cpu->isar.id_isar6 = 0x00000000;
  }
  static void cortex_m4_initfn(Object *obj)
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
      cpu->id_mmfr1 = 0x00000000;
      cpu->id_mmfr2 = 0x00000000;
      cpu->id_mmfr3 = 0x00000000;
 -    cpu->id_isar0 = 0x01141110;
 -    cpu->id_isar1 = 0x02111000;
 -    cpu->id_isar2 = 0x21112231;
 -    cpu->id_isar3 = 0x01111110;
 -    cpu->id_isar4 = 0x01310102;
 -    cpu->id_isar5 = 0x00000000;
 -    cpu->id_isar6 = 0x00000000;
 +    cpu->isar.id_isar0 = 0x01141110;
 +    cpu->isar.id_isar1 = 0x02111000;
 +    cpu->isar.id_isar2 = 0x21112231;
 +    cpu->isar.id_isar3 = 0x01111110;
 +    cpu->isar.id_isar4 = 0x01310102;
 +    cpu->isar.id_isar5 = 0x00000000;
 +    cpu->isar.id_isar6 = 0x00000000;
  }
  static void cortex_m33_initfn(Object *obj)
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
      cpu->id_mmfr1 = 0x00000000;
      cpu->id_mmfr2 = 0x01000000;
      cpu->id_mmfr3 = 0x00000000;
 -    cpu->id_isar0 = 0x01101110;
 -    cpu->id_isar1 = 0x02212000;
 -    cpu->id_isar2 = 0x20232232;
 -    cpu->id_isar3 = 0x01111131;
 -    cpu->id_isar4 = 0x01310132;
 -    cpu->id_isar5 = 0x00000000;
 -    cpu->id_isar6 = 0x00000000;
 +    cpu->isar.id_isar0 = 0x01101110;
 +    cpu->isar.id_isar1 = 0x02212000;
 +    cpu->isar.id_isar2 = 0x20232232;
 +    cpu->isar.id_isar3 = 0x01111131;
 +    cpu->isar.id_isar4 = 0x01310132;
 +    cpu->isar.id_isar5 = 0x00000000;
 +    cpu->isar.id_isar6 = 0x00000000;
      cpu->clidr = 0x00000000;
      cpu->ctr = 0x8000c000;
  }
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
      cpu->id_mmfr1 = 0x00000000;
      cpu->id_mmfr2 = 0x01200000;
      cpu->id_mmfr3 = 0x0211;
 -    cpu->id_isar0 = 0x02101111;
 -    cpu->id_isar1 = 0x13112111;
 -    cpu->id_isar2 = 0x21232141;
 -    cpu->id_isar3 = 0x01112131;
 -    cpu->id_isar4 = 0x0010142;
 -    cpu->id_isar5 = 0x0;
 -    cpu->id_isar6 = 0x0;
 +    cpu->isar.id_isar0 = 0x02101111;
 +    cpu->isar.id_isar1 = 0x13112111;
 +    cpu->isar.id_isar2 = 0x21232141;
 +    cpu->isar.id_isar3 = 0x01112131;
 +    cpu->isar.id_isar4 = 0x0010142;
 +    cpu->isar.id_isar5 = 0x0;
 +    cpu->isar.id_isar6 = 0x0;
      cpu->mp_is_up = true;
      cpu->pmsav7_dregion = 16;
      define_arm_cp_regs(cpu, cortexr5_cp_reginfo);
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_EL3);
      cpu->midr = 0x410fc080;
      cpu->reset_fpsid = 0x410330c0;
 -    cpu->mvfr0 = 0x11110222;
 -    cpu->mvfr1 = 0x00011111;
 +    cpu->isar.mvfr0 = 0x11110222;
 +    cpu->isar.mvfr1 = 0x00011111;
      cpu->ctr = 0x82048004;
      cpu->reset_sctlr = 0x00c50078;
      cpu->id_pfr0 = 0x1031;
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
      cpu->id_mmfr1 = 0x20000000;
      cpu->id_mmfr2 = 0x01202000;
      cpu->id_mmfr3 = 0x11;
 -    cpu->id_isar0 = 0x00101111;
 -    cpu->id_isar1 = 0x12112111;
 -    cpu->id_isar2 = 0x21232031;
 -    cpu->id_isar3 = 0x11112131;
 -    cpu->id_isar4 = 0x00111142;
 +    cpu->isar.id_isar0 = 0x00101111;
 +    cpu->isar.id_isar1 = 0x12112111;
 +    cpu->isar.id_isar2 = 0x21232031;
 +    cpu->isar.id_isar3 = 0x11112131;
 +    cpu->isar.id_isar4 = 0x00111142;
      cpu->dbgdidr = 0x15141000;
      cpu->clidr = (1 << 27) | (2 << 24) | 3;
      cpu->ccsidr[0] = 0xe007e01a; /* 16k L1 dcache. */
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_CBAR);
      cpu->midr = 0x410fc090;
      cpu->reset_fpsid = 0x41033090;
 -    cpu->mvfr0 = 0x11110222;
 -    cpu->mvfr1 = 0x01111111;
 +    cpu->isar.mvfr0 = 0x11110222;
 +    cpu->isar.mvfr1 = 0x01111111;
      cpu->ctr = 0x80038003;
      cpu->reset_sctlr = 0x00c50078;
      cpu->id_pfr0 = 0x1031;
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
      cpu->id_mmfr1 = 0x20000000;
      cpu->id_mmfr2 = 0x01230000;
      cpu->id_mmfr3 = 0x00002111;
 -    cpu->id_isar0 = 0x00101111;
 -    cpu->id_isar1 = 0x13112111;
 -    cpu->id_isar2 = 0x21232041;
 -    cpu->id_isar3 = 0x11112131;
 -    cpu->id_isar4 = 0x00111142;
 +    cpu->isar.id_isar0 = 0x00101111;
 +    cpu->isar.id_isar1 = 0x13112111;
 +    cpu->isar.id_isar2 = 0x21232041;
 +    cpu->isar.id_isar3 = 0x11112131;
 +    cpu->isar.id_isar4 = 0x00111142;
      cpu->dbgdidr = 0x35141000;
      cpu->clidr = (1 << 27) | (1 << 24) | 3;
      cpu->ccsidr[0] = 0xe00fe019; /* 16k L1 dcache. */
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
      cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A7;
      cpu->midr = 0x410fc075;
      cpu->reset_fpsid = 0x41023075;
 -    cpu->mvfr0 = 0x10110222;
 -    cpu->mvfr1 = 0x11111111;
 +    cpu->isar.mvfr0 = 0x10110222;
 +    cpu->isar.mvfr1 = 0x11111111;
      cpu->ctr = 0x84448003;
      cpu->reset_sctlr = 0x00c50078;
      cpu->id_pfr0 = 0x00001131;
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
      /* a7_mpcore_r0p5_trm, page 4-4 gives 0x01101110; but
       * table 4-41 gives 0x02101110, which includes the arm div insns.
       */
 -    cpu->id_isar0 = 0x02101110;
 -    cpu->id_isar1 = 0x13112111;
 -    cpu->id_isar2 = 0x21232041;
 -    cpu->id_isar3 = 0x11112131;
 -    cpu->id_isar4 = 0x10011142;
 +    cpu->isar.id_isar0 = 0x02101110;
 +    cpu->isar.id_isar1 = 0x13112111;
 +    cpu->isar.id_isar2 = 0x21232041;
 +    cpu->isar.id_isar3 = 0x11112131;
 +    cpu->isar.id_isar4 = 0x10011142;
      cpu->dbgdidr = 0x3515f005;
      cpu->clidr = 0x0a200023;
      cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
      cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A15;
      cpu->midr = 0x412fc0f1;
      cpu->reset_fpsid = 0x410430f0;
 -    cpu->mvfr0 = 0x10110222;
 -    cpu->mvfr1 = 0x11111111;
 +    cpu->isar.mvfr0 = 0x10110222;
 +    cpu->isar.mvfr1 = 0x11111111;
      cpu->ctr = 0x8444c004;
      cpu->reset_sctlr = 0x00c50078;
      cpu->id_pfr0 = 0x00001131;
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
      cpu->id_mmfr1 = 0x20000000;
      cpu->id_mmfr2 = 0x01240000;
      cpu->id_mmfr3 = 0x02102211;
 -    cpu->id_isar0 = 0x02101110;
 -    cpu->id_isar1 = 0x13112111;
 -    cpu->id_isar2 = 0x21232041;
 -    cpu->id_isar3 = 0x11112131;
 -    cpu->id_isar4 = 0x10011142;
 +    cpu->isar.id_isar0 = 0x02101110;
 +    cpu->isar.id_isar1 = 0x13112111;
 +    cpu->isar.id_isar2 = 0x21232041;
 +    cpu->isar.id_isar3 = 0x11112131;
 +    cpu->isar.id_isar4 = 0x10011142;
      cpu->dbgdidr = 0x3515f021;
      cpu->clidr = 0x0a200023;
      cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
      cpu->midr = 0x411fd070;
      cpu->revidr = 0x00000000;
      cpu->reset_fpsid = 0x41034070;
 -    cpu->mvfr0 = 0x10110222;
 -    cpu->mvfr1 = 0x12111111;
 -    cpu->mvfr2 = 0x00000043;
 +    cpu->isar.mvfr0 = 0x10110222;
 +    cpu->isar.mvfr1 = 0x12111111;
 +    cpu->isar.mvfr2 = 0x00000043;
      cpu->ctr = 0x8444c004;
      cpu->reset_sctlr = 0x00c50838;
      cpu->id_pfr0 = 0x00000131;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
      cpu->id_mmfr1 = 0x40000000;
      cpu->id_mmfr2 = 0x01260000;
      cpu->id_mmfr3 = 0x02102211;
 -    cpu->id_isar0 = 0x02101110;
 -    cpu->id_isar1 = 0x13112111;
 -    cpu->id_isar2 = 0x21232042;
 -    cpu->id_isar3 = 0x01112131;
 -    cpu->id_isar4 = 0x00011142;
 -    cpu->id_isar5 = 0x00011121;
 -    cpu->id_isar6 = 0;
 -    cpu->id_aa64pfr0 = 0x00002222;
 +    cpu->isar.id_isar0 = 0x02101110;
 +    cpu->isar.id_isar1 = 0x13112111;
 +    cpu->isar.id_isar2 = 0x21232042;
 +    cpu->isar.id_isar3 = 0x01112131;
 +    cpu->isar.id_isar4 = 0x00011142;
 +    cpu->isar.id_isar5 = 0x00011121;
 +    cpu->isar.id_isar6 = 0;
 +    cpu->isar.id_aa64pfr0 = 0x00002222;
      cpu->id_aa64dfr0 = 0x10305106;
      cpu->pmceid0 = 0x00000000;
      cpu->pmceid1 = 0x00000000;
 -    cpu->id_aa64isar0 = 0x00011120;
 +    cpu->isar.id_aa64isar0 = 0x00011120;
      cpu->id_aa64mmfr0 = 0x00001124;
      cpu->dbgdidr = 0x3516d000;
      cpu->clidr = 0x0a200023;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
      cpu->midr = 0x410fd034;
      cpu->revidr = 0x00000000;
      cpu->reset_fpsid = 0x41034070;
 -    cpu->mvfr0 = 0x10110222;
 -    cpu->mvfr1 = 0x12111111;
 -    cpu->mvfr2 = 0x00000043;
 +    cpu->isar.mvfr0 = 0x10110222;
 +    cpu->isar.mvfr1 = 0x12111111;
 +    cpu->isar.mvfr2 = 0x00000043;
      cpu->ctr = 0x84448004; /* L1Ip = VIPT */
      cpu->reset_sctlr = 0x00c50838;
      cpu->id_pfr0 = 0x00000131;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
      cpu->id_mmfr1 = 0x40000000;
      cpu->id_mmfr2 = 0x01260000;
      cpu->id_mmfr3 = 0x02102211;
 -    cpu->id_isar0 = 0x02101110;
 -    cpu->id_isar1 = 0x13112111;
 -    cpu->id_isar2 = 0x21232042;
 -    cpu->id_isar3 = 0x01112131;
 -    cpu->id_isar4 = 0x00011142;
 -    cpu->id_isar5 = 0x00011121;
 -    cpu->id_isar6 = 0;
 -    cpu->id_aa64pfr0 = 0x00002222;
 +    cpu->isar.id_isar0 = 0x02101110;
 +    cpu->isar.id_isar1 = 0x13112111;
 +    cpu->isar.id_isar2 = 0x21232042;
 +    cpu->isar.id_isar3 = 0x01112131;
 +    cpu->isar.id_isar4 = 0x00011142;
 +    cpu->isar.id_isar5 = 0x00011121;
 +    cpu->isar.id_isar6 = 0;
 +    cpu->isar.id_aa64pfr0 = 0x00002222;
      cpu->id_aa64dfr0 = 0x10305106;
 -    cpu->id_aa64isar0 = 0x00011120;
 +    cpu->isar.id_aa64isar0 = 0x00011120;
      cpu->id_aa64mmfr0 = 0x00001122; /* 40 bit physical addr */
      cpu->dbgdidr = 0x3516d000;
      cpu->clidr = 0x0a200023;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
      cpu->midr = 0x410fd083;
      cpu->revidr = 0x00000000;
      cpu->reset_fpsid = 0x41034080;
 -    cpu->mvfr0 = 0x10110222;
 -    cpu->mvfr1 = 0x12111111;
 -    cpu->mvfr2 = 0x00000043;
 +    cpu->isar.mvfr0 = 0x10110222;
 +    cpu->isar.mvfr1 = 0x12111111;
 +    cpu->isar.mvfr2 = 0x00000043;
      cpu->ctr = 0x8444c004;
      cpu->reset_sctlr = 0x00c50838;
      cpu->id_pfr0 = 0x00000131;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
      cpu->id_mmfr1 = 0x40000000;
      cpu->id_mmfr2 = 0x01260000;
      cpu->id_mmfr3 = 0x02102211;
 -    cpu->id_isar0 = 0x02101110;
 -    cpu->id_isar1 = 0x13112111;
 -    cpu->id_isar2 = 0x21232042;
 -    cpu->id_isar3 = 0x01112131;
 -    cpu->id_isar4 = 0x00011142;
 -    cpu->id_isar5 = 0x00011121;
 -    cpu->id_aa64pfr0 = 0x00002222;
 +    cpu->isar.id_isar0 = 0x02101110;
 +    cpu->isar.id_isar1 = 0x13112111;
 +    cpu->isar.id_isar2 = 0x21232042;
 +    cpu->isar.id_isar3 = 0x01112131;
 +    cpu->isar.id_isar4 = 0x00011142;
 +    cpu->isar.id_isar5 = 0x00011121;
 +    cpu->isar.id_aa64pfr0 = 0x00002222;
      cpu->id_aa64dfr0 = 0x10305106;
      cpu->pmceid0 = 0x00000000;
      cpu->pmceid1 = 0x00000000;
 -    cpu->id_aa64isar0 = 0x00011120;
 +    cpu->isar.id_aa64isar0 = 0x00011120;
      cpu->id_aa64mmfr0 = 0x00001124;
      cpu->dbgdidr = 0x3516d000;
      cpu->clidr = 0x0a200023;
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static uint64_t id_pfr1_read(CPUARMState *env, const ARMCPRegInfo *ri)
+@@ -XXX,XX +XXX,XX @@ void aarch64_sve_change_el(CPUARMState *env, int old_el,
- static uint64_t id_aa64pfr0_read(CPUARMState *env, const ARMCPRegInfo *ri)
+     }
- {
+ }
-     ARMCPU *cpu = arm_env_get_cpu(env);
+ #endif
--    uint64_t pfr0 = cpu->id_aa64pfr0;
++
-+    uint64_t pfr0 = cpu->isar.id_aa64pfr0;
++#ifndef CONFIG_USER_ONLY
++ARMSecuritySpace arm_security_space(CPUARMState *env)
-     if (env->gicv3state) {
++{
-         pfr0 |= 1 << 24;
++    if (arm_feature(env, ARM_FEATURE_M)) {
-@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
++        return arm_secure_to_space(env->v7m.secure);
-             { .name = "ID_ISAR0", .state = ARM_CP_STATE_BOTH,
++    }
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 0,
++
-               .access = PL1_R, .type = ARM_CP_CONST,
++    /*
--              .resetvalue = cpu->id_isar0 },
++     * If EL3 is not supported then the secure state is implementation
-+              .resetvalue = cpu->isar.id_isar0 },
++     * defined, in which case QEMU defaults to non-secure.
-             { .name = "ID_ISAR1", .state = ARM_CP_STATE_BOTH,
++     */
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 1,
++    if (!arm_feature(env, ARM_FEATURE_EL3)) {
-               .access = PL1_R, .type = ARM_CP_CONST,
++        return ARMSS_NonSecure;
--              .resetvalue = cpu->id_isar1 },
++    }
-+              .resetvalue = cpu->isar.id_isar1 },
++
-             { .name = "ID_ISAR2", .state = ARM_CP_STATE_BOTH,
++    /* Check for AArch64 EL3 or AArch32 Mon. */
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 2,
++    if (is_a64(env)) {
-               .access = PL1_R, .type = ARM_CP_CONST,
++        if (extract32(env->pstate, 2, 2) == 3) {
--              .resetvalue = cpu->id_isar2 },
++            if (cpu_isar_feature(aa64_rme, env_archcpu(env))) {
-+              .resetvalue = cpu->isar.id_isar2 },
++                return ARMSS_Root;
-             { .name = "ID_ISAR3", .state = ARM_CP_STATE_BOTH,
++            } else {
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 3,
++                return ARMSS_Secure;
-               .access = PL1_R, .type = ARM_CP_CONST,
++            }
--              .resetvalue = cpu->id_isar3 },
++        }
-+              .resetvalue = cpu->isar.id_isar3 },
++    } else {
-             { .name = "ID_ISAR4", .state = ARM_CP_STATE_BOTH,
++        if ((env->uncached_cpsr & CPSR_M) == ARM_CPU_MODE_MON) {
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 4,
++            return ARMSS_Secure;
-               .access = PL1_R, .type = ARM_CP_CONST,
++        }
--              .resetvalue = cpu->id_isar4 },
++    }
-+              .resetvalue = cpu->isar.id_isar4 },
++
-             { .name = "ID_ISAR5", .state = ARM_CP_STATE_BOTH,
++    return arm_security_space_below_el3(env);
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 5,
++}
-               .access = PL1_R, .type = ARM_CP_CONST,
++
--              .resetvalue = cpu->id_isar5 },
++ARMSecuritySpace arm_security_space_below_el3(CPUARMState *env)
-+              .resetvalue = cpu->isar.id_isar5 },
++{
-             { .name = "ID_MMFR4", .state = ARM_CP_STATE_BOTH,
++    assert(!arm_feature(env, ARM_FEATURE_M));
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 6,
++
-               .access = PL1_R, .type = ARM_CP_CONST,
++    /*
-@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
++     * If EL3 is not supported then the secure state is implementation
-             { .name = "ID_ISAR6", .state = ARM_CP_STATE_BOTH,
++     * defined, in which case QEMU defaults to non-secure.
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 7,
++     */
-               .access = PL1_R, .type = ARM_CP_CONST,
++    if (!arm_feature(env, ARM_FEATURE_EL3)) {
--              .resetvalue = cpu->id_isar6 },
++        return ARMSS_NonSecure;
-+              .resetvalue = cpu->isar.id_isar6 },
++    }
-             REGINFO_SENTINEL
++
-         };
++    /*
-         define_arm_cp_regs(cpu, v6_idregs);
++     * Note NSE cannot be set without RME, and NSE & !NS is Reserved.
-@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
++     * Ignoring NSE when !NS retains consistency without having to
-             { .name = "ID_AA64PFR1_EL1", .state = ARM_CP_STATE_AA64,
++     * modify other predicates.
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 1,
++     */
-               .access = PL1_R, .type = ARM_CP_CONST,
++    if (!(env->cp15.scr_el3 & SCR_NS)) {
--              .resetvalue = cpu->id_aa64pfr1},
++        return ARMSS_Secure;
-+              .resetvalue = cpu->isar.id_aa64pfr1},
++    } else if (env->cp15.scr_el3 & SCR_NSE) {
-             { .name = "ID_AA64PFR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
++        return ARMSS_Realm;
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 2,
++    } else {
-               .access = PL1_R, .type = ARM_CP_CONST,
++        return ARMSS_NonSecure;
-@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
++    }
-             { .name = "ID_AA64ISAR0_EL1", .state = ARM_CP_STATE_AA64,
++}
-               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 0,
++#endif /* !CONFIG_USER_ONLY */
                .access = PL1_R, .type = ARM_CP_CONST,
 -              .resetvalue = cpu->id_aa64isar0 },
 +              .resetvalue = cpu->isar.id_aa64isar0 },
              { .name = "ID_AA64ISAR1_EL1", .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 1,
                .access = PL1_R, .type = ARM_CP_CONST,
 -              .resetvalue = cpu->id_aa64isar1 },
 +              .resetvalue = cpu->isar.id_aa64isar1 },
              { .name = "ID_AA64ISAR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 2,
                .access = PL1_R, .type = ARM_CP_CONST,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
              { .name = "MVFR0_EL1", .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 0,
                .access = PL1_R, .type = ARM_CP_CONST,
 -              .resetvalue = cpu->mvfr0 },
 +              .resetvalue = cpu->isar.mvfr0 },
              { .name = "MVFR1_EL1", .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 1,
                .access = PL1_R, .type = ARM_CP_CONST,
 -              .resetvalue = cpu->mvfr1 },
 +              .resetvalue = cpu->isar.mvfr1 },
              { .name = "MVFR2_EL1", .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 2,
                .access = PL1_R, .type = ARM_CP_CONST,
 -              .resetvalue = cpu->mvfr2 },
 +              .resetvalue = cpu->isar.mvfr2 },
              { .name = "MVFR3_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 3,
                .access = PL1_R, .type = ARM_CP_CONST,
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 40/45] target/arm: Promote consecutive memory ops for aa32
+[PULL 06/26] include/exec/memattrs: Add two bits of space to MemTxAttrs
 From: Richard Henderson <richard.henderson@linaro.org>
-For a sequence of loads or stores from a single register,
+We will need 2 bits to represent ARMSecurityState.
 little-endian operations can be promoted to an 8-byte op.
 This can reduce the number of operations by a factor of 8.
+Do not attempt to replace or widen secure, even though it
+logically overlaps the new field -- there are uses within
+e.g. hw/block/pflash_cfi01.c, which don't know anything
+specific about ARM.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-20-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-7-richard.henderson@linaro.org
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 10 ++++++++++
+ include/exec/memattrs.h | 9 ++++++++-
-file changed, 10 insertions(+)
+file changed, 8 insertions(+), 1 deletion(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/include/exec/memattrs.h b/include/exec/memattrs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/include/exec/memattrs.h
-+++ b/target/arm/translate.c
++++ b/include/exec/memattrs.h
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ typedef struct MemTxAttrs {
-         if (size == 3 && (interleave | spacing) != 1) {
+      * "didn't specify" if necessary.
-             return 1;
+      */
-         }
+     unsigned int unspecified:1;
-+        /* For our purposes, bytes are always little-endian.  */
+-    /* ARM/AMBA: TrustZone Secure access
-+        if (size == 0) {
++    /*
-+            endian = MO_LE;
++     * ARM/AMBA: TrustZone Secure access
-+        }
+      * x86: System Management Mode access
-+        /* Consecutive little-endian elements from a single register
+      */
-+         * can be promoted to a larger little-endian operation.
+     unsigned int secure:1;
-+         */
++    /*
-+        if (interleave == 1 && endian == MO_LE) {
++     * ARM: ArmSecuritySpace.  This partially overlaps secure, but it is
-+            size = 3;
++     * easier to have both fields to assist code that does not understand
-+        }
++     * ARMv9 RME, or no specific knowledge of ARM at all (e.g. pflash).
-         tmp64 = tcg_temp_new_i64();
++     */
-         addr = tcg_temp_new_i32();
++    unsigned int space:2;
-         tmp2 = tcg_const_i32(1 << size);
+     /* Memory access is usermode (unprivileged) */
      unsigned int user:1;
      /*
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 08/45] target/arm: Convert t32ee from feature bit to isar3 test
+[PULL 07/26] target/arm: Adjust the order of Phys and Stage2 ARMMMUIdx
 From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+It will be helpful to have ARMMMUIdx_Phys_* to be in the same
 relative order as ARMSecuritySpace enumerators. This requires
 the adjustment to the nstable check. While there, check for being
 in secure state rather than rely on clearing the low bit making
 no change to non-secure state.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181016223115.24100-7-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-8-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h     | 6 +++++-
+ target/arm/cpu.h | 12 ++++++------
- linux-user/elfload.c | 2 +-
+ target/arm/ptw.c | 12 +++++-------
- target/arm/cpu.c     | 4 ----
+files changed, 11 insertions(+), 13 deletions(-)
  target/arm/helper.c  | 2 +-
  target/arm/machine.c | 3 +--
 files changed, 8 insertions(+), 9 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ enum arm_features {
+@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
-     ARM_FEATURE_NEON,
+     ARMMMUIdx_E2        = 6 | ARM_MMU_IDX_A,
-     ARM_FEATURE_M, /* Microcontroller profile.  */
+     ARMMMUIdx_E3        = 7 | ARM_MMU_IDX_A,
-     ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling.  */
--    ARM_FEATURE_THUMB2EE,
+-    /* TLBs with 1-1 mapping to the physical address spaces. */
-     ARM_FEATURE_V7MP,    /* v7 Multiprocessing Extensions */
+-    ARMMMUIdx_Phys_NS   = 8 | ARM_MMU_IDX_A,
-     ARM_FEATURE_V7VE, /* v7 Virtualization Extensions (non-EL2 parts) */
+-    ARMMMUIdx_Phys_S    = 9 | ARM_MMU_IDX_A,
-     ARM_FEATURE_V4T,
+-
-@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_jazelle(const ARMISARegisters *id)
+     /*
-     return FIELD_EX32(id->id_isar1, ID_ISAR1, JAZELLE) != 0;
+      * Used for second stage of an S12 page table walk, or for descriptor
- }
+      * loads during first stage of an S1 page table walk.  Note that both
+      * are in use simultaneously for SecureEL2: the security state for
-+static inline bool isar_feature_t32ee(const ARMISARegisters *id)
+      * the S2 ptw is selected by the NS bit from the S1 ptw.
-+{
+      */
-+    return FIELD_EX32(id->id_isar3, ID_ISAR3, T32EE) != 0;
+-    ARMMMUIdx_Stage2    = 10 | ARM_MMU_IDX_A,
-+}
+-    ARMMMUIdx_Stage2_S  = 11 | ARM_MMU_IDX_A,
 +    ARMMMUIdx_Stage2_S  = 8 | ARM_MMU_IDX_A,
 +    ARMMMUIdx_Stage2    = 9 | ARM_MMU_IDX_A,
 +
- static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
++    /* TLBs with 1-1 mapping to the physical address spaces. */
- {
++    ARMMMUIdx_Phys_S    = 10 | ARM_MMU_IDX_A,
-     return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
++    ARMMMUIdx_Phys_NS   = 11 | ARM_MMU_IDX_A,
-diff --git a/linux-user/elfload.c b/linux-user/elfload.c
      /*
       * These are not allocated TLBs and are used only for AT system
 diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/linux-user/elfload.c
+--- a/target/arm/ptw.c
-+++ b/linux-user/elfload.c
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
-     GET_FEATURE(ARM_FEATURE_V5, ARM_HWCAP_ARM_EDSP);
+     descaddr |= (address >> (stride * (4 - level))) & indexmask;
-     GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
+     descaddr &= ~7ULL;
-     GET_FEATURE(ARM_FEATURE_IWMMXT, ARM_HWCAP_ARM_IWMMXT);
+     nstable = !regime_is_stage2(mmu_idx) && extract32(tableattrs, 4, 1);
--    GET_FEATURE(ARM_FEATURE_THUMB2EE, ARM_HWCAP_ARM_THUMBEE);
+-    if (nstable) {
-+    GET_FEATURE_ID(t32ee, ARM_HWCAP_ARM_THUMBEE);
++    if (nstable && ptw->in_secure) {
-     GET_FEATURE(ARM_FEATURE_NEON, ARM_HWCAP_ARM_NEON);
+         /*
-     GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
+          * Stage2_S -> Stage2 or Phys_S -> Phys_NS
-     GET_FEATURE(ARM_FEATURE_V6K, ARM_HWCAP_ARM_TLS);
+-         * Assert that the non-secure idx are even, and relative order.
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
++         * Assert the relative order of the secure/non-secure indexes.
-index XXXXXXX..XXXXXXX 100644
+          */
---- a/target/arm/cpu.c
+-        QEMU_BUILD_BUG_ON((ARMMMUIdx_Phys_NS & 1) != 0);
-+++ b/target/arm/cpu.c
+-        QEMU_BUILD_BUG_ON((ARMMMUIdx_Stage2 & 1) != 0);
-@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
+-        QEMU_BUILD_BUG_ON(ARMMMUIdx_Phys_NS + 1 != ARMMMUIdx_Phys_S);
-     set_feature(&cpu->env, ARM_FEATURE_V7);
+-        QEMU_BUILD_BUG_ON(ARMMMUIdx_Stage2 + 1 != ARMMMUIdx_Stage2_S);
-     set_feature(&cpu->env, ARM_FEATURE_VFP3);
+-        ptw->in_ptw_idx &= ~1;
-     set_feature(&cpu->env, ARM_FEATURE_NEON);
++        QEMU_BUILD_BUG_ON(ARMMMUIdx_Phys_S + 1 != ARMMMUIdx_Phys_NS);
--    set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
++        QEMU_BUILD_BUG_ON(ARMMMUIdx_Stage2_S + 1 != ARMMMUIdx_Stage2);
-     set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
++        ptw->in_ptw_idx += 1;
-     set_feature(&cpu->env, ARM_FEATURE_EL3);
+         ptw->in_secure = false;
      cpu->midr = 0x410fc080;
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_VFP3);
      set_feature(&cpu->env, ARM_FEATURE_VFP_FP16);
      set_feature(&cpu->env, ARM_FEATURE_NEON);
 -    set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
      set_feature(&cpu->env, ARM_FEATURE_EL3);
      /* Note that A9 supports the MP extensions even for
       * A9UP and single-core A9MP (which are both different
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_V7VE);
      set_feature(&cpu->env, ARM_FEATURE_VFP4);
      set_feature(&cpu->env, ARM_FEATURE_NEON);
 -    set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
      set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
      set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
      set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_V7VE);
      set_feature(&cpu->env, ARM_FEATURE_VFP4);
      set_feature(&cpu->env, ARM_FEATURE_NEON);
 -    set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
      set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
      set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
      set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
          define_arm_cp_regs(cpu, vmsa_pmsa_cp_reginfo);
          define_arm_cp_regs(cpu, vmsa_cp_reginfo);
      }
--    if (arm_feature(env, ARM_FEATURE_THUMB2EE)) {
+     if (!S1_ptw_translate(env, ptw, descaddr, fi)) {
 +    if (cpu_isar_feature(t32ee, cpu)) {
          define_arm_cp_regs(cpu, t2ee_cp_reginfo);
      }
      if (arm_feature(env, ARM_FEATURE_GENERIC_TIMER)) {
 diff --git a/target/arm/machine.c b/target/arm/machine.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/machine.c
 +++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m = {
  static bool thumb2ee_needed(void *opaque)
  {
      ARMCPU *cpu = opaque;
 -    CPUARMState *env = &cpu->env;
 -    return arm_feature(env, ARM_FEATURE_THUMB2EE);
 +    return cpu_isar_feature(t32ee, cpu);
  }
  static const VMStateDescription vmstate_thumb2ee = {
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 06/45] target/arm: Convert division from feature bits to isar0 tests
+[PULL 08/26] target/arm: Introduce ARMMMUIdx_Phys_{Realm,Root}
 From: Richard Henderson <richard.henderson@linaro.org>
-Both arm and thumb2 division are controlled by the same ISAR field,
+With FEAT_RME, there are four physical address spaces.
-which takes care of the arm implies thumb case.  Having M imply
+For now, just define the symbols, and mention them in
-thumb2 division was wrong for cortex-m0, which is v6m and does not
+the same spots as the other Phys indexes in ptw.c.
 have thumb2 at all, much less thumb2 division.
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181016223115.24100-5-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-9-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h       | 12 ++++++++++--
+ target/arm/cpu.h | 23 +++++++++++++++++++++--
- linux-user/elfload.c   |  4 ++--
+ target/arm/ptw.c | 10 ++++++++--
- target/arm/cpu.c       | 10 +---------
+files changed, 29 insertions(+), 4 deletions(-)
  target/arm/translate.c |  4 ++--
 files changed, 15 insertions(+), 15 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ enum arm_features {
+@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
-     ARM_FEATURE_VFP3,
+     ARMMMUIdx_Stage2    = 9 | ARM_MMU_IDX_A,
-     ARM_FEATURE_VFP_FP16,
-     ARM_FEATURE_NEON,
+     /* TLBs with 1-1 mapping to the physical address spaces. */
--    ARM_FEATURE_THUMB_DIV, /* divide supported in Thumb encoding */
+-    ARMMMUIdx_Phys_S    = 10 | ARM_MMU_IDX_A,
-     ARM_FEATURE_M, /* Microcontroller profile.  */
+-    ARMMMUIdx_Phys_NS   = 11 | ARM_MMU_IDX_A,
-     ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling.  */
++    ARMMMUIdx_Phys_S     = 10 | ARM_MMU_IDX_A,
-     ARM_FEATURE_THUMB2EE,
++    ARMMMUIdx_Phys_NS    = 11 | ARM_MMU_IDX_A,
-@@ -XXX,XX +XXX,XX @@ enum arm_features {
++    ARMMMUIdx_Phys_Root  = 12 | ARM_MMU_IDX_A,
-     ARM_FEATURE_V5,
++    ARMMMUIdx_Phys_Realm = 13 | ARM_MMU_IDX_A,
-     ARM_FEATURE_STRONGARM,
-     ARM_FEATURE_VAPA, /* cp15 VA to PA lookups */
+     /*
--    ARM_FEATURE_ARM_DIV, /* divide supported in ARM encoding */
+      * These are not allocated TLBs and are used only for AT system
-     ARM_FEATURE_VFP4, /* VFPv4 (implies that NEON is v2) */
+@@ -XXX,XX +XXX,XX @@ typedef enum ARMASIdx {
-     ARM_FEATURE_GENERIC_TIMER,
+     ARMASIdx_TagS = 3,
-     ARM_FEATURE_MVFR, /* Media and VFP Feature Registers 0 and 1 */
+ } ARMASIdx;
-@@ -XXX,XX +XXX,XX @@ extern const uint64_t pred_esz_masks[4];
- /*
++static inline ARMMMUIdx arm_space_to_phys(ARMSecuritySpace space)
   * 32-bit feature tests via id registers.
   */
 +static inline bool isar_feature_thumb_div(const ARMISARegisters *id)
 +{
-+    return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) != 0;
++    /* Assert the relative order of the physical mmu indexes. */
 +    QEMU_BUILD_BUG_ON(ARMSS_Secure != 0);
 +    QEMU_BUILD_BUG_ON(ARMMMUIdx_Phys_NS != ARMMMUIdx_Phys_S + ARMSS_NonSecure);
 +    QEMU_BUILD_BUG_ON(ARMMMUIdx_Phys_Root != ARMMMUIdx_Phys_S + ARMSS_Root);
 +    QEMU_BUILD_BUG_ON(ARMMMUIdx_Phys_Realm != ARMMMUIdx_Phys_S + ARMSS_Realm);
 +
 +    return ARMMMUIdx_Phys_S + space;
 +}
 +
-+static inline bool isar_feature_arm_div(const ARMISARegisters *id)
++static inline ARMSecuritySpace arm_phys_to_space(ARMMMUIdx idx)
 +{
-+    return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) > 1;
++    assert(idx >= ARMMMUIdx_Phys_S && idx <= ARMMMUIdx_Phys_Realm);
 +    return idx - ARMMMUIdx_Phys_S;
 +}
 +
- static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
+ static inline bool arm_v7m_csselr_razwi(ARMCPU *cpu)
  {
-     return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
+     /* If all the CLIDR.Ctypem bits are 0 there are no caches, and
-diff --git a/linux-user/elfload.c b/linux-user/elfload.c
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/linux-user/elfload.c
+--- a/target/arm/ptw.c
-+++ b/linux-user/elfload.c
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
+@@ -XXX,XX +XXX,XX @@ static bool regime_translation_disabled(CPUARMState *env, ARMMMUIdx mmu_idx,
-     GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
+     case ARMMMUIdx_E3:
-     GET_FEATURE(ARM_FEATURE_V6K, ARM_HWCAP_ARM_TLS);
+         break;
-     GET_FEATURE(ARM_FEATURE_VFP4, ARM_HWCAP_ARM_VFPv4);
--    GET_FEATURE(ARM_FEATURE_ARM_DIV, ARM_HWCAP_ARM_IDIVA);
+-    case ARMMMUIdx_Phys_NS:
--    GET_FEATURE(ARM_FEATURE_THUMB_DIV, ARM_HWCAP_ARM_IDIVT);
+     case ARMMMUIdx_Phys_S:
-+    GET_FEATURE_ID(arm_div, ARM_HWCAP_ARM_IDIVA);
++    case ARMMMUIdx_Phys_NS:
-+    GET_FEATURE_ID(thumb_div, ARM_HWCAP_ARM_IDIVT);
++    case ARMMMUIdx_Phys_Root:
-     /* All QEMU's VFPv3 CPUs have 32 registers, see VFP_DREG in translate.c.
++    case ARMMMUIdx_Phys_Realm:
-      * Note that the ARM_HWCAP_ARM_VFPv3D16 bit is always the inverse of
+         /* No translation for physical address spaces. */
-      * ARM_HWCAP_ARM_VFPD32 (and so always clear for QEMU); it is unrelated
+         return true;
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
-index XXXXXXX..XXXXXXX 100644
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_disabled(CPUARMState *env, target_ulong address,
---- a/target/arm/cpu.c
+     switch (mmu_idx) {
-+++ b/target/arm/cpu.c
+     case ARMMMUIdx_Stage2:
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
+     case ARMMMUIdx_Stage2_S:
-          * Presence of EL2 itself is ARM_FEATURE_EL2, and of the
+-    case ARMMMUIdx_Phys_NS:
-          * Security Extensions is ARM_FEATURE_EL3.
+     case ARMMMUIdx_Phys_S:
-          */
++    case ARMMMUIdx_Phys_NS:
--        set_feature(env, ARM_FEATURE_ARM_DIV);
++    case ARMMMUIdx_Phys_Root:
-+        assert(cpu_isar_feature(arm_div, cpu));
++    case ARMMMUIdx_Phys_Realm:
-         set_feature(env, ARM_FEATURE_LPAE);
+         break;
-         set_feature(env, ARM_FEATURE_V7);
-     }
+     default:
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
-     if (arm_feature(env, ARM_FEATURE_V5)) {
+     switch (mmu_idx) {
-         set_feature(env, ARM_FEATURE_V4T);
+     case ARMMMUIdx_Phys_S:
-     }
+     case ARMMMUIdx_Phys_NS:
--    if (arm_feature(env, ARM_FEATURE_M)) {
++    case ARMMMUIdx_Phys_Root:
--        set_feature(env, ARM_FEATURE_THUMB_DIV);
++    case ARMMMUIdx_Phys_Realm:
--    }
+         /* Checking Phys early avoids special casing later vs regime_el. */
--    if (arm_feature(env, ARM_FEATURE_ARM_DIV)) {
+         return get_phys_addr_disabled(env, address, access_type, mmu_idx,
--        set_feature(env, ARM_FEATURE_THUMB_DIV);
+                                       is_secure, result, fi);
 -    }
      if (arm_feature(env, ARM_FEATURE_VFP4)) {
          set_feature(env, ARM_FEATURE_VFP3);
          set_feature(env, ARM_FEATURE_VFP_FP16);
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
      ARMCPU *cpu = ARM_CPU(obj);
      set_feature(&cpu->env, ARM_FEATURE_V7);
 -    set_feature(&cpu->env, ARM_FEATURE_THUMB_DIV);
 -    set_feature(&cpu->env, ARM_FEATURE_ARM_DIV);
      set_feature(&cpu->env, ARM_FEATURE_V7MP);
      set_feature(&cpu->env, ARM_FEATURE_PMSA);
      cpu->midr = 0x411fc153; /* r1p3 */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                      case 1:
                      case 3:
                          /* SDIV, UDIV */
 -                        if (!arm_dc_feature(s, ARM_FEATURE_ARM_DIV)) {
 +                        if (!dc_isar_feature(arm_div, s)) {
                              goto illegal_op;
                          }
                          if (((insn >> 5) & 7) || (rd != 15)) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
              tmp2 = load_reg(s, rm);
              if ((op & 0x50) == 0x10) {
                  /* sdiv, udiv */
 -                if (!arm_dc_feature(s, ARM_FEATURE_THUMB_DIV)) {
 +                if (!dc_isar_feature(thumb_div, s)) {
                      goto illegal_op;
                  }
                  if (op & 0x20)
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 39/45] target/arm: Reorg NEON VLD/VST all elements
+[PULL 09/26] target/arm: Remove __attribute__((nonnull)) from ptw.c
 From: Richard Henderson <richard.henderson@linaro.org>
-Instead of shifts and masks, use direct loads and stores from the neon
+This was added in 7e98e21c098 as part of a reorg in which
-register file.  Mirror the iteration structure of the ARM pseudocode
+one of the argument had been legally NULL, and this caught
-more closely.  Correct the parameters of the VLD2 A2 insn.
+actual instances.  Now that the reorg is complete, this
 serves little purpose.
-Note that this includes a bugfix for handling of the insn
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-"VLD2 (multiple 2-element structures)" -- we were using an
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
 incorrect stride value.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-19-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-10-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 170 ++++++++++++++++++-----------------------
+ target/arm/ptw.c | 6 ++----
-file changed, 74 insertions(+), 96 deletions(-)
+file changed, 2 insertions(+), 4 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/ptw.c
-+++ b/target/arm/translate.c
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ static TCGv_i32 neon_load_reg(int reg, int pass)
+@@ -XXX,XX +XXX,XX @@ typedef struct S1Translate {
-     return tmp;
+ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
- }
+                                uint64_t address,
+                                MMUAccessType access_type, bool s1_is_el0,
-+static void neon_load_element64(TCGv_i64 var, int reg, int ele, TCGMemOp mop)
+-                               GetPhysAddrResult *result, ARMMMUFaultInfo *fi)
-+{
+-    __attribute__((nonnull));
-+    long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
++                               GetPhysAddrResult *result, ARMMMUFaultInfo *fi);
-+
-+    switch (mop) {
+ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
-+    case MO_UB:
+                                       target_ulong address,
-+        tcg_gen_ld8u_i64(var, cpu_env, offset);
+                                       MMUAccessType access_type,
-+        break;
+                                       GetPhysAddrResult *result,
-+    case MO_UW:
+-                                      ARMMMUFaultInfo *fi)
-+        tcg_gen_ld16u_i64(var, cpu_env, offset);
+-    __attribute__((nonnull));
-+        break;
++                                      ARMMMUFaultInfo *fi);
-+    case MO_UL:
-+        tcg_gen_ld32u_i64(var, cpu_env, offset);
+ /* This mapping is common between ID_AA64MMFR0.PARANGE and TCR_ELx.{I}PS. */
-+        break;
+ static const uint8_t pamax_map[] = {
 +    case MO_Q:
 +        tcg_gen_ld_i64(var, cpu_env, offset);
 +        break;
 +    default:
 +        g_assert_not_reached();
 +    }
 +}
 +
  static void neon_store_reg(int reg, int pass, TCGv_i32 var)
  {
      tcg_gen_st_i32(var, cpu_env, neon_reg_offset(reg, pass));
      tcg_temp_free_i32(var);
  }
 +static void neon_store_element64(int reg, int ele, TCGMemOp size, TCGv_i64 var)
 +{
 +    long offset = neon_element_offset(reg, ele, size);
 +
 +    switch (size) {
 +    case MO_8:
 +        tcg_gen_st8_i64(var, cpu_env, offset);
 +        break;
 +    case MO_16:
 +        tcg_gen_st16_i64(var, cpu_env, offset);
 +        break;
 +    case MO_32:
 +        tcg_gen_st32_i64(var, cpu_env, offset);
 +        break;
 +    case MO_64:
 +        tcg_gen_st_i64(var, cpu_env, offset);
 +        break;
 +    default:
 +        g_assert_not_reached();
 +    }
 +}
 +
  static inline void neon_load_reg64(TCGv_i64 var, int reg)
  {
      tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg));
@@ -XXX,XX +XXX,XX @@ static struct {
      int interleave;
      int spacing;
  } const neon_ls_element_type[11] = {
 -    {4, 4, 1},
 -    {4, 4, 2},
 +    {1, 4, 1},
 +    {1, 4, 2},
      {4, 1, 1},
 -    {4, 2, 1},
 -    {3, 3, 1},
 -    {3, 3, 2},
 +    {2, 2, 2},
 +    {1, 3, 1},
 +    {1, 3, 2},
      {3, 1, 1},
      {1, 1, 1},
 -    {2, 2, 1},
 -    {2, 2, 2},
 +    {1, 2, 1},
 +    {1, 2, 2},
      {2, 1, 1}
  };
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      int shift;
      int n;
      int vec_size;
 +    int mmu_idx;
 +    TCGMemOp endian;
      TCGv_i32 addr;
      TCGv_i32 tmp;
      TCGv_i32 tmp2;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      rn = (insn >> 16) & 0xf;
      rm = insn & 0xf;
      load = (insn & (1 << 21)) != 0;
 +    endian = s->be_data;
 +    mmu_idx = get_mem_index(s);
      if ((insn & (1 << 23)) == 0) {
          /* Load store all elements.  */
          op = (insn >> 8) & 0xf;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
          nregs = neon_ls_element_type[op].nregs;
          interleave = neon_ls_element_type[op].interleave;
          spacing = neon_ls_element_type[op].spacing;
 -        if (size == 3 && (interleave | spacing) != 1)
 +        if (size == 3 && (interleave | spacing) != 1) {
              return 1;
 +        }
 +        tmp64 = tcg_temp_new_i64();
          addr = tcg_temp_new_i32();
 +        tmp2 = tcg_const_i32(1 << size);
          load_reg_var(s, addr, rn);
 -        stride = (1 << size) * interleave;
          for (reg = 0; reg < nregs; reg++) {
 -            if (interleave > 2 || (interleave == 2 && nregs == 2)) {
 -                load_reg_var(s, addr, rn);
 -                tcg_gen_addi_i32(addr, addr, (1 << size) * reg);
 -            } else if (interleave == 2 && nregs == 4 && reg == 2) {
 -                load_reg_var(s, addr, rn);
 -                tcg_gen_addi_i32(addr, addr, 1 << size);
 -            }
 -            if (size == 3) {
 -                tmp64 = tcg_temp_new_i64();
 -                if (load) {
 -                    gen_aa32_ld64(s, tmp64, addr, get_mem_index(s));
 -                    neon_store_reg64(tmp64, rd);
 -                } else {
 -                    neon_load_reg64(tmp64, rd);
 -                    gen_aa32_st64(s, tmp64, addr, get_mem_index(s));
 -                }
 -                tcg_temp_free_i64(tmp64);
 -                tcg_gen_addi_i32(addr, addr, stride);
 -            } else {
 -                for (pass = 0; pass < 2; pass++) {
 -                    if (size == 2) {
 -                        if (load) {
 -                            tmp = tcg_temp_new_i32();
 -                            gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
 -                            neon_store_reg(rd, pass, tmp);
 -                        } else {
 -                            tmp = neon_load_reg(rd, pass);
 -                            gen_aa32_st32(s, tmp, addr, get_mem_index(s));
 -                            tcg_temp_free_i32(tmp);
 -                        }
 -                        tcg_gen_addi_i32(addr, addr, stride);
 -                    } else if (size == 1) {
 -                        if (load) {
 -                            tmp = tcg_temp_new_i32();
 -                            gen_aa32_ld16u(s, tmp, addr, get_mem_index(s));
 -                            tcg_gen_addi_i32(addr, addr, stride);
 -                            tmp2 = tcg_temp_new_i32();
 -                            gen_aa32_ld16u(s, tmp2, addr, get_mem_index(s));
 -                            tcg_gen_addi_i32(addr, addr, stride);
 -                            tcg_gen_shli_i32(tmp2, tmp2, 16);
 -                            tcg_gen_or_i32(tmp, tmp, tmp2);
 -                            tcg_temp_free_i32(tmp2);
 -                            neon_store_reg(rd, pass, tmp);
 -                        } else {
 -                            tmp = neon_load_reg(rd, pass);
 -                            tmp2 = tcg_temp_new_i32();
 -                            tcg_gen_shri_i32(tmp2, tmp, 16);
 -                            gen_aa32_st16(s, tmp, addr, get_mem_index(s));
 -                            tcg_temp_free_i32(tmp);
 -                            tcg_gen_addi_i32(addr, addr, stride);
 -                            gen_aa32_st16(s, tmp2, addr, get_mem_index(s));
 -                            tcg_temp_free_i32(tmp2);
 -                            tcg_gen_addi_i32(addr, addr, stride);
 -                        }
 -                    } else /* size == 0 */ {
 -                        if (load) {
 -                            tmp2 = NULL;
 -                            for (n = 0; n < 4; n++) {
 -                                tmp = tcg_temp_new_i32();
 -                                gen_aa32_ld8u(s, tmp, addr, get_mem_index(s));
 -                                tcg_gen_addi_i32(addr, addr, stride);
 -                                if (n == 0) {
 -                                    tmp2 = tmp;
 -                                } else {
 -                                    tcg_gen_shli_i32(tmp, tmp, n * 8);
 -                                    tcg_gen_or_i32(tmp2, tmp2, tmp);
 -                                    tcg_temp_free_i32(tmp);
 -                                }
 -                            }
 -                            neon_store_reg(rd, pass, tmp2);
 -                        } else {
 -                            tmp2 = neon_load_reg(rd, pass);
 -                            for (n = 0; n < 4; n++) {
 -                                tmp = tcg_temp_new_i32();
 -                                if (n == 0) {
 -                                    tcg_gen_mov_i32(tmp, tmp2);
 -                                } else {
 -                                    tcg_gen_shri_i32(tmp, tmp2, n * 8);
 -                                }
 -                                gen_aa32_st8(s, tmp, addr, get_mem_index(s));
 -                                tcg_temp_free_i32(tmp);
 -                                tcg_gen_addi_i32(addr, addr, stride);
 -                            }
 -                            tcg_temp_free_i32(tmp2);
 -                        }
 +            for (n = 0; n < 8 >> size; n++) {
 +                int xs;
 +                for (xs = 0; xs < interleave; xs++) {
 +                    int tt = rd + reg + spacing * xs;
 +
 +                    if (load) {
 +                        gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
 +                        neon_store_element64(tt, n, size, tmp64);
 +                    } else {
 +                        neon_load_element64(tmp64, tt, n, size);
 +                        gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
                      }
 +                    tcg_gen_add_i32(addr, addr, tmp2);
                  }
              }
 -            rd += spacing;
          }
          tcg_temp_free_i32(addr);
 -        stride = nregs * 8;
 +        tcg_temp_free_i32(tmp2);
 +        tcg_temp_free_i64(tmp64);
 +        stride = nregs * interleave * 8;
      } else {
          size = (insn >> 10) & 3;
          if (size == 3) {
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 23/45] target/arm: Don't call tcg_clear_temp_count
+[PULL 10/26] target/arm: Pipe ARMSecuritySpace through ptw.c
 From: Richard Henderson <richard.henderson@linaro.org>
-This is done generically in translator_loop.
+Add input and output space members to S1Translate.  Set and adjust
 them in S1_ptw_translate, and the various points at which we drop
 secure state.  Initialize the space in get_phys_addr; for now leave
 get_phys_addr_with_secure considering only secure vs non-secure spaces.
-Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Message-id: 20230620124418.805717-11-richard.henderson@linaro.org
 Message-id: 20181011205206.3552-3-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 1 -
+ target/arm/ptw.c | 86 +++++++++++++++++++++++++++++++++++++++---------
- target/arm/translate.c     | 1 -
+file changed, 71 insertions(+), 15 deletions(-)
 files changed, 2 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/ptw.c
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
+@@ -XXX,XX +XXX,XX @@
+ typedef struct S1Translate {
- static void aarch64_tr_tb_start(DisasContextBase *db, CPUState *cpu)
+     ARMMMUIdx in_mmu_idx;
      ARMMMUIdx in_ptw_idx;
 +    ARMSecuritySpace in_space;
      bool in_secure;
      bool in_debug;
      bool out_secure;
      bool out_rw;
      bool out_be;
 +    ARMSecuritySpace out_space;
      hwaddr out_virt;
      hwaddr out_phys;
      void *out_host;
@@ -XXX,XX +XXX,XX @@ static bool S2_attrs_are_device(uint64_t hcr, uint8_t attrs)
  static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
                               hwaddr addr, ARMMMUFaultInfo *fi)
  {
--    tcg_clear_temp_count();
++    ARMSecuritySpace space = ptw->in_space;
      bool is_secure = ptw->in_secure;
      ARMMMUIdx mmu_idx = ptw->in_mmu_idx;
      ARMMMUIdx s2_mmu_idx = ptw->in_ptw_idx;
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
                  .in_mmu_idx = s2_mmu_idx,
                  .in_ptw_idx = ptw_idx_for_stage_2(env, s2_mmu_idx),
                  .in_secure = s2_mmu_idx == ARMMMUIdx_Stage2_S,
 +                .in_space = (s2_mmu_idx == ARMMMUIdx_Stage2_S ? ARMSS_Secure
 +                             : space == ARMSS_Realm ? ARMSS_Realm
 +                             : ARMSS_NonSecure),
                  .in_debug = true,
              };
              GetPhysAddrResult s2 = { };
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
              ptw->out_phys = s2.f.phys_addr;
              pte_attrs = s2.cacheattrs.attrs;
              ptw->out_secure = s2.f.attrs.secure;
 +            ptw->out_space = s2.f.attrs.space;
          } else {
              /* Regime is physical. */
              ptw->out_phys = addr;
              pte_attrs = 0;
              ptw->out_secure = s2_mmu_idx == ARMMMUIdx_Phys_S;
 +            ptw->out_space = (s2_mmu_idx == ARMMMUIdx_Phys_S ? ARMSS_Secure
 +                              : space == ARMSS_Realm ? ARMSS_Realm
 +                              : ARMSS_NonSecure);
          }
          ptw->out_host = NULL;
          ptw->out_rw = false;
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
          ptw->out_rw = full->prot & PAGE_WRITE;
          pte_attrs = full->pte_attrs;
          ptw->out_secure = full->attrs.secure;
 +        ptw->out_space = full->attrs.space;
  #else
          g_assert_not_reached();
  #endif
@@ -XXX,XX +XXX,XX @@ static uint32_t arm_ldl_ptw(CPUARMState *env, S1Translate *ptw,
          }
      } else {
          /* Page tables are in MMIO. */
 -        MemTxAttrs attrs = { .secure = ptw->out_secure };
 +        MemTxAttrs attrs = {
 +            .secure = ptw->out_secure,
 +            .space = ptw->out_space,
 +        };
          AddressSpace *as = arm_addressspace(cs, attrs);
          MemTxResult result = MEMTX_OK;
@@ -XXX,XX +XXX,XX @@ static uint64_t arm_ldq_ptw(CPUARMState *env, S1Translate *ptw,
  #endif
      } else {
          /* Page tables are in MMIO. */
 -        MemTxAttrs attrs = { .secure = ptw->out_secure };
 +        MemTxAttrs attrs = {
 +            .secure = ptw->out_secure,
 +            .space = ptw->out_space,
 +        };
          AddressSpace *as = arm_addressspace(cs, attrs);
          MemTxResult result = MEMTX_OK;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_v6(CPUARMState *env, S1Translate *ptw,
           * regime, because the attribute will already be non-secure.
           */
          result->f.attrs.secure = false;
 +        result->f.attrs.space = ARMSS_NonSecure;
      }
      result->f.phys_addr = phys_addr;
      return false;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
           * regime, because the attribute will already be non-secure.
           */
          result->f.attrs.secure = false;
 +        result->f.attrs.space = ARMSS_NonSecure;
      }
      if (regime_is_stage2(mmu_idx)) {
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_pmsav8(CPUARMState *env, uint32_t address,
               */
              if (sattrs.ns) {
                  result->f.attrs.secure = false;
 +                result->f.attrs.space = ARMSS_NonSecure;
              } else if (!secure) {
                  /*
                   * NS access to S memory must fault.
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
      bool is_secure = ptw->in_secure;
      bool ret, ipa_secure;
      ARMCacheAttrs cacheattrs1;
 +    ARMSecuritySpace ipa_space;
      bool is_el0;
      uint64_t hcr;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
      ipa = result->f.phys_addr;
      ipa_secure = result->f.attrs.secure;
 +    ipa_space = result->f.attrs.space;
      is_el0 = ptw->in_mmu_idx == ARMMMUIdx_Stage1_E0;
      ptw->in_mmu_idx = ipa_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
      ptw->in_secure = ipa_secure;
 +    ptw->in_space = ipa_space;
      ptw->in_ptw_idx = ptw_idx_for_stage_2(env, ptw->in_mmu_idx);
      /*
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
      ARMMMUIdx s1_mmu_idx;
      /*
 -     * The page table entries may downgrade secure to non-secure, but
 -     * cannot upgrade an non-secure translation regime's attributes
 -     * to secure.
 +     * The page table entries may downgrade Secure to NonSecure, but
 +     * cannot upgrade a NonSecure translation regime's attributes
 +     * to Secure or Realm.
       */
      result->f.attrs.secure = is_secure;
 +    result->f.attrs.space = ptw->in_space;
      switch (mmu_idx) {
      case ARMMMUIdx_Phys_S:
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
      default:
          /* Single stage uses physical for ptw. */
 -        ptw->in_ptw_idx = is_secure ? ARMMMUIdx_Phys_S : ARMMMUIdx_Phys_NS;
 +        ptw->in_ptw_idx = arm_space_to_phys(ptw->in_space);
          break;
      }
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr_with_secure(CPUARMState *env, target_ulong address,
      S1Translate ptw = {
          .in_mmu_idx = mmu_idx,
          .in_secure = is_secure,
 +        .in_space = arm_secure_to_space(is_secure),
      };
      return get_phys_addr_with_struct(env, &ptw, address, access_type,
                                       result, fi);
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
                     MMUAccessType access_type, ARMMMUIdx mmu_idx,
                     GetPhysAddrResult *result, ARMMMUFaultInfo *fi)
  {
 -    bool is_secure;
 +    S1Translate ptw = {
 +        .in_mmu_idx = mmu_idx,
 +    };
 +    ARMSecuritySpace ss;
      switch (mmu_idx) {
      case ARMMMUIdx_E10_0:
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
      case ARMMMUIdx_Stage1_E1:
      case ARMMMUIdx_Stage1_E1_PAN:
      case ARMMMUIdx_E2:
 -        is_secure = arm_is_secure_below_el3(env);
 +        ss = arm_security_space_below_el3(env);
          break;
      case ARMMMUIdx_Stage2:
 +        /*
 +         * For Secure EL2, we need this index to be NonSecure;
 +         * otherwise this will already be NonSecure or Realm.
 +         */
 +        ss = arm_security_space_below_el3(env);
 +        if (ss == ARMSS_Secure) {
 +            ss = ARMSS_NonSecure;
 +        }
 +        break;
      case ARMMMUIdx_Phys_NS:
      case ARMMMUIdx_MPrivNegPri:
      case ARMMMUIdx_MUserNegPri:
      case ARMMMUIdx_MPriv:
      case ARMMMUIdx_MUser:
 -        is_secure = false;
 +        ss = ARMSS_NonSecure;
          break;
 -    case ARMMMUIdx_E3:
      case ARMMMUIdx_Stage2_S:
      case ARMMMUIdx_Phys_S:
      case ARMMMUIdx_MSPrivNegPri:
      case ARMMMUIdx_MSUserNegPri:
      case ARMMMUIdx_MSPriv:
      case ARMMMUIdx_MSUser:
 -        is_secure = true;
 +        ss = ARMSS_Secure;
 +        break;
 +    case ARMMMUIdx_E3:
 +        if (arm_feature(env, ARM_FEATURE_AARCH64) &&
 +            cpu_isar_feature(aa64_rme, env_archcpu(env))) {
 +            ss = ARMSS_Root;
 +        } else {
 +            ss = ARMSS_Secure;
 +        }
 +        break;
 +    case ARMMMUIdx_Phys_Root:
 +        ss = ARMSS_Root;
 +        break;
 +    case ARMMMUIdx_Phys_Realm:
 +        ss = ARMSS_Realm;
          break;
      default:
          g_assert_not_reached();
      }
 -    return get_phys_addr_with_secure(env, address, access_type, mmu_idx,
 -                                     is_secure, result, fi);
 +
 +    ptw.in_space = ss;
 +    ptw.in_secure = arm_space_is_secure(ss);
 +    return get_phys_addr_with_struct(env, &ptw, address, access_type,
 +                                     result, fi);
  }
- static void aarch64_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
+ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
-index XXXXXXX..XXXXXXX 100644
+ {
---- a/target/arm/translate.c
+     ARMCPU *cpu = ARM_CPU(cs);
-+++ b/target/arm/translate.c
+     CPUARMState *env = &cpu->env;
-@@ -XXX,XX +XXX,XX @@ static void arm_tr_tb_start(DisasContextBase *dcbase, CPUState *cpu)
++    ARMMMUIdx mmu_idx = arm_mmu_idx(env);
-         tcg_gen_movi_i32(tmp, 0);
++    ARMSecuritySpace ss = arm_security_space(env);
-         store_cpu_field(tmp, condexec_bits);
+     S1Translate ptw = {
-     }
+-        .in_mmu_idx = arm_mmu_idx(env),
--    tcg_clear_temp_count();
+-        .in_secure = arm_is_secure(env),
- }
++        .in_mmu_idx = mmu_idx,
++        .in_space = ss,
- static void arm_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
++        .in_secure = arm_space_is_secure(ss),
          .in_debug = true,
      };
      GetPhysAddrResult res = {};
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 38/45] target/arm: Use gvec for NEON VLD all lanes
+[PULL 11/26] target/arm: NSTable is RES0 for the RME EL3 regime
 From: Richard Henderson <richard.henderson@linaro.org>
+Test in_space instead of in_secure so that we don't
+switch out of Root space.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-18-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-12-richard.henderson@linaro.org
 [PMM: added parens in ?: expression]
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 81 ++++++++++++++----------------------------
+ target/arm/ptw.c | 28 ++++++++++++++--------------
-file changed, 26 insertions(+), 55 deletions(-)
+file changed, 14 insertions(+), 14 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/ptw.c
-+++ b/target/arm/translate.c
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ static void gen_vfp_msr(TCGv_i32 tmp)
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
-     tcg_temp_free_i32(tmp);
+ {
- }
+     ARMCPU *cpu = env_archcpu(env);
+     ARMMMUIdx mmu_idx = ptw->in_mmu_idx;
--static void gen_neon_dup_u8(TCGv_i32 var, int shift)
+-    bool is_secure = ptw->in_secure;
--{
+     int32_t level;
--    TCGv_i32 tmp = tcg_temp_new_i32();
+     ARMVAParameters param;
--    if (shift)
+     uint64_t ttbr;
--        tcg_gen_shri_i32(var, var, shift);
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
--    tcg_gen_ext8u_i32(var, var);
+     uint64_t descaddrmask;
--    tcg_gen_shli_i32(tmp, var, 8);
+     bool aarch64 = arm_el_is_aa64(env, el);
--    tcg_gen_or_i32(var, var, tmp);
+     uint64_t descriptor, new_descriptor;
--    tcg_gen_shli_i32(tmp, var, 16);
+-    bool nstable;
--    tcg_gen_or_i32(var, var, tmp);
--    tcg_temp_free_i32(tmp);
+     /* TODO: This code does not support shareability levels. */
--}
+     if (aarch64) {
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
          descaddrmask = MAKE_64BIT_MASK(0, 40);
      }
      descaddrmask &= ~indexmask_grainsize;
 -
- static void gen_neon_dup_low16(TCGv_i32 var)
+-    /*
- {
+-     * Secure stage 1 accesses start with the page table in secure memory and
-     TCGv_i32 tmp = tcg_temp_new_i32();
+-     * can be downgraded to non-secure at any step. Non-secure accesses
-@@ -XXX,XX +XXX,XX @@ static void gen_neon_dup_high16(TCGv_i32 var)
+-     * remain non-secure. We implement this by just ORing in the NSTable/NS
-     tcg_temp_free_i32(tmp);
+-     * bits at each step.
- }
+-     * Stage 2 never gets this kind of downgrade.
+-     */
--static TCGv_i32 gen_load_and_replicate(DisasContext *s, TCGv_i32 addr, int size)
+-    tableattrs = is_secure ? 0 : (1 << 4);
--{
++    tableattrs = 0;
--    /* Load a single Neon element and replicate into a 32 bit TCG reg */
--    TCGv_i32 tmp = tcg_temp_new_i32();
+  next_level:
--    switch (size) {
+     descaddr |= (address >> (stride * (4 - level))) & indexmask;
--    case 0:
+     descaddr &= ~7ULL;
--        gen_aa32_ld8u(s, tmp, addr, get_mem_index(s));
+-    nstable = !regime_is_stage2(mmu_idx) && extract32(tableattrs, 4, 1);
--        gen_neon_dup_u8(tmp, 0);
+-    if (nstable && ptw->in_secure) {
 -        break;
 -    case 1:
 -        gen_aa32_ld16u(s, tmp, addr, get_mem_index(s));
 -        gen_neon_dup_low16(tmp);
 -        break;
 -    case 2:
 -        gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
 -        break;
 -    default: /* Avoid compiler warnings.  */
 -        abort();
 -    }
 -    return tmp;
 -}
 -
  static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm,
                         uint32_t dp)
  {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      int load;
      int shift;
      int n;
 +    int vec_size;
      TCGv_i32 addr;
      TCGv_i32 tmp;
      TCGv_i32 tmp2;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
              }
              addr = tcg_temp_new_i32();
              load_reg_var(s, addr, rn);
 -            if (nregs == 1) {
 -                /* VLD1 to all lanes: bit 5 indicates how many Dregs to write */
 -                tmp = gen_load_and_replicate(s, addr, size);
 -                tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0));
 -                tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1));
 -                if (insn & (1 << 5)) {
 -                    tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 0));
 -                    tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 1));
 -                }
 -                tcg_temp_free_i32(tmp);
 -            } else {
 -                /* VLD2/3/4 to all lanes: bit 5 indicates register stride */
 -                stride = (insn & (1 << 5)) ? 2 : 1;
 -                for (reg = 0; reg < nregs; reg++) {
 -                    tmp = gen_load_and_replicate(s, addr, size);
 -                    tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0));
 -                    tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1));
 -                    tcg_temp_free_i32(tmp);
 -                    tcg_gen_addi_i32(addr, addr, 1 << size);
 -                    rd += stride;
 +
-+            /* VLD1 to all lanes: bit 5 indicates how many Dregs to write.
++    /*
-+             * VLD2/3/4 to all lanes: bit 5 indicates register stride.
++     * Process the NSTable bit from the previous level.  This changes
-+             */
++     * the table address space and the output space from Secure to
-+            stride = (insn & (1 << 5)) ? 2 : 1;
++     * NonSecure.  With RME, the EL3 translation regime does not change
-+            vec_size = nregs == 1 ? stride * 8 : 8;
++     * from Root to NonSecure.
 +     */
 +    if (ptw->in_space == ARMSS_Secure
 +        && !regime_is_stage2(mmu_idx)
 +        && extract32(tableattrs, 4, 1)) {
          /*
           * Stage2_S -> Stage2 or Phys_S -> Phys_NS
           * Assert the relative order of the secure/non-secure indexes.
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
          QEMU_BUILD_BUG_ON(ARMMMUIdx_Stage2_S + 1 != ARMMMUIdx_Stage2);
          ptw->in_ptw_idx += 1;
          ptw->in_secure = false;
 +        ptw->in_space = ARMSS_NonSecure;
      }
 +
-+            tmp = tcg_temp_new_i32();
+     if (!S1_ptw_translate(env, ptw, descaddr, fi)) {
-+            for (reg = 0; reg < nregs; reg++) {
+         goto do_fault;
-+                gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
+     }
-+                                s->be_data | size);
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
-+                if ((rd & 1) && vec_size == 16) {
+      */
-+                    /* We cannot write 16 bytes at once because the
+     attrs = new_descriptor & (MAKE_64BIT_MASK(2, 10) | MAKE_64BIT_MASK(50, 14));
-+                     * destination is unaligned.
+     if (!regime_is_stage2(mmu_idx)) {
-+                     */
+-        attrs |= nstable << 5; /* NS */
-+                    tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
++        attrs |= !ptw->in_secure << 5; /* NS */
-+                                         8, 8, tmp);
+         if (!param.hpd) {
-+                    tcg_gen_gvec_mov(0, neon_reg_offset(rd + 1, 0),
+             attrs |= extract64(tableattrs, 0, 2) << 53;     /* XN, PXN */
-+                                     neon_reg_offset(rd, 0), 8, 8);
+             /*
 +                } else {
 +                    tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
 +                                         vec_size, vec_size, tmp);
                  }
 +                tcg_gen_addi_i32(addr, addr, 1 << size);
 +                rd += stride;
              }
 +            tcg_temp_free_i32(tmp);
              tcg_temp_free_i32(addr);
              stride = (1 << size) * nregs;
          } else {
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 05/45] target/arm: Convert v8 extensions from feature bits to isar tests
+[PULL 12/26] target/arm: Handle Block and Page bits for security space
 From: Richard Henderson <richard.henderson@linaro.org>
-Most of the v8 extensions are self-contained within the ISAR
+With Realm security state, bit 55 of a block or page descriptor during
-registers and are not implied by other feature bits, which
+the stage2 walk becomes the NS bit; during the stage1 walk the bit 5
-makes them the easiest to convert.
+NS bit is RES0.  With Root security state, bit 11 of the block or page
 descriptor during the stage1 walk becomes the NSE bit.
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Rather than collecting an NS bit and applying it later, compute the
 output pa space from the input pa space and unconditionally assign.
 This means that we no longer need to adjust the output space earlier
 for the NSTable bit.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181016223115.24100-4-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-13-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h           | 131 +++++++++++++++++++++++++++++++++----
+ target/arm/ptw.c | 89 +++++++++++++++++++++++++++++++++++++++---------
- target/arm/translate.h     |   7 ++
+file changed, 73 insertions(+), 16 deletions(-)
  linux-user/elfload.c       |  46 ++++++++-----
  target/arm/cpu.c           |  27 +++++---
  target/arm/cpu64.c         |  57 +++++++++-------
  target/arm/translate-a64.c | 101 ++++++++++++++--------------
  target/arm/translate.c     |  36 +++++-----
 files changed, 273 insertions(+), 132 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/ptw.c
-+++ b/target/arm/cpu.h
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ typedef enum ARMPSCIState {
+@@ -XXX,XX +XXX,XX @@ static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
-     PSCI_ON_PENDING = 2
+  * @mmu_idx: MMU index indicating required translation regime
- } ARMPSCIState;
+  * @is_aa64: TRUE if AArch64
+  * @ap:      The 2-bit simple AP (AP[2:1])
-+typedef struct ARMISARegisters ARMISARegisters;
+- * @ns:      NS (non-secure) bit
-+
+  * @xn:      XN (execute-never) bit
- /**
+  * @pxn:     PXN (privileged execute-never) bit
-  * ARMCPU:
++ * @in_pa:   The original input pa space
-  * @env: #CPUARMState
++ * @out_pa:  The output pa space, modified by NSTable, NS, and NSE
-@@ -XXX,XX +XXX,XX @@ enum arm_features {
+  */
-     ARM_FEATURE_LPAE, /* has Large Physical Address Extension */
+ static int get_S1prot(CPUARMState *env, ARMMMUIdx mmu_idx, bool is_aa64,
-     ARM_FEATURE_V8,
+-                      int ap, int ns, int xn, int pxn)
-     ARM_FEATURE_AARCH64, /* supports 64 bit mode */
++                      int ap, int xn, int pxn,
--    ARM_FEATURE_V8_AES, /* implements AES part of v8 Crypto Extensions */
++                      ARMSecuritySpace in_pa, ARMSecuritySpace out_pa)
-     ARM_FEATURE_CBAR, /* has cp15 CBAR */
+ {
-     ARM_FEATURE_CRC, /* ARMv8 CRC instructions */
+     ARMCPU *cpu = env_archcpu(env);
-     ARM_FEATURE_CBAR_RO, /* has cp15 CBAR and it is read-only */
+     bool is_user = regime_is_user(env, mmu_idx);
-     ARM_FEATURE_EL2, /* has EL2 Virtualization support */
+@@ -XXX,XX +XXX,XX @@ static int get_S1prot(CPUARMState *env, ARMMMUIdx mmu_idx, bool is_aa64,
      ARM_FEATURE_EL3, /* has EL3 Secure monitor support */
 -    ARM_FEATURE_V8_SHA1, /* implements SHA1 part of v8 Crypto Extensions */
 -    ARM_FEATURE_V8_SHA256, /* implements SHA256 part of v8 Crypto Extensions */
 -    ARM_FEATURE_V8_PMULL, /* implements PMULL part of v8 Crypto Extensions */
      ARM_FEATURE_THUMB_DSP, /* DSP insns supported in the Thumb encodings */
      ARM_FEATURE_PMU, /* has PMU support */
      ARM_FEATURE_VBAR, /* has cp15 VBAR */
      ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
      ARM_FEATURE_JAZELLE, /* has (trivial) Jazelle implementation */
      ARM_FEATURE_SVE, /* has Scalable Vector Extension */
 -    ARM_FEATURE_V8_SHA512, /* implements SHA512 part of v8 Crypto Extensions */
 -    ARM_FEATURE_V8_SHA3, /* implements SHA3 part of v8 Crypto Extensions */
 -    ARM_FEATURE_V8_SM3, /* implements SM3 part of v8 Crypto Extensions */
 -    ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */
 -    ARM_FEATURE_V8_ATOMICS, /* ARMv8.1-Atomics feature */
 -    ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */
 -    ARM_FEATURE_V8_DOTPROD, /* implements v8.2 simd dot product */
      ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
 -    ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions.  */
      ARM_FEATURE_M_MAIN, /* M profile Main Extension */
  };
@@ -XXX,XX +XXX,XX @@ static inline uint64_t *aa64_vfp_qreg(CPUARMState *env, unsigned regno)
  /* Shared between translate-sve.c and sve_helper.c.  */
  extern const uint64_t pred_esz_masks[4];
 +/*
 + * 32-bit feature tests via id registers.
 + */
 +static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
 +{
 +    return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
 +}
 +
 +static inline bool isar_feature_aa32_pmull(const ARMISARegisters *id)
 +{
 +    return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) > 1;
 +}
 +
 +static inline bool isar_feature_aa32_sha1(const ARMISARegisters *id)
 +{
 +    return FIELD_EX32(id->id_isar5, ID_ISAR5, SHA1) != 0;
 +}
 +
 +static inline bool isar_feature_aa32_sha2(const ARMISARegisters *id)
 +{
 +    return FIELD_EX32(id->id_isar5, ID_ISAR5, SHA2) != 0;
 +}
 +
 +static inline bool isar_feature_aa32_crc32(const ARMISARegisters *id)
 +{
 +    return FIELD_EX32(id->id_isar5, ID_ISAR5, CRC32) != 0;
 +}
 +
 +static inline bool isar_feature_aa32_rdm(const ARMISARegisters *id)
 +{
 +    return FIELD_EX32(id->id_isar5, ID_ISAR5, RDM) != 0;
 +}
 +
 +static inline bool isar_feature_aa32_vcma(const ARMISARegisters *id)
 +{
 +    return FIELD_EX32(id->id_isar5, ID_ISAR5, VCMA) != 0;
 +}
 +
 +static inline bool isar_feature_aa32_dp(const ARMISARegisters *id)
 +{
 +    return FIELD_EX32(id->id_isar6, ID_ISAR6, DP) != 0;
 +}
 +
 +/*
 + * 64-bit feature tests via id registers.
 + */
 +static inline bool isar_feature_aa64_aes(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, AES) != 0;
 +}
 +
 +static inline bool isar_feature_aa64_pmull(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, AES) > 1;
 +}
 +
 +static inline bool isar_feature_aa64_sha1(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA1) != 0;
 +}
 +
 +static inline bool isar_feature_aa64_sha256(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA2) != 0;
 +}
 +
 +static inline bool isar_feature_aa64_sha512(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA2) > 1;
 +}
 +
 +static inline bool isar_feature_aa64_crc32(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, CRC32) != 0;
 +}
 +
 +static inline bool isar_feature_aa64_atomics(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, ATOMIC) != 0;
 +}
 +
 +static inline bool isar_feature_aa64_rdm(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, RDM) != 0;
 +}
 +
 +static inline bool isar_feature_aa64_sha3(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA3) != 0;
 +}
 +
 +static inline bool isar_feature_aa64_sm3(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SM3) != 0;
 +}
 +
 +static inline bool isar_feature_aa64_sm4(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SM4) != 0;
 +}
 +
 +static inline bool isar_feature_aa64_dp(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, DP) != 0;
 +}
 +
 +static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
 +}
 +
 +/*
 + * Forward to the above feature tests given an ARMCPU pointer.
 + */
 +#define cpu_isar_feature(name, cpu) \
 +    ({ ARMCPU *cpu_ = (cpu); isar_feature_##name(&cpu_->isar); })
 +
  #endif
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@
  /* internal defines */
  typedef struct DisasContext {
      DisasContextBase base;
 +    const ARMISARegisters *isar;
      target_ulong pc;
      target_ulong page_start;
@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 get_ahp_flag(void)
      return ret;
  }
 +/*
 + * Forward to the isar_feature_* tests given a DisasContext pointer.
 + */
 +#define dc_isar_feature(name, ctx) \
 +    ({ DisasContext *ctx_ = (ctx); isar_feature_##name(ctx_->isar); })
 +
  #endif /* TARGET_ARM_TRANSLATE_H */
 diff --git a/linux-user/elfload.c b/linux-user/elfload.c
 index XXXXXXX..XXXXXXX 100644
 --- a/linux-user/elfload.c
 +++ b/linux-user/elfload.c
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
      /* probe for the extra features */
  #define GET_FEATURE(feat, hwcap) \
      do { if (arm_feature(&cpu->env, feat)) { hwcaps |= hwcap; } } while (0)
 +
 +#define GET_FEATURE_ID(feat, hwcap) \
 +    do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)
 +
      /* EDSP is in v5TE and above, but all our v5 CPUs are v5TE */
      GET_FEATURE(ARM_FEATURE_V5, ARM_HWCAP_ARM_EDSP);
      GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap2(void)
      ARMCPU *cpu = ARM_CPU(thread_cpu);
      uint32_t hwcaps = 0;
 -    GET_FEATURE(ARM_FEATURE_V8_AES, ARM_HWCAP2_ARM_AES);
 -    GET_FEATURE(ARM_FEATURE_V8_PMULL, ARM_HWCAP2_ARM_PMULL);
 -    GET_FEATURE(ARM_FEATURE_V8_SHA1, ARM_HWCAP2_ARM_SHA1);
 -    GET_FEATURE(ARM_FEATURE_V8_SHA256, ARM_HWCAP2_ARM_SHA2);
 -    GET_FEATURE(ARM_FEATURE_CRC, ARM_HWCAP2_ARM_CRC32);
 +    GET_FEATURE_ID(aa32_aes, ARM_HWCAP2_ARM_AES);
 +    GET_FEATURE_ID(aa32_pmull, ARM_HWCAP2_ARM_PMULL);
 +    GET_FEATURE_ID(aa32_sha1, ARM_HWCAP2_ARM_SHA1);
 +    GET_FEATURE_ID(aa32_sha2, ARM_HWCAP2_ARM_SHA2);
 +    GET_FEATURE_ID(aa32_crc32, ARM_HWCAP2_ARM_CRC32);
      return hwcaps;
  }
  #undef GET_FEATURE
 +#undef GET_FEATURE_ID
  #else
  /* 64 bit ARM definitions */
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
      /* probe for the extra features */
  #define GET_FEATURE(feat, hwcap) \
      do { if (arm_feature(&cpu->env, feat)) { hwcaps |= hwcap; } } while (0)
 -    GET_FEATURE(ARM_FEATURE_V8_AES, ARM_HWCAP_A64_AES);
 -    GET_FEATURE(ARM_FEATURE_V8_PMULL, ARM_HWCAP_A64_PMULL);
 -    GET_FEATURE(ARM_FEATURE_V8_SHA1, ARM_HWCAP_A64_SHA1);
 -    GET_FEATURE(ARM_FEATURE_V8_SHA256, ARM_HWCAP_A64_SHA2);
 -    GET_FEATURE(ARM_FEATURE_CRC, ARM_HWCAP_A64_CRC32);
 -    GET_FEATURE(ARM_FEATURE_V8_SHA3, ARM_HWCAP_A64_SHA3);
 -    GET_FEATURE(ARM_FEATURE_V8_SM3, ARM_HWCAP_A64_SM3);
 -    GET_FEATURE(ARM_FEATURE_V8_SM4, ARM_HWCAP_A64_SM4);
 -    GET_FEATURE(ARM_FEATURE_V8_SHA512, ARM_HWCAP_A64_SHA512);
 +#define GET_FEATURE_ID(feat, hwcap) \
 +    do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)
 +
 +    GET_FEATURE_ID(aa64_aes, ARM_HWCAP_A64_AES);
 +    GET_FEATURE_ID(aa64_pmull, ARM_HWCAP_A64_PMULL);
 +    GET_FEATURE_ID(aa64_sha1, ARM_HWCAP_A64_SHA1);
 +    GET_FEATURE_ID(aa64_sha256, ARM_HWCAP_A64_SHA2);
 +    GET_FEATURE_ID(aa64_sha512, ARM_HWCAP_A64_SHA512);
 +    GET_FEATURE_ID(aa64_crc32, ARM_HWCAP_A64_CRC32);
 +    GET_FEATURE_ID(aa64_sha3, ARM_HWCAP_A64_SHA3);
 +    GET_FEATURE_ID(aa64_sm3, ARM_HWCAP_A64_SM3);
 +    GET_FEATURE_ID(aa64_sm4, ARM_HWCAP_A64_SM4);
      GET_FEATURE(ARM_FEATURE_V8_FP16,
                  ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
 -    GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS);
 -    GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
 -    GET_FEATURE(ARM_FEATURE_V8_DOTPROD, ARM_HWCAP_A64_ASIMDDP);
 -    GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
 +    GET_FEATURE_ID(aa64_atomics, ARM_HWCAP_A64_ATOMICS);
 +    GET_FEATURE_ID(aa64_rdm, ARM_HWCAP_A64_ASIMDRDM);
 +    GET_FEATURE_ID(aa64_dp, ARM_HWCAP_A64_ASIMDDP);
 +    GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
      GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
 +
  #undef GET_FEATURE
 +#undef GET_FEATURE_ID
      return hwcaps;
  }
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
          cortex_a15_initfn(obj);
  #ifdef CONFIG_USER_ONLY
          /* We don't set these in system emulation mode for the moment,
 -         * since we don't correctly set the ID registers to advertise them,
 +         * since we don't correctly set (all of) the ID registers to
 +         * advertise them.
           */
          set_feature(&cpu->env, ARM_FEATURE_V8);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_AES);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
 -        set_feature(&cpu->env, ARM_FEATURE_CRC);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
 +        {
 +            uint32_t t;
 +
 +            t = cpu->isar.id_isar5;
 +            t = FIELD_DP32(t, ID_ISAR5, AES, 2);
 +            t = FIELD_DP32(t, ID_ISAR5, SHA1, 1);
 +            t = FIELD_DP32(t, ID_ISAR5, SHA2, 1);
 +            t = FIELD_DP32(t, ID_ISAR5, CRC32, 1);
 +            t = FIELD_DP32(t, ID_ISAR5, RDM, 1);
 +            t = FIELD_DP32(t, ID_ISAR5, VCMA, 1);
 +            cpu->isar.id_isar5 = t;
 +
 +            t = cpu->isar.id_isar6;
 +            t = FIELD_DP32(t, ID_ISAR6, DP, 1);
 +            cpu->isar.id_isar6 = t;
 +        }
  #endif
      }
  }
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
      set_feature(&cpu->env, ARM_FEATURE_AARCH64);
      set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_AES);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
 -    set_feature(&cpu->env, ARM_FEATURE_CRC);
      set_feature(&cpu->env, ARM_FEATURE_EL2);
      set_feature(&cpu->env, ARM_FEATURE_EL3);
      set_feature(&cpu->env, ARM_FEATURE_PMU);
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
      set_feature(&cpu->env, ARM_FEATURE_AARCH64);
      set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_AES);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
 -    set_feature(&cpu->env, ARM_FEATURE_CRC);
      set_feature(&cpu->env, ARM_FEATURE_EL2);
      set_feature(&cpu->env, ARM_FEATURE_EL3);
      set_feature(&cpu->env, ARM_FEATURE_PMU);
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
      set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
      set_feature(&cpu->env, ARM_FEATURE_AARCH64);
      set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_AES);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
 -    set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
 -    set_feature(&cpu->env, ARM_FEATURE_CRC);
      set_feature(&cpu->env, ARM_FEATURE_EL2);
      set_feature(&cpu->env, ARM_FEATURE_EL3);
      set_feature(&cpu->env, ARM_FEATURE_PMU);
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
      if (kvm_enabled()) {
          kvm_arm_set_cpu_features_from_host(cpu);
      } else {
 +        uint64_t t;
 +        uint32_t u;
          aarch64_a57_initfn(obj);
 +
 +        t = cpu->isar.id_aa64isar0;
 +        t = FIELD_DP64(t, ID_AA64ISAR0, AES, 2); /* AES + PMULL */
 +        t = FIELD_DP64(t, ID_AA64ISAR0, SHA1, 1);
 +        t = FIELD_DP64(t, ID_AA64ISAR0, SHA2, 2); /* SHA512 */
 +        t = FIELD_DP64(t, ID_AA64ISAR0, CRC32, 1);
 +        t = FIELD_DP64(t, ID_AA64ISAR0, ATOMIC, 2);
 +        t = FIELD_DP64(t, ID_AA64ISAR0, RDM, 1);
 +        t = FIELD_DP64(t, ID_AA64ISAR0, SHA3, 1);
 +        t = FIELD_DP64(t, ID_AA64ISAR0, SM3, 1);
 +        t = FIELD_DP64(t, ID_AA64ISAR0, SM4, 1);
 +        t = FIELD_DP64(t, ID_AA64ISAR0, DP, 1);
 +        cpu->isar.id_aa64isar0 = t;
 +
 +        t = cpu->isar.id_aa64isar1;
 +        t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
 +        cpu->isar.id_aa64isar1 = t;
 +
 +        /* Replicate the same data to the 32-bit id registers.  */
 +        u = cpu->isar.id_isar5;
 +        u = FIELD_DP32(u, ID_ISAR5, AES, 2); /* AES + PMULL */
 +        u = FIELD_DP32(u, ID_ISAR5, SHA1, 1);
 +        u = FIELD_DP32(u, ID_ISAR5, SHA2, 1);
 +        u = FIELD_DP32(u, ID_ISAR5, CRC32, 1);
 +        u = FIELD_DP32(u, ID_ISAR5, RDM, 1);
 +        u = FIELD_DP32(u, ID_ISAR5, VCMA, 1);
 +        cpu->isar.id_isar5 = u;
 +
 +        u = cpu->isar.id_isar6;
 +        u = FIELD_DP32(u, ID_ISAR6, DP, 1);
 +        cpu->isar.id_isar6 = u;
 +
  #ifdef CONFIG_USER_ONLY
          /* We don't set these in system emulation mode for the moment,
           * since we don't correctly set the ID registers to advertise them,
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
           * whereas the architecture requires them to be present in both if
           * present in either.
           */
 -        set_feature(&cpu->env, ARM_FEATURE_V8_SHA512);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_SHA3);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_SM3);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_SM4);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_ATOMICS);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
          set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
 -        set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
          set_feature(&cpu->env, ARM_FEATURE_SVE);
          /* For usermode -cpu max we can use a larger and more efficient DCZ
           * blocksize since we don't have to follow what the hardware does.
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
          }
          if (rt2 == 31
              && ((rt | rs) & 1) == 0
 -            && arm_dc_feature(s, ARM_FEATURE_V8_ATOMICS)) {
 +            && dc_isar_feature(aa64_atomics, s)) {
              /* CASP / CASPL */
              gen_compare_and_swap_pair(s, rs, rt, rn, size | 2);
              return;
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
          }
          if (rt2 == 31
              && ((rt | rs) & 1) == 0
 -            && arm_dc_feature(s, ARM_FEATURE_V8_ATOMICS)) {
 +            && dc_isar_feature(aa64_atomics, s)) {
              /* CASPA / CASPAL */
              gen_compare_and_swap_pair(s, rs, rt, rn, size | 2);
              return;
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
      case 0xb: /* CASL */
      case 0xe: /* CASA */
      case 0xf: /* CASAL */
 -        if (rt2 == 31 && arm_dc_feature(s, ARM_FEATURE_V8_ATOMICS)) {
 +        if (rt2 == 31 && dc_isar_feature(aa64_atomics, s)) {
              gen_compare_and_swap(s, rs, rt, rn, size);
              return;
          }
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_atomic(DisasContext *s, uint32_t insn,
      int rs = extract32(insn, 16, 5);
      int rn = extract32(insn, 5, 5);
      int o3_opc = extract32(insn, 12, 4);
 -    int feature = ARM_FEATURE_V8_ATOMICS;
      TCGv_i64 tcg_rn, tcg_rs;
      AtomicThreeOpFn *fn;
 -    if (is_vector) {
 +    if (is_vector || !dc_isar_feature(aa64_atomics, s)) {
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_atomic(DisasContext *s, uint32_t insn,
          unallocated_encoding(s);
          return;
      }
 -    if (!arm_dc_feature(s, feature)) {
 -        unallocated_encoding(s);
 -        return;
 -    }
      if (rn == 31) {
          gen_check_sp_alignment(s);
@@ -XXX,XX +XXX,XX @@ static void handle_crc32(DisasContext *s,
      TCGv_i64 tcg_acc, tcg_val;
      TCGv_i32 tcg_bytes;
 -    if (!arm_dc_feature(s, ARM_FEATURE_CRC)
 +    if (!dc_isar_feature(aa64_crc32, s)
          || (sf == 1 && sz != 3)
          || (sf == 0 && sz == 3)) {
          unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_extra(DisasContext *s,
      bool u = extract32(insn, 29, 1);
      TCGv_i32 ele1, ele2, ele3;
      TCGv_i64 res;
 -    int feature;
 +    bool feature;
      switch (u * 16 + opcode) {
      case 0x10: /* SQRDMLAH (vector) */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_extra(DisasContext *s,
              unallocated_encoding(s);
              return;
          }
 -        feature = ARM_FEATURE_V8_RDM;
 +        feature = dc_isar_feature(aa64_rdm, s);
          break;
      default:
          unallocated_encoding(s);
          return;
      }
 -    if (!arm_dc_feature(s, feature)) {
 +    if (!feature) {
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
              return;
          }
          if (size == 3) {
 -            if (!arm_dc_feature(s, ARM_FEATURE_V8_PMULL)) {
 +            if (!dc_isar_feature(aa64_pmull, s)) {
                  unallocated_encoding(s);
                  return;
              }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
      int size = extract32(insn, 22, 2);
      bool u = extract32(insn, 29, 1);
      bool is_q = extract32(insn, 30, 1);
 -    int feature, rot;
 +    bool feature;
 +    int rot;
      switch (u * 16 + opcode) {
      case 0x10: /* SQRDMLAH (vector) */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
              unallocated_encoding(s);
              return;
          }
 -        feature = ARM_FEATURE_V8_RDM;
 +        feature = dc_isar_feature(aa64_rdm, s);
          break;
      case 0x02: /* SDOT (vector) */
      case 0x12: /* UDOT (vector) */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
              unallocated_encoding(s);
              return;
          }
 -        feature = ARM_FEATURE_V8_DOTPROD;
 +        feature = dc_isar_feature(aa64_dp, s);
          break;
      case 0x18: /* FCMLA, #0 */
      case 0x19: /* FCMLA, #90 */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
              unallocated_encoding(s);
              return;
          }
 -        feature = ARM_FEATURE_V8_FCMA;
 +        feature = dc_isar_feature(aa64_fcma, s);
          break;
      default:
          unallocated_encoding(s);
          return;
      }
 -    if (!arm_dc_feature(s, feature)) {
 +    if (!feature) {
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
          break;
      case 0x1d: /* SQRDMLAH */
      case 0x1f: /* SQRDMLSH */
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
 +        if (!dc_isar_feature(aa64_rdm, s)) {
              unallocated_encoding(s);
              return;
          }
          break;
      case 0x0e: /* SDOT */
      case 0x1e: /* UDOT */
 -        if (size != MO_32 || !arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
 +        if (size != MO_32 || !dc_isar_feature(aa64_dp, s)) {
              unallocated_encoding(s);
              return;
          }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
      case 0x13: /* FCMLA #90 */
      case 0x15: /* FCMLA #180 */
      case 0x17: /* FCMLA #270 */
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)) {
 +        if (!dc_isar_feature(aa64_fcma, s)) {
              unallocated_encoding(s);
              return;
          }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_aes(DisasContext *s, uint32_t insn)
      TCGv_i32 tcg_decrypt;
      CryptoThreeOpIntFn *genfn;
 -    if (!arm_dc_feature(s, ARM_FEATURE_V8_AES)
 -        || size != 0) {
 +    if (!dc_isar_feature(aa64_aes, s) || size != 0) {
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
      int rd = extract32(insn, 0, 5);
      CryptoThreeOpFn *genfn;
      TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
 -    int feature = ARM_FEATURE_V8_SHA256;
 +    bool feature;
      if (size != 0) {
          unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
      case 2: /* SHA1M */
      case 3: /* SHA1SU0 */
          genfn = NULL;
 -        feature = ARM_FEATURE_V8_SHA1;
 +        feature = dc_isar_feature(aa64_sha1, s);
          break;
      case 4: /* SHA256H */
          genfn = gen_helper_crypto_sha256h;
 +        feature = dc_isar_feature(aa64_sha256, s);
          break;
      case 5: /* SHA256H2 */
          genfn = gen_helper_crypto_sha256h2;
 +        feature = dc_isar_feature(aa64_sha256, s);
          break;
      case 6: /* SHA256SU1 */
          genfn = gen_helper_crypto_sha256su1;
 +        feature = dc_isar_feature(aa64_sha256, s);
          break;
      default:
          unallocated_encoding(s);
          return;
      }
 -    if (!arm_dc_feature(s, feature)) {
 +    if (!feature) {
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
      int rn = extract32(insn, 5, 5);
      int rd = extract32(insn, 0, 5);
      CryptoTwoOpFn *genfn;
 -    int feature;
 +    bool feature;
      TCGv_ptr tcg_rd_ptr, tcg_rn_ptr;
      if (size != 0) {
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
      switch (opcode) {
      case 0: /* SHA1H */
 -        feature = ARM_FEATURE_V8_SHA1;
 +        feature = dc_isar_feature(aa64_sha1, s);
          genfn = gen_helper_crypto_sha1h;
          break;
      case 1: /* SHA1SU1 */
 -        feature = ARM_FEATURE_V8_SHA1;
 +        feature = dc_isar_feature(aa64_sha1, s);
          genfn = gen_helper_crypto_sha1su1;
          break;
      case 2: /* SHA256SU0 */
 -        feature = ARM_FEATURE_V8_SHA256;
 +        feature = dc_isar_feature(aa64_sha256, s);
          genfn = gen_helper_crypto_sha256su0;
          break;
      default:
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
          return;
      }
 -    if (!arm_dc_feature(s, feature)) {
 +    if (!feature) {
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
      int rm = extract32(insn, 16, 5);
      int rn = extract32(insn, 5, 5);
      int rd = extract32(insn, 0, 5);
 -    int feature;
 +    bool feature;
      CryptoThreeOpFn *genfn;
      if (o == 0) {
          switch (opcode) {
          case 0: /* SHA512H */
 -            feature = ARM_FEATURE_V8_SHA512;
 +            feature = dc_isar_feature(aa64_sha512, s);
              genfn = gen_helper_crypto_sha512h;
              break;
          case 1: /* SHA512H2 */
 -            feature = ARM_FEATURE_V8_SHA512;
 +            feature = dc_isar_feature(aa64_sha512, s);
              genfn = gen_helper_crypto_sha512h2;
              break;
          case 2: /* SHA512SU1 */
 -            feature = ARM_FEATURE_V8_SHA512;
 +            feature = dc_isar_feature(aa64_sha512, s);
              genfn = gen_helper_crypto_sha512su1;
              break;
          case 3: /* RAX1 */
 -            feature = ARM_FEATURE_V8_SHA3;
 +            feature = dc_isar_feature(aa64_sha3, s);
              genfn = NULL;
              break;
          }
      } else {
          switch (opcode) {
          case 0: /* SM3PARTW1 */
 -            feature = ARM_FEATURE_V8_SM3;
 +            feature = dc_isar_feature(aa64_sm3, s);
              genfn = gen_helper_crypto_sm3partw1;
              break;
          case 1: /* SM3PARTW2 */
 -            feature = ARM_FEATURE_V8_SM3;
 +            feature = dc_isar_feature(aa64_sm3, s);
              genfn = gen_helper_crypto_sm3partw2;
              break;
          case 2: /* SM4EKEY */
 -            feature = ARM_FEATURE_V8_SM4;
 +            feature = dc_isar_feature(aa64_sm4, s);
              genfn = gen_helper_crypto_sm4ekey;
              break;
          default:
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
          }
      }
--    if (!arm_dc_feature(s, feature)) {
+-    if (ns && arm_is_secure(env) && (env->cp15.scr_el3 & SCR_SIF)) {
-+    if (!feature) {
++    if (out_pa == ARMSS_NonSecure && in_pa == ARMSS_Secure &&
-         unallocated_encoding(s);
++        (env->cp15.scr_el3 & SCR_SIF)) {
-         return;
+         return prot_rw;
      }
-@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
-     int rn = extract32(insn, 5, 5);
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
-     int rd = extract32(insn, 0, 5);
+     int32_t stride;
-     TCGv_ptr tcg_rd_ptr, tcg_rn_ptr;
+     int addrsize, inputsize, outputsize;
--    int feature;
+     uint64_t tcr = regime_tcr(env, mmu_idx);
-+    bool feature;
+-    int ap, ns, xn, pxn;
-     CryptoTwoOpFn *genfn;
++    int ap, xn, pxn;
+     uint32_t el = regime_el(env, mmu_idx);
-     switch (opcode) {
+     uint64_t descaddrmask;
-     case 0: /* SHA512SU0 */
+     bool aarch64 = arm_el_is_aa64(env, el);
--        feature = ARM_FEATURE_V8_SHA512;
+     uint64_t descriptor, new_descriptor;
-+        feature = dc_isar_feature(aa64_sha512, s);
++    ARMSecuritySpace out_space;
-         genfn = gen_helper_crypto_sha512su0;
-         break;
+     /* TODO: This code does not support shareability levels. */
-     case 1: /* SM4E */
+     if (aarch64) {
--        feature = ARM_FEATURE_V8_SM4;
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
 +        feature = dc_isar_feature(aa64_sm4, s);
          genfn = gen_helper_crypto_sm4e;
          break;
      default:
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
          return;
      }
--    if (!arm_dc_feature(s, feature)) {
+     ap = extract32(attrs, 6, 2);
-+    if (!feature) {
++    out_space = ptw->in_space;
-         unallocated_encoding(s);
+     if (regime_is_stage2(mmu_idx)) {
-         return;
+-        ns = mmu_idx == ARMMMUIdx_Stage2;
 +        /*
 +         * R_GYNXY: For stage2 in Realm security state, bit 55 is NS.
 +         * The bit remains ignored for other security states.
 +         */
 +        if (out_space == ARMSS_Realm && extract64(attrs, 55, 1)) {
 +            out_space = ARMSS_NonSecure;
 +        }
          xn = extract64(attrs, 53, 2);
          result->f.prot = get_S2prot(env, ap, xn, s1_is_el0);
      } else {
 -        ns = extract32(attrs, 5, 1);
 +        int nse, ns = extract32(attrs, 5, 1);
 +        switch (out_space) {
 +        case ARMSS_Root:
 +            /*
 +             * R_GVZML: Bit 11 becomes the NSE field in the EL3 regime.
 +             * R_XTYPW: NSE and NS together select the output pa space.
 +             */
 +            nse = extract32(attrs, 11, 1);
 +            out_space = (nse << 1) | ns;
 +            if (out_space == ARMSS_Secure &&
 +                !cpu_isar_feature(aa64_sel2, cpu)) {
 +                out_space = ARMSS_NonSecure;
 +            }
 +            break;
 +        case ARMSS_Secure:
 +            if (ns) {
 +                out_space = ARMSS_NonSecure;
 +            }
 +            break;
 +        case ARMSS_Realm:
 +            switch (mmu_idx) {
 +            case ARMMMUIdx_Stage1_E0:
 +            case ARMMMUIdx_Stage1_E1:
 +            case ARMMMUIdx_Stage1_E1_PAN:
 +                /* I_CZPRF: For Realm EL1&0 stage1, NS bit is RES0. */
 +                break;
 +            case ARMMMUIdx_E2:
 +            case ARMMMUIdx_E20_0:
 +            case ARMMMUIdx_E20_2:
 +            case ARMMMUIdx_E20_2_PAN:
 +                /*
 +                 * R_LYKFZ, R_WGRZN: For Realm EL2 and EL2&1,
 +                 * NS changes the output to non-secure space.
 +                 */
 +                if (ns) {
 +                    out_space = ARMSS_NonSecure;
 +                }
 +                break;
 +            default:
 +                g_assert_not_reached();
 +            }
 +            break;
 +        case ARMSS_NonSecure:
 +            /* R_QRMFF: For NonSecure state, the NS bit is RES0. */
 +            break;
 +        default:
 +            g_assert_not_reached();
 +        }
          xn = extract64(attrs, 54, 1);
          pxn = extract64(attrs, 53, 1);
 -        result->f.prot = get_S1prot(env, mmu_idx, aarch64, ap, ns, xn, pxn);
 +
 +        /*
 +         * Note that we modified ptw->in_space earlier for NSTable, but
 +         * result->f.attrs retains a copy of the original security space.
 +         */
 +        result->f.prot = get_S1prot(env, mmu_idx, aarch64, ap, xn, pxn,
 +                                    result->f.attrs.space, out_space);
      }
-@@ -XXX,XX +XXX,XX @@ static void disas_crypto_four_reg(DisasContext *s, uint32_t insn)
-     int ra = extract32(insn, 10, 5);
+     if (!(result->f.prot & (1 << access_type))) {
-     int rn = extract32(insn, 5, 5);
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
-     int rd = extract32(insn, 0, 5);
+         }
 -    int feature;
 +    bool feature;
      switch (op0) {
      case 0: /* EOR3 */
      case 1: /* BCAX */
 -        feature = ARM_FEATURE_V8_SHA3;
 +        feature = dc_isar_feature(aa64_sha3, s);
          break;
      case 2: /* SM3SS1 */
 -        feature = ARM_FEATURE_V8_SM3;
 +        feature = dc_isar_feature(aa64_sm3, s);
          break;
      default:
          unallocated_encoding(s);
          return;
      }
--    if (!arm_dc_feature(s, feature)) {
+-    if (ns) {
-+    if (!feature) {
+-        /*
-         unallocated_encoding(s);
+-         * The NS bit will (as required by the architecture) have no effect if
-         return;
+-         * the CPU doesn't support TZ or this is a non-secure translation
-     }
+-         * regime, because the attribute will already be non-secure.
-@@ -XXX,XX +XXX,XX @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn)
+-         */
-     TCGv_i64 tcg_op1, tcg_op2, tcg_res[2];
+-        result->f.attrs.secure = false;
-     int pass;
+-        result->f.attrs.space = ARMSS_NonSecure;
+-    }
--    if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA3)) {
++    result->f.attrs.space = out_space;
-+    if (!dc_isar_feature(aa64_sha3, s)) {
++    result->f.attrs.secure = arm_space_is_secure(out_space);
-         unallocated_encoding(s);
-         return;
+     if (regime_is_stage2(mmu_idx)) {
-     }
+         result->cacheattrs.is_s2_format = true;
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_imm2(DisasContext *s, uint32_t insn)
      TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
      TCGv_i32 tcg_imm2, tcg_opcode;
 -    if (!arm_dc_feature(s, ARM_FEATURE_V8_SM3)) {
 +    if (!dc_isar_feature(aa64_sm3, s)) {
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
      ARMCPU *arm_cpu = arm_env_get_cpu(env);
      int bound;
 +    dc->isar = &arm_cpu->isar;
      dc->pc = dc->base.pc_first;
      dc->condjmp = 0;
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
  static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
                           int q, int rd, int rn, int rm)
  {
 -    if (arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
 +    if (dc_isar_feature(aa32_rdm, s)) {
          int opr_sz = (1 + q) * 8;
          tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
                             vfp_reg_offset(1, rn),
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  return 1;
              }
              if (!u) { /* SHA-1 */
 -                if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA1)) {
 +                if (!dc_isar_feature(aa32_sha1, s)) {
                      return 1;
                  }
                  ptr1 = vfp_reg_ptr(true, rd);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp4);
                  tcg_temp_free_i32(tmp4);
              } else { /* SHA-256 */
 -                if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA256) || size == 3) {
 +                if (!dc_isar_feature(aa32_sha2, s) || size == 3) {
                      return 1;
                  }
                  ptr1 = vfp_reg_ptr(true, rd);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  if (op == 14 && size == 2) {
                      TCGv_i64 tcg_rn, tcg_rm, tcg_rd;
 -                    if (!arm_dc_feature(s, ARM_FEATURE_V8_PMULL)) {
 +                    if (!dc_isar_feature(aa32_pmull, s)) {
                          return 1;
                      }
                      tcg_rn = tcg_temp_new_i64();
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      {
                          NeonGenThreeOpEnvFn *fn;
 -                        if (!arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
 +                        if (!dc_isar_feature(aa32_rdm, s)) {
                              return 1;
                          }
                          if (u && ((rd | rn) & 1)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      break;
                  }
                  case NEON_2RM_AESE: case NEON_2RM_AESMC:
 -                    if (!arm_dc_feature(s, ARM_FEATURE_V8_AES)
 -                        || ((rm | rd) & 1)) {
 +                    if (!dc_isar_feature(aa32_aes, s) || ((rm | rd) & 1)) {
                          return 1;
                      }
                      ptr1 = vfp_reg_ptr(true, rd);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      tcg_temp_free_i32(tmp3);
                      break;
                  case NEON_2RM_SHA1H:
 -                    if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA1)
 -                        || ((rm | rd) & 1)) {
 +                    if (!dc_isar_feature(aa32_sha1, s) || ((rm | rd) & 1)) {
                          return 1;
                      }
                      ptr1 = vfp_reg_ptr(true, rd);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      }
                      /* bit 6 (q): set -> SHA256SU0, cleared -> SHA1SU1 */
                      if (q) {
 -                        if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA256)) {
 +                        if (!dc_isar_feature(aa32_sha2, s)) {
                              return 1;
                          }
 -                    } else if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA1)) {
 +                    } else if (!dc_isar_feature(aa32_sha1, s)) {
                          return 1;
                      }
                      ptr1 = vfp_reg_ptr(true, rd);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
          /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
          int size = extract32(insn, 20, 1);
          data = extract32(insn, 23, 2); /* rot */
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
 +        if (!dc_isar_feature(aa32_vcma, s)
              || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
          /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
          int size = extract32(insn, 20, 1);
          data = extract32(insn, 24, 1); /* rot */
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
 +        if (!dc_isar_feature(aa32_vcma, s)
              || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
      } else if ((insn & 0xfeb00f00) == 0xfc200d00) {
          /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
          bool u = extract32(insn, 4, 1);
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
 +        if (!dc_isar_feature(aa32_dp, s)) {
              return 1;
          }
          fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
          int size = extract32(insn, 23, 1);
          int index;
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)) {
 +        if (!dc_isar_feature(aa32_vcma, s)) {
              return 1;
          }
          if (size == 0) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
      } else if ((insn & 0xffb00f00) == 0xfe200d00) {
          /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
          int u = extract32(insn, 4, 1);
 -        if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
 +        if (!dc_isar_feature(aa32_dp, s)) {
              return 1;
          }
          fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
               * op1 == 3 is UNPREDICTABLE but handle as UNDEFINED.
               * Bits 8, 10 and 11 should be zero.
               */
 -            if (!arm_dc_feature(s, ARM_FEATURE_CRC) || op1 == 0x3 ||
 -                (c & 0xd) != 0) {
 +            if (!dc_isar_feature(aa32_crc32, s) || op1 == 0x3 || (c & 0xd) != 0) {
                  goto illegal_op;
              }
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                  case 0x28:
                  case 0x29:
                  case 0x2a:
 -                    if (!arm_dc_feature(s, ARM_FEATURE_CRC)) {
 +                    if (!dc_isar_feature(aa32_crc32, s)) {
                          goto illegal_op;
                      }
                      break;
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
      CPUARMState *env = cs->env_ptr;
      ARMCPU *cpu = arm_env_get_cpu(env);
 +    dc->isar = &cpu->isar;
      dc->pc = dc->base.pc_first;
      dc->condjmp = 0;
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 41/45] target/arm: Reorg NEON VLD/VST single element to one lane
+[PULL 13/26] target/arm: Handle no-execute for Realm and Root regimes
 From: Richard Henderson <richard.henderson@linaro.org>
-Instead of shifts and masks, use direct loads and stores from
+While Root and Realm may read and write data from other spaces,
-the neon register file.
+neither may execute from other pa spaces.
+This happens for Stage1 EL3, EL2, EL2&0, and Stage2 EL1&0.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-21-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-14-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 92 +++++++++++++++++++++++-------------------
+ target/arm/ptw.c | 52 ++++++++++++++++++++++++++++++++++++++++++------
-file changed, 50 insertions(+), 42 deletions(-)
+file changed, 46 insertions(+), 6 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/ptw.c
-+++ b/target/arm/translate.c
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ static TCGv_i32 neon_load_reg(int reg, int pass)
+@@ -XXX,XX +XXX,XX @@ do_fault:
-     return tmp;
+  * @xn:      XN (execute-never) bits
- }
+  * @s1_is_el0: true if this is S2 of an S1+2 walk for EL0
+  */
-+static void neon_load_element(TCGv_i32 var, int reg, int ele, TCGMemOp mop)
+-static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
-+{
++static int get_S2prot_noexecute(int s2ap)
-+    long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
+ {
-+
+     int prot = 0;
-+    switch (mop) {
-+    case MO_UB:
+@@ -XXX,XX +XXX,XX @@ static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
-+        tcg_gen_ld8u_i32(var, cpu_env, offset);
+     if (s2ap & 2) {
-+        break;
+         prot |= PAGE_WRITE;
-+    case MO_UW:
+     }
-+        tcg_gen_ld16u_i32(var, cpu_env, offset);
++    return prot;
 +        break;
 +    case MO_UL:
 +        tcg_gen_ld_i32(var, cpu_env, offset);
 +        break;
 +    default:
 +        g_assert_not_reached();
 +    }
 +}
 +
- static void neon_load_element64(TCGv_i64 var, int reg, int ele, TCGMemOp mop)
++static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
  {
      long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
@@ -XXX,XX +XXX,XX @@ static void neon_store_reg(int reg, int pass, TCGv_i32 var)
      tcg_temp_free_i32(var);
  }
 +static void neon_store_element(int reg, int ele, TCGMemOp size, TCGv_i32 var)
 +{
-+    long offset = neon_element_offset(reg, ele, size);
++    int prot = get_S2prot_noexecute(s2ap);
-+
-+    switch (size) {
+     if (cpu_isar_feature(any_tts2uxn, env_archcpu(env))) {
-+    case MO_8:
+         switch (xn) {
-+        tcg_gen_st8_i32(var, cpu_env, offset);
+@@ -XXX,XX +XXX,XX @@ static int get_S1prot(CPUARMState *env, ARMMMUIdx mmu_idx, bool is_aa64,
 +        break;
 +    case MO_16:
 +        tcg_gen_st16_i32(var, cpu_env, offset);
 +        break;
 +    case MO_32:
 +        tcg_gen_st_i32(var, cpu_env, offset);
 +        break;
 +    default:
 +        g_assert_not_reached();
 +    }
 +}
 +
  static void neon_store_element64(int reg, int ele, TCGMemOp size, TCGv_i64 var)
  {
      long offset = neon_element_offset(reg, ele, size);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      int stride;
      int size;
      int reg;
 -    int pass;
      int load;
 -    int shift;
      int n;
      int vec_size;
      int mmu_idx;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
          } else {
              /* Single element.  */
              int idx = (insn >> 4) & 0xf;
 -            pass = (insn >> 7) & 1;
 +            int reg_idx;
              switch (size) {
              case 0:
 -                shift = ((insn >> 5) & 3) * 8;
 +                reg_idx = (insn >> 5) & 7;
                  stride = 1;
                  break;
              case 1:
 -                shift = ((insn >> 6) & 1) * 16;
 +                reg_idx = (insn >> 6) & 3;
                  stride = (insn & (1 << 5)) ? 2 : 1;
                  break;
              case 2:
 -                shift = 0;
 +                reg_idx = (insn >> 7) & 1;
                  stride = (insn & (1 << 6)) ? 2 : 1;
                  break;
              default:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
                   */
                  return 1;
              }
 +            tmp = tcg_temp_new_i32();
              addr = tcg_temp_new_i32();
              load_reg_var(s, addr, rn);
              for (reg = 0; reg < nregs; reg++) {
                  if (load) {
 -                    tmp = tcg_temp_new_i32();
 -                    switch (size) {
 -                    case 0:
 -                        gen_aa32_ld8u(s, tmp, addr, get_mem_index(s));
 -                        break;
 -                    case 1:
 -                        gen_aa32_ld16u(s, tmp, addr, get_mem_index(s));
 -                        break;
 -                    case 2:
 -                        gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
 -                        break;
 -                    default: /* Avoid compiler warnings.  */
 -                        abort();
 -                    }
 -                    if (size != 2) {
 -                        tmp2 = neon_load_reg(rd, pass);
 -                        tcg_gen_deposit_i32(tmp, tmp2, tmp,
 -                                            shift, size ? 16 : 8);
 -                        tcg_temp_free_i32(tmp2);
 -                    }
 -                    neon_store_reg(rd, pass, tmp);
 +                    gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 +                                    s->be_data | size);
 +                    neon_store_element(rd, reg_idx, size, tmp);
                  } else { /* Store */
 -                    tmp = neon_load_reg(rd, pass);
 -                    if (shift)
 -                        tcg_gen_shri_i32(tmp, tmp, shift);
 -                    switch (size) {
 -                    case 0:
 -                        gen_aa32_st8(s, tmp, addr, get_mem_index(s));
 -                        break;
 -                    case 1:
 -                        gen_aa32_st16(s, tmp, addr, get_mem_index(s));
 -                        break;
 -                    case 2:
 -                        gen_aa32_st32(s, tmp, addr, get_mem_index(s));
 -                        break;
 -                    }
 -                    tcg_temp_free_i32(tmp);
 +                    neon_load_element(tmp, rd, reg_idx, size);
 +                    gen_aa32_st_i32(s, tmp, addr, get_mem_index(s),
 +                                    s->be_data | size);
                  }
                  rd += stride;
                  tcg_gen_addi_i32(addr, addr, 1 << size);
              }
              tcg_temp_free_i32(addr);
 +            tcg_temp_free_i32(tmp);
              stride = nregs * (1 << size);
          }
      }
+-    if (out_pa == ARMSS_NonSecure && in_pa == ARMSS_Secure &&
+-        (env->cp15.scr_el3 & SCR_SIF)) {
+-        return prot_rw;
++    if (in_pa != out_pa) {
++        switch (in_pa) {
++        case ARMSS_Root:
++            /*
++             * R_ZWRVD: permission fault for insn fetched from non-Root,
++             * I_WWBFB: SIF has no effect in EL3.
++             */
++            return prot_rw;
++        case ARMSS_Realm:
++            /*
++             * R_PKTDS: permission fault for insn fetched from non-Realm,
++             * for Realm EL2 or EL2&0.  The corresponding fault for EL1&0
++             * happens during any stage2 translation.
++             */
++            switch (mmu_idx) {
++            case ARMMMUIdx_E2:
++            case ARMMMUIdx_E20_0:
++            case ARMMMUIdx_E20_2:
++            case ARMMMUIdx_E20_2_PAN:
++                return prot_rw;
++            default:
++                break;
++            }
++            break;
++        case ARMSS_Secure:
++            if (env->cp15.scr_el3 & SCR_SIF) {
++                return prot_rw;
++            }
++            break;
++        default:
++            /* Input NonSecure must have output NonSecure. */
++            g_assert_not_reached();
++        }
+     }
+     /* TODO have_wxn should be replaced with
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
+         /*
+          * R_GYNXY: For stage2 in Realm security state, bit 55 is NS.
+          * The bit remains ignored for other security states.
++         * R_YMCSL: Executing an insn fetched from non-Realm causes
++         * a stage2 permission fault.
+          */
+         if (out_space == ARMSS_Realm && extract64(attrs, 55, 1)) {
+             out_space = ARMSS_NonSecure;
++            result->f.prot = get_S2prot_noexecute(ap);
++        } else {
++            xn = extract64(attrs, 53, 2);
++            result->f.prot = get_S2prot(env, ap, xn, s1_is_el0);
+         }
+-        xn = extract64(attrs, 53, 2);
+-        result->f.prot = get_S2prot(env, ap, xn, s1_is_el0);
+     } else {
+         int nse, ns = extract32(attrs, 5, 1);
+         switch (out_space) {
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 37/45] target/arm: Use gvec for NEON_3R_VTST_VCEQ, NEON_3R_VCGT, NEON_3R_VCGE
+[PULL 14/26] target/arm: Use get_phys_addr_with_struct in S1_ptw_translate
 From: Richard Henderson <richard.henderson@linaro.org>
-Move cmtst_op expanders from translate-a64.c.
+Do not provide a fast-path for physical addresses,
 as those will need to be validated for GPC.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-17-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-15-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h     |  2 +
+ target/arm/ptw.c | 44 +++++++++++++++++---------------------------
- target/arm/translate-a64.c | 38 ------------------
+file changed, 17 insertions(+), 27 deletions(-)
  target/arm/translate.c     | 81 +++++++++++++++++++++++++++-----------
 files changed, 60 insertions(+), 61 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/ptw.c
-+++ b/target/arm/translate.h
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 bit_op;
+@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
- extern const GVecGen3 bif_op;
+          * From gdbstub, do not use softmmu so that we don't modify the
- extern const GVecGen3 mla_op[4];
+          * state of the cpu at all, including softmmu tlb contents.
- extern const GVecGen3 mls_op[4];
+          */
-+extern const GVecGen3 cmtst_op[4];
+-        if (regime_is_stage2(s2_mmu_idx)) {
- extern const GVecGen2i ssra_op[4];
+-            S1Translate s2ptw = {
- extern const GVecGen2i usra_op[4];
+-                .in_mmu_idx = s2_mmu_idx,
- extern const GVecGen2i sri_op[4];
+-                .in_ptw_idx = ptw_idx_for_stage_2(env, s2_mmu_idx),
- extern const GVecGen2i sli_op[4];
+-                .in_secure = s2_mmu_idx == ARMMMUIdx_Stage2_S,
-+void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
+-                .in_space = (s2_mmu_idx == ARMMMUIdx_Stage2_S ? ARMSS_Secure
+-                             : space == ARMSS_Realm ? ARMSS_Realm
- /*
+-                             : ARMSS_NonSecure),
-  * Forward to the isar_feature_* tests given a DisasContext pointer.
+-                .in_debug = true,
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+-            };
-index XXXXXXX..XXXXXXX 100644
+-            GetPhysAddrResult s2 = { };
---- a/target/arm/translate-a64.c
++        S1Translate s2ptw = {
-+++ b/target/arm/translate-a64.c
++            .in_mmu_idx = s2_mmu_idx,
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn)
++            .in_ptw_idx = ptw_idx_for_stage_2(env, s2_mmu_idx),
-     }
++            .in_secure = s2_mmu_idx == ARMMMUIdx_Stage2_S,
- }
++            .in_space = (s2_mmu_idx == ARMMMUIdx_Stage2_S ? ARMSS_Secure
++                         : space == ARMSS_Realm ? ARMSS_Realm
--/* CMTST : test is "if (X & Y != 0)". */
++                         : ARMSS_NonSecure),
--static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
++            .in_debug = true,
--{
++        };
--    tcg_gen_and_i32(d, a, b);
++        GetPhysAddrResult s2 = { };
--    tcg_gen_setcondi_i32(TCG_COND_NE, d, d, 0);
--    tcg_gen_neg_i32(d, d);
+-            if (get_phys_addr_lpae(env, &s2ptw, addr, MMU_DATA_LOAD,
--}
+-                                   false, &s2, fi)) {
--
+-                goto fail;
--static void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
+-            }
--{
+-            ptw->out_phys = s2.f.phys_addr;
--    tcg_gen_and_i64(d, a, b);
+-            pte_attrs = s2.cacheattrs.attrs;
--    tcg_gen_setcondi_i64(TCG_COND_NE, d, d, 0);
+-            ptw->out_secure = s2.f.attrs.secure;
--    tcg_gen_neg_i64(d, d);
+-            ptw->out_space = s2.f.attrs.space;
--}
+-        } else {
--
+-            /* Regime is physical. */
--static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
+-            ptw->out_phys = addr;
--{
+-            pte_attrs = 0;
--    tcg_gen_and_vec(vece, d, a, b);
+-            ptw->out_secure = s2_mmu_idx == ARMMMUIdx_Phys_S;
--    tcg_gen_dupi_vec(vece, a, 0);
+-            ptw->out_space = (s2_mmu_idx == ARMMMUIdx_Phys_S ? ARMSS_Secure
--    tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
+-                              : space == ARMSS_Realm ? ARMSS_Realm
--}
+-                              : ARMSS_NonSecure);
--
++        if (get_phys_addr_with_struct(env, &s2ptw, addr,
- static void handle_3same_64(DisasContext *s, int opcode, bool u,
++                                      MMU_DATA_LOAD, &s2, fi)) {
-                             TCGv_i64 tcg_rd, TCGv_i64 tcg_rn, TCGv_i64 tcg_rm)
++            goto fail;
  {
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
  /* Integer op subgroup of C3.6.16. */
  static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
  {
 -    static const GVecGen3 cmtst_op[4] = {
 -        { .fni4 = gen_helper_neon_tst_u8,
 -          .fniv = gen_cmtst_vec,
 -          .vece = MO_8 },
 -        { .fni4 = gen_helper_neon_tst_u16,
 -          .fniv = gen_cmtst_vec,
 -          .vece = MO_16 },
 -        { .fni4 = gen_cmtst_i32,
 -          .fniv = gen_cmtst_vec,
 -          .vece = MO_32 },
 -        { .fni8 = gen_cmtst_i64,
 -          .fniv = gen_cmtst_vec,
 -          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -          .vece = MO_64 },
 -    };
 -
      int is_q = extract32(insn, 30, 1);
      int u = extract32(insn, 29, 1);
      int size = extract32(insn, 22, 2);
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ const GVecGen3 mls_op[4] = {
        .vece = MO_64 },
  };
 +/* CMTST : test is "if (X & Y != 0)". */
 +static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    tcg_gen_and_i32(d, a, b);
 +    tcg_gen_setcondi_i32(TCG_COND_NE, d, d, 0);
 +    tcg_gen_neg_i32(d, d);
 +}
 +
 +void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    tcg_gen_and_i64(d, a, b);
 +    tcg_gen_setcondi_i64(TCG_COND_NE, d, d, 0);
 +    tcg_gen_neg_i64(d, d);
 +}
 +
 +static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    tcg_gen_and_vec(vece, d, a, b);
 +    tcg_gen_dupi_vec(vece, a, 0);
 +    tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
 +}
 +
 +const GVecGen3 cmtst_op[4] = {
 +    { .fni4 = gen_helper_neon_tst_u8,
 +      .fniv = gen_cmtst_vec,
 +      .vece = MO_8 },
 +    { .fni4 = gen_helper_neon_tst_u16,
 +      .fniv = gen_cmtst_vec,
 +      .vece = MO_16 },
 +    { .fni4 = gen_cmtst_i32,
 +      .fniv = gen_cmtst_vec,
 +      .vece = MO_32 },
 +    { .fni8 = gen_cmtst_i64,
 +      .fniv = gen_cmtst_vec,
 +      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +      .vece = MO_64 },
 +};
 +
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
     We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
                             u ? &mls_op[size] : &mla_op[size]);
              return 0;
 +
 +        case NEON_3R_VTST_VCEQ:
 +            if (u) { /* VCEQ */
 +                tcg_gen_gvec_cmp(TCG_COND_EQ, size, rd_ofs, rn_ofs, rm_ofs,
 +                                 vec_size, vec_size);
 +            } else { /* VTST */
 +                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
 +                               vec_size, vec_size, &cmtst_op[size]);
 +            }
 +            return 0;
 +
 +        case NEON_3R_VCGT:
 +            tcg_gen_gvec_cmp(u ? TCG_COND_GTU : TCG_COND_GT, size,
 +                             rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
 +            return 0;
 +
 +        case NEON_3R_VCGE:
 +            tcg_gen_gvec_cmp(u ? TCG_COND_GEU : TCG_COND_GE, size,
 +                             rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
 +            return 0;
          }
++        ptw->out_phys = s2.f.phys_addr;
-         if (size == 3) {
++        pte_attrs = s2.cacheattrs.attrs;
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+         ptw->out_host = NULL;
-         case NEON_3R_VQSUB:
+         ptw->out_rw = false;
-             GEN_NEON_INTEGER_OP_ENV(qsub);
++        ptw->out_secure = s2.f.attrs.secure;
-             break;
++        ptw->out_space = s2.f.attrs.space;
--        case NEON_3R_VCGT:
+     } else {
--            GEN_NEON_INTEGER_OP(cgt);
+ #ifdef CONFIG_TCG
--            break;
+         CPUTLBEntryFull *full;
 -        case NEON_3R_VCGE:
 -            GEN_NEON_INTEGER_OP(cge);
 -            break;
          case NEON_3R_VSHL:
              GEN_NEON_INTEGER_OP(shl);
              break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tmp2 = neon_load_reg(rd, pass);
              gen_neon_add(size, tmp, tmp2);
              break;
 -        case NEON_3R_VTST_VCEQ:
 -            if (!u) { /* VTST */
 -                switch (size) {
 -                case 0: gen_helper_neon_tst_u8(tmp, tmp, tmp2); break;
 -                case 1: gen_helper_neon_tst_u16(tmp, tmp, tmp2); break;
 -                case 2: gen_helper_neon_tst_u32(tmp, tmp, tmp2); break;
 -                default: abort();
 -                }
 -            } else { /* VCEQ */
 -                switch (size) {
 -                case 0: gen_helper_neon_ceq_u8(tmp, tmp, tmp2); break;
 -                case 1: gen_helper_neon_ceq_u16(tmp, tmp, tmp2); break;
 -                case 2: gen_helper_neon_ceq_u32(tmp, tmp, tmp2); break;
 -                default: abort();
 -                }
 -            }
 -            break;
          case NEON_3R_VMUL:
              /* VMUL.P8; other cases already eliminated.  */
              gen_helper_neon_mul_p8(tmp, tmp, tmp2);
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 25/45] target/arm: Promote consecutive memory ops for aa64
+[PULL 15/26] target/arm: Move s1_is_el0 into S1Translate
 From: Richard Henderson <richard.henderson@linaro.org>
-For a sequence of loads or stores from a single register,
+Instead of passing this to get_phys_addr_lpae, stash it
-little-endian operations can be promoted to an 8-byte op.
+in the S1Translate structure.
 This can reduce the number of operations by a factor of 8.
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-5-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-16-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 66 +++++++++++++++++++++++---------------
+ target/arm/ptw.c | 27 ++++++++++++---------------
-file changed, 40 insertions(+), 26 deletions(-)
+file changed, 12 insertions(+), 15 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/ptw.c
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ static void write_vec_element_i32(DisasContext *s, TCGv_i32 tcg_src,
+@@ -XXX,XX +XXX,XX @@ typedef struct S1Translate {
+     ARMSecuritySpace in_space;
- /* Store from vector register to memory */
+     bool in_secure;
- static void do_vec_st(DisasContext *s, int srcidx, int element,
+     bool in_debug;
--                      TCGv_i64 tcg_addr, int size)
++    /*
-+                      TCGv_i64 tcg_addr, int size, TCGMemOp endian)
++     * If this is stage 2 of a stage 1+2 page table walk, then this must
 +     * be true if stage 1 is an EL0 access; otherwise this is ignored.
 +     * Stage 2 is indicated by in_mmu_idx set to ARMMMUIdx_Stage2{,_S}.
 +     */
 +    bool in_s1_is_el0;
      bool out_secure;
      bool out_rw;
      bool out_be;
@@ -XXX,XX +XXX,XX @@ typedef struct S1Translate {
  } S1Translate;
  static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
 -                               uint64_t address,
 -                               MMUAccessType access_type, bool s1_is_el0,
 +                               uint64_t address, MMUAccessType access_type,
                                 GetPhysAddrResult *result, ARMMMUFaultInfo *fi);
  static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
@@ -XXX,XX +XXX,XX @@ static int check_s2_mmu_setup(ARMCPU *cpu, bool is_aa64, uint64_t tcr,
   * @ptw: Current and next stage parameters for the walk.
   * @address: virtual address to get physical address for
   * @access_type: MMU_DATA_LOAD, MMU_DATA_STORE or MMU_INST_FETCH
 - * @s1_is_el0: if @ptw->in_mmu_idx is ARMMMUIdx_Stage2
 - *             (so this is a stage 2 page table walk),
 - *             must be true if this is stage 2 of a stage 1+2
 - *             walk for an EL0 access. If @mmu_idx is anything else,
 - *             @s1_is_el0 is ignored.
   * @result: set on translation success,
   * @fi: set to fault info if the translation fails
   */
  static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
                                 uint64_t address,
 -                               MMUAccessType access_type, bool s1_is_el0,
 +                               MMUAccessType access_type,
                                 GetPhysAddrResult *result, ARMMMUFaultInfo *fi)
  {
--    TCGMemOp memop = s->be_data + size;
+     ARMCPU *cpu = env_archcpu(env);
-     TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
+             result->f.prot = get_S2prot_noexecute(ap);
-     read_vec_element(s, tcg_tmp, srcidx, element, size);
+         } else {
--    tcg_gen_qemu_st_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
+             xn = extract64(attrs, 53, 2);
-+    tcg_gen_qemu_st_i64(tcg_tmp, tcg_addr, get_mem_index(s), endian | size);
+-            result->f.prot = get_S2prot(env, ap, xn, s1_is_el0);
++            result->f.prot = get_S2prot(env, ap, xn, ptw->in_s1_is_el0);
-     tcg_temp_free_i64(tcg_tmp);
+         }
- }
+     } else {
+         int nse, ns = extract32(attrs, 5, 1);
- /* Load from memory to vector register */
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
- static void do_vec_ld(DisasContext *s, int destidx, int element,
+     bool ret, ipa_secure;
--                      TCGv_i64 tcg_addr, int size)
+     ARMCacheAttrs cacheattrs1;
-+                      TCGv_i64 tcg_addr, int size, TCGMemOp endian)
+     ARMSecuritySpace ipa_space;
- {
+-    bool is_el0;
--    TCGMemOp memop = s->be_data + size;
+     uint64_t hcr;
-     TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+     ret = get_phys_addr_with_struct(env, ptw, address, access_type, result, fi);
--    tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
-+    tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr, get_mem_index(s), endian | size);
+     ipa_secure = result->f.attrs.secure;
-     write_vec_element(s, tcg_tmp, destidx, element, size);
+     ipa_space = result->f.attrs.space;
-     tcg_temp_free_i64(tcg_tmp);
+-    is_el0 = ptw->in_mmu_idx == ARMMMUIdx_Stage1_E0;
-@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
++    ptw->in_s1_is_el0 = ptw->in_mmu_idx == ARMMMUIdx_Stage1_E0;
-     bool is_postidx = extract32(insn, 23, 1);
+     ptw->in_mmu_idx = ipa_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
-     bool is_q = extract32(insn, 30, 1);
+     ptw->in_secure = ipa_secure;
-     TCGv_i64 tcg_addr, tcg_rn, tcg_ebytes;
+     ptw->in_space = ipa_space;
-+    TCGMemOp endian = s->be_data;
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
+         ret = get_phys_addr_pmsav8(env, ipa, access_type,
--    int ebytes = 1 << size;
+                                    ptw->in_mmu_idx, is_secure, result, fi);
--    int elements = (is_q ? 128 : 64) / (8 << size);
+     } else {
-+    int ebytes;   /* bytes per element */
+-        ret = get_phys_addr_lpae(env, ptw, ipa, access_type,
-+    int elements; /* elements per vector */
+-                                 is_el0, result, fi);
-     int rpt;    /* num iterations */
++        ret = get_phys_addr_lpae(env, ptw, ipa, access_type, result, fi);
      int selem;  /* structure elements */
      int r;
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
          gen_check_sp_alignment(s);
      }
+     fi->s2addr = ipa;
-+    /* For our purposes, bytes are always little-endian.  */
-+    if (size == 0) {
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
 +        endian = MO_LE;
 +    }
 +
 +    /* Consecutive little-endian elements from a single register
 +     * can be promoted to a larger little-endian operation.
 +     */
 +    if (selem == 1 && endian == MO_LE) {
 +        size = 3;
 +    }
 +    ebytes = 1 << size;
 +    elements = (is_q ? 16 : 8) / ebytes;
 +
      tcg_rn = cpu_reg_sp(s, rn);
      tcg_addr = tcg_temp_new_i64();
      tcg_gen_mov_i64(tcg_addr, tcg_rn);
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
      for (r = 0; r < rpt; r++) {
          int e;
          for (e = 0; e < elements; e++) {
 -            int tt = (rt + r) % 32;
              int xs;
              for (xs = 0; xs < selem; xs++) {
 +                int tt = (rt + r + xs) % 32;
                  if (is_store) {
 -                    do_vec_st(s, tt, e, tcg_addr, size);
 +                    do_vec_st(s, tt, e, tcg_addr, size, endian);
                  } else {
 -                    do_vec_ld(s, tt, e, tcg_addr, size);
 -
 -                    /* For non-quad operations, setting a slice of the low
 -                     * 64 bits of the register clears the high 64 bits (in
 -                     * the ARM ARM pseudocode this is implicit in the fact
 -                     * that 'rval' is a 64 bit wide variable).
 -                     * For quad operations, we might still need to zero the
 -                     * high bits of SVE.  We optimize by noticing that we only
 -                     * need to do this the first time we touch a register.
 -                     */
 -                    if (e == 0 && (r == 0 || xs == selem - 1)) {
 -                        clear_vec_high(s, is_q, tt);
 -                    }
 +                    do_vec_ld(s, tt, e, tcg_addr, size, endian);
                  }
                  tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
 -                tt = (tt + 1) % 32;
              }
          }
      }
-+    if (!is_store) {
+     if (regime_using_lpae_format(env, mmu_idx)) {
-+        /* For non-quad operations, setting a slice of the low
+-        return get_phys_addr_lpae(env, ptw, address, access_type, false,
-+         * 64 bits of the register clears the high 64 bits (in
+-                                  result, fi);
-+         * the ARM ARM pseudocode this is implicit in the fact
++        return get_phys_addr_lpae(env, ptw, address, access_type, result, fi);
-+         * that 'rval' is a 64 bit wide variable).
+     } else if (arm_feature(env, ARM_FEATURE_V7) ||
-+         * For quad operations, we might still need to zero the
+                regime_sctlr(env, mmu_idx) & SCTLR_XP) {
-+         * high bits of SVE.
+         return get_phys_addr_v6(env, ptw, address, access_type, result, fi);
 +         */
 +        for (r = 0; r < rpt * selem; r++) {
 +            int tt = (rt + r) % 32;
 +            clear_vec_high(s, is_q, tt);
 +        }
 +    }
 +
      if (is_postidx) {
          int rm = extract32(insn, 16, 5);
          if (rm == 31) {
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
          } else {
              /* Load/store one element per register */
              if (is_load) {
 -                do_vec_ld(s, rt, index, tcg_addr, scale);
 +                do_vec_ld(s, rt, index, tcg_addr, scale, s->be_data);
              } else {
 -                do_vec_st(s, rt, index, tcg_addr, scale);
 +                do_vec_st(s, rt, index, tcg_addr, scale, s->be_data);
              }
          }
          tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 36/45] target/arm: Use gvec for NEON_3R_VML
+[PULL 16/26] target/arm: Use get_phys_addr_with_struct for stage2
 From: Richard Henderson <richard.henderson@linaro.org>
-Move mla_op and mls_op expanders from translate-a64.c.
+This fixes a bug in which we failed to initialize
 the result attributes properly after the memset.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-16-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-17-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h     |   2 +
+ target/arm/ptw.c | 11 +----------
- target/arm/translate-a64.c | 106 -----------------------------
+file changed, 1 insertion(+), 10 deletions(-)
  target/arm/translate.c     | 134 ++++++++++++++++++++++++++++++++-----
 files changed, 120 insertions(+), 122 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/ptw.c
-+++ b/target/arm/translate.h
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 get_ahp_flag(void)
+@@ -XXX,XX +XXX,XX @@ typedef struct S1Translate {
- extern const GVecGen3 bsl_op;
+     void *out_host;
- extern const GVecGen3 bit_op;
+ } S1Translate;
- extern const GVecGen3 bif_op;
-+extern const GVecGen3 mla_op[4];
+-static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
-+extern const GVecGen3 mls_op[4];
+-                               uint64_t address, MMUAccessType access_type,
- extern const GVecGen2i ssra_op[4];
+-                               GetPhysAddrResult *result, ARMMMUFaultInfo *fi);
  extern const GVecGen2i usra_op[4];
  extern const GVecGen2i sri_op[4];
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
      }
  }
 -static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 -{
 -    gen_helper_neon_mul_u8(a, a, b);
 -    gen_helper_neon_add_u8(d, d, a);
 -}
 -
--static void gen_mla16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
+ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
--{
+                                       target_ulong address,
--    gen_helper_neon_mul_u16(a, a, b);
+                                       MMUAccessType access_type,
--    gen_helper_neon_add_u16(d, d, a);
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
--}
+     cacheattrs1 = result->cacheattrs;
--
+     memset(result, 0, sizeof(*result));
--static void gen_mla32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
--{
+-    if (arm_feature(env, ARM_FEATURE_PMSA)) {
--    tcg_gen_mul_i32(a, a, b);
+-        ret = get_phys_addr_pmsav8(env, ipa, access_type,
--    tcg_gen_add_i32(d, d, a);
+-                                   ptw->in_mmu_idx, is_secure, result, fi);
--}
+-    } else {
--
+-        ret = get_phys_addr_lpae(env, ptw, ipa, access_type, result, fi);
--static void gen_mla64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
+-    }
--{
++    ret = get_phys_addr_with_struct(env, ptw, ipa, access_type, result, fi);
--    tcg_gen_mul_i64(a, a, b);
+     fi->s2addr = ipa;
--    tcg_gen_add_i64(d, d, a);
--}
+     /* Combine the S1 and S2 perms.  */
 -
 -static void gen_mla_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 -{
 -    tcg_gen_mul_vec(vece, a, a, b);
 -    tcg_gen_add_vec(vece, d, d, a);
 -}
 -
 -static void gen_mls8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 -{
 -    gen_helper_neon_mul_u8(a, a, b);
 -    gen_helper_neon_sub_u8(d, d, a);
 -}
 -
 -static void gen_mls16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 -{
 -    gen_helper_neon_mul_u16(a, a, b);
 -    gen_helper_neon_sub_u16(d, d, a);
 -}
 -
 -static void gen_mls32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 -{
 -    tcg_gen_mul_i32(a, a, b);
 -    tcg_gen_sub_i32(d, d, a);
 -}
 -
 -static void gen_mls64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 -{
 -    tcg_gen_mul_i64(a, a, b);
 -    tcg_gen_sub_i64(d, d, a);
 -}
 -
 -static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 -{
 -    tcg_gen_mul_vec(vece, a, a, b);
 -    tcg_gen_sub_vec(vece, d, d, a);
 -}
 -
  /* Integer op subgroup of C3.6.16. */
  static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
  {
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
            .prefer_i64 = TCG_TARGET_REG_BITS == 64,
            .vece = MO_64 },
      };
 -    static const GVecGen3 mla_op[4] = {
 -        { .fni4 = gen_mla8_i32,
 -          .fniv = gen_mla_vec,
 -          .opc = INDEX_op_mul_vec,
 -          .load_dest = true,
 -          .vece = MO_8 },
 -        { .fni4 = gen_mla16_i32,
 -          .fniv = gen_mla_vec,
 -          .opc = INDEX_op_mul_vec,
 -          .load_dest = true,
 -          .vece = MO_16 },
 -        { .fni4 = gen_mla32_i32,
 -          .fniv = gen_mla_vec,
 -          .opc = INDEX_op_mul_vec,
 -          .load_dest = true,
 -          .vece = MO_32 },
 -        { .fni8 = gen_mla64_i64,
 -          .fniv = gen_mla_vec,
 -          .opc = INDEX_op_mul_vec,
 -          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -          .load_dest = true,
 -          .vece = MO_64 },
 -    };
 -    static const GVecGen3 mls_op[4] = {
 -        { .fni4 = gen_mls8_i32,
 -          .fniv = gen_mls_vec,
 -          .opc = INDEX_op_mul_vec,
 -          .load_dest = true,
 -          .vece = MO_8 },
 -        { .fni4 = gen_mls16_i32,
 -          .fniv = gen_mls_vec,
 -          .opc = INDEX_op_mul_vec,
 -          .load_dest = true,
 -          .vece = MO_16 },
 -        { .fni4 = gen_mls32_i32,
 -          .fniv = gen_mls_vec,
 -          .opc = INDEX_op_mul_vec,
 -          .load_dest = true,
 -          .vece = MO_32 },
 -        { .fni8 = gen_mls64_i64,
 -          .fniv = gen_mls_vec,
 -          .opc = INDEX_op_mul_vec,
 -          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -          .load_dest = true,
 -          .vece = MO_64 },
 -    };
      int is_q = extract32(insn, 30, 1);
      int u = extract32(insn, 29, 1);
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
  #define NEON_3R_VABA 15
  #define NEON_3R_VADD_VSUB 16
  #define NEON_3R_VTST_VCEQ 17
 -#define NEON_3R_VML 18 /* VMLA, VMLAL, VMLS, VMLSL */
 +#define NEON_3R_VML 18 /* VMLA, VMLS */
  #define NEON_3R_VMUL 19
  #define NEON_3R_VPMAX 20
  #define NEON_3R_VPMIN 21
@@ -XXX,XX +XXX,XX @@ const GVecGen2i sli_op[4] = {
        .vece = MO_64 },
  };
 +static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    gen_helper_neon_mul_u8(a, a, b);
 +    gen_helper_neon_add_u8(d, d, a);
 +}
 +
 +static void gen_mls8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    gen_helper_neon_mul_u8(a, a, b);
 +    gen_helper_neon_sub_u8(d, d, a);
 +}
 +
 +static void gen_mla16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    gen_helper_neon_mul_u16(a, a, b);
 +    gen_helper_neon_add_u16(d, d, a);
 +}
 +
 +static void gen_mls16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    gen_helper_neon_mul_u16(a, a, b);
 +    gen_helper_neon_sub_u16(d, d, a);
 +}
 +
 +static void gen_mla32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    tcg_gen_mul_i32(a, a, b);
 +    tcg_gen_add_i32(d, d, a);
 +}
 +
 +static void gen_mls32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    tcg_gen_mul_i32(a, a, b);
 +    tcg_gen_sub_i32(d, d, a);
 +}
 +
 +static void gen_mla64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    tcg_gen_mul_i64(a, a, b);
 +    tcg_gen_add_i64(d, d, a);
 +}
 +
 +static void gen_mls64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    tcg_gen_mul_i64(a, a, b);
 +    tcg_gen_sub_i64(d, d, a);
 +}
 +
 +static void gen_mla_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    tcg_gen_mul_vec(vece, a, a, b);
 +    tcg_gen_add_vec(vece, d, d, a);
 +}
 +
 +static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    tcg_gen_mul_vec(vece, a, a, b);
 +    tcg_gen_sub_vec(vece, d, d, a);
 +}
 +
 +/* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
 + * these tables are shared with AArch64 which does support them.
 + */
 +const GVecGen3 mla_op[4] = {
 +    { .fni4 = gen_mla8_i32,
 +      .fniv = gen_mla_vec,
 +      .opc = INDEX_op_mul_vec,
 +      .load_dest = true,
 +      .vece = MO_8 },
 +    { .fni4 = gen_mla16_i32,
 +      .fniv = gen_mla_vec,
 +      .opc = INDEX_op_mul_vec,
 +      .load_dest = true,
 +      .vece = MO_16 },
 +    { .fni4 = gen_mla32_i32,
 +      .fniv = gen_mla_vec,
 +      .opc = INDEX_op_mul_vec,
 +      .load_dest = true,
 +      .vece = MO_32 },
 +    { .fni8 = gen_mla64_i64,
 +      .fniv = gen_mla_vec,
 +      .opc = INDEX_op_mul_vec,
 +      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +      .load_dest = true,
 +      .vece = MO_64 },
 +};
 +
 +const GVecGen3 mls_op[4] = {
 +    { .fni4 = gen_mls8_i32,
 +      .fniv = gen_mls_vec,
 +      .opc = INDEX_op_mul_vec,
 +      .load_dest = true,
 +      .vece = MO_8 },
 +    { .fni4 = gen_mls16_i32,
 +      .fniv = gen_mls_vec,
 +      .opc = INDEX_op_mul_vec,
 +      .load_dest = true,
 +      .vece = MO_16 },
 +    { .fni4 = gen_mls32_i32,
 +      .fniv = gen_mls_vec,
 +      .opc = INDEX_op_mul_vec,
 +      .load_dest = true,
 +      .vece = MO_32 },
 +    { .fni8 = gen_mls64_i64,
 +      .fniv = gen_mls_vec,
 +      .opc = INDEX_op_mul_vec,
 +      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +      .load_dest = true,
 +      .vece = MO_64 },
 +};
 +
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
     We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  return 0;
              }
              break;
 +
 +        case NEON_3R_VML: /* VMLA, VMLS */
 +            tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
 +                           u ? &mls_op[size] : &mla_op[size]);
 +            return 0;
          }
 +
          if (size == 3) {
              /* 64-bit element instructions. */
              for (pass = 0; pass < (q ? 2 : 1); pass++) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  }
              }
              break;
 -        case NEON_3R_VML: /* VMLA, VMLAL, VMLS,VMLSL */
 -            switch (size) {
 -            case 0: gen_helper_neon_mul_u8(tmp, tmp, tmp2); break;
 -            case 1: gen_helper_neon_mul_u16(tmp, tmp, tmp2); break;
 -            case 2: tcg_gen_mul_i32(tmp, tmp, tmp2); break;
 -            default: abort();
 -            }
 -            tcg_temp_free_i32(tmp2);
 -            tmp2 = neon_load_reg(rd, pass);
 -            if (u) { /* VMLS */
 -                gen_neon_rsb(size, tmp, tmp2);
 -            } else { /* VMLA */
 -                gen_neon_add(size, tmp, tmp2);
 -            }
 -            break;
          case NEON_3R_VMUL:
              /* VMUL.P8; other cases already eliminated.  */
              gen_helper_neon_mul_p8(tmp, tmp, tmp2);
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 27/45] target/arm: Use gvec for NEON VDUP
+[PULL 17/26] target/arm: Add GPC syndrome
 From: Richard Henderson <richard.henderson@linaro.org>
-Also introduces neon_element_offset to find the env offset
+The function takes the fields as filled in by
-of a specific element within a neon register.
+the Arm ARM pseudocode for TakeGPCException.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-7-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-18-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 63 ++++++++++++++++++++++++------------------
+ target/arm/syndrome.h | 10 ++++++++++
-file changed, 36 insertions(+), 27 deletions(-)
+file changed, 10 insertions(+)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/syndrome.h
-+++ b/target/arm/translate.c
++++ b/target/arm/syndrome.h
-@@ -XXX,XX +XXX,XX @@ neon_reg_offset (int reg, int n)
+@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
-     return vfp_reg_offset(0, sreg);
+     EC_SVEACCESSTRAP          = 0x19,
      EC_ERETTRAP               = 0x1a,
      EC_SMETRAP                = 0x1d,
 +    EC_GPC                    = 0x1e,
      EC_INSNABORT              = 0x20,
      EC_INSNABORT_SAME_EL      = 0x21,
      EC_PCALIGNMENT            = 0x22,
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_bxjtrap(int cv, int cond, int rm)
          (cv << 24) | (cond << 20) | rm;
  }
-+/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
++static inline uint32_t syn_gpc(int s2ptw, int ind, int gpcsc,
-+ * where 0 is the least significant end of the register.
++                               int cm, int s1ptw, int wnr, int fsc)
 + */
 +static inline long
 +neon_element_offset(int reg, int element, TCGMemOp size)
 +{
-+    int element_size = 1 << size;
++    /* TODO: FEAT_NV2 adds VNCR */
-+    int ofs = element * element_size;
++    return (EC_GPC << ARM_EL_EC_SHIFT) | ARM_EL_IL | (s2ptw << 21)
-+#ifdef HOST_WORDS_BIGENDIAN
++            | (ind << 20) | (gpcsc << 14) | (cm << 8) | (s1ptw << 7)
-+    /* Calculate the offset assuming fully little-endian,
++            | (wnr << 6) | fsc;
 +     * then XOR to account for the order of the 8-byte units.
 +     */
 +    if (element_size < 8) {
 +        ofs ^= 8 - element_size;
 +    }
 +#endif
 +    return neon_reg_offset(reg, 0) + ofs;
 +}
 +
- static TCGv_i32 neon_load_reg(int reg, int pass)
+ static inline uint32_t syn_insn_abort(int same_el, int ea, int s1ptw, int fsc)
  {
-     TCGv_i32 tmp = tcg_temp_new_i32();
+     return (EC_INSNABORT << ARM_EL_EC_SHIFT) | (same_el << ARM_EL_EC_SHIFT)
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
                      tmp = load_reg(s, rd);
                      if (insn & (1 << 23)) {
                          /* VDUP */
 -                        if (size == 0) {
 -                            gen_neon_dup_u8(tmp, 0);
 -                        } else if (size == 1) {
 -                            gen_neon_dup_low16(tmp);
 -                        }
 -                        for (n = 0; n <= pass * 2; n++) {
 -                            tmp2 = tcg_temp_new_i32();
 -                            tcg_gen_mov_i32(tmp2, tmp);
 -                            neon_store_reg(rn, n, tmp2);
 -                        }
 -                        neon_store_reg(rn, n, tmp);
 +                        int vec_size = pass ? 16 : 8;
 +                        tcg_gen_gvec_dup_i32(size, neon_reg_offset(rn, 0),
 +                                             vec_size, vec_size, tmp);
 +                        tcg_temp_free_i32(tmp);
                      } else {
                          /* VMOV */
                          switch (size) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  tcg_temp_free_i32(tmp);
              } else if ((insn & 0x380) == 0) {
                  /* VDUP */
 +                int element;
 +                TCGMemOp size;
 +
                  if ((insn & (7 << 16)) == 0 || (q && (rd & 1))) {
                      return 1;
                  }
 -                if (insn & (1 << 19)) {
 -                    tmp = neon_load_reg(rm, 1);
 -                } else {
 -                    tmp = neon_load_reg(rm, 0);
 -                }
                  if (insn & (1 << 16)) {
 -                    gen_neon_dup_u8(tmp, ((insn >> 17) & 3) * 8);
 +                    size = MO_8;
 +                    element = (insn >> 17) & 7;
                  } else if (insn & (1 << 17)) {
 -                    if ((insn >> 18) & 1)
 -                        gen_neon_dup_high16(tmp);
 -                    else
 -                        gen_neon_dup_low16(tmp);
 +                    size = MO_16;
 +                    element = (insn >> 18) & 3;
 +                } else {
 +                    size = MO_32;
 +                    element = (insn >> 19) & 1;
                  }
 -                for (pass = 0; pass < (q ? 4 : 2); pass++) {
 -                    tmp2 = tcg_temp_new_i32();
 -                    tcg_gen_mov_i32(tmp2, tmp);
 -                    neon_store_reg(rd, pass, tmp2);
 -                }
 -                tcg_temp_free_i32(tmp);
 +                tcg_gen_gvec_dup_mem(size, neon_reg_offset(rd, 0),
 +                                     neon_element_offset(rm, element, size),
 +                                     q ? 16 : 8, q ? 16 : 8);
              } else {
                  return 1;
              }
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 02/45] target/arm: Add support for VCPU event states
+[PULL 18/26] target/arm: Implement GPC exceptions
-From: Dongjiu Geng <gengdongjiu@huawei.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-This patch extends the qemu-kvm state sync logic with support for
+Handle GPC Fault types in arm_deliver_fault, reporting as
-KVM_GET/SET_VCPU_EVENTS, giving access to yet missing SError exception.
+either a GPC exception at EL3, or falling through to insn
-And also it can support the exception state migration.
+or data aborts at various exception levels.
-The SError exception states include SError pending state and ESR value,
-the kvm_put/get_vcpu_events() will be called when set or get system
-registers. When do migration, if source machine has SError pending,
-QEMU will do this migration regardless whether the target machine supports
-to specify guest ESR value, because if target machine does not support that,
-it can also inject the SError with zero ESR value.
-Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
-Reviewed-by: Andrew Jones <drjones@redhat.com>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 1538067351-23931-3-git-send-email-gengdongjiu@huawei.com
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20230620124418.805717-19-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h     |  7 ++++++
+ target/arm/cpu.h            |  1 +
- target/arm/kvm_arm.h | 24 ++++++++++++++++++
+ target/arm/internals.h      | 27 +++++++++++
- target/arm/kvm.c     | 60 ++++++++++++++++++++++++++++++++++++++++++++
+ target/arm/helper.c         |  5 ++
- target/arm/kvm32.c   | 13 ++++++++++
+ target/arm/tcg/tlb_helper.c | 96 +++++++++++++++++++++++++++++++++++--
- target/arm/kvm64.c   | 13 ++++++++++
+files changed, 126 insertions(+), 3 deletions(-)
  target/arm/machine.c | 22 ++++++++++++++++
 files changed, 139 insertions(+)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
+@@ -XXX,XX +XXX,XX @@
-          */
+ #define EXCP_UNALIGNED      22   /* v7M UNALIGNED UsageFault */
-     } exception;
+ #define EXCP_DIVBYZERO      23   /* v7M DIVBYZERO UsageFault */
+ #define EXCP_VSERR          24
-+    /* Information associated with an SError */
++#define EXCP_GPC            25   /* v9 Granule Protection Check Fault */
-+    struct {
+ /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
-+        uint8_t pending;
-+        uint8_t has_esr;
+ #define ARMV7M_EXCP_RESET   1
-+        uint64_t esr;
+diff --git a/target/arm/internals.h b/target/arm/internals.h
-+    } serror;
+index XXXXXXX..XXXXXXX 100644
-+
+--- a/target/arm/internals.h
-     /* Thumb-2 EE state.  */
++++ b/target/arm/internals.h
-     uint32_t teecr;
+@@ -XXX,XX +XXX,XX @@ typedef enum ARMFaultType {
-     uint32_t teehbr;
+     ARMFault_ICacheMaint,
-diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
+     ARMFault_QEMU_NSCExec, /* v8M: NS executing in S&NSC memory */
-index XXXXXXX..XXXXXXX 100644
+     ARMFault_QEMU_SFault, /* v8M: SecureFault INVTRAN, INVEP or AUVIOL */
---- a/target/arm/kvm_arm.h
++    ARMFault_GPCFOnWalk,
-+++ b/target/arm/kvm_arm.h
++    ARMFault_GPCFOnOutput,
-@@ -XXX,XX +XXX,XX @@ bool write_kvmstate_to_list(ARMCPU *cpu);
+ } ARMFaultType;
-  */
- void kvm_arm_reset_vcpu(ARMCPU *cpu);
++typedef enum ARMGPCF {
++    GPCF_None,
-+/**
++    GPCF_AddressSize,
-+ * kvm_arm_init_serror_injection:
++    GPCF_Walk,
-+ * @cs: CPUState
++    GPCF_EABT,
-+ *
++    GPCF_Fail,
-+ * Check whether KVM can set guest SError syndrome.
++} ARMGPCF;
-+ */
++
 +void kvm_arm_init_serror_injection(CPUState *cs);
 +
 +/**
 + * kvm_get_vcpu_events:
 + * @cpu: ARMCPU
 + *
 + * Get VCPU related state from kvm.
 + */
 +int kvm_get_vcpu_events(ARMCPU *cpu);
 +
 +/**
 + * kvm_put_vcpu_events:
 + * @cpu: ARMCPU
 + *
 + * Put VCPU related state to kvm.
 + */
 +int kvm_put_vcpu_events(ARMCPU *cpu);
 +
  #ifdef CONFIG_KVM
  /**
-  * kvm_arm_create_scratch_host_vcpu:
+  * ARMMMUFaultInfo: Information describing an ARM MMU Fault
-diff --git a/target/arm/kvm.c b/target/arm/kvm.c
+  * @type: Type of fault
-index XXXXXXX..XXXXXXX 100644
++ * @gpcf: Subtype of ARMFault_GPCFOn{Walk,Output}.
---- a/target/arm/kvm.c
+  * @level: Table walk level (for translation, access flag and permission faults)
-+++ b/target/arm/kvm.c
+  * @domain: Domain of the fault address (for non-LPAE CPUs only)
-@@ -XXX,XX +XXX,XX @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
+  * @s2addr: Address that caused a fault at stage 2
- };
++ * @paddr: physical address that caused a fault for gpc
++ * @paddr_space: physical address space that caused a fault for gpc
- static bool cap_has_mp_state;
+  * @stage2: True if we faulted at stage 2
-+static bool cap_has_inject_serror_esr;
+  * @s1ptw: True if we faulted at stage 2 while doing a stage 1 page-table walk
+  * @s1ns: True if we faulted on a non-secure IPA while in secure state
- static ARMHostCPUFeatures arm_host_cpu_features;
+@@ -XXX,XX +XXX,XX @@ typedef enum ARMFaultType {
+ typedef struct ARMMMUFaultInfo ARMMMUFaultInfo;
-@@ -XXX,XX +XXX,XX @@ int kvm_arm_vcpu_init(CPUState *cs)
+ struct ARMMMUFaultInfo {
-     return kvm_vcpu_ioctl(cs, KVM_ARM_VCPU_INIT, &init);
+     ARMFaultType type;
 +    ARMGPCF gpcf;
      target_ulong s2addr;
 +    target_ulong paddr;
 +    ARMSecuritySpace paddr_space;
      int level;
      int domain;
      bool stage2;
@@ -XXX,XX +XXX,XX @@ static inline uint32_t arm_fi_to_lfsc(ARMMMUFaultInfo *fi)
      case ARMFault_Exclusive:
          fsc = 0x35;
          break;
 +    case ARMFault_GPCFOnWalk:
 +        assert(fi->level >= -1 && fi->level <= 3);
 +        if (fi->level < 0) {
 +            fsc = 0b100011;
 +        } else {
 +            fsc = 0b100100 | fi->level;
 +        }
 +        break;
 +    case ARMFault_GPCFOnOutput:
 +        fsc = 0b101000;
 +        break;
      default:
          /* Other faults can't occur in a context that requires a
           * long-format status code.
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void arm_log_exception(CPUState *cs)
              [EXCP_UNALIGNED] = "v7M UNALIGNED UsageFault",
              [EXCP_DIVBYZERO] = "v7M DIVBYZERO UsageFault",
              [EXCP_VSERR] = "Virtual SERR",
 +            [EXCP_GPC] = "Granule Protection Check",
          };
          if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
      }
      switch (cs->exception_index) {
 +    case EXCP_GPC:
 +        qemu_log_mask(CPU_LOG_INT, "...with MFAR 0x%" PRIx64 "\n",
 +                      env->cp15.mfar_el3);
 +        /* fall through */
      case EXCP_PREFETCH_ABORT:
      case EXCP_DATA_ABORT:
          /*
 diff --git a/target/arm/tcg/tlb_helper.c b/target/arm/tcg/tlb_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/tcg/tlb_helper.c
 +++ b/target/arm/tcg/tlb_helper.c
@@ -XXX,XX +XXX,XX @@ static uint32_t compute_fsr_fsc(CPUARMState *env, ARMMMUFaultInfo *fi,
      return fsr;
  }
-+void kvm_arm_init_serror_injection(CPUState *cs)
++static bool report_as_gpc_exception(ARMCPU *cpu, int current_el,
 +                                    ARMMMUFaultInfo *fi)
 +{
-+    cap_has_inject_serror_esr = kvm_check_extension(cs->kvm_state,
++    bool ret;
-+                                    KVM_CAP_ARM_INJECT_SERROR_ESR);
++
-+}
++    switch (fi->gpcf) {
-+
++    case GPCF_None:
- bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
++        return false;
-                                       int *fdarray,
++    case GPCF_AddressSize:
-                                       struct kvm_vcpu_init *init)
++    case GPCF_Walk:
-@@ -XXX,XX +XXX,XX @@ int kvm_arm_sync_mpstate_to_qemu(ARMCPU *cpu)
++    case GPCF_EABT:
-     return 0;
++        /* R_PYTGX: GPT faults are reported as GPC. */
- }
++        ret = true;
++        break;
-+int kvm_put_vcpu_events(ARMCPU *cpu)
++    case GPCF_Fail:
-+{
++        /*
-+    CPUARMState *env = &cpu->env;
++         * R_BLYPM: A GPF at EL3 is reported as insn or data abort.
-+    struct kvm_vcpu_events events;
++         * R_VBZMW, R_LXHQR: A GPF at EL[0-2] is reported as a GPC
-+    int ret;
++         * if SCR_EL3.GPF is set, otherwise an insn or data abort.
-+
++         */
-+    if (!kvm_has_vcpu_events()) {
++        ret = (cpu->env.cp15.scr_el3 & SCR_GPF) && current_el != 3;
-+        return 0;
++        break;
-+    }
++    default:
-+
++        g_assert_not_reached();
-+    memset(&events, 0, sizeof(events));
++    }
-+    events.exception.serror_pending = env->serror.pending;
++
-+
++    assert(cpu_isar_feature(aa64_rme, cpu));
-+    /* Inject SError to guest with specified syndrome if host kernel
++    assert(fi->type == ARMFault_GPCFOnWalk ||
-+     * supports it, otherwise inject SError without syndrome.
++           fi->type == ARMFault_GPCFOnOutput);
-+     */
++    if (fi->gpcf == GPCF_AddressSize) {
-+    if (cap_has_inject_serror_esr) {
++        assert(fi->level == 0);
-+        events.exception.serror_has_esr = env->serror.has_esr;
++    } else {
-+        events.exception.serror_esr = env->serror.esr;
++        assert(fi->level >= 0 && fi->level <= 1);
 +    }
 +
 +    ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
 +    if (ret) {
 +        error_report("failed to put vcpu events");
 +    }
 +
 +    return ret;
 +}
 +
-+int kvm_get_vcpu_events(ARMCPU *cpu)
++static unsigned encode_gpcsc(ARMMMUFaultInfo *fi)
 +{
-+    CPUARMState *env = &cpu->env;
++    static uint8_t const gpcsc[] = {
-+    struct kvm_vcpu_events events;
++        [GPCF_AddressSize] = 0b000000,
-+    int ret;
++        [GPCF_Walk]        = 0b000100,
-+
++        [GPCF_Fail]        = 0b001100,
-+    if (!kvm_has_vcpu_events()) {
++        [GPCF_EABT]        = 0b010100,
-+        return 0;
++    };
-+    }
++
-+
++    /* Note that we've validated fi->gpcf and fi->level above. */
-+    memset(&events, 0, sizeof(events));
++    return gpcsc[fi->gpcf] | fi->level;
 +    ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_VCPU_EVENTS, &events);
 +    if (ret) {
 +        error_report("failed to get vcpu events");
 +        return ret;
 +    }
 +
 +    env->serror.pending = events.exception.serror_pending;
 +    env->serror.has_esr = events.exception.serror_has_esr;
 +    env->serror.esr = events.exception.serror_esr;
 +
 +    return 0;
 +}
 +
- void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
+ static G_NORETURN
  void arm_deliver_fault(ARMCPU *cpu, vaddr addr,
                         MMUAccessType access_type,
                         int mmu_idx, ARMMMUFaultInfo *fi)
  {
- }
+     CPUARMState *env = &cpu->env;
-diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
+-    int target_el;
-index XXXXXXX..XXXXXXX 100644
++    int target_el = exception_target_el(env);
---- a/target/arm/kvm32.c
++    int current_el = arm_current_el(env);
-+++ b/target/arm/kvm32.c
+     bool same_el;
-@@ -XXX,XX +XXX,XX @@ int kvm_arch_init_vcpu(CPUState *cs)
+     uint32_t syn, exc, fsr, fsc;
 -    target_el = exception_target_el(env);
 +    if (report_as_gpc_exception(cpu, current_el, fi)) {
 +        target_el = 3;
 +
 +        fsr = compute_fsr_fsc(env, fi, target_el, mmu_idx, &fsc);
 +
 +        syn = syn_gpc(fi->stage2 && fi->type == ARMFault_GPCFOnWalk,
 +                      access_type == MMU_INST_FETCH,
 +                      encode_gpcsc(fi), 0, fi->s1ptw,
 +                      access_type == MMU_DATA_STORE, fsc);
 +
 +        env->cp15.mfar_el3 = fi->paddr;
 +        switch (fi->paddr_space) {
 +        case ARMSS_Secure:
 +            break;
 +        case ARMSS_NonSecure:
 +            env->cp15.mfar_el3 |= R_MFAR_NS_MASK;
 +            break;
 +        case ARMSS_Root:
 +            env->cp15.mfar_el3 |= R_MFAR_NSE_MASK;
 +            break;
 +        case ARMSS_Realm:
 +            env->cp15.mfar_el3 |= R_MFAR_NSE_MASK | R_MFAR_NS_MASK;
 +            break;
 +        default:
 +            g_assert_not_reached();
 +        }
 +
 +        exc = EXCP_GPC;
 +        goto do_raise;
 +    }
 +
 +    /* If SCR_EL3.GPF is unset, GPF may still be routed to EL2. */
 +    if (fi->gpcf == GPCF_Fail && target_el < 2) {
 +        if (arm_hcr_el2_eff(env) & HCR_GPF) {
 +            target_el = 2;
 +        }
 +    }
 +
      if (fi->stage2) {
          target_el = 2;
          env->cp15.hpfar_el2 = extract64(fi->s2addr, 12, 47) << 4;
@@ -XXX,XX +XXX,XX @@ void arm_deliver_fault(ARMCPU *cpu, vaddr addr,
              env->cp15.hpfar_el2 |= HPFAR_NS;
          }
      }
-     cpu->mp_affinity = mpidr & ARM32_AFFINITY_MASK;
+-    same_el = (arm_current_el(env) == target_el);
-+    /* Check whether userspace can specify guest syndrome value */
++    same_el = current_el == target_el;
-+    kvm_arm_init_serror_injection(cs);
+     fsr = compute_fsr_fsc(env, fi, target_el, mmu_idx, &fsc);
-+
-     return kvm_arm_init_cpreg_list(cpu);
+     if (access_type == MMU_INST_FETCH) {
- }
+@@ -XXX,XX +XXX,XX @@ void arm_deliver_fault(ARMCPU *cpu, vaddr addr,
+         exc = EXCP_DATA_ABORT;
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
          return ret;
      }
-+    ret = kvm_put_vcpu_events(cpu);
++ do_raise:
-+    if (ret) {
+     env->exception.vaddress = addr;
-+        return ret;
+     env->exception.fsr = fsr;
-+    }
+     raise_exception(env, exc, syn, target_el);
 +
      /* Note that we do not call write_cpustate_to_list()
       * here, so we are only writing the tuple list back to
       * KVM. This is safe because nothing can change the
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
      }
      vfp_set_fpscr(env, fpscr);
 +    ret = kvm_get_vcpu_events(cpu);
 +    if (ret) {
 +        return ret;
 +    }
 +
      if (!write_kvmstate_to_list(cpu)) {
          return EINVAL;
      }
 diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm64.c
 +++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init_vcpu(CPUState *cs)
      kvm_arm_init_debug(cs);
 +    /* Check whether user space can specify guest syndrome value */
 +    kvm_arm_init_serror_injection(cs);
 +
      return kvm_arm_init_cpreg_list(cpu);
  }
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
          return ret;
      }
 +    ret = kvm_put_vcpu_events(cpu);
 +    if (ret) {
 +        return ret;
 +    }
 +
      if (!write_list_to_kvmstate(cpu, level)) {
          return EINVAL;
      }
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
      }
      vfp_set_fpcr(env, fpr);
 +    ret = kvm_get_vcpu_events(cpu);
 +    if (ret) {
 +        return ret;
 +    }
 +
      if (!write_kvmstate_to_list(cpu)) {
          return EINVAL;
      }
 diff --git a/target/arm/machine.c b/target/arm/machine.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/machine.c
 +++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_sve = {
  };
  #endif /* AARCH64 */
 +static bool serror_needed(void *opaque)
 +{
 +    ARMCPU *cpu = opaque;
 +    CPUARMState *env = &cpu->env;
 +
 +    return env->serror.pending != 0;
 +}
 +
 +static const VMStateDescription vmstate_serror = {
 +    .name = "cpu/serror",
 +    .version_id = 1,
 +    .minimum_version_id = 1,
 +    .needed = serror_needed,
 +    .fields = (VMStateField[]) {
 +        VMSTATE_UINT8(env.serror.pending, ARMCPU),
 +        VMSTATE_UINT8(env.serror.has_esr, ARMCPU),
 +        VMSTATE_UINT64(env.serror.esr, ARMCPU),
 +        VMSTATE_END_OF_LIST()
 +    }
 +};
 +
  static bool m_needed(void *opaque)
  {
      ARMCPU *cpu = opaque;
@@ -XXX,XX +XXX,XX @@ const VMStateDescription vmstate_arm_cpu = {
  #ifdef TARGET_AARCH64
          &vmstate_sve,
  #endif
 +        &vmstate_serror,
          NULL
      }
  };
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 34/45] target/arm: Use gvec for VSRA
+[PULL 19/26] target/arm: Implement the granule protection check
 From: Richard Henderson <richard.henderson@linaro.org>
-Move ssra_op and usra_op expanders from translate-a64.c.
+Place the check at the end of get_phys_addr_with_struct,
 so that we check all physical results.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-14-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-20-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h     |   2 +
+ target/arm/ptw.c | 249 +++++++++++++++++++++++++++++++++++++++++++----
- target/arm/translate-a64.c | 106 ----------------------------
+file changed, 232 insertions(+), 17 deletions(-)
  target/arm/translate.c     | 139 ++++++++++++++++++++++++++++++++++---
 files changed, 130 insertions(+), 117 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/ptw.c
-+++ b/target/arm/translate.h
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 get_ahp_flag(void)
+@@ -XXX,XX +XXX,XX @@ typedef struct S1Translate {
- extern const GVecGen3 bsl_op;
+     void *out_host;
- extern const GVecGen3 bit_op;
+ } S1Translate;
- extern const GVecGen3 bif_op;
-+extern const GVecGen2i ssra_op[4];
+-static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
-+extern const GVecGen2i usra_op[4];
+-                                      target_ulong address,
+-                                      MMUAccessType access_type,
- /*
+-                                      GetPhysAddrResult *result,
-  * Forward to the isar_feature_* tests given a DisasContext pointer.
+-                                      ARMMMUFaultInfo *fi);
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
++static bool get_phys_addr_nogpc(CPUARMState *env, S1Translate *ptw,
-index XXXXXXX..XXXXXXX 100644
++                                target_ulong address,
---- a/target/arm/translate-a64.c
++                                MMUAccessType access_type,
-+++ b/target/arm/translate-a64.c
++                                GetPhysAddrResult *result,
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
++                                ARMMMUFaultInfo *fi);
 +
 +static bool get_phys_addr_gpc(CPUARMState *env, S1Translate *ptw,
 +                              target_ulong address,
 +                              MMUAccessType access_type,
 +                              GetPhysAddrResult *result,
 +                              ARMMMUFaultInfo *fi);
  /* This mapping is common between ID_AA64MMFR0.PARANGE and TCR_ELx.{I}PS. */
  static const uint8_t pamax_map[] = {
@@ -XXX,XX +XXX,XX @@ static bool regime_translation_disabled(CPUARMState *env, ARMMMUIdx mmu_idx,
      return (regime_sctlr(env, mmu_idx) & SCTLR_M) == 0;
  }
 +static bool granule_protection_check(CPUARMState *env, uint64_t paddress,
 +                                     ARMSecuritySpace pspace,
 +                                     ARMMMUFaultInfo *fi)
 +{
 +    MemTxAttrs attrs = {
 +        .secure = true,
 +        .space = ARMSS_Root,
 +    };
 +    ARMCPU *cpu = env_archcpu(env);
 +    uint64_t gpccr = env->cp15.gpccr_el3;
 +    unsigned pps, pgs, l0gptsz, level = 0;
 +    uint64_t tableaddr, pps_mask, align, entry, index;
 +    AddressSpace *as;
 +    MemTxResult result;
 +    int gpi;
 +
 +    if (!FIELD_EX64(gpccr, GPCCR, GPC)) {
 +        return true;
 +    }
 +
 +    /*
 +     * GPC Priority 1 (R_GMGRR):
 +     * R_JWCSM: If the configuration of GPCCR_EL3 is invalid,
 +     * the access fails as GPT walk fault at level 0.
 +     */
 +
 +    /*
 +     * Configuration of PPS to a value exceeding the implemented
 +     * physical address size is invalid.
 +     */
 +    pps = FIELD_EX64(gpccr, GPCCR, PPS);
 +    if (pps > FIELD_EX64(cpu->isar.id_aa64mmfr0, ID_AA64MMFR0, PARANGE)) {
 +        goto fault_walk;
 +    }
 +    pps = pamax_map[pps];
 +    pps_mask = MAKE_64BIT_MASK(0, pps);
 +
 +    switch (FIELD_EX64(gpccr, GPCCR, SH)) {
 +    case 0b10: /* outer shareable */
 +        break;
 +    case 0b00: /* non-shareable */
 +    case 0b11: /* inner shareable */
 +        /* Inner and Outer non-cacheable requires Outer shareable. */
 +        if (FIELD_EX64(gpccr, GPCCR, ORGN) == 0 &&
 +            FIELD_EX64(gpccr, GPCCR, IRGN) == 0) {
 +            goto fault_walk;
 +        }
 +        break;
 +    default:   /* reserved */
 +        goto fault_walk;
 +    }
 +
 +    switch (FIELD_EX64(gpccr, GPCCR, PGS)) {
 +    case 0b00: /* 4KB */
 +        pgs = 12;
 +        break;
 +    case 0b01: /* 64KB */
 +        pgs = 16;
 +        break;
 +    case 0b10: /* 16KB */
 +        pgs = 14;
 +        break;
 +    default: /* reserved */
 +        goto fault_walk;
 +    }
 +
 +    /* Note this field is read-only and fixed at reset. */
 +    l0gptsz = 30 + FIELD_EX64(gpccr, GPCCR, L0GPTSZ);
 +
 +    /*
 +     * GPC Priority 2: Secure, Realm or Root address exceeds PPS.
 +     * R_CPDSB: A NonSecure physical address input exceeding PPS
 +     * does not experience any fault.
 +     */
 +    if (paddress & ~pps_mask) {
 +        if (pspace == ARMSS_NonSecure) {
 +            return true;
 +        }
 +        goto fault_size;
 +    }
 +
 +    /* GPC Priority 3: the base address of GPTBR_EL3 exceeds PPS. */
 +    tableaddr = env->cp15.gptbr_el3 << 12;
 +    if (tableaddr & ~pps_mask) {
 +        goto fault_size;
 +    }
 +
 +    /*
 +     * BADDR is aligned per a function of PPS and L0GPTSZ.
 +     * These bits of GPTBR_EL3 are RES0, but are not a configuration error,
 +     * unlike the RES0 bits of the GPT entries (R_XNKFZ).
 +     */
 +    align = MAX(pps - l0gptsz + 3, 12);
 +    align = MAKE_64BIT_MASK(0, align);
 +    tableaddr &= ~align;
 +
 +    as = arm_addressspace(env_cpu(env), attrs);
 +
 +    /* Level 0 lookup. */
 +    index = extract64(paddress, l0gptsz, pps - l0gptsz);
 +    tableaddr += index * 8;
 +    entry = address_space_ldq_le(as, tableaddr, attrs, &result);
 +    if (result != MEMTX_OK) {
 +        goto fault_eabt;
 +    }
 +
 +    switch (extract32(entry, 0, 4)) {
 +    case 1: /* block descriptor */
 +        if (entry >> 8) {
 +            goto fault_walk; /* RES0 bits not 0 */
 +        }
 +        gpi = extract32(entry, 4, 4);
 +        goto found;
 +    case 3: /* table descriptor */
 +        tableaddr = entry & ~0xf;
 +        align = MAX(l0gptsz - pgs - 1, 12);
 +        align = MAKE_64BIT_MASK(0, align);
 +        if (tableaddr & (~pps_mask | align)) {
 +            goto fault_walk; /* RES0 bits not 0 */
 +        }
 +        break;
 +    default: /* invalid */
 +        goto fault_walk;
 +    }
 +
 +    /* Level 1 lookup */
 +    level = 1;
 +    index = extract64(paddress, pgs + 4, l0gptsz - pgs - 4);
 +    tableaddr += index * 8;
 +    entry = address_space_ldq_le(as, tableaddr, attrs, &result);
 +    if (result != MEMTX_OK) {
 +        goto fault_eabt;
 +    }
 +
 +    switch (extract32(entry, 0, 4)) {
 +    case 1: /* contiguous descriptor */
 +        if (entry >> 10) {
 +            goto fault_walk; /* RES0 bits not 0 */
 +        }
 +        /*
 +         * Because the softmmu tlb only works on units of TARGET_PAGE_SIZE,
 +         * and because we cannot invalidate by pa, and thus will always
 +         * flush entire tlbs, we don't actually care about the range here
 +         * and can simply extract the GPI as the result.
 +         */
 +        if (extract32(entry, 8, 2) == 0) {
 +            goto fault_walk; /* reserved contig */
 +        }
 +        gpi = extract32(entry, 4, 4);
 +        break;
 +    default:
 +        index = extract64(paddress, pgs, 4);
 +        gpi = extract64(entry, index * 4, 4);
 +        break;
 +    }
 +
 + found:
 +    switch (gpi) {
 +    case 0b0000: /* no access */
 +        break;
 +    case 0b1111: /* all access */
 +        return true;
 +    case 0b1000:
 +    case 0b1001:
 +    case 0b1010:
 +    case 0b1011:
 +        if (pspace == (gpi & 3)) {
 +            return true;
 +        }
 +        break;
 +    default:
 +        goto fault_walk; /* reserved */
 +    }
 +
 +    fi->gpcf = GPCF_Fail;
 +    goto fault_common;
 + fault_eabt:
 +    fi->gpcf = GPCF_EABT;
 +    goto fault_common;
 + fault_size:
 +    fi->gpcf = GPCF_AddressSize;
 +    goto fault_common;
 + fault_walk:
 +    fi->gpcf = GPCF_Walk;
 + fault_common:
 +    fi->level = level;
 +    fi->paddr = paddress;
 +    fi->paddr_space = pspace;
 +    return false;
 +}
 +
  static bool S2_attrs_are_device(uint64_t hcr, uint8_t attrs)
  {
      /*
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
          };
          GetPhysAddrResult s2 = { };
 -        if (get_phys_addr_with_struct(env, &s2ptw, addr,
 -                                      MMU_DATA_LOAD, &s2, fi)) {
 +        if (get_phys_addr_gpc(env, &s2ptw, addr, MMU_DATA_LOAD, &s2, fi)) {
              goto fail;
          }
 +
          ptw->out_phys = s2.f.phys_addr;
          pte_attrs = s2.cacheattrs.attrs;
          ptw->out_host = NULL;
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
   fail:
      assert(fi->type != ARMFault_None);
 +    if (fi->type == ARMFault_GPCFOnOutput) {
 +        fi->type = ARMFault_GPCFOnWalk;
 +    }
      fi->s2addr = addr;
      fi->stage2 = true;
      fi->s1ptw = true;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_disabled(CPUARMState *env, target_ulong address,
                                     ARMMMUFaultInfo *fi)
  {
      uint8_t memattr = 0x00;    /* Device nGnRnE */
 -    uint8_t shareability = 0;  /* non-sharable */
 +    uint8_t shareability = 0;  /* non-shareable */
      int r_el;
      switch (mmu_idx) {
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_disabled(CPUARMState *env, target_ulong address,
              } else {
                  memattr = 0x44;  /* Normal, NC, No */
              }
 -            shareability = 2; /* outer sharable */
 +            shareability = 2; /* outer shareable */
          }
          result->cacheattrs.is_s2_format = false;
          break;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
      ARMSecuritySpace ipa_space;
      uint64_t hcr;
 -    ret = get_phys_addr_with_struct(env, ptw, address, access_type, result, fi);
 +    ret = get_phys_addr_nogpc(env, ptw, address, access_type, result, fi);
      /* If S1 fails, return early.  */
      if (ret) {
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
      cacheattrs1 = result->cacheattrs;
      memset(result, 0, sizeof(*result));
 -    ret = get_phys_addr_with_struct(env, ptw, ipa, access_type, result, fi);
 +    ret = get_phys_addr_nogpc(env, ptw, ipa, access_type, result, fi);
      fi->s2addr = ipa;
      /* Combine the S1 and S2 perms.  */
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
      return false;
  }
 -static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
 +static bool get_phys_addr_nogpc(CPUARMState *env, S1Translate *ptw,
                                        target_ulong address,
                                        MMUAccessType access_type,
                                        GetPhysAddrResult *result,
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
      }
  }
--static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
++static bool get_phys_addr_gpc(CPUARMState *env, S1Translate *ptw,
--{
++                              target_ulong address,
--    tcg_gen_vec_sar8i_i64(a, a, shift);
++                              MMUAccessType access_type,
--    tcg_gen_vec_add8_i64(d, d, a);
++                              GetPhysAddrResult *result,
--}
++                              ARMMMUFaultInfo *fi)
 -
 -static void gen_ssra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 -{
 -    tcg_gen_vec_sar16i_i64(a, a, shift);
 -    tcg_gen_vec_add16_i64(d, d, a);
 -}
 -
 -static void gen_ssra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
 -{
 -    tcg_gen_sari_i32(a, a, shift);
 -    tcg_gen_add_i32(d, d, a);
 -}
 -
 -static void gen_ssra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 -{
 -    tcg_gen_sari_i64(a, a, shift);
 -    tcg_gen_add_i64(d, d, a);
 -}
 -
 -static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 -{
 -    tcg_gen_sari_vec(vece, a, a, sh);
 -    tcg_gen_add_vec(vece, d, d, a);
 -}
 -
 -static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 -{
 -    tcg_gen_vec_shr8i_i64(a, a, shift);
 -    tcg_gen_vec_add8_i64(d, d, a);
 -}
 -
 -static void gen_usra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 -{
 -    tcg_gen_vec_shr16i_i64(a, a, shift);
 -    tcg_gen_vec_add16_i64(d, d, a);
 -}
 -
 -static void gen_usra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
 -{
 -    tcg_gen_shri_i32(a, a, shift);
 -    tcg_gen_add_i32(d, d, a);
 -}
 -
 -static void gen_usra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 -{
 -    tcg_gen_shri_i64(a, a, shift);
 -    tcg_gen_add_i64(d, d, a);
 -}
 -
 -static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 -{
 -    tcg_gen_shri_vec(vece, a, a, sh);
 -    tcg_gen_add_vec(vece, d, d, a);
 -}
 -
  static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
      uint64_t mask = dup_const(MO_8, 0xff >> shift);
@@ -XXX,XX +XXX,XX @@ static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
  static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
                                   int immh, int immb, int opcode, int rn, int rd)
  {
 -    static const GVecGen2i ssra_op[4] = {
 -        { .fni8 = gen_ssra8_i64,
 -          .fniv = gen_ssra_vec,
 -          .load_dest = true,
 -          .opc = INDEX_op_sari_vec,
 -          .vece = MO_8 },
 -        { .fni8 = gen_ssra16_i64,
 -          .fniv = gen_ssra_vec,
 -          .load_dest = true,
 -          .opc = INDEX_op_sari_vec,
 -          .vece = MO_16 },
 -        { .fni4 = gen_ssra32_i32,
 -          .fniv = gen_ssra_vec,
 -          .load_dest = true,
 -          .opc = INDEX_op_sari_vec,
 -          .vece = MO_32 },
 -        { .fni8 = gen_ssra64_i64,
 -          .fniv = gen_ssra_vec,
 -          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -          .load_dest = true,
 -          .opc = INDEX_op_sari_vec,
 -          .vece = MO_64 },
 -    };
 -    static const GVecGen2i usra_op[4] = {
 -        { .fni8 = gen_usra8_i64,
 -          .fniv = gen_usra_vec,
 -          .load_dest = true,
 -          .opc = INDEX_op_shri_vec,
 -          .vece = MO_8, },
 -        { .fni8 = gen_usra16_i64,
 -          .fniv = gen_usra_vec,
 -          .load_dest = true,
 -          .opc = INDEX_op_shri_vec,
 -          .vece = MO_16, },
 -        { .fni4 = gen_usra32_i32,
 -          .fniv = gen_usra_vec,
 -          .load_dest = true,
 -          .opc = INDEX_op_shri_vec,
 -          .vece = MO_32, },
 -        { .fni8 = gen_usra64_i64,
 -          .fniv = gen_usra_vec,
 -          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -          .load_dest = true,
 -          .opc = INDEX_op_shri_vec,
 -          .vece = MO_64, },
 -    };
      static const GVecGen2i sri_op[4] = {
          { .fni8 = gen_shr8_ins_i64,
            .fniv = gen_shr_ins_vec,
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ const GVecGen3 bif_op = {
      .load_dest = true
  };
 +static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 +{
-+    tcg_gen_vec_sar8i_i64(a, a, shift);
++    if (get_phys_addr_nogpc(env, ptw, address, access_type, result, fi)) {
-+    tcg_gen_vec_add8_i64(d, d, a);
++        return true;
 +    }
 +    if (!granule_protection_check(env, result->f.phys_addr,
 +                                  result->f.attrs.space, fi)) {
 +        fi->type = ARMFault_GPCFOnOutput;
 +        return true;
 +    }
 +    return false;
 +}
 +
-+static void gen_ssra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
+ bool get_phys_addr_with_secure(CPUARMState *env, target_ulong address,
-+{
+                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
-+    tcg_gen_vec_sar16i_i64(a, a, shift);
+                                bool is_secure, GetPhysAddrResult *result,
-+    tcg_gen_vec_add16_i64(d, d, a);
+@@ -XXX,XX +XXX,XX @@ bool get_phys_addr_with_secure(CPUARMState *env, target_ulong address,
-+}
+         .in_secure = is_secure,
-+
+         .in_space = arm_secure_to_space(is_secure),
-+static void gen_ssra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
+     };
-+{
+-    return get_phys_addr_with_struct(env, &ptw, address, access_type,
-+    tcg_gen_sari_i32(a, a, shift);
+-                                     result, fi);
-+    tcg_gen_add_i32(d, d, a);
++    return get_phys_addr_gpc(env, &ptw, address, access_type, result, fi);
-+}
+ }
-+
-+static void gen_ssra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
+ bool get_phys_addr(CPUARMState *env, target_ulong address,
-+{
+@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
-+    tcg_gen_sari_i64(a, a, shift);
-+    tcg_gen_add_i64(d, d, a);
+     ptw.in_space = ss;
-+}
+     ptw.in_secure = arm_space_is_secure(ss);
-+
+-    return get_phys_addr_with_struct(env, &ptw, address, access_type,
-+static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
+-                                     result, fi);
-+{
++    return get_phys_addr_gpc(env, &ptw, address, access_type, result, fi);
-+    tcg_gen_sari_vec(vece, a, a, sh);
+ }
-+    tcg_gen_add_vec(vece, d, d, a);
-+}
+ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
-+
+@@ -XXX,XX +XXX,XX @@ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
-+const GVecGen2i ssra_op[4] = {
+     ARMMMUFaultInfo fi = {};
-+    { .fni8 = gen_ssra8_i64,
+     bool ret;
-+      .fniv = gen_ssra_vec,
-+      .load_dest = true,
+-    ret = get_phys_addr_with_struct(env, &ptw, addr, MMU_DATA_LOAD, &res, &fi);
-+      .opc = INDEX_op_sari_vec,
++    ret = get_phys_addr_gpc(env, &ptw, addr, MMU_DATA_LOAD, &res, &fi);
-+      .vece = MO_8 },
+     *attrs = res.f.attrs;
-+    { .fni8 = gen_ssra16_i64,
-+      .fniv = gen_ssra_vec,
+     if (ret) {
 +      .load_dest = true,
 +      .opc = INDEX_op_sari_vec,
 +      .vece = MO_16 },
 +    { .fni4 = gen_ssra32_i32,
 +      .fniv = gen_ssra_vec,
 +      .load_dest = true,
 +      .opc = INDEX_op_sari_vec,
 +      .vece = MO_32 },
 +    { .fni8 = gen_ssra64_i64,
 +      .fniv = gen_ssra_vec,
 +      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +      .load_dest = true,
 +      .opc = INDEX_op_sari_vec,
 +      .vece = MO_64 },
 +};
 +
 +static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 +{
 +    tcg_gen_vec_shr8i_i64(a, a, shift);
 +    tcg_gen_vec_add8_i64(d, d, a);
 +}
 +
 +static void gen_usra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 +{
 +    tcg_gen_vec_shr16i_i64(a, a, shift);
 +    tcg_gen_vec_add16_i64(d, d, a);
 +}
 +
 +static void gen_usra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
 +{
 +    tcg_gen_shri_i32(a, a, shift);
 +    tcg_gen_add_i32(d, d, a);
 +}
 +
 +static void gen_usra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 +{
 +    tcg_gen_shri_i64(a, a, shift);
 +    tcg_gen_add_i64(d, d, a);
 +}
 +
 +static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 +{
 +    tcg_gen_shri_vec(vece, a, a, sh);
 +    tcg_gen_add_vec(vece, d, d, a);
 +}
 +
 +const GVecGen2i usra_op[4] = {
 +    { .fni8 = gen_usra8_i64,
 +      .fniv = gen_usra_vec,
 +      .load_dest = true,
 +      .opc = INDEX_op_shri_vec,
 +      .vece = MO_8, },
 +    { .fni8 = gen_usra16_i64,
 +      .fniv = gen_usra_vec,
 +      .load_dest = true,
 +      .opc = INDEX_op_shri_vec,
 +      .vece = MO_16, },
 +    { .fni4 = gen_usra32_i32,
 +      .fniv = gen_usra_vec,
 +      .load_dest = true,
 +      .opc = INDEX_op_shri_vec,
 +      .vece = MO_32, },
 +    { .fni8 = gen_usra64_i64,
 +      .fniv = gen_usra_vec,
 +      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +      .load_dest = true,
 +      .opc = INDEX_op_shri_vec,
 +      .vece = MO_64, },
 +};
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      }
                      return 0;
 +                case 1:  /* VSRA */
 +                    /* Right shift comes here negative.  */
 +                    shift = -shift;
 +                    /* Shifts larger than the element size are architecturally
 +                     * valid.  Unsigned results in all zeros; signed results
 +                     * in all sign bits.
 +                     */
 +                    if (!u) {
 +                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 +                                        MIN(shift, (8 << size) - 1),
 +                                        &ssra_op[size]);
 +                    } else if (shift >= 8 << size) {
 +                        /* rd += 0 */
 +                    } else {
 +                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 +                                        shift, &usra_op[size]);
 +                    }
 +                    return 0;
 +
                  case 5: /* VSHL, VSLI */
                      if (!u) { /* VSHL */
                          /* Shifts larger than the element size are
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          neon_load_reg64(cpu_V0, rm + pass);
                          tcg_gen_movi_i64(cpu_V1, imm);
                          switch (op) {
 -                        case 1:  /* VSRA */
 -                            if (u)
 -                                gen_helper_neon_shl_u64(cpu_V0, cpu_V0, cpu_V1);
 -                            else
 -                                gen_helper_neon_shl_s64(cpu_V0, cpu_V0, cpu_V1);
 -                            break;
                          case 2: /* VRSHR */
                          case 3: /* VRSRA */
                              if (u)
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          default:
                              g_assert_not_reached();
                          }
 -                        if (op == 1 || op == 3) {
 +                        if (op == 3) {
                              /* Accumulate.  */
                              neon_load_reg64(cpu_V1, rd + pass);
                              tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          tmp2 = tcg_temp_new_i32();
                          tcg_gen_movi_i32(tmp2, imm);
                          switch (op) {
 -                        case 1:  /* VSRA */
 -                            GEN_NEON_INTEGER_OP(shl);
 -                            break;
                          case 2: /* VRSHR */
                          case 3: /* VRSRA */
                              GEN_NEON_INTEGER_OP(rshl);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          }
                          tcg_temp_free_i32(tmp2);
 -                        if (op == 1 || op == 3) {
 +                        if (op == 3) {
                              /* Accumulate.  */
                              tmp2 = neon_load_reg(rd, pass);
                              gen_neon_add(size, tmp, tmp2);
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 35/45] target/arm: Use gvec for VSRI, VSLI
+[PULL 20/26] target/arm: Add cpu properties for enabling FEAT_RME
 From: Richard Henderson <richard.henderson@linaro.org>
-Move shi_op and sli_op expanders from translate-a64.c.
+Add an x-rme cpu property to enable FEAT_RME.
 Add an x-l0gptsz property to set GPCCR_EL3.L0GPTSZ,
 for testing various possible configurations.
 We're not currently completely sure whether FEAT_RME will
 be OK to enable purely as a CPU-level property, or if it will
 need board co-operation, so we're making these experimental
 x- properties, so that the people developing the system
 level software for RME can try to start using this and let
 us know how it goes. The command line syntax for enabling
 this will change in future, without backwards-compatibility.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-15-richard.henderson@linaro.org
+Message-id: 20230620124418.805717-21-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h     |   2 +
+ target/arm/tcg/cpu64.c | 53 ++++++++++++++++++++++++++++++++++++++++++
- target/arm/translate-a64.c | 152 +----------------------
+file changed, 53 insertions(+)
  target/arm/translate.c     | 244 ++++++++++++++++++++++++++-----------
 files changed, 179 insertions(+), 219 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/tcg/cpu64.c
-+++ b/target/arm/translate.h
++++ b/target/arm/tcg/cpu64.c
-@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 bit_op;
+@@ -XXX,XX +XXX,XX @@ static void cpu_max_set_sve_max_vq(Object *obj, Visitor *v, const char *name,
- extern const GVecGen3 bif_op;
+     cpu->sve_max_vq = max_vq;
  extern const GVecGen2i ssra_op[4];
  extern const GVecGen2i usra_op[4];
 +extern const GVecGen2i sri_op[4];
 +extern const GVecGen2i sli_op[4];
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
      }
  }
--static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
++static bool cpu_arm_get_rme(Object *obj, Error **errp)
 -{
 -    uint64_t mask = dup_const(MO_8, 0xff >> shift);
 -    TCGv_i64 t = tcg_temp_new_i64();
 -
 -    tcg_gen_shri_i64(t, a, shift);
 -    tcg_gen_andi_i64(t, t, mask);
 -    tcg_gen_andi_i64(d, d, ~mask);
 -    tcg_gen_or_i64(d, d, t);
 -    tcg_temp_free_i64(t);
 -}
 -
 -static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 -{
 -    uint64_t mask = dup_const(MO_16, 0xffff >> shift);
 -    TCGv_i64 t = tcg_temp_new_i64();
 -
 -    tcg_gen_shri_i64(t, a, shift);
 -    tcg_gen_andi_i64(t, t, mask);
 -    tcg_gen_andi_i64(d, d, ~mask);
 -    tcg_gen_or_i64(d, d, t);
 -    tcg_temp_free_i64(t);
 -}
 -
 -static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
 -{
 -    tcg_gen_shri_i32(a, a, shift);
 -    tcg_gen_deposit_i32(d, d, a, 0, 32 - shift);
 -}
 -
 -static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 -{
 -    tcg_gen_shri_i64(a, a, shift);
 -    tcg_gen_deposit_i64(d, d, a, 0, 64 - shift);
 -}
 -
 -static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 -{
 -    uint64_t mask = (2ull << ((8 << vece) - 1)) - 1;
 -    TCGv_vec t = tcg_temp_new_vec_matching(d);
 -    TCGv_vec m = tcg_temp_new_vec_matching(d);
 -
 -    tcg_gen_dupi_vec(vece, m, mask ^ (mask >> sh));
 -    tcg_gen_shri_vec(vece, t, a, sh);
 -    tcg_gen_and_vec(vece, d, d, m);
 -    tcg_gen_or_vec(vece, d, d, t);
 -
 -    tcg_temp_free_vec(t);
 -    tcg_temp_free_vec(m);
 -}
 -
  /* SSHR[RA]/USHR[RA] - Vector shift right (optional rounding/accumulate) */
  static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
                                   int immh, int immb, int opcode, int rn, int rd)
  {
 -    static const GVecGen2i sri_op[4] = {
 -        { .fni8 = gen_shr8_ins_i64,
 -          .fniv = gen_shr_ins_vec,
 -          .load_dest = true,
 -          .opc = INDEX_op_shri_vec,
 -          .vece = MO_8 },
 -        { .fni8 = gen_shr16_ins_i64,
 -          .fniv = gen_shr_ins_vec,
 -          .load_dest = true,
 -          .opc = INDEX_op_shri_vec,
 -          .vece = MO_16 },
 -        { .fni4 = gen_shr32_ins_i32,
 -          .fniv = gen_shr_ins_vec,
 -          .load_dest = true,
 -          .opc = INDEX_op_shri_vec,
 -          .vece = MO_32 },
 -        { .fni8 = gen_shr64_ins_i64,
 -          .fniv = gen_shr_ins_vec,
 -          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -          .load_dest = true,
 -          .opc = INDEX_op_shri_vec,
 -          .vece = MO_64 },
 -    };
 -
      int size = 32 - clz32(immh) - 1;
      int immhb = immh << 3 | immb;
      int shift = 2 * (8 << size) - immhb;
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
      clear_vec_high(s, is_q, rd);
  }
 -static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 -{
 -    uint64_t mask = dup_const(MO_8, 0xff << shift);
 -    TCGv_i64 t = tcg_temp_new_i64();
 -
 -    tcg_gen_shli_i64(t, a, shift);
 -    tcg_gen_andi_i64(t, t, mask);
 -    tcg_gen_andi_i64(d, d, ~mask);
 -    tcg_gen_or_i64(d, d, t);
 -    tcg_temp_free_i64(t);
 -}
 -
 -static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 -{
 -    uint64_t mask = dup_const(MO_16, 0xffff << shift);
 -    TCGv_i64 t = tcg_temp_new_i64();
 -
 -    tcg_gen_shli_i64(t, a, shift);
 -    tcg_gen_andi_i64(t, t, mask);
 -    tcg_gen_andi_i64(d, d, ~mask);
 -    tcg_gen_or_i64(d, d, t);
 -    tcg_temp_free_i64(t);
 -}
 -
 -static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
 -{
 -    tcg_gen_deposit_i32(d, d, a, shift, 32 - shift);
 -}
 -
 -static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 -{
 -    tcg_gen_deposit_i64(d, d, a, shift, 64 - shift);
 -}
 -
 -static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 -{
 -    uint64_t mask = (1ull << sh) - 1;
 -    TCGv_vec t = tcg_temp_new_vec_matching(d);
 -    TCGv_vec m = tcg_temp_new_vec_matching(d);
 -
 -    tcg_gen_dupi_vec(vece, m, mask);
 -    tcg_gen_shli_vec(vece, t, a, sh);
 -    tcg_gen_and_vec(vece, d, d, m);
 -    tcg_gen_or_vec(vece, d, d, t);
 -
 -    tcg_temp_free_vec(t);
 -    tcg_temp_free_vec(m);
 -}
 -
  /* SHL/SLI - Vector shift left */
  static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
                                   int immh, int immb, int opcode, int rn, int rd)
  {
 -    static const GVecGen2i shi_op[4] = {
 -        { .fni8 = gen_shl8_ins_i64,
 -          .fniv = gen_shl_ins_vec,
 -          .opc = INDEX_op_shli_vec,
 -          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -          .load_dest = true,
 -          .vece = MO_8 },
 -        { .fni8 = gen_shl16_ins_i64,
 -          .fniv = gen_shl_ins_vec,
 -          .opc = INDEX_op_shli_vec,
 -          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -          .load_dest = true,
 -          .vece = MO_16 },
 -        { .fni4 = gen_shl32_ins_i32,
 -          .fniv = gen_shl_ins_vec,
 -          .opc = INDEX_op_shli_vec,
 -          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -          .load_dest = true,
 -          .vece = MO_32 },
 -        { .fni8 = gen_shl64_ins_i64,
 -          .fniv = gen_shl_ins_vec,
 -          .opc = INDEX_op_shli_vec,
 -          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -          .load_dest = true,
 -          .vece = MO_64 },
 -    };
      int size = 32 - clz32(immh) - 1;
      int immhb = immh << 3 | immb;
      int shift = immhb - (8 << size);
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
      }
      if (insert) {
 -        gen_gvec_op2i(s, is_q, rd, rn, shift, &shi_op[size]);
 +        gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]);
      } else {
          gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size);
      }
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ const GVecGen2i usra_op[4] = {
        .vece = MO_64, },
  };
 +static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 +{
-+    uint64_t mask = dup_const(MO_8, 0xff >> shift);
++    ARMCPU *cpu = ARM_CPU(obj);
-+    TCGv_i64 t = tcg_temp_new_i64();
++    return cpu_isar_feature(aa64_rme, cpu);
 +
 +    tcg_gen_shri_i64(t, a, shift);
 +    tcg_gen_andi_i64(t, t, mask);
 +    tcg_gen_andi_i64(d, d, ~mask);
 +    tcg_gen_or_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
-+static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
++static void cpu_arm_set_rme(Object *obj, bool value, Error **errp)
 +{
-+    uint64_t mask = dup_const(MO_16, 0xffff >> shift);
++    ARMCPU *cpu = ARM_CPU(obj);
-+    TCGv_i64 t = tcg_temp_new_i64();
++    uint64_t t;
 +
-+    tcg_gen_shri_i64(t, a, shift);
++    t = cpu->isar.id_aa64pfr0;
-+    tcg_gen_andi_i64(t, t, mask);
++    t = FIELD_DP64(t, ID_AA64PFR0, RME, value);
-+    tcg_gen_andi_i64(d, d, ~mask);
++    cpu->isar.id_aa64pfr0 = t;
 +    tcg_gen_or_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
-+static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
++static void cpu_max_set_l0gptsz(Object *obj, Visitor *v, const char *name,
 +                                void *opaque, Error **errp)
 +{
-+    tcg_gen_shri_i32(a, a, shift);
++    ARMCPU *cpu = ARM_CPU(obj);
-+    tcg_gen_deposit_i32(d, d, a, 0, 32 - shift);
++    uint32_t value;
 +}
 +
-+static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
++    if (!visit_type_uint32(v, name, &value, errp)) {
-+{
++        return;
-+    tcg_gen_shri_i64(a, a, shift);
++    }
 +    tcg_gen_deposit_i64(d, d, a, 0, 64 - shift);
 +}
 +
-+static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
++    /* Encode the value for the GPCCR_EL3 field. */
-+{
++    switch (value) {
-+    if (sh == 0) {
++    case 30:
-+        tcg_gen_mov_vec(d, a);
++    case 34:
-+    } else {
++    case 36:
-+        TCGv_vec t = tcg_temp_new_vec_matching(d);
++    case 39:
-+        TCGv_vec m = tcg_temp_new_vec_matching(d);
++        cpu->reset_l0gptsz = value - 30;
-+
++        break;
-+        tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
++    default:
-+        tcg_gen_shri_vec(vece, t, a, sh);
++        error_setg(errp, "invalid value for l0gptsz");
-+        tcg_gen_and_vec(vece, d, d, m);
++        error_append_hint(errp, "valid values are 30, 34, 36, 39\n");
-+        tcg_gen_or_vec(vece, d, d, t);
++        break;
 +
 +        tcg_temp_free_vec(t);
 +        tcg_temp_free_vec(m);
 +    }
 +}
 +
-+const GVecGen2i sri_op[4] = {
++static void cpu_max_get_l0gptsz(Object *obj, Visitor *v, const char *name,
-+    { .fni8 = gen_shr8_ins_i64,
++                                void *opaque, Error **errp)
-+      .fniv = gen_shr_ins_vec,
++{
-+      .load_dest = true,
++    ARMCPU *cpu = ARM_CPU(obj);
-+      .opc = INDEX_op_shri_vec,
++    uint32_t value = cpu->reset_l0gptsz + 30;
 +      .vece = MO_8 },
 +    { .fni8 = gen_shr16_ins_i64,
 +      .fniv = gen_shr_ins_vec,
 +      .load_dest = true,
 +      .opc = INDEX_op_shri_vec,
 +      .vece = MO_16 },
 +    { .fni4 = gen_shr32_ins_i32,
 +      .fniv = gen_shr_ins_vec,
 +      .load_dest = true,
 +      .opc = INDEX_op_shri_vec,
 +      .vece = MO_32 },
 +    { .fni8 = gen_shr64_ins_i64,
 +      .fniv = gen_shr_ins_vec,
 +      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +      .load_dest = true,
 +      .opc = INDEX_op_shri_vec,
 +      .vece = MO_64 },
 +};
 +
-+static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
++    visit_type_uint32(v, name, &value, errp);
 +{
 +    uint64_t mask = dup_const(MO_8, 0xff << shift);
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_shli_i64(t, a, shift);
 +    tcg_gen_andi_i64(t, t, mask);
 +    tcg_gen_andi_i64(d, d, ~mask);
 +    tcg_gen_or_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
-+static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
+ static Property arm_cpu_lpa2_property =
-+{
+     DEFINE_PROP_BOOL("lpa2", ARMCPU, prop_lpa2, true);
-+    uint64_t mask = dup_const(MO_16, 0xffff << shift);
-+    TCGv_i64 t = tcg_temp_new_i64();
+@@ -XXX,XX +XXX,XX @@ void aarch64_max_tcg_initfn(Object *obj)
-+
+     aarch64_add_sme_properties(obj);
-+    tcg_gen_shli_i64(t, a, shift);
+     object_property_add(obj, "sve-max-vq", "uint32", cpu_max_get_sve_max_vq,
-+    tcg_gen_andi_i64(t, t, mask);
+                         cpu_max_set_sve_max_vq, NULL, NULL);
-+    tcg_gen_andi_i64(d, d, ~mask);
++    object_property_add_bool(obj, "x-rme", cpu_arm_get_rme, cpu_arm_set_rme);
-+    tcg_gen_or_i64(d, d, t);
++    object_property_add(obj, "x-l0gptsz", "uint32", cpu_max_get_l0gptsz,
-+    tcg_temp_free_i64(t);
++                        cpu_max_set_l0gptsz, NULL, NULL);
-+}
+     qdev_property_add_static(DEVICE(obj), &arm_cpu_lpa2_property);
-+
+ }
-+static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
 +{
 +    tcg_gen_deposit_i32(d, d, a, shift, 32 - shift);
 +}
 +
 +static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 +{
 +    tcg_gen_deposit_i64(d, d, a, shift, 64 - shift);
 +}
 +
 +static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 +{
 +    if (sh == 0) {
 +        tcg_gen_mov_vec(d, a);
 +    } else {
 +        TCGv_vec t = tcg_temp_new_vec_matching(d);
 +        TCGv_vec m = tcg_temp_new_vec_matching(d);
 +
 +        tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
 +        tcg_gen_shli_vec(vece, t, a, sh);
 +        tcg_gen_and_vec(vece, d, d, m);
 +        tcg_gen_or_vec(vece, d, d, t);
 +
 +        tcg_temp_free_vec(t);
 +        tcg_temp_free_vec(m);
 +    }
 +}
 +
 +const GVecGen2i sli_op[4] = {
 +    { .fni8 = gen_shl8_ins_i64,
 +      .fniv = gen_shl_ins_vec,
 +      .load_dest = true,
 +      .opc = INDEX_op_shli_vec,
 +      .vece = MO_8 },
 +    { .fni8 = gen_shl16_ins_i64,
 +      .fniv = gen_shl_ins_vec,
 +      .load_dest = true,
 +      .opc = INDEX_op_shli_vec,
 +      .vece = MO_16 },
 +    { .fni4 = gen_shl32_ins_i32,
 +      .fniv = gen_shl_ins_vec,
 +      .load_dest = true,
 +      .opc = INDEX_op_shli_vec,
 +      .vece = MO_32 },
 +    { .fni8 = gen_shl64_ins_i64,
 +      .fniv = gen_shl_ins_vec,
 +      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +      .load_dest = true,
 +      .opc = INDEX_op_shli_vec,
 +      .vece = MO_64 },
 +};
 +
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
     We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      int pairwise;
      int u;
      int vec_size;
 -    uint32_t imm, mask;
 +    uint32_t imm;
      TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
      TCGv_ptr ptr1, ptr2, ptr3;
      TCGv_i64 tmp64;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      }
                      return 0;
 +                case 4: /* VSRI */
 +                    if (!u) {
 +                        return 1;
 +                    }
 +                    /* Right shift comes here negative.  */
 +                    shift = -shift;
 +                    /* Shift out of range leaves destination unchanged.  */
 +                    if (shift < 8 << size) {
 +                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 +                                        shift, &sri_op[size]);
 +                    }
 +                    return 0;
 +
                  case 5: /* VSHL, VSLI */
 -                    if (!u) { /* VSHL */
 +                    if (u) { /* VSLI */
 +                        /* Shift out of range leaves destination unchanged.  */
 +                        if (shift < 8 << size) {
 +                            tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size,
 +                                            vec_size, shift, &sli_op[size]);
 +                        }
 +                    } else { /* VSHL */
                          /* Shifts larger than the element size are
                           * architecturally valid and results in zero.
                           */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
                                                vec_size, vec_size);
                          }
 -                        return 0;
                      }
 -                    break;
 +                    return 0;
                  }
                  if (size == 3) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              else
                                  gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1);
                              break;
 -                        case 4: /* VSRI */
 -                        case 5: /* VSHL, VSLI */
 -                            gen_helper_neon_shl_u64(cpu_V0, cpu_V0, cpu_V1);
 -                            break;
                          case 6: /* VQSHLU */
                              gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
                                                        cpu_V0, cpu_V1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              /* Accumulate.  */
                              neon_load_reg64(cpu_V1, rd + pass);
                              tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
 -                        } else if (op == 4 || (op == 5 && u)) {
 -                            /* Insert */
 -                            neon_load_reg64(cpu_V1, rd + pass);
 -                            uint64_t mask;
 -                            if (shift < -63 || shift > 63) {
 -                                mask = 0;
 -                            } else {
 -                                if (op == 4) {
 -                                    mask = 0xffffffffffffffffull >> -shift;
 -                                } else {
 -                                    mask = 0xffffffffffffffffull << shift;
 -                                }
 -                            }
 -                            tcg_gen_andi_i64(cpu_V1, cpu_V1, ~mask);
 -                            tcg_gen_or_i64(cpu_V0, cpu_V0, cpu_V1);
                          }
                          neon_store_reg64(cpu_V0, rd + pass);
                      } else { /* size < 3 */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          case 3: /* VRSRA */
                              GEN_NEON_INTEGER_OP(rshl);
                              break;
 -                        case 4: /* VSRI */
 -                        case 5: /* VSHL, VSLI */
 -                            switch (size) {
 -                            case 0: gen_helper_neon_shl_u8(tmp, tmp, tmp2); break;
 -                            case 1: gen_helper_neon_shl_u16(tmp, tmp, tmp2); break;
 -                            case 2: gen_helper_neon_shl_u32(tmp, tmp, tmp2); break;
 -                            default: abort();
 -                            }
 -                            break;
                          case 6: /* VQSHLU */
                              switch (size) {
                              case 0:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              tmp2 = neon_load_reg(rd, pass);
                              gen_neon_add(size, tmp, tmp2);
                              tcg_temp_free_i32(tmp2);
 -                        } else if (op == 4 || (op == 5 && u)) {
 -                            /* Insert */
 -                            switch (size) {
 -                            case 0:
 -                                if (op == 4)
 -                                    mask = 0xff >> -shift;
 -                                else
 -                                    mask = (uint8_t)(0xff << shift);
 -                                mask |= mask << 8;
 -                                mask |= mask << 16;
 -                                break;
 -                            case 1:
 -                                if (op == 4)
 -                                    mask = 0xffff >> -shift;
 -                                else
 -                                    mask = (uint16_t)(0xffff << shift);
 -                                mask |= mask << 16;
 -                                break;
 -                            case 2:
 -                                if (shift < -31 || shift > 31) {
 -                                    mask = 0;
 -                                } else {
 -                                    if (op == 4)
 -                                        mask = 0xffffffffu >> -shift;
 -                                    else
 -                                        mask = 0xffffffffu << shift;
 -                                }
 -                                break;
 -                            default:
 -                                abort();
 -                            }
 -                            tmp2 = neon_load_reg(rd, pass);
 -                            tcg_gen_andi_i32(tmp, tmp, mask);
 -                            tcg_gen_andi_i32(tmp2, tmp2, ~mask);
 -                            tcg_gen_or_i32(tmp, tmp, tmp2);
 -                            tcg_temp_free_i32(tmp2);
                          }
                          neon_store_reg(rd, pass, tmp);
                      }
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 33/45] target/arm: Use gvec for VSHR, VSHL
+[PULL 21/26] docs/system/arm: Document FEAT_RME
 From: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-13-richard.henderson@linaro.org
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20230622143046.1578160-1-richard.henderson@linaro.org
 [PMM: fixed typo; note experimental status in emulation.rst too]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 70 +++++++++++++++++++++++++++++-------------
+ docs/system/arm/cpu-features.rst | 23 +++++++++++++++++++++++
-file changed, 48 insertions(+), 22 deletions(-)
+ docs/system/arm/emulation.rst    |  1 +
 files changed, 24 insertions(+)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/docs/system/arm/cpu-features.rst
-+++ b/target/arm/translate.c
++++ b/docs/system/arm/cpu-features.rst
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ As with ``sve-default-vector-length``, if the default length is larger
-                     size--;
+ than the maximum vector length enabled, the actual vector length will
-             }
+ be reduced.  If this property is set to ``-1`` then the default vector
-             shift = (insn >> 16) & ((1 << (3 + size)) - 1);
+ length is set to the maximum possible length.
 -            /* To avoid excessive duplication of ops we implement shift
 -               by immediate using the variable shift operations.  */
              if (op < 8) {
                  /* Shift by immediate:
                     VSHR, VSRA, VRSHR, VRSRA, VSRI, VSHL, VQSHL, VQSHLU.  */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  }
                  /* Right shifts are encoded as N - shift, where N is the
                     element size in bits.  */
 -                if (op <= 4)
 +                if (op <= 4) {
                      shift = shift - (1 << (size + 3));
 +                }
 +
-+                switch (op) {
++RME CPU Properties
-+                case 0:  /* VSHR */
++==================
 +                    /* Right shift comes here negative.  */
 +                    shift = -shift;
 +                    /* Shifts larger than the element size are architecturally
 +                     * valid.  Unsigned results in all zeros; signed results
 +                     * in all sign bits.
 +                     */
 +                    if (!u) {
 +                        tcg_gen_gvec_sari(size, rd_ofs, rm_ofs,
 +                                          MIN(shift, (8 << size) - 1),
 +                                          vec_size, vec_size);
 +                    } else if (shift >= 8 << size) {
 +                        tcg_gen_gvec_dup8i(rd_ofs, vec_size, vec_size, 0);
 +                    } else {
 +                        tcg_gen_gvec_shri(size, rd_ofs, rm_ofs, shift,
 +                                          vec_size, vec_size);
 +                    }
 +                    return 0;
 +
-+                case 5: /* VSHL, VSLI */
++The status of RME support with QEMU is experimental.  At this time we
-+                    if (!u) { /* VSHL */
++only support RME within the CPU proper, not within the SMMU or GIC.
-+                        /* Shifts larger than the element size are
++The feature is enabled by the CPU property ``x-rme``, with the ``x-``
-+                         * architecturally valid and results in zero.
++prefix present as a reminder of the experimental status, and defaults off.
 +                         */
 +                        if (shift >= 8 << size) {
 +                            tcg_gen_gvec_dup8i(rd_ofs, vec_size, vec_size, 0);
 +                        } else {
 +                            tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
 +                                              vec_size, vec_size);
 +                        }
 +                        return 0;
 +                    }
 +                    break;
 +                }
 +
-                 if (size == 3) {
++The method for enabling RME will change in some future QEMU release
-                     count = q + 1;
++without notice or backward compatibility.
                  } else {
                      count = q ? 4: 2;
                  }
 -                switch (size) {
 -                case 0:
 -                    imm = (uint8_t) shift;
 -                    imm |= imm << 8;
 -                    imm |= imm << 16;
 -                    break;
 -                case 1:
 -                    imm = (uint16_t) shift;
 -                    imm |= imm << 16;
 -                    break;
 -                case 2:
 -                case 3:
 -                    imm = shift;
 -                    break;
 -                default:
 -                    abort();
 -                }
 +
-+                /* To avoid excessive duplication of ops we implement shift
++RME Level 0 GPT Size Property
-+                 * by immediate using the variable shift operations.
++-----------------------------
-+                  */
++
-+                imm = dup_const(size, shift);
++To aid firmware developers in testing different possible CPU
++configurations, ``x-l0gptsz=S`` may be used to specify the value
-                 for (pass = 0; pass < count; pass++) {
++to encode into ``GPCCR_EL3.L0GPTSZ``, a read-only field that
-                     if (size == 3) {
++specifies the size of the Level 0 Granule Protection Table.
-                         neon_load_reg64(cpu_V0, rm + pass);
++Legal values for ``S`` are 30, 34, 36, and 39; the default is 30.
-                         tcg_gen_movi_i64(cpu_V1, imm);
++
-                         switch (op) {
++As with ``x-rme``, the ``x-l0gptsz`` property may be renamed or
--                        case 0:  /* VSHR */
++removed in some future QEMU release.
-                         case 1:  /* VSRA */
+diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
-                             if (u)
+index XXXXXXX..XXXXXXX 100644
-                                 gen_helper_neon_shl_u64(cpu_V0, cpu_V0, cpu_V1);
+--- a/docs/system/arm/emulation.rst
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
++++ b/docs/system/arm/emulation.rst
-                                                          cpu_V0, cpu_V1);
+@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
-                             }
+ - FEAT_RAS (Reliability, availability, and serviceability)
-                             break;
+ - FEAT_RASv1p1 (RAS Extension v1.1)
-+                        default:
+ - FEAT_RDM (Advanced SIMD rounding double multiply accumulate instructions)
-+                            g_assert_not_reached();
++- FEAT_RME (Realm Management Extension) (NB: support status in QEMU is experimental)
-                         }
+ - FEAT_RNG (Random number generator)
-                         if (op == 1 || op == 3) {
+ - FEAT_S2FWB (Stage 2 forced Write-Back)
-                             /* Accumulate.  */
+ - FEAT_SB (Speculation Barrier)
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          tmp2 = tcg_temp_new_i32();
                          tcg_gen_movi_i32(tmp2, imm);
                          switch (op) {
 -                        case 0:  /* VSHR */
                          case 1:  /* VSRA */
                              GEN_NEON_INTEGER_OP(shl);
                              break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          case 7: /* VQSHL */
                              GEN_NEON_INTEGER_OP_ENV(qshl);
                              break;
 +                        default:
 +                            g_assert_not_reached();
                          }
                          tcg_temp_free_i32(tmp2);
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 13/45] target/arm: Implement HCR.FB
+[PULL 22/26] host-utils: Avoid using __builtin_subcll on buggy versions of Apple Clang
-The HCR.FB virtualization configuration register bit requests that
+We use __builtin_subcll() to do a 64-bit subtract with borrow-in and
-TLB maintenance, branch predictor invalidate-all and icache
+borrow-out when the host compiler supports it.  Unfortunately some
-invalidate-all operations performed in NS EL1 should be upgraded
+versions of Apple Clang have a bug in their implementation of this
-from "local CPU only to "broadcast within Inner Shareable domain".
+intrinsic which means it returns the wrong value.  The effect is that
-For QEMU we NOP the branch predictor and icache operations, so
+a QEMU built with the affected compiler will hang when emulating x86
-we only need to upgrade the TLB invalidates:
+or m68k float80 division.
  AArch32 TLBIALL, TLBIMVA, TLBIASID, DTLBIALL, DTLBIMVA, DTLBIASID,
          ITLBIALL, ITLBIMVA, ITLBIASID, TLBIMVAA, TLBIMVAL, TLBIMVAAL
  AArch64 TLBI VMALLE1, TLBI VAE1, TLBI ASIDE1, TLBI VAAE1,
          TLBI VALE1, TLBI VAALE1
+The upstream LLVM issue is:
+https://github.com/llvm/llvm-project/issues/55253
+The commit that introduced the bug apparently never made it into an
+upstream LLVM release without the subsequent fix
+https://github.com/llvm/llvm-project/commit/fffb6e6afdbaba563189c1f715058ed401fbc88d
+but unfortunately it did make it into Apple Clang 14.0, as shipped
+in Xcode 14.3 (14.2 is reported to be OK). The Apple bug number is
+FB12210478.
+Add ifdefs to avoid use of __builtin_subcll() on Apple Clang version
+or greater.  There is not currently a version of Apple Clang which
+has the bug fix -- when one appears we should be able to add an upper
+bound to the ifdef condition so we can start using the builtin again.
+We make the lower bound a conservative "any Apple clang with major
+version 14 or greater" because the consequences of incorrectly
+disabling the builtin when it would work are pretty small and the
+consequences of not disabling it when we should are pretty bad.
+Many thanks to those users who both reported this bug and also
+did a lot of work in identifying the root cause; in particular
+to Daniel Bertalan and osy.
+Cc: qemu-stable@nongnu.org
+Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1631
+Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1659
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181012144235.19646-4-peter.maydell@linaro.org
+Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
 Tested-by: Daniel Bertalan <dani@danielbertalan.dev>
 Tested-by: Tested-By: Solra Bizna <solra@bizna.name>
 Message-id: 20230622130823.1631719-1-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 191 +++++++++++++++++++++++++++-----------------
+ include/qemu/compiler.h   | 13 +++++++++++++
-file changed, 116 insertions(+), 75 deletions(-)
+ include/qemu/host-utils.h |  2 +-
 files changed, 14 insertions(+), 1 deletion(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/include/qemu/compiler.h
-+++ b/target/arm/helper.c
++++ b/include/qemu/compiler.h
-@@ -XXX,XX +XXX,XX @@ static void contextidr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@
-     raw_write(env, ri, value);
+ #define QEMU_DISABLE_CFI
- }
+ #endif
 -static void tlbiall_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                          uint64_t value)
 -{
 -    /* Invalidate all (TLBIALL) */
 -    ARMCPU *cpu = arm_env_get_cpu(env);
 -
 -    tlb_flush(CPU(cpu));
 -}
 -
 -static void tlbimva_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                          uint64_t value)
 -{
 -    /* Invalidate single TLB entry by MVA and ASID (TLBIMVA) */
 -    ARMCPU *cpu = arm_env_get_cpu(env);
 -
 -    tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
 -}
 -
 -static void tlbiasid_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                           uint64_t value)
 -{
 -    /* Invalidate by ASID (TLBIASID) */
 -    ARMCPU *cpu = arm_env_get_cpu(env);
 -
 -    tlb_flush(CPU(cpu));
 -}
 -
 -static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                           uint64_t value)
 -{
 -    /* Invalidate single entry by MVA, all ASIDs (TLBIMVAA) */
 -    ARMCPU *cpu = arm_env_get_cpu(env);
 -
 -    tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
 -}
 -
  /* IS variants of TLB operations must affect all cores */
  static void tlbiall_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                               uint64_t value)
@@ -XXX,XX +XXX,XX @@ static void tlbimvaa_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
      tlb_flush_page_all_cpus_synced(cs, value & TARGET_PAGE_MASK);
  }
 +/*
-+ * Non-IS variants of TLB operations are upgraded to
++ * Apple clang version 14 has a bug in its __builtin_subcll(); define
-+ * IS versions if we are at NS EL1 and HCR_EL2.FB is set to
++ * BUILTIN_SUBCLL_BROKEN for the offending versions so we can avoid it.
-+ * force broadcast of these operations.
++ * When a version of Apple clang which has this bug fixed is released
 + * we can add an upper bound to this check.
 + * See https://gitlab.com/qemu-project/qemu/-/issues/1631
 + * and https://gitlab.com/qemu-project/qemu/-/issues/1659 for details.
 + * The bug never made it into any upstream LLVM releases, only Apple ones.
 + */
-+static bool tlb_force_broadcast(CPUARMState *env)
++#if defined(__apple_build_version__) && __clang_major__ >= 14
-+{
++#define BUILTIN_SUBCLL_BROKEN
-+    return (env->cp15.hcr_el2 & HCR_FB) &&
++#endif
 +        arm_current_el(env) == 1 && arm_is_secure_below_el3(env);
 +}
 +
-+static void tlbiall_write(CPUARMState *env, const ARMCPRegInfo *ri,
+ #endif /* COMPILER_H */
-+                          uint64_t value)
+diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h
-+{
+index XXXXXXX..XXXXXXX 100644
-+    /* Invalidate all (TLBIALL) */
+--- a/include/qemu/host-utils.h
-+    ARMCPU *cpu = arm_env_get_cpu(env);
++++ b/include/qemu/host-utils.h
-+
+@@ -XXX,XX +XXX,XX @@ static inline uint64_t uadd64_carry(uint64_t x, uint64_t y, bool *pcarry)
-+    if (tlb_force_broadcast(env)) {
+  */
-+        tlbiall_is_write(env, NULL, value);
+ static inline uint64_t usub64_borrow(uint64_t x, uint64_t y, bool *pborrow)
 +        return;
 +    }
 +
 +    tlb_flush(CPU(cpu));
 +}
 +
 +static void tlbimva_write(CPUARMState *env, const ARMCPRegInfo *ri,
 +                          uint64_t value)
 +{
 +    /* Invalidate single TLB entry by MVA and ASID (TLBIMVA) */
 +    ARMCPU *cpu = arm_env_get_cpu(env);
 +
 +    if (tlb_force_broadcast(env)) {
 +        tlbimva_is_write(env, NULL, value);
 +        return;
 +    }
 +
 +    tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
 +}
 +
 +static void tlbiasid_write(CPUARMState *env, const ARMCPRegInfo *ri,
 +                           uint64_t value)
 +{
 +    /* Invalidate by ASID (TLBIASID) */
 +    ARMCPU *cpu = arm_env_get_cpu(env);
 +
 +    if (tlb_force_broadcast(env)) {
 +        tlbiasid_is_write(env, NULL, value);
 +        return;
 +    }
 +
 +    tlb_flush(CPU(cpu));
 +}
 +
 +static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
 +                           uint64_t value)
 +{
 +    /* Invalidate single entry by MVA, all ASIDs (TLBIMVAA) */
 +    ARMCPU *cpu = arm_env_get_cpu(env);
 +
 +    if (tlb_force_broadcast(env)) {
 +        tlbimvaa_is_write(env, NULL, value);
 +        return;
 +    }
 +
 +    tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
 +}
 +
  static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                 uint64_t value)
  {
-@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
+-#if __has_builtin(__builtin_subcll)
-  * Page D4-1736 (DDI0487A.b)
++#if __has_builtin(__builtin_subcll) && !defined(BUILTIN_SUBCLL_BROKEN)
-  */
+     unsigned long long b = *pborrow;
+     x = __builtin_subcll(x, y, b, &b);
--static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
+     *pborrow = b & 1;
 -                                    uint64_t value)
 -{
 -    CPUState *cs = ENV_GET_CPU(env);
 -
 -    if (arm_is_secure_below_el3(env)) {
 -        tlb_flush_by_mmuidx(cs,
 -                            ARMMMUIdxBit_S1SE1 |
 -                            ARMMMUIdxBit_S1SE0);
 -    } else {
 -        tlb_flush_by_mmuidx(cs,
 -                            ARMMMUIdxBit_S12NSE1 |
 -                            ARMMMUIdxBit_S12NSE0);
 -    }
 -}
 -
  static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                        uint64_t value)
  {
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
      }
  }
 +static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 +                                    uint64_t value)
 +{
 +    CPUState *cs = ENV_GET_CPU(env);
 +
 +    if (tlb_force_broadcast(env)) {
 +        tlbi_aa64_vmalle1_write(env, NULL, value);
 +        return;
 +    }
 +
 +    if (arm_is_secure_below_el3(env)) {
 +        tlb_flush_by_mmuidx(cs,
 +                            ARMMMUIdxBit_S1SE1 |
 +                            ARMMMUIdxBit_S1SE0);
 +    } else {
 +        tlb_flush_by_mmuidx(cs,
 +                            ARMMMUIdxBit_S12NSE1 |
 +                            ARMMMUIdxBit_S12NSE0);
 +    }
 +}
 +
  static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                    uint64_t value)
  {
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
      tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_S1E3);
  }
 -static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                                 uint64_t value)
 -{
 -    /* Invalidate by VA, EL1&0 (AArch64 version).
 -     * Currently handles all of VAE1, VAAE1, VAALE1 and VALE1,
 -     * since we don't support flush-for-specific-ASID-only or
 -     * flush-last-level-only.
 -     */
 -    ARMCPU *cpu = arm_env_get_cpu(env);
 -    CPUState *cs = CPU(cpu);
 -    uint64_t pageaddr = sextract64(value << 12, 0, 56);
 -
 -    if (arm_is_secure_below_el3(env)) {
 -        tlb_flush_page_by_mmuidx(cs, pageaddr,
 -                                 ARMMMUIdxBit_S1SE1 |
 -                                 ARMMMUIdxBit_S1SE0);
 -    } else {
 -        tlb_flush_page_by_mmuidx(cs, pageaddr,
 -                                 ARMMMUIdxBit_S12NSE1 |
 -                                 ARMMMUIdxBit_S12NSE0);
 -    }
 -}
 -
  static void tlbi_aa64_vae2_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                   uint64_t value)
  {
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
      }
  }
 +static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 +                                 uint64_t value)
 +{
 +    /* Invalidate by VA, EL1&0 (AArch64 version).
 +     * Currently handles all of VAE1, VAAE1, VAALE1 and VALE1,
 +     * since we don't support flush-for-specific-ASID-only or
 +     * flush-last-level-only.
 +     */
 +    ARMCPU *cpu = arm_env_get_cpu(env);
 +    CPUState *cs = CPU(cpu);
 +    uint64_t pageaddr = sextract64(value << 12, 0, 56);
 +
 +    if (tlb_force_broadcast(env)) {
 +        tlbi_aa64_vae1is_write(env, NULL, value);
 +        return;
 +    }
 +
 +    if (arm_is_secure_below_el3(env)) {
 +        tlb_flush_page_by_mmuidx(cs, pageaddr,
 +                                 ARMMMUIdxBit_S1SE1 |
 +                                 ARMMMUIdxBit_S1SE0);
 +    } else {
 +        tlb_flush_page_by_mmuidx(cs, pageaddr,
 +                                 ARMMMUIdxBit_S12NSE1 |
 +                                 ARMMMUIdxBit_S12NSE0);
 +    }
 +}
 +
  static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                     uint64_t value)
  {
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 04/45] target/arm: V8M should not imply V7VE
+[PULL 23/26] target/arm: Restructure has_vfp_d32 test
 From: Richard Henderson <richard.henderson@linaro.org>
-Instantiating mps2-an505 (cortex-m33) will fail make check when
+One cannot test for feature aa32_simd_r32 without first
-V7VE asserts that ID_ISAR0.Divide includes ARM division.  It is
+testing if AArch32 mode is supported at all.  This leads to
 also wrong to include ARM_FEATURE_LPAE.
+qemu-system-aarch64: ARM CPUs must have both VFP-D32 and Neon or neither
+for Apple M1 cpus.
+We already have a check for ARMv8-A never setting vfp-d32 true,
+so restructure the code so that AArch64 avoids the test entirely.
+Reported-by: Mads Ynddal <mads@ynddal.dk>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181016223115.24100-3-richard.henderson@linaro.org
+Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Tested-by: Mads Ynddal <m.ynddal@samsung.com>
 Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
 Reviewed-by: Cédric Le Goater <clg@kaod.org>
 Reviewed-by: Mads Ynddal <m.ynddal@samsung.com>
 Message-id: 20230619140216.402530-1-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.c | 6 +++++-
+ target/arm/cpu.c | 28 +++++++++++++++-------------
-file changed, 5 insertions(+), 1 deletion(-)
+file changed, 15 insertions(+), 13 deletions(-)
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
+@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
+      * KVM does not currently allow us to lie to the guest about its
-     /* Some features automatically imply others: */
+      * ID/feature registers, so the guest always sees what the host has.
-     if (arm_feature(env, ARM_FEATURE_V8)) {
+      */
--        set_feature(env, ARM_FEATURE_V7VE);
+-    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)
-+        if (arm_feature(env, ARM_FEATURE_M)) {
+-        ? cpu_isar_feature(aa64_fp_simd, cpu)
-+            set_feature(env, ARM_FEATURE_V7);
+-        : cpu_isar_feature(aa32_vfp, cpu)) {
-+        } else {
+-        cpu->has_vfp = true;
-+            set_feature(env, ARM_FEATURE_V7VE);
+-        if (!kvm_enabled()) {
-+        }
+-            qdev_property_add_static(DEVICE(obj), &arm_cpu_has_vfp_property);
-     }
++    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
-     if (arm_feature(env, ARM_FEATURE_V7VE)) {
++        if (cpu_isar_feature(aa64_fp_simd, cpu)) {
-         /* v7 Virtualization Extensions. In real hardware this implies
++            cpu->has_vfp = true;
 +            cpu->has_vfp_d32 = true;
 +            if (tcg_enabled() || qtest_enabled()) {
 +                qdev_property_add_static(DEVICE(obj),
 +                                         &arm_cpu_has_vfp_property);
 +            }
          }
 -    }
 -
 -    if (cpu->has_vfp && cpu_isar_feature(aa32_simd_r32, cpu)) {
 -        cpu->has_vfp_d32 = true;
 -        if (!kvm_enabled()) {
 +    } else if (cpu_isar_feature(aa32_vfp, cpu)) {
 +        cpu->has_vfp = true;
 +        if (cpu_isar_feature(aa32_simd_r32, cpu)) {
 +            cpu->has_vfp_d32 = true;
              /*
               * The permitted values of the SIMDReg bits [3:0] on
               * Armv8-A are either 0b0000 and 0b0010. On such CPUs,
               * make sure that has_vfp_d32 can not be set to false.
               */
 -            if (!(arm_feature(&cpu->env, ARM_FEATURE_V8) &&
 -                  !arm_feature(&cpu->env, ARM_FEATURE_M))) {
 +            if ((tcg_enabled() || qtest_enabled())
 +                && !(arm_feature(&cpu->env, ARM_FEATURE_V8)
 +                     && !arm_feature(&cpu->env, ARM_FEATURE_M))) {
                  qdev_property_add_static(DEVICE(obj),
                                           &arm_cpu_has_vfp_d32_property);
              }
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 11/45] target/arm: Improve debug logging of AArch32 exception return
+Deleted patch
-For AArch32, exception return happens through certain kinds
-of CPSR write. We don't currently have any CPU_LOG_INT logging
-of these events (unlike AArch64, where we log in the ERET
-instruction). Add some suitable logging.
-This will log exception returns like this:
-Exception return from AArch32 hyp to usr PC 0x80100374
-paralleling the existing logging in the exception_return
-helper for AArch64 exception returns:
-Exception return from AArch64 EL2 to AArch64 EL0 PC 0x8003045c
-Exception return from AArch64 EL2 to AArch32 EL0 PC 0x8003045c
-(Note that an AArch32 exception return can only be
-AArch32->AArch32, never to AArch64.)
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181012144235.19646-2-peter.maydell@linaro.org
----
- target/arm/internals.h | 18 ++++++++++++++++++
- target/arm/helper.c    | 10 ++++++++++
- target/arm/translate.c |  7 +------
-files changed, 29 insertions(+), 6 deletions(-)
-diff --git a/target/arm/internals.h b/target/arm/internals.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/internals.h
-+++ b/target/arm/internals.h
-@@ -XXX,XX +XXX,XX @@ static inline uint32_t v7m_sp_limit(CPUARMState *env)
-     }
- }
-+/**
-+ * aarch32_mode_name(): Return name of the AArch32 CPU mode
-+ * @psr: Program Status Register indicating CPU mode
-+ *
-+ * Returns, for debug logging purposes, a printable representation
-+ * of the AArch32 CPU mode ("svc", "usr", etc) as indicated by
-+ * the low bits of the specified PSR.
-+ */
-+static inline const char *aarch32_mode_name(uint32_t psr)
-+{
-+    static const char cpu_mode_names[16][4] = {
-+        "usr", "fiq", "irq", "svc", "???", "???", "mon", "abt",
-+        "???", "???", "hyp", "und", "???", "???", "???", "sys"
-+    };
-+
-+    return cpu_mode_names[psr & 0xf];
-+}
-+
- #endif
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
-                 mask |= CPSR_IL;
-                 val |= CPSR_IL;
-             }
-+            qemu_log_mask(LOG_GUEST_ERROR,
-+                          "Illegal AArch32 mode switch attempt from %s to %s\n",
-+                          aarch32_mode_name(env->uncached_cpsr),
-+                          aarch32_mode_name(val));
-         } else {
-+            qemu_log_mask(CPU_LOG_INT, "%s %s to %s PC 0x%" PRIx32 "\n",
-+                          write_type == CPSRWriteExceptionReturn ?
-+                          "Exception return from AArch32" :
-+                          "AArch32 mode switch from",
-+                          aarch32_mode_name(env->uncached_cpsr),
-+                          aarch32_mode_name(val), env->regs[15]);
-             switch_mode(env, val & CPSR_M);
-         }
-     }
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb)
-     translator_loop(ops, &dc.base, cpu, tb);
- }
--static const char *cpu_mode_names[16] = {
--  "usr", "fiq", "irq", "svc", "???", "???", "mon", "abt",
--  "???", "???", "hyp", "und", "???", "???", "???", "sys"
--};
--
- void arm_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
-                         int flags)
- {
-@@ -XXX,XX +XXX,XX @@ void arm_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
-                     psr & CPSR_V ? 'V' : '-',
-                     psr & CPSR_T ? 'T' : 'A',
-                     ns_status,
--                    cpu_mode_names[psr & 0xf], (psr & 0x10) ? 32 : 26);
-+                    aarch32_mode_name(psr), (psr & 0x10) ? 32 : 26);
-     }
-     if (flags & CPU_DUMP_FPU) {
---
-.19.1

-[Qemu-devel] [PULL 12/45] target/arm: Make switch_mode() file-local
+Deleted patch
-The switch_mode() function is defined in target/arm/helper.c and used
-only in that file and nowhere else, so we can make it file-local
-rather than global.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181012144235.19646-3-peter.maydell@linaro.org
----
- target/arm/internals.h | 1 -
- target/arm/helper.c    | 6 ++++--
-files changed, 4 insertions(+), 3 deletions(-)
-diff --git a/target/arm/internals.h b/target/arm/internals.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/internals.h
-+++ b/target/arm/internals.h
-@@ -XXX,XX +XXX,XX @@ static inline int bank_number(int mode)
-     g_assert_not_reached();
- }
--void switch_mode(CPUARMState *, int);
- void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu);
- void arm_translate_init(void);
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void v8m_security_lookup(CPUARMState *env, uint32_t address,
-                                 V8M_SAttributes *sattrs);
- #endif
-+static void switch_mode(CPUARMState *env, int mode);
-+
- static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
- {
-     int nregs;
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
-     return 0;
- }
--void switch_mode(CPUARMState *env, int mode)
-+static void switch_mode(CPUARMState *env, int mode)
- {
-     ARMCPU *cpu = arm_env_get_cpu(env);
-@@ -XXX,XX +XXX,XX @@ void aarch64_sync_64_to_32(CPUARMState *env)
- #else
--void switch_mode(CPUARMState *env, int mode)
-+static void switch_mode(CPUARMState *env, int mode)
- {
-     int old_mode;
-     int i;
---
-.19.1

-[Qemu-devel] [PULL 14/45] target/arm: Implement HCR.DC
+Deleted patch
-The HCR.DC virtualization configuration register bit has the
-following effects:
- * SCTLR.M behaves as if it is 0 for all purposes except
-   direct reads of the bit
- * HCR.VM behaves as if it is 1 for all purposes except
-   direct reads of the bit
- * the memory type produced by the first stage of the EL1&EL0
-   translation regime is Normal Non-Shareable,
-   Inner Write-Back Read-Allocate Write-Allocate,
-   Outer Write-Back Read-Allocate Write-Allocate.
-Implement this behaviour.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181012144235.19646-5-peter.maydell@linaro.org
----
- target/arm/helper.c | 23 +++++++++++++++++++++--
-file changed, 21 insertions(+), 2 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
-          * * The Non-secure TTBCR.EAE bit is set to 1
-          * * The implementation includes EL2, and the value of HCR.VM is 1
-          *
-+         * (Note that HCR.DC makes HCR.VM behave as if it is 1.)
-+         *
-          * ATS1Hx always uses the 64bit format (not supported yet).
-          */
-         format64 = arm_s1_regime_using_lpae_format(env, mmu_idx);
-         if (arm_feature(env, ARM_FEATURE_EL2)) {
-             if (mmu_idx == ARMMMUIdx_S12NSE0 || mmu_idx == ARMMMUIdx_S12NSE1) {
--                format64 |= env->cp15.hcr_el2 & HCR_VM;
-+                format64 |= env->cp15.hcr_el2 & (HCR_VM | HCR_DC);
-             } else {
-                 format64 |= arm_current_el(env) == 2;
-             }
-@@ -XXX,XX +XXX,XX @@ static inline bool regime_translation_disabled(CPUARMState *env,
-     }
-     if (mmu_idx == ARMMMUIdx_S2NS) {
--        return (env->cp15.hcr_el2 & HCR_VM) == 0;
-+        /* HCR.DC means HCR.VM behaves as 1 */
-+        return (env->cp15.hcr_el2 & (HCR_DC | HCR_VM)) == 0;
-     }
-     if (env->cp15.hcr_el2 & HCR_TGE) {
-@@ -XXX,XX +XXX,XX @@ static inline bool regime_translation_disabled(CPUARMState *env,
-         }
-     }
-+    if ((env->cp15.hcr_el2 & HCR_DC) &&
-+        (mmu_idx == ARMMMUIdx_S1NSE0 || mmu_idx == ARMMMUIdx_S1NSE1)) {
-+        /* HCR.DC means SCTLR_EL1.M behaves as 0 */
-+        return true;
-+    }
-+
-     return (regime_sctlr(env, mmu_idx) & SCTLR_M) == 0;
- }
-@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr(CPUARMState *env, target_ulong address,
-             /* Combine the S1 and S2 cache attributes, if needed */
-             if (!ret && cacheattrs != NULL) {
-+                if (env->cp15.hcr_el2 & HCR_DC) {
-+                    /*
-+                     * HCR.DC forces the first stage attributes to
-+                     *  Normal Non-Shareable,
-+                     *  Inner Write-Back Read-Allocate Write-Allocate,
-+                     *  Outer Write-Back Read-Allocate Write-Allocate.
-+                     */
-+                    cacheattrs->attrs = 0xff;
-+                    cacheattrs->shareability = 0;
-+                }
-                 *cacheattrs = combine_cacheattrs(*cacheattrs, cacheattrs2);
-             }
---
-.19.1

-[Qemu-devel] [PULL 15/45] target/arm: ISR_EL1 bits track virtual interrupts if IMO/FMO set
+Deleted patch
-The A/I/F bits in ISR_EL1 should track the virtual interrupt
-status, not the physical interrupt status, if the associated
-HCR_EL2.AMO/IMO/FMO bit is set. Implement this, rather than
-always showing the physical interrupt status.
-We don't currently implement anything to do with external
-aborts, so this applies only to the I and F bits (though it
-ought to be possible for the outer guest to present a virtual
-external abort to the inner guest, even if QEMU doesn't
-emulate physical external aborts, so there is missing
-functionality in this area).
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181012144235.19646-6-peter.maydell@linaro.org
----
- target/arm/helper.c | 22 ++++++++++++++++++----
-file changed, 18 insertions(+), 4 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static uint64_t isr_read(CPUARMState *env, const ARMCPRegInfo *ri)
-     CPUState *cs = ENV_GET_CPU(env);
-     uint64_t ret = 0;
--    if (cs->interrupt_request & CPU_INTERRUPT_HARD) {
--        ret |= CPSR_I;
-+    if (arm_hcr_el2_imo(env)) {
-+        if (cs->interrupt_request & CPU_INTERRUPT_VIRQ) {
-+            ret |= CPSR_I;
-+        }
-+    } else {
-+        if (cs->interrupt_request & CPU_INTERRUPT_HARD) {
-+            ret |= CPSR_I;
-+        }
-     }
--    if (cs->interrupt_request & CPU_INTERRUPT_FIQ) {
--        ret |= CPSR_F;
-+
-+    if (arm_hcr_el2_fmo(env)) {
-+        if (cs->interrupt_request & CPU_INTERRUPT_VFIQ) {
-+            ret |= CPSR_F;
-+        }
-+    } else {
-+        if (cs->interrupt_request & CPU_INTERRUPT_FIQ) {
-+            ret |= CPSR_F;
-+        }
-     }
-+
-     /* External aborts are not possible in QEMU so A bit is always clear */
-     return ret;
- }
---
-.19.1

-[Qemu-devel] [PULL 16/45] target/arm: Implement HCR.VI and VF
+Deleted patch
-The HCR_EL2 VI and VF bits are supposed to track whether there is
-a pending virtual IRQ or virtual FIQ. For QEMU we store the
-pending VIRQ/VFIQ status in cs->interrupt_request, so this means:
- * if the register is read we must get these bit values from
-   cs->interrupt_request
- * if the register is written then we must write the bit
-   values back into cs->interrupt_request
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181012144235.19646-7-peter.maydell@linaro.org
----
- target/arm/helper.c | 47 +++++++++++++++++++++++++++++++++++++++++----
-file changed, 43 insertions(+), 4 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
- static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
- {
-     ARMCPU *cpu = arm_env_get_cpu(env);
-+    CPUState *cs = ENV_GET_CPU(env);
-     uint64_t valid_mask = HCR_MASK;
-     if (arm_feature(env, ARM_FEATURE_EL3)) {
-@@ -XXX,XX +XXX,XX @@ static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
-     /* Clear RES0 bits.  */
-     value &= valid_mask;
-+    /*
-+     * VI and VF are kept in cs->interrupt_request. Modifying that
-+     * requires that we have the iothread lock, which is done by
-+     * marking the reginfo structs as ARM_CP_IO.
-+     * Note that if a write to HCR pends a VIRQ or VFIQ it is never
-+     * possible for it to be taken immediately, because VIRQ and
-+     * VFIQ are masked unless running at EL0 or EL1, and HCR
-+     * can only be written at EL2.
-+     */
-+    g_assert(qemu_mutex_iothread_locked());
-+    if (value & HCR_VI) {
-+        cs->interrupt_request |= CPU_INTERRUPT_VIRQ;
-+    } else {
-+        cs->interrupt_request &= ~CPU_INTERRUPT_VIRQ;
-+    }
-+    if (value & HCR_VF) {
-+        cs->interrupt_request |= CPU_INTERRUPT_VFIQ;
-+    } else {
-+        cs->interrupt_request &= ~CPU_INTERRUPT_VFIQ;
-+    }
-+    value &= ~(HCR_VI | HCR_VF);
-+
-     /* These bits change the MMU setup:
-      * HCR_VM enables stage 2 translation
-      * HCR_PTW forbids certain page-table setups
-@@ -XXX,XX +XXX,XX @@ static void hcr_writelow(CPUARMState *env, const ARMCPRegInfo *ri,
-     hcr_write(env, NULL, value);
- }
-+static uint64_t hcr_read(CPUARMState *env, const ARMCPRegInfo *ri)
-+{
-+    /* The VI and VF bits live in cs->interrupt_request */
-+    uint64_t ret = env->cp15.hcr_el2 & ~(HCR_VI | HCR_VF);
-+    CPUState *cs = ENV_GET_CPU(env);
-+
-+    if (cs->interrupt_request & CPU_INTERRUPT_VIRQ) {
-+        ret |= HCR_VI;
-+    }
-+    if (cs->interrupt_request & CPU_INTERRUPT_VFIQ) {
-+        ret |= HCR_VF;
-+    }
-+    return ret;
-+}
-+
- static const ARMCPRegInfo el2_cp_reginfo[] = {
-     { .name = "HCR_EL2", .state = ARM_CP_STATE_AA64,
-+      .type = ARM_CP_IO,
-       .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
-       .access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.hcr_el2),
--      .writefn = hcr_write },
-+      .writefn = hcr_write, .readfn = hcr_read },
-     { .name = "HCR", .state = ARM_CP_STATE_AA32,
--      .type = ARM_CP_ALIAS,
-+      .type = ARM_CP_ALIAS | ARM_CP_IO,
-       .cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
-       .access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.hcr_el2),
--      .writefn = hcr_writelow },
-+      .writefn = hcr_writelow, .readfn = hcr_read },
-     { .name = "ELR_EL2", .state = ARM_CP_STATE_AA64,
-       .type = ARM_CP_ALIAS,
-       .opc0 = 3, .opc1 = 4, .crn = 4, .crm = 0, .opc2 = 1,
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
- static const ARMCPRegInfo el2_v8_cp_reginfo[] = {
-     { .name = "HCR2", .state = ARM_CP_STATE_AA32,
--      .type = ARM_CP_ALIAS,
-+      .type = ARM_CP_ALIAS | ARM_CP_IO,
-       .cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 4,
-       .access = PL2_RW,
-       .fieldoffset = offsetofhigh32(CPUARMState, cp15.hcr_el2),
---
-.19.1

-[Qemu-devel] [PULL 17/45] target/arm: Implement HCR.PTW
+Deleted patch
-If the HCR_EL2 PTW virtualizaiton configuration register bit
-is set, then this means that a stage 2 Permission fault must
-be generated if a stage 1 translation table access is made
-to an address that is mapped as Device memory in stage 2.
-Implement this.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181012144235.19646-8-peter.maydell@linaro.org
----
- target/arm/helper.c | 21 ++++++++++++++++++++-
-file changed, 20 insertions(+), 1 deletion(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
-         hwaddr s2pa;
-         int s2prot;
-         int ret;
-+        ARMCacheAttrs cacheattrs = {};
-+        ARMCacheAttrs *pcacheattrs = NULL;
-+
-+        if (env->cp15.hcr_el2 & HCR_PTW) {
-+            /*
-+             * PTW means we must fault if this S1 walk touches S2 Device
-+             * memory; otherwise we don't care about the attributes and can
-+             * save the S2 translation the effort of computing them.
-+             */
-+            pcacheattrs = &cacheattrs;
-+        }
-         ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_S2NS, &s2pa,
--                                 &txattrs, &s2prot, &s2size, fi, NULL);
-+                                 &txattrs, &s2prot, &s2size, fi, pcacheattrs);
-         if (ret) {
-             assert(fi->type != ARMFault_None);
-             fi->s2addr = addr;
-@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
-             fi->s1ptw = true;
-             return ~0;
-         }
-+        if (pcacheattrs && (pcacheattrs->attrs & 0xf0) == 0) {
-+            /* Access was to Device memory: generate Permission fault */
-+            fi->type = ARMFault_Permission;
-+            fi->s2addr = addr;
-+            fi->stage2 = true;
-+            fi->s1ptw = true;
-+            return ~0;
-+        }
-         addr = s2pa;
-     }
-     return addr;
---
-.19.1

-[Qemu-devel] [PULL 18/45] target/arm: New utility function to extract EC from syndrome
+Deleted patch
-Create and use a utility function to extract the EC field
-from a syndrome, rather than open-coding the shift.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181012144235.19646-9-peter.maydell@linaro.org
----
- target/arm/internals.h | 5 +++++
- target/arm/helper.c    | 4 ++--
- target/arm/kvm64.c     | 2 +-
- target/arm/op_helper.c | 2 +-
-files changed, 9 insertions(+), 4 deletions(-)
-diff --git a/target/arm/internals.h b/target/arm/internals.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/internals.h
-+++ b/target/arm/internals.h
-@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
- #define ARM_EL_IL (1 << ARM_EL_IL_SHIFT)
- #define ARM_EL_ISV (1 << ARM_EL_ISV_SHIFT)
-+static inline uint32_t syn_get_ec(uint32_t syn)
-+{
-+    return syn >> ARM_EL_EC_SHIFT;
-+}
-+
- /* Utility functions for constructing various kinds of syndrome value.
-  * Note that in general we follow the AArch64 syndrome values; in a
-  * few cases the value in HSR for exceptions taken to AArch32 Hyp
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch32(CPUState *cs)
-     uint32_t moe;
-     /* If this is a debug exception we must update the DBGDSCR.MOE bits */
--    switch (env->exception.syndrome >> ARM_EL_EC_SHIFT) {
-+    switch (syn_get_ec(env->exception.syndrome)) {
-     case EC_BREAKPOINT:
-     case EC_BREAKPOINT_SAME_EL:
-         moe = 1;
-@@ -XXX,XX +XXX,XX @@ void arm_cpu_do_interrupt(CPUState *cs)
-     if (qemu_loglevel_mask(CPU_LOG_INT)
-         && !excp_is_internal(cs->exception_index)) {
-         qemu_log_mask(CPU_LOG_INT, "...with ESR 0x%x/0x%" PRIx32 "\n",
--                      env->exception.syndrome >> ARM_EL_EC_SHIFT,
-+                      syn_get_ec(env->exception.syndrome),
-                       env->exception.syndrome);
-     }
-diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/kvm64.c
-+++ b/target/arm/kvm64.c
-@@ -XXX,XX +XXX,XX @@ int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
- bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit)
- {
--    int hsr_ec = debug_exit->hsr >> ARM_EL_EC_SHIFT;
-+    int hsr_ec = syn_get_ec(debug_exit->hsr);
-     ARMCPU *cpu = ARM_CPU(cs);
-     CPUClass *cc = CPU_GET_CLASS(cs);
-     CPUARMState *env = &cpu->env;
-diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/op_helper.c
-+++ b/target/arm/op_helper.c
-@@ -XXX,XX +XXX,XX @@ void raise_exception(CPUARMState *env, uint32_t excp,
-          * (see DDI0478C.a D1.10.4)
-          */
-         target_el = 2;
--        if (syndrome >> ARM_EL_EC_SHIFT == EC_ADVSIMDFPACCESSTRAP) {
-+        if (syn_get_ec(syndrome) == EC_ADVSIMDFPACCESSTRAP) {
-             syndrome = syn_uncategorized();
-         }
-     }
---
-.19.1

-[Qemu-devel] [PULL 19/45] target/arm: Get IL bit correct for v7 syndrome values
+Deleted patch
-For the v7 version of the Arm architecture, the IL bit in
-syndrome register values where the field is not valid was
-defined to be UNK/SBZP. In v8 this is RES1, which is what
-QEMU currently implements. Handle the desired v7 behaviour
-by squashing the IL bit for the affected cases:
- * EC == EC_UNCATEGORIZED
- * prefetch aborts
- * data aborts where ISV is 0
-(The fourth case listed in the v8 Arm ARM DDI 0487C.a in
-section G7.2.70, "illegal state exception", can't happen
-on a v7 CPU.)
-This deals with a corner case noted in a comment.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181012144235.19646-10-peter.maydell@linaro.org
----
- target/arm/internals.h |  7 ++-----
- target/arm/helper.c    | 13 +++++++++++++
-files changed, 15 insertions(+), 5 deletions(-)
-diff --git a/target/arm/internals.h b/target/arm/internals.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/internals.h
-+++ b/target/arm/internals.h
-@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_get_ec(uint32_t syn)
- /* Utility functions for constructing various kinds of syndrome value.
-  * Note that in general we follow the AArch64 syndrome values; in a
-  * few cases the value in HSR for exceptions taken to AArch32 Hyp
-- * mode differs slightly, so if we ever implemented Hyp mode then the
-- * syndrome value would need some massaging on exception entry.
-- * (One example of this is that AArch64 defaults to IL bit set for
-- * exceptions which don't specifically indicate information about the
-- * trapping instruction, whereas AArch32 defaults to IL bit clear.)
-+ * mode differs slightly, and we fix this up when populating HSR in
-+ * arm_cpu_do_interrupt_aarch32_hyp().
-  */
- static inline uint32_t syn_uncategorized(void)
- {
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch32_hyp(CPUState *cs)
-     }
-     if (cs->exception_index != EXCP_IRQ && cs->exception_index != EXCP_FIQ) {
-+        if (!arm_feature(env, ARM_FEATURE_V8)) {
-+            /*
-+             * QEMU syndrome values are v8-style. v7 has the IL bit
-+             * UNK/SBZP for "field not valid" cases, where v8 uses RES1.
-+             * If this is a v7 CPU, squash the IL bit in those cases.
-+             */
-+            if (cs->exception_index == EXCP_PREFETCH_ABORT ||
-+                (cs->exception_index == EXCP_DATA_ABORT &&
-+                 !(env->exception.syndrome & ARM_EL_ISV)) ||
-+                syn_get_ec(env->exception.syndrome) == EC_UNCATEGORIZED) {
-+                env->exception.syndrome &= ~ARM_EL_IL;
-+            }
-+        }
-         env->cp15.esr_el[2] = env->exception.syndrome;
-     }
---
-.19.1

-[Qemu-devel] [PULL 29/45] target/arm: Use gvec for NEON_3R_LOGIC insns
+[PULL 24/26] hw/arm/sbsa-ref: add ITS support in SBSA GIC
-From: Richard Henderson <richard.henderson@linaro.org>
+From: Shashi Mallela <shashi.mallela@linaro.org>
-Move expanders for VBSL, VBIT, and VBIF from translate-a64.c.
+Create ITS as part of SBSA platform GIC initialization.
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+GIC ITS information is in DeviceTree so TF-A can pass it to EDK2.
-Message-id: 20181011205206.3552-9-richard.henderson@linaro.org
 Bumping platform version to 0.2 as this is important hardware change.
 Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
 Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
 Message-id: 20230619170913.517373-2-marcin.juszkiewicz@linaro.org
 Co-authored-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
 Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h     |   6 ++
+ docs/system/arm/sbsa.rst | 14 ++++++++++++++
- target/arm/translate-a64.c |  61 --------------
+ hw/arm/sbsa-ref.c        | 33 ++++++++++++++++++++++++++++++---
- target/arm/translate.c     | 162 +++++++++++++++++++++++++++----------
+files changed, 44 insertions(+), 3 deletions(-)
 files changed, 124 insertions(+), 105 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/docs/system/arm/sbsa.rst b/docs/system/arm/sbsa.rst
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/docs/system/arm/sbsa.rst
-+++ b/target/arm/translate.h
++++ b/docs/system/arm/sbsa.rst
-@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 get_ahp_flag(void)
+@@ -XXX,XX +XXX,XX @@ to be a complete compliant DT. It currently reports:
-     return ret;
+    - platform version
     - GIC addresses
 +Platform version
 +''''''''''''''''
 +
  The platform version is only for informing platform firmware about
  what kind of ``sbsa-ref`` board it is running on. It is neither
  a QEMU versioned machine type nor a reflection of the level of the
@@ -XXX,XX +XXX,XX @@ SBSA/SystemReady SR support provided.
  The ``machine-version-major`` value is updated when changes breaking
  fw compatibility are introduced. The ``machine-version-minor`` value
  is updated when features are added that don't break fw compatibility.
 +
 +Platform version changes:
 +
 +0.0
 +  Devicetree holds information about CPUs, memory and platform version.
 +
 +0.1
 +  GIC information is present in devicetree.
 +
 +0.2
 +  GIC ITS information is present in devicetree.
 diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/sbsa-ref.c
 +++ b/hw/arm/sbsa-ref.c
@@ -XXX,XX +XXX,XX @@ enum {
      SBSA_CPUPERIPHS,
      SBSA_GIC_DIST,
      SBSA_GIC_REDIST,
 +    SBSA_GIC_ITS,
      SBSA_SECURE_EC,
      SBSA_GWDT_WS0,
      SBSA_GWDT_REFRESH,
@@ -XXX,XX +XXX,XX @@ static const MemMapEntry sbsa_ref_memmap[] = {
      [SBSA_CPUPERIPHS] =         { 0x40000000, 0x00040000 },
      [SBSA_GIC_DIST] =           { 0x40060000, 0x00010000 },
      [SBSA_GIC_REDIST] =         { 0x40080000, 0x04000000 },
 +    [SBSA_GIC_ITS] =            { 0x44081000, 0x00020000 },
      [SBSA_SECURE_EC] =          { 0x50000000, 0x00001000 },
      [SBSA_GWDT_REFRESH] =       { 0x50010000, 0x00001000 },
      [SBSA_GWDT_CONTROL] =       { 0x50011000, 0x00001000 },
@@ -XXX,XX +XXX,XX @@ static void sbsa_fdt_add_gic_node(SBSAMachineState *sms)
 , sbsa_ref_memmap[SBSA_GIC_REDIST].base,
 , sbsa_ref_memmap[SBSA_GIC_REDIST].size);
 +    nodename = g_strdup_printf("/intc/its");
 +    qemu_fdt_add_subnode(sms->fdt, nodename);
 +    qemu_fdt_setprop_sized_cells(sms->fdt, nodename, "reg",
 +                                 2, sbsa_ref_memmap[SBSA_GIC_ITS].base,
 +                                 2, sbsa_ref_memmap[SBSA_GIC_ITS].size);
 +
      g_free(nodename);
  }
-+
-+/* Vector operations shared between ARM and AArch64.  */
-+extern const GVecGen3 bsl_op;
-+extern const GVecGen3 bit_op;
-+extern const GVecGen3 bif_op;
 +
  /*
-  * Forward to the isar_feature_* tests given a DisasContext pointer.
+  * Firmware on this machine only uses ACPI table to load OS, these limited
-  */
+  * device tree nodes are just to let firmware know the info which varies from
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static void create_fdt(SBSAMachineState *sms)
-index XXXXXXX..XXXXXXX 100644
+      *                        fw compatibility.
---- a/target/arm/translate-a64.c
+      */
-+++ b/target/arm/translate-a64.c
+     qemu_fdt_setprop_cell(fdt, "/", "machine-version-major", 0);
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
+-    qemu_fdt_setprop_cell(fdt, "/", "machine-version-minor", 1);
-     }
++    qemu_fdt_setprop_cell(fdt, "/", "machine-version-minor", 2);
      if (ms->numa_state->have_numa_distance) {
          int size = nb_numa_nodes * nb_numa_nodes * 3 * sizeof(uint32_t);
@@ -XXX,XX +XXX,XX @@ static void create_secure_ram(SBSAMachineState *sms,
      memory_region_add_subregion(secure_sysmem, base, secram);
  }
--static void gen_bsl_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
+-static void create_gic(SBSAMachineState *sms)
--{
++static void create_its(SBSAMachineState *sms)
 -    tcg_gen_xor_i64(rn, rn, rm);
 -    tcg_gen_and_i64(rn, rn, rd);
 -    tcg_gen_xor_i64(rd, rm, rn);
 -}
 -
 -static void gen_bit_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
 -{
 -    tcg_gen_xor_i64(rn, rn, rd);
 -    tcg_gen_and_i64(rn, rn, rm);
 -    tcg_gen_xor_i64(rd, rd, rn);
 -}
 -
 -static void gen_bif_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
 -{
 -    tcg_gen_xor_i64(rn, rn, rd);
 -    tcg_gen_andc_i64(rn, rn, rm);
 -    tcg_gen_xor_i64(rd, rd, rn);
 -}
 -
 -static void gen_bsl_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
 -{
 -    tcg_gen_xor_vec(vece, rn, rn, rm);
 -    tcg_gen_and_vec(vece, rn, rn, rd);
 -    tcg_gen_xor_vec(vece, rd, rm, rn);
 -}
 -
 -static void gen_bit_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
 -{
 -    tcg_gen_xor_vec(vece, rn, rn, rd);
 -    tcg_gen_and_vec(vece, rn, rn, rm);
 -    tcg_gen_xor_vec(vece, rd, rd, rn);
 -}
 -
 -static void gen_bif_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
 -{
 -    tcg_gen_xor_vec(vece, rn, rn, rd);
 -    tcg_gen_andc_vec(vece, rn, rn, rm);
 -    tcg_gen_xor_vec(vece, rd, rd, rn);
 -}
 -
  /* Logic op (opcode == 3) subgroup of C3.6.16. */
  static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
  {
 -    static const GVecGen3 bsl_op = {
 -        .fni8 = gen_bsl_i64,
 -        .fniv = gen_bsl_vec,
 -        .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -        .load_dest = true
 -    };
 -    static const GVecGen3 bit_op = {
 -        .fni8 = gen_bit_i64,
 -        .fniv = gen_bit_vec,
 -        .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -        .load_dest = true
 -    };
 -    static const GVecGen3 bif_op = {
 -        .fni8 = gen_bif_i64,
 -        .fniv = gen_bif_vec,
 -        .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -        .load_dest = true
 -    };
 -
      int rd = extract32(insn, 0, 5);
      int rn = extract32(insn, 5, 5);
      int rm = extract32(insn, 16, 5);
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      return 0;
  }
 -/* Bitwise select.  dest = c ? t : f.  Clobbers T and F.  */
 -static void gen_neon_bsl(TCGv_i32 dest, TCGv_i32 t, TCGv_i32 f, TCGv_i32 c)
 -{
 -    tcg_gen_and_i32(t, t, c);
 -    tcg_gen_andc_i32(f, f, c);
 -    tcg_gen_or_i32(dest, t, f);
 -}
 -
  static inline void gen_neon_narrow(int size, TCGv_i32 dest, TCGv_i64 src)
  {
      switch (size) {
@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
      return 1;
  }
 +/*
 + * Expanders for VBitOps_VBIF, VBIT, VBSL.
 + */
 +static void gen_bsl_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
 +{
-+    tcg_gen_xor_i64(rn, rn, rm);
++    const char *itsclass = its_class_name();
-+    tcg_gen_and_i64(rn, rn, rd);
++    DeviceState *dev;
-+    tcg_gen_xor_i64(rd, rm, rn);
++
 +    dev = qdev_new(itsclass);
 +
 +    object_property_set_link(OBJECT(dev), "parent-gicv3", OBJECT(sms->gic),
 +                             &error_abort);
 +    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
 +    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, sbsa_ref_memmap[SBSA_GIC_ITS].base);
 +}
 +
-+static void gen_bit_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
++static void create_gic(SBSAMachineState *sms, MemoryRegion *mem)
-+{
+ {
-+    tcg_gen_xor_i64(rn, rn, rd);
+     unsigned int smp_cpus = MACHINE(sms)->smp.cpus;
-+    tcg_gen_and_i64(rn, rn, rm);
+     SysBusDevice *gicbusdev;
-+    tcg_gen_xor_i64(rd, rd, rn);
+@@ -XXX,XX +XXX,XX @@ static void create_gic(SBSAMachineState *sms)
-+}
+     qdev_prop_set_uint32(sms->gic, "len-redist-region-count", 1);
      qdev_prop_set_uint32(sms->gic, "redist-region-count[0]", redist0_count);
 +    object_property_set_link(OBJECT(sms->gic), "sysmem",
 +                             OBJECT(mem), &error_fatal);
 +    qdev_prop_set_bit(sms->gic, "has-lpi", true);
 +
-+static void gen_bif_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
+     gicbusdev = SYS_BUS_DEVICE(sms->gic);
-+{
+     sysbus_realize_and_unref(gicbusdev, &error_fatal);
-+    tcg_gen_xor_i64(rn, rn, rd);
+     sysbus_mmio_map(gicbusdev, 0, sbsa_ref_memmap[SBSA_GIC_DIST].base);
-+    tcg_gen_andc_i64(rn, rn, rm);
+@@ -XXX,XX +XXX,XX @@ static void create_gic(SBSAMachineState *sms)
-+    tcg_gen_xor_i64(rd, rd, rn);
+         sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus,
-+}
+                            qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
-+
+     }
-+static void gen_bsl_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
++    create_its(sms);
-+{
+ }
-+    tcg_gen_xor_vec(vece, rn, rn, rm);
-+    tcg_gen_and_vec(vece, rn, rn, rd);
+ static void create_uart(const SBSAMachineState *sms, int uart,
-+    tcg_gen_xor_vec(vece, rd, rm, rn);
+@@ -XXX,XX +XXX,XX @@ static void sbsa_ref_init(MachineState *machine)
-+}
-+
+     create_secure_ram(sms, secure_sysmem);
-+static void gen_bit_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
-+{
+-    create_gic(sms);
-+    tcg_gen_xor_vec(vece, rn, rn, rd);
++    create_gic(sms, sysmem);
-+    tcg_gen_and_vec(vece, rn, rn, rm);
-+    tcg_gen_xor_vec(vece, rd, rd, rn);
+     create_uart(sms, SBSA_UART, sysmem, serial_hd(0));
-+}
+     create_uart(sms, SBSA_SECURE_UART, secure_sysmem, serial_hd(1));
 +
 +static void gen_bif_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
 +{
 +    tcg_gen_xor_vec(vece, rn, rn, rd);
 +    tcg_gen_andc_vec(vece, rn, rn, rm);
 +    tcg_gen_xor_vec(vece, rd, rd, rn);
 +}
 +
 +const GVecGen3 bsl_op = {
 +    .fni8 = gen_bsl_i64,
 +    .fniv = gen_bsl_vec,
 +    .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +    .load_dest = true
 +};
 +
 +const GVecGen3 bit_op = {
 +    .fni8 = gen_bit_i64,
 +    .fniv = gen_bit_vec,
 +    .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +    .load_dest = true
 +};
 +
 +const GVecGen3 bif_op = {
 +    .fni8 = gen_bif_i64,
 +    .fniv = gen_bif_vec,
 +    .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +    .load_dest = true
 +};
 +
 +
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
     We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
  {
      int op;
      int q;
 -    int rd, rn, rm;
 +    int rd, rn, rm, rd_ofs, rn_ofs, rm_ofs;
      int size;
      int shift;
      int pass;
      int count;
      int pairwise;
      int u;
 +    int vec_size;
      uint32_t imm, mask;
      TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
      TCGv_ptr ptr1, ptr2, ptr3;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      VFP_DREG_N(rn, insn);
      VFP_DREG_M(rm, insn);
      size = (insn >> 20) & 3;
 +    vec_size = q ? 16 : 8;
 +    rd_ofs = neon_reg_offset(rd, 0);
 +    rn_ofs = neon_reg_offset(rn, 0);
 +    rm_ofs = neon_reg_offset(rm, 0);
 +
      if ((insn & (1 << 23)) == 0) {
          /* Three register same length.  */
          op = ((insn >> 7) & 0x1e) | ((insn >> 4) & 1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                       q, rd, rn, rm);
              }
              return 1;
 +
 +        case NEON_3R_LOGIC: /* Logic ops.  */
 +            switch ((u << 2) | size) {
 +            case 0: /* VAND */
 +                tcg_gen_gvec_and(0, rd_ofs, rn_ofs, rm_ofs,
 +                                 vec_size, vec_size);
 +                break;
 +            case 1: /* VBIC */
 +                tcg_gen_gvec_andc(0, rd_ofs, rn_ofs, rm_ofs,
 +                                  vec_size, vec_size);
 +                break;
 +            case 2:
 +                if (rn == rm) {
 +                    /* VMOV */
 +                    tcg_gen_gvec_mov(0, rd_ofs, rn_ofs, vec_size, vec_size);
 +                } else {
 +                    /* VORR */
 +                    tcg_gen_gvec_or(0, rd_ofs, rn_ofs, rm_ofs,
 +                                    vec_size, vec_size);
 +                }
 +                break;
 +            case 3: /* VORN */
 +                tcg_gen_gvec_orc(0, rd_ofs, rn_ofs, rm_ofs,
 +                                 vec_size, vec_size);
 +                break;
 +            case 4: /* VEOR */
 +                tcg_gen_gvec_xor(0, rd_ofs, rn_ofs, rm_ofs,
 +                                 vec_size, vec_size);
 +                break;
 +            case 5: /* VBSL */
 +                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
 +                               vec_size, vec_size, &bsl_op);
 +                break;
 +            case 6: /* VBIT */
 +                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
 +                               vec_size, vec_size, &bit_op);
 +                break;
 +            case 7: /* VBIF */
 +                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
 +                               vec_size, vec_size, &bif_op);
 +                break;
 +            }
 +            return 0;
          }
 -        if (size == 3 && op != NEON_3R_LOGIC) {
 +        if (size == 3) {
              /* 64-bit element instructions. */
              for (pass = 0; pass < (q ? 2 : 1); pass++) {
                  neon_load_reg64(cpu_V0, rn + pass);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VRHADD:
              GEN_NEON_INTEGER_OP(rhadd);
              break;
 -        case NEON_3R_LOGIC: /* Logic ops.  */
 -            switch ((u << 2) | size) {
 -            case 0: /* VAND */
 -                tcg_gen_and_i32(tmp, tmp, tmp2);
 -                break;
 -            case 1: /* BIC */
 -                tcg_gen_andc_i32(tmp, tmp, tmp2);
 -                break;
 -            case 2: /* VORR */
 -                tcg_gen_or_i32(tmp, tmp, tmp2);
 -                break;
 -            case 3: /* VORN */
 -                tcg_gen_orc_i32(tmp, tmp, tmp2);
 -                break;
 -            case 4: /* VEOR */
 -                tcg_gen_xor_i32(tmp, tmp, tmp2);
 -                break;
 -            case 5: /* VBSL */
 -                tmp3 = neon_load_reg(rd, pass);
 -                gen_neon_bsl(tmp, tmp, tmp2, tmp3);
 -                tcg_temp_free_i32(tmp3);
 -                break;
 -            case 6: /* VBIT */
 -                tmp3 = neon_load_reg(rd, pass);
 -                gen_neon_bsl(tmp, tmp, tmp3, tmp2);
 -                tcg_temp_free_i32(tmp3);
 -                break;
 -            case 7: /* VBIF */
 -                tmp3 = neon_load_reg(rd, pass);
 -                gen_neon_bsl(tmp, tmp3, tmp, tmp2);
 -                tcg_temp_free_i32(tmp3);
 -                break;
 -            }
 -            break;
          case NEON_3R_VHSUB:
              GEN_NEON_INTEGER_OP(hsub);
              break;
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 44/45] target/arm: Remove writefn from TTBR0_EL3
+[PULL 25/26] target/arm: Fix sve predicate store, 8 <= VQ <= 15
 From: Richard Henderson <richard.henderson@linaro.org>
-The EL3 version of this register does not include an ASID,
+Brown bag time: store instead of load results in uninitialized temp.
 and so the tlb_flush performed by vmsa_ttbr_write is not needed.
-Reviewed-by: Aaron Lindsay <aaron@os.amperecomputing.com>
 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1704
 Reported-by: Mark Rutland <mark.rutland@arm.com>
 Tested-by: Alex Bennée <alex.bennee@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20230620134659.817559-1-richard.henderson@linaro.org
 Fixes: e6dd5e782be ("target/arm: Use tcg_gen_qemu_{ld, st}_i128 in gen_sve_{ld, st}r")
 Tested-by: Alex Bennée <alex.bennee@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20181019015617.22583-2-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.c | 2 +-
+ target/arm/tcg/translate-sve.c | 2 +-
 file changed, 1 insertion(+), 1 deletion(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/tcg/translate-sve.c
-+++ b/target/arm/helper.c
++++ b/target/arm/tcg/translate-sve.c
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ void gen_sve_str(DisasContext *s, TCGv_ptr base, int vofs,
-       .fieldoffset = offsetof(CPUARMState, cp15.mvbar) },
+     /* Predicate register stores can be any multiple of 2.  */
-     { .name = "TTBR0_EL3", .state = ARM_CP_STATE_AA64,
+     if (len_remain >= 8) {
-       .opc0 = 3, .opc1 = 6, .crn = 2, .crm = 0, .opc2 = 0,
+         t0 = tcg_temp_new_i64();
--      .access = PL3_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+-        tcg_gen_st_i64(t0, base, vofs + len_align);
-+      .access = PL3_RW, .resetvalue = 0,
++        tcg_gen_ld_i64(t0, base, vofs + len_align);
-       .fieldoffset = offsetof(CPUARMState, cp15.ttbr0_el[3]) },
+         tcg_gen_qemu_st_i64(t0, clean_addr, midx, MO_LEUQ | MO_ATOM_NONE);
-     { .name = "TCR_EL3", .state = ARM_CP_STATE_AA64,
+         len_remain -= 8;
-       .opc0 = 3, .opc1 = 6, .crn = 2, .crm = 0, .opc2 = 2,
+         len_align += 8;
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 20/45] target/arm: Report correct syndrome for FP/SIMD traps to Hyp mode
+[PULL 26/26] pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym
-For traps of FP/SIMD instructions to AArch32 Hyp mode, the syndrome
+The xkb official name for the Arabic keyboard layout is 'ara'.
-provided in HSR has more information than is reported to AArch64.
+However xkb has for at least the past 15 years also permitted it to
-Specifically, there are extra fields TA and coproc which indicate
+be named via the legacy synonym 'ar'.  In xkeyboard-config 2.39 this
-whether the trapped instruction was FP or SIMD. Add this extra
+synoynm was removed, which breaks compilation of QEMU:
 information to the syndromes we construct, and mask it out when
 taking the exception to AArch64.
+FAILED: pc-bios/keymaps/ar
+/home/fred/qemu-git/src/qemu/build-full/qemu-keymap -f pc-bios/keymaps/ar -l ar
+xkbcommon: ERROR: Couldn't find file "symbols/ar" in include paths
+xkbcommon: ERROR: 1 include paths searched:
+xkbcommon: ERROR:     /usr/share/X11/xkb
+xkbcommon: ERROR: 3 include paths could not be added:
+xkbcommon: ERROR:     /home/fred/.config/xkb
+xkbcommon: ERROR:     /home/fred/.xkb
+xkbcommon: ERROR:     /etc/xkb
+xkbcommon: ERROR: Abandoning symbols file "(unnamed)"
+xkbcommon: ERROR: Failed to compile xkb_symbols
+xkbcommon: ERROR: Failed to compile keymap
+The upstream xkeyboard-config change removing the compat
+mapping is:
+https://gitlab.freedesktop.org/xkeyboard-config/xkeyboard-config/-/commit/470ad2cd8fea84d7210377161d86b31999bb5ea6
+Make QEMU always ask for the 'ara' xkb layout, which should work on
+both older and newer xkeyboard-config.  We leave the QEMU name for
+this keyboard layout as 'ar'; it is not the only one where our name
+for it deviates from the xkb standard name.
+Cc: qemu-stable@nongnu.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181012144235.19646-11-peter.maydell@linaro.org
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
 Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
 Message-id: 20230620162024.1132013-1-peter.maydell@linaro.org
 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1709
 ---
- target/arm/internals.h | 14 +++++++++++++-
+ pc-bios/keymaps/meson.build | 2 +-
- target/arm/helper.c    |  9 +++++++++
+file changed, 1 insertion(+), 1 deletion(-)
  target/arm/translate.c |  8 ++++----
 files changed, 26 insertions(+), 5 deletions(-)
-diff --git a/target/arm/internals.h b/target/arm/internals.h
+diff --git a/pc-bios/keymaps/meson.build b/pc-bios/keymaps/meson.build
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/internals.h
+--- a/pc-bios/keymaps/meson.build
-+++ b/target/arm/internals.h
++++ b/pc-bios/keymaps/meson.build
-@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_get_ec(uint32_t syn)
+@@ -XXX,XX +XXX,XX @@
-  * few cases the value in HSR for exceptions taken to AArch32 Hyp
+ keymaps = {
-  * mode differs slightly, and we fix this up when populating HSR in
+-  'ar': '-l ar',
-  * arm_cpu_do_interrupt_aarch32_hyp().
++  'ar': '-l ara',
-+ * The exception is FP/SIMD access traps -- these report extra information
+   'bepo': '-l fr -v dvorak',
-+ * when taking an exception to AArch32. For those we include the extra coproc
+   'cz': '-l cz',
-+ * and TA fields, and mask them out when taking the exception to AArch64.
+   'da': '-l dk',
   */
  static inline uint32_t syn_uncategorized(void)
  {
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_cp15_rrt_trap(int cv, int cond, int opc1, int crm,
  static inline uint32_t syn_fp_access_trap(int cv, int cond, bool is_16bit)
  {
 +    /* AArch32 FP trap or any AArch64 FP/SIMD trap: TA == 0 coproc == 0xa */
      return (EC_ADVSIMDFPACCESSTRAP << ARM_EL_EC_SHIFT)
          | (is_16bit ? 0 : ARM_EL_IL)
 -        | (cv << 24) | (cond << 20);
 +        | (cv << 24) | (cond << 20) | 0xa;
 +}
 +
 +static inline uint32_t syn_simd_access_trap(int cv, int cond, bool is_16bit)
 +{
 +    /* AArch32 SIMD trap: TA == 1 coproc == 0 */
 +    return (EC_ADVSIMDFPACCESSTRAP << ARM_EL_EC_SHIFT)
 +        | (is_16bit ? 0 : ARM_EL_IL)
 +        | (cv << 24) | (cond << 20) | (1 << 5);
  }
  static inline uint32_t syn_sve_access_trap(void)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
      case EXCP_HVC:
      case EXCP_HYP_TRAP:
      case EXCP_SMC:
 +        if (syn_get_ec(env->exception.syndrome) == EC_ADVSIMDFPACCESSTRAP) {
 +            /*
 +             * QEMU internal FP/SIMD syndromes from AArch32 include the
 +             * TA and coproc fields which are only exposed if the exception
 +             * is taken to AArch32 Hyp mode. Mask them out to get a valid
 +             * AArch64 format syndrome.
 +             */
 +            env->exception.syndrome &= ~MAKE_64BIT_MASK(0, 20);
 +        }
          env->cp15.esr_el[new_el] = env->exception.syndrome;
          break;
      case EXCP_IRQ:
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
       */
      if (s->fp_excp_el) {
          gen_exception_insn(s, 4, EXCP_UDEF,
 -                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
 +                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
          return 0;
      }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
       */
      if (s->fp_excp_el) {
          gen_exception_insn(s, 4, EXCP_UDEF,
 -                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
 +                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
          return 0;
      }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
      if (s->fp_excp_el) {
          gen_exception_insn(s, 4, EXCP_UDEF,
 -                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
 +                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
          return 0;
      }
      if (!s->vfp_enabled) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
      if (s->fp_excp_el) {
          gen_exception_insn(s, 4, EXCP_UDEF,
 -                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
 +                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
          return 0;
      }
      if (!s->vfp_enabled) {
 --
-.19.1
+.34.1

-[Qemu-devel] [PULL 21/45] hw/arm/boot: Increase compliance with kernel arm64 boot protocol
+Deleted patch
-From: Stewart Hildebrand <Stewart.Hildebrand@dornerworks.com>
-"The Image must be placed text_offset bytes from a 2MB aligned base
-address anywhere in usable system RAM and called there."
-For the virt board, we write our startup bootloader at the very
-bottom of RAM, so that bit can't be used for the image. To avoid
-overlap in case the image requests to be loaded at an offset
-smaller than our bootloader, we increment the load offset to the
-next 2MB.
-This fixes a boot failure for Xen AArch64.
-Signed-off-by: Stewart Hildebrand <stewart.hildebrand@dornerworks.com>
-Tested-by: Andre Przywara <andre.przywara@arm.com>
-Message-id: b8a89518794b4436af0c151ed10de4fa@dornerworks.com
-[PMM: Rephrased a comment a bit]
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- hw/arm/boot.c | 18 ++++++++++++++++++
-file changed, 18 insertions(+)
-diff --git a/hw/arm/boot.c b/hw/arm/boot.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/boot.c
-+++ b/hw/arm/boot.c
-@@ -XXX,XX +XXX,XX @@
- #include "qemu/config-file.h"
- #include "qemu/option.h"
- #include "exec/address-spaces.h"
-+#include "qemu/units.h"
- /* Kernel boot protocol is specified in the kernel docs
-  * Documentation/arm/Booting and Documentation/arm64/booting.txt
-@@ -XXX,XX +XXX,XX @@
- #define ARM64_TEXT_OFFSET_OFFSET    8
- #define ARM64_MAGIC_OFFSET          56
-+#define BOOTLOADER_MAX_SIZE         (4 * KiB)
-+
- AddressSpace *arm_boot_address_space(ARMCPU *cpu,
-                                      const struct arm_boot_info *info)
- {
-@@ -XXX,XX +XXX,XX @@ static void write_bootloader(const char *name, hwaddr addr,
-         code[i] = tswap32(insn);
-     }
-+    assert((len * sizeof(uint32_t)) < BOOTLOADER_MAX_SIZE);
-+
-     rom_add_blob_fixed_as(name, code, len * sizeof(uint32_t), addr, as);
-     g_free(code);
-@@ -XXX,XX +XXX,XX @@ static uint64_t load_aarch64_image(const char *filename, hwaddr mem_base,
-         memcpy(&hdrvals, buffer + ARM64_TEXT_OFFSET_OFFSET, sizeof(hdrvals));
-         if (hdrvals[1] != 0) {
-             kernel_load_offset = le64_to_cpu(hdrvals[0]);
-+
-+            /*
-+             * We write our startup "bootloader" at the very bottom of RAM,
-+             * so that bit can't be used for the image. Luckily the Image
-+             * format specification is that the image requests only an offset
-+             * from a 2MB boundary, not an absolute load address. So if the
-+             * image requests an offset that might mean it overlaps with the
-+             * bootloader, we can just load it starting at 2MB+offset rather
-+             * than 0MB + offset.
-+             */
-+            if (kernel_load_offset < BOOTLOADER_MAX_SIZE) {
-+                kernel_load_offset += 2 * MiB;
-+            }
-         }
-     }
---
-.19.1

-[Qemu-devel] [PULL 22/45] target/arm: Hoist address increment for vector memory ops
+Deleted patch
-From: Richard Henderson <rth@twiddle.net>
-This can reduce the number of opcodes required for certain
-complex forms of load-multiple (e.g. ld4.16b).
-Signed-off-by: Richard Henderson <rth@twiddle.net>
-Message-id: 20181011205206.3552-2-richard.henderson@linaro.org
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/translate-a64.c | 12 ++++++++----
-file changed, 8 insertions(+), 4 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
-+++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
-     bool is_store = !extract32(insn, 22, 1);
-     bool is_postidx = extract32(insn, 23, 1);
-     bool is_q = extract32(insn, 30, 1);
--    TCGv_i64 tcg_addr, tcg_rn;
-+    TCGv_i64 tcg_addr, tcg_rn, tcg_ebytes;
-     int ebytes = 1 << size;
-     int elements = (is_q ? 128 : 64) / (8 << size);
-@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
-     tcg_rn = cpu_reg_sp(s, rn);
-     tcg_addr = tcg_temp_new_i64();
-     tcg_gen_mov_i64(tcg_addr, tcg_rn);
-+    tcg_ebytes = tcg_const_i64(ebytes);
-     for (r = 0; r < rpt; r++) {
-         int e;
-@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
-                         clear_vec_high(s, is_q, tt);
-                     }
-                 }
--                tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes);
-+                tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
-                 tt = (tt + 1) % 32;
-             }
-         }
-@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
-             tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
-         }
-     }
-+    tcg_temp_free_i64(tcg_ebytes);
-     tcg_temp_free_i64(tcg_addr);
- }
-@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
-     bool replicate = false;
-     int index = is_q << 3 | S << 2 | size;
-     int ebytes, xs;
--    TCGv_i64 tcg_addr, tcg_rn;
-+    TCGv_i64 tcg_addr, tcg_rn, tcg_ebytes;
-     switch (scale) {
-     case 3:
-@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
-     tcg_rn = cpu_reg_sp(s, rn);
-     tcg_addr = tcg_temp_new_i64();
-     tcg_gen_mov_i64(tcg_addr, tcg_rn);
-+    tcg_ebytes = tcg_const_i64(ebytes);
-     for (xs = 0; xs < selem; xs++) {
-         if (replicate) {
-@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
-                 do_vec_st(s, rt, index, tcg_addr, scale);
-             }
-         }
--        tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes);
-+        tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
-         rt = (rt + 1) % 32;
-     }
-@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
-             tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
-         }
-     }
-+    tcg_temp_free_i64(tcg_ebytes);
-     tcg_temp_free_i64(tcg_addr);
- }
---
-.19.1

-[Qemu-devel] [PULL 24/45] target/arm: Use tcg_gen_gvec_dup_i64 for LD[1-4]R
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-4-richard.henderson@linaro.org
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/translate-a64.c | 28 +++-------------------------
-file changed, 3 insertions(+), 25 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
-+++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
-     for (xs = 0; xs < selem; xs++) {
-         if (replicate) {
-             /* Load and replicate to all elements */
--            uint64_t mulconst;
-             TCGv_i64 tcg_tmp = tcg_temp_new_i64();
-             tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr,
-                                 get_mem_index(s), s->be_data + scale);
--            switch (scale) {
--            case 0:
--                mulconst = 0x0101010101010101ULL;
--                break;
--            case 1:
--                mulconst = 0x0001000100010001ULL;
--                break;
--            case 2:
--                mulconst = 0x0000000100000001ULL;
--                break;
--            case 3:
--                mulconst = 0;
--                break;
--            default:
--                g_assert_not_reached();
--            }
--            if (mulconst) {
--                tcg_gen_muli_i64(tcg_tmp, tcg_tmp, mulconst);
--            }
--            write_vec_element(s, tcg_tmp, rt, 0, MO_64);
--            if (is_q) {
--                write_vec_element(s, tcg_tmp, rt, 1, MO_64);
--            }
-+            tcg_gen_gvec_dup_i64(scale, vec_full_reg_offset(s, rt),
-+                                 (is_q + 1) * 8, vec_full_reg_size(s),
-+                                 tcg_tmp);
-             tcg_temp_free_i64(tcg_tmp);
--            clear_vec_high(s, is_q, rt);
-         } else {
-             /* Load/store one element per register */
-             if (is_load) {
---
-.19.1

-[Qemu-devel] [PULL 26/45] target/arm: Mark some arrays const
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20181011205206.3552-6-richard.henderson@linaro.org
-[PMM: drop change to now-deleted cpu_mode_names array]
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/translate.c | 4 ++--
-file changed, 2 insertions(+), 2 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static TCGv_i64 cpu_F0d, cpu_F1d;
- #include "exec/gen-icount.h"
--static const char *regnames[] =
-+static const char * const regnames[] =
-     { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
-       "r8", "r9", "r10", "r11", "r12", "r13", "r14", "pc" };
-@@ -XXX,XX +XXX,XX @@ static struct {
-     int nregs;
-     int interleave;
-     int spacing;
--} neon_ls_element_type[11] = {
-+} const neon_ls_element_type[11] = {
-     {4, 4, 1},
-     {4, 4, 2},
-     {4, 1, 1},
---
-.19.1

-[Qemu-devel] [PULL 28/45] target/arm: Use gvec for NEON VMOV, VMVN, VBIC & VORR (immediate)
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-8-richard.henderson@linaro.org
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/translate.c | 67 ++++++++++++++++++++++++------------------
-file changed, 39 insertions(+), 28 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                 return 1;
-             }
-         } else { /* (insn & 0x00380080) == 0 */
--            int invert;
-+            int invert, reg_ofs, vec_size;
-+
-             if (q && (rd & 1)) {
-                 return 1;
-             }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                 break;
-             case 14:
-                 imm |= (imm << 8) | (imm << 16) | (imm << 24);
--                if (invert)
-+                if (invert) {
-                     imm = ~imm;
-+                }
-                 break;
-             case 15:
-                 if (invert) {
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                       | ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
-                 break;
-             }
--            if (invert)
-+            if (invert) {
-                 imm = ~imm;
-+            }
--            for (pass = 0; pass < (q ? 4 : 2); pass++) {
--                if (op & 1 && op < 12) {
--                    tmp = neon_load_reg(rd, pass);
--                    if (invert) {
--                        /* The immediate value has already been inverted, so
--                           BIC becomes AND.  */
--                        tcg_gen_andi_i32(tmp, tmp, imm);
--                    } else {
--                        tcg_gen_ori_i32(tmp, tmp, imm);
--                    }
-+            reg_ofs = neon_reg_offset(rd, 0);
-+            vec_size = q ? 16 : 8;
-+
-+            if (op & 1 && op < 12) {
-+                if (invert) {
-+                    /* The immediate value has already been inverted,
-+                     * so BIC becomes AND.
-+                     */
-+                    tcg_gen_gvec_andi(MO_32, reg_ofs, reg_ofs, imm,
-+                                      vec_size, vec_size);
-                 } else {
--                    /* VMOV, VMVN.  */
--                    tmp = tcg_temp_new_i32();
--                    if (op == 14 && invert) {
--                        int n;
--                        uint32_t val;
--                        val = 0;
--                        for (n = 0; n < 4; n++) {
--                            if (imm & (1 << (n + (pass & 1) * 4)))
--                                val |= 0xff << (n * 8);
--                        }
--                        tcg_gen_movi_i32(tmp, val);
--                    } else {
--                        tcg_gen_movi_i32(tmp, imm);
--                    }
-+                    tcg_gen_gvec_ori(MO_32, reg_ofs, reg_ofs, imm,
-+                                     vec_size, vec_size);
-+                }
-+            } else {
-+                /* VMOV, VMVN.  */
-+                if (op == 14 && invert) {
-+                    TCGv_i64 t64 = tcg_temp_new_i64();
-+
-+                    for (pass = 0; pass <= q; ++pass) {
-+                        uint64_t val = 0;
-+                        int n;
-+
-+                        for (n = 0; n < 8; n++) {
-+                            if (imm & (1 << (n + pass * 8))) {
-+                                val |= 0xffull << (n * 8);
-+                            }
-+                        }
-+                        tcg_gen_movi_i64(t64, val);
-+                        neon_store_reg64(t64, rd + pass);
-+                    }
-+                    tcg_temp_free_i64(t64);
-+                } else {
-+                    tcg_gen_gvec_dup32i(reg_ofs, vec_size, vec_size, imm);
-                 }
--                neon_store_reg(rd, pass, tmp);
-             }
-         }
-     } else { /* (insn & 0x00800010 == 0x00800000) */
---
-.19.1

-[Qemu-devel] [PULL 30/45] target/arm: Use gvec for NEON_3R_VADD_VSUB insns
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-10-richard.henderson@linaro.org
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/translate.c | 29 ++++++++++-------------------
-file changed, 10 insertions(+), 19 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                 break;
-             }
-             return 0;
-+
-+        case NEON_3R_VADD_VSUB:
-+            if (u) {
-+                tcg_gen_gvec_sub(size, rd_ofs, rn_ofs, rm_ofs,
-+                                 vec_size, vec_size);
-+            } else {
-+                tcg_gen_gvec_add(size, rd_ofs, rn_ofs, rm_ofs,
-+                                 vec_size, vec_size);
-+            }
-+            return 0;
-         }
-         if (size == 3) {
-             /* 64-bit element instructions. */
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                                                   cpu_V1, cpu_V0);
-                     }
-                     break;
--                case NEON_3R_VADD_VSUB:
--                    if (u) {
--                        tcg_gen_sub_i64(CPU_V001);
--                    } else {
--                        tcg_gen_add_i64(CPU_V001);
--                    }
--                    break;
-                 default:
-                     abort();
-                 }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             tmp2 = neon_load_reg(rd, pass);
-             gen_neon_add(size, tmp, tmp2);
-             break;
--        case NEON_3R_VADD_VSUB:
--            if (!u) { /* VADD */
--                gen_neon_add(size, tmp, tmp2);
--            } else { /* VSUB */
--                switch (size) {
--                case 0: gen_helper_neon_sub_u8(tmp, tmp, tmp2); break;
--                case 1: gen_helper_neon_sub_u16(tmp, tmp, tmp2); break;
--                case 2: tcg_gen_sub_i32(tmp, tmp, tmp2); break;
--                default: abort();
--                }
--            }
--            break;
-         case NEON_3R_VTST_VCEQ:
-             if (!u) { /* VTST */
-                 switch (size) {
---
-.19.1

-[Qemu-devel] [PULL 31/45] target/arm: Use gvec for NEON_2RM_VMN, NEON_2RM_VNEG
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-11-richard.henderson@linaro.org
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/translate.c | 16 ++++++++--------
-file changed, 8 insertions(+), 8 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                     tcg_temp_free_ptr(ptr1);
-                     tcg_temp_free_ptr(ptr2);
-                     break;
-+
-+                case NEON_2RM_VMVN:
-+                    tcg_gen_gvec_not(0, rd_ofs, rm_ofs, vec_size, vec_size);
-+                    break;
-+                case NEON_2RM_VNEG:
-+                    tcg_gen_gvec_neg(size, rd_ofs, rm_ofs, vec_size, vec_size);
-+                    break;
-+
-                 default:
-                 elementwise:
-                     for (pass = 0; pass < (q ? 4 : 2); pass++) {
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                         case NEON_2RM_VCNT:
-                             gen_helper_neon_cnt_u8(tmp, tmp);
-                             break;
--                        case NEON_2RM_VMVN:
--                            tcg_gen_not_i32(tmp, tmp);
--                            break;
-                         case NEON_2RM_VQABS:
-                             switch (size) {
-                             case 0:
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                             default: abort();
-                             }
-                             break;
--                        case NEON_2RM_VNEG:
--                            tmp2 = tcg_const_i32(0);
--                            gen_neon_rsb(size, tmp, tmp2);
--                            tcg_temp_free_i32(tmp2);
--                            break;
-                         case NEON_2RM_VCGT0_F:
-                         {
-                             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
---
-.19.1

-[Qemu-devel] [PULL 32/45] target/arm: Use gvec for NEON_3R_VMUL
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20181011205206.3552-12-richard.henderson@linaro.org
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/translate.c | 31 +++++++++++++++----------------
-file changed, 15 insertions(+), 16 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                                  vec_size, vec_size);
-             }
-             return 0;
-+
-+        case NEON_3R_VMUL: /* VMUL */
-+            if (u) {
-+                /* Polynomial case allows only P8 and is handled below.  */
-+                if (size != 0) {
-+                    return 1;
-+                }
-+            } else {
-+                tcg_gen_gvec_mul(size, rd_ofs, rn_ofs, rm_ofs,
-+                                 vec_size, vec_size);
-+                return 0;
-+            }
-+            break;
-         }
-         if (size == 3) {
-             /* 64-bit element instructions. */
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                 return 1;
-             }
-             break;
--        case NEON_3R_VMUL:
--            if (u && (size != 0)) {
--                /* UNDEF on invalid size for polynomial subcase */
--                return 1;
--            }
--            break;
-         case NEON_3R_VFM_VQRDMLSH:
-             if (!arm_dc_feature(s, ARM_FEATURE_VFP4)) {
-                 return 1;
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             }
-             break;
-         case NEON_3R_VMUL:
--            if (u) { /* polynomial */
--                gen_helper_neon_mul_p8(tmp, tmp, tmp2);
--            } else { /* Integer */
--                switch (size) {
--                case 0: gen_helper_neon_mul_u8(tmp, tmp, tmp2); break;
--                case 1: gen_helper_neon_mul_u16(tmp, tmp, tmp2); break;
--                case 2: tcg_gen_mul_i32(tmp, tmp, tmp2); break;
--                default: abort();
--                }
--            }
-+            /* VMUL.P8; other cases already eliminated.  */
-+            gen_helper_neon_mul_p8(tmp, tmp, tmp2);
-             break;
-         case NEON_3R_VPMAX:
-             GEN_NEON_INTEGER_OP(pmax);
---
-.19.1

-[Qemu-devel] [PULL 42/45] net: cadence_gem: Announce availability of priority queues
+Deleted patch
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Announce the availability of the various priority queues.
-This fixes an issue where guest kernels would miss to
-configure secondary queues due to inproper feature bits.
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Message-id: 20181017213932.19973-2-edgar.iglesias@gmail.com
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- hw/net/cadence_gem.c | 8 +++++++-
-file changed, 7 insertions(+), 1 deletion(-)
-diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/net/cadence_gem.c
-+++ b/hw/net/cadence_gem.c
-@@ -XXX,XX +XXX,XX @@ static void gem_reset(DeviceState *d)
-     int i;
-     CadenceGEMState *s = CADENCE_GEM(d);
-     const uint8_t *a;
-+    uint32_t queues_mask = 0;
-     DB_PRINT("\n");
-@@ -XXX,XX +XXX,XX @@ static void gem_reset(DeviceState *d)
-     s->regs[GEM_DESCONF] = 0x02500111;
-     s->regs[GEM_DESCONF2] = 0x2ab13fff;
-     s->regs[GEM_DESCONF5] = 0x002f2045;
--    s->regs[GEM_DESCONF6] = 0x00000200;
-+    s->regs[GEM_DESCONF6] = 0x0;
-+
-+    if (s->num_priority_queues > 1) {
-+        queues_mask = MAKE_64BIT_MASK(1, s->num_priority_queues - 1);
-+        s->regs[GEM_DESCONF6] |= queues_mask;
-+    }
-     /* Set MAC address */
-     a = &s->conf.macaddr.a[0];
---
-.19.1

-[Qemu-devel] [PULL 43/45] net: cadence_gem: Announce 64bit addressing support
+Deleted patch
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Announce 64bit addressing support.
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Message-id: 20181017213932.19973-3-edgar.iglesias@gmail.com
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- hw/net/cadence_gem.c | 3 ++-
-file changed, 2 insertions(+), 1 deletion(-)
-diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/net/cadence_gem.c
-+++ b/hw/net/cadence_gem.c
-@@ -XXX,XX +XXX,XX @@
- #define GEM_DESCONF4      (0x0000028C/4)
- #define GEM_DESCONF5      (0x00000290/4)
- #define GEM_DESCONF6      (0x00000294/4)
-+#define GEM_DESCONF6_64B_MASK (1U << 23)
- #define GEM_DESCONF7      (0x00000298/4)
- #define GEM_INT_Q1_STATUS               (0x00000400 / 4)
-@@ -XXX,XX +XXX,XX @@ static void gem_reset(DeviceState *d)
-     s->regs[GEM_DESCONF] = 0x02500111;
-     s->regs[GEM_DESCONF2] = 0x2ab13fff;
-     s->regs[GEM_DESCONF5] = 0x002f2045;
--    s->regs[GEM_DESCONF6] = 0x0;
-+    s->regs[GEM_DESCONF6] = GEM_DESCONF6_64B_MASK;
-     if (s->num_priority_queues > 1) {
-         queues_mask = MAKE_64BIT_MASK(1, s->num_priority_queues - 1);
---
-.19.1

As promised, another pullreq... This one's mostly RTH's patches.

thanks
-- PMM

The following changes since commit 784c2e4f232adf5ef47a84a262ec72a07d068d6a:

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging (2018-10-19 15:30:40 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20181019

for you to fetch changes up to 88c9add25e7120e8622796c81ad3f3fb7f8d40e7:

target/arm: Only flush tlb if ASID changes (2018-10-19 17:38:48 +0100)

----------------------------------------------------------------
target-arm queue:
 * ssi-sd: Make devices picking up backends unavailable with -device
 * Add support for VCPU event states
 * Move towards making ID registers the source of truth for
   whether a guest CPU implements a feature, rather than having
   parallel ID registers and feature bit flags
 * Implement various HCR hypervisor trap/config bits
 * Get IL bit correct for v7 syndrome values
 * Report correct syndrome for FP/SIMD traps to Hyp mode
 * hw/arm/boot: Increase compliance with kernel arm64 boot protocol
 * Refactor A32 Neon to use generic vector infrastructure
 * Fix a bug in A32 VLD2 "(multiple 2-element structures)" insn
 * net: cadence_gem: Report features correctly in ID register
 * Avoid some unnecessary TLB flushes on TTBR register writes

----------------------------------------------------------------
Dongjiu Geng (1):
      target/arm: Add support for VCPU event states

Edgar E. Iglesias (2):
      net: cadence_gem: Announce availability of priority queues
      net: cadence_gem: Announce 64bit addressing support

Markus Armbruster (1):
      ssi-sd: Make devices picking up backends unavailable with -device

Peter Maydell (10):
      target/arm: Improve debug logging of AArch32 exception return
      target/arm: Make switch_mode() file-local
      target/arm: Implement HCR.FB
      target/arm: Implement HCR.DC
      target/arm: ISR_EL1 bits track virtual interrupts if IMO/FMO set
      target/arm: Implement HCR.VI and VF
      target/arm: Implement HCR.PTW
      target/arm: New utility function to extract EC from syndrome
      target/arm: Get IL bit correct for v7 syndrome values
      target/arm: Report correct syndrome for FP/SIMD traps to Hyp mode

Richard Henderson (30):
      target/arm: Move some system registers into a substructure
      target/arm: V8M should not imply V7VE
      target/arm: Convert v8 extensions from feature bits to isar tests
      target/arm: Convert division from feature bits to isar0 tests
      target/arm: Convert jazelle from feature bit to isar1 test
      target/arm: Convert t32ee from feature bit to isar3 test
      target/arm: Convert sve from feature bit to aa64pfr0 test
      target/arm: Convert v8.2-fp16 from feature bit to aa64pfr0 test
      target/arm: Hoist address increment for vector memory ops
      target/arm: Don't call tcg_clear_temp_count
      target/arm: Use tcg_gen_gvec_dup_i64 for LD[1-4]R
      target/arm: Promote consecutive memory ops for aa64
      target/arm: Mark some arrays const
      target/arm: Use gvec for NEON VDUP
      target/arm: Use gvec for NEON VMOV, VMVN, VBIC & VORR (immediate)
      target/arm: Use gvec for NEON_3R_LOGIC insns
      target/arm: Use gvec for NEON_3R_VADD_VSUB insns
      target/arm: Use gvec for NEON_2RM_VMN, NEON_2RM_VNEG
      target/arm: Use gvec for NEON_3R_VMUL
      target/arm: Use gvec for VSHR, VSHL
      target/arm: Use gvec for VSRA
      target/arm: Use gvec for VSRI, VSLI
      target/arm: Use gvec for NEON_3R_VML
      target/arm: Use gvec for NEON_3R_VTST_VCEQ, NEON_3R_VCGT, NEON_3R_VCGE
      target/arm: Use gvec for NEON VLD all lanes
      target/arm: Reorg NEON VLD/VST all elements
      target/arm: Promote consecutive memory ops for aa32
      target/arm: Reorg NEON VLD/VST single element to one lane
      target/arm: Remove writefn from TTBR0_EL3
      target/arm: Only flush tlb if ASID changes

Stewart Hildebrand (1):
      hw/arm/boot: Increase compliance with kernel arm64 boot protocol

From: Markus Armbruster <armbru@redhat.com>

Device models aren't supposed to go on fishing expeditions for
backends.  They should expose suitable properties for the user to set.
For onboard devices, board code sets them.

Device ssi-sd picks up its block backend in its init() method with
drive_get_next() instead.  This mistake is already marked FIXME since
commit af9e40a.

Unset user_creatable to remove the mistake from our external
interface.  Since the SSI bus doesn't support hotplug, only -device
can be affected.  Only certain ARM machines have ssi-sd and provide an
SSI bus for it; this patch breaks -device ssi-sd for these machines.
No actual use of -device ssi-sd is known.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-id: 20181009060835.4608-1-armbru@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/sd/ssi-sd.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/sd/ssi-sd.c b/hw/sd/ssi-sd.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/sd/ssi-sd.c
+++ b/hw/sd/ssi-sd.c
@@ -XXX,XX +XXX,XX @@ static void ssi_sd_class_init(ObjectClass *klass, void *data)
     k->cs_polarity = SSI_CS_LOW;
     dc->vmsd = &vmstate_ssi_sd;
     dc->reset = ssi_sd_reset;
+    /* Reason: init() method uses drive_get_next() */
+    dc->user_creatable = false;
 }
 
 static const TypeInfo ssi_sd_info = {
-- 
2.19.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

This patch extends the qemu-kvm state sync logic with support for
KVM_GET/SET_VCPU_EVENTS, giving access to yet missing SError exception.
And also it can support the exception state migration.

The SError exception states include SError pending state and ESR value,
the kvm_put/get_vcpu_events() will be called when set or get system
registers. When do migration, if source machine has SError pending,
QEMU will do this migration regardless whether the target machine supports
to specify guest ESR value, because if target machine does not support that,
it can also inject the SError with zero ESR value.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1538067351-23931-3-git-send-email-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h     |  7 ++++++
 target/arm/kvm_arm.h | 24 ++++++++++++++++++
 target/arm/kvm.c     | 60 ++++++++++++++++++++++++++++++++++++++++++++
 target/arm/kvm32.c   | 13 ++++++++++
 target/arm/kvm64.c   | 13 ++++++++++
 target/arm/machine.c | 22 ++++++++++++++++
 6 files changed, 139 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
          */
     } exception;
 
+    /* Information associated with an SError */
+    struct {
+        uint8_t pending;
+        uint8_t has_esr;
+        uint64_t esr;
+    } serror;
+
     /* Thumb-2 EE state.  */
     uint32_t teecr;
     uint32_t teehbr;
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -XXX,XX +XXX,XX @@ bool write_kvmstate_to_list(ARMCPU *cpu);
  */
 void kvm_arm_reset_vcpu(ARMCPU *cpu);
 
+/**
+ * kvm_arm_init_serror_injection:
+ * @cs: CPUState
+ *
+ * Check whether KVM can set guest SError syndrome.
+ */
+void kvm_arm_init_serror_injection(CPUState *cs);
+
+/**
+ * kvm_get_vcpu_events:
+ * @cpu: ARMCPU
+ *
+ * Get VCPU related state from kvm.
+ */
+int kvm_get_vcpu_events(ARMCPU *cpu);
+
+/**
+ * kvm_put_vcpu_events:
+ * @cpu: ARMCPU
+ *
+ * Put VCPU related state to kvm.
+ */
+int kvm_put_vcpu_events(ARMCPU *cpu);
+
 #ifdef CONFIG_KVM
 /**
  * kvm_arm_create_scratch_host_vcpu:
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -XXX,XX +XXX,XX @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
 };
 
 static bool cap_has_mp_state;
+static bool cap_has_inject_serror_esr;
 
 static ARMHostCPUFeatures arm_host_cpu_features;
 
@@ -XXX,XX +XXX,XX @@ int kvm_arm_vcpu_init(CPUState *cs)
     return kvm_vcpu_ioctl(cs, KVM_ARM_VCPU_INIT, &init);
 }
 
+void kvm_arm_init_serror_injection(CPUState *cs)
+{
+    cap_has_inject_serror_esr = kvm_check_extension(cs->kvm_state,
+                                    KVM_CAP_ARM_INJECT_SERROR_ESR);
+}
+
 bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
                                       int *fdarray,
                                       struct kvm_vcpu_init *init)
@@ -XXX,XX +XXX,XX @@ int kvm_arm_sync_mpstate_to_qemu(ARMCPU *cpu)
     return 0;
 }
 
+int kvm_put_vcpu_events(ARMCPU *cpu)
+{
+    CPUARMState *env = &cpu->env;
+    struct kvm_vcpu_events events;
+    int ret;
+
+    if (!kvm_has_vcpu_events()) {
+        return 0;
+    }
+
+    memset(&events, 0, sizeof(events));
+    events.exception.serror_pending = env->serror.pending;
+
+    /* Inject SError to guest with specified syndrome if host kernel
+     * supports it, otherwise inject SError without syndrome.
+     */
+    if (cap_has_inject_serror_esr) {
+        events.exception.serror_has_esr = env->serror.has_esr;
+        events.exception.serror_esr = env->serror.esr;
+    }
+
+    ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
+    if (ret) {
+        error_report("failed to put vcpu events");
+    }
+
+    return ret;
+}
+
+int kvm_get_vcpu_events(ARMCPU *cpu)
+{
+    CPUARMState *env = &cpu->env;
+    struct kvm_vcpu_events events;
+    int ret;
+
+    if (!kvm_has_vcpu_events()) {
+        return 0;
+    }
+
+    memset(&events, 0, sizeof(events));
+    ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_VCPU_EVENTS, &events);
+    if (ret) {
+        error_report("failed to get vcpu events");
+        return ret;
+    }
+
+    env->serror.pending = events.exception.serror_pending;
+    env->serror.has_esr = events.exception.serror_has_esr;
+    env->serror.esr = events.exception.serror_esr;
+
+    return 0;
+}
+
 void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
 {
 }
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm32.c
+++ b/target/arm/kvm32.c
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init_vcpu(CPUState *cs)
     }
     cpu->mp_affinity = mpidr & ARM32_AFFINITY_MASK;
 
+    /* Check whether userspace can specify guest syndrome value */
+    kvm_arm_init_serror_injection(cs);
+
     return kvm_arm_init_cpreg_list(cpu);
 }
 
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
         return ret;
     }
 
+    ret = kvm_put_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     /* Note that we do not call write_cpustate_to_list()
      * here, so we are only writing the tuple list back to
      * KVM. This is safe because nothing can change the
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
     }
     vfp_set_fpscr(env, fpscr);
 
+    ret = kvm_get_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     if (!write_kvmstate_to_list(cpu)) {
         return EINVAL;
     }
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init_vcpu(CPUState *cs)
 
     kvm_arm_init_debug(cs);
 
+    /* Check whether user space can specify guest syndrome value */
+    kvm_arm_init_serror_injection(cs);
+
     return kvm_arm_init_cpreg_list(cpu);
 }
 
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
         return ret;
     }
 
+    ret = kvm_put_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     if (!write_list_to_kvmstate(cpu, level)) {
         return EINVAL;
     }
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
     }
     vfp_set_fpcr(env, fpr);
 
+    ret = kvm_get_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     if (!write_kvmstate_to_list(cpu)) {
         return EINVAL;
     }
diff --git a/target/arm/machine.c b/target/arm/machine.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_sve = {
 };
 #endif /* AARCH64 */
 
+static bool serror_needed(void *opaque)
+{
+    ARMCPU *cpu = opaque;
+    CPUARMState *env = &cpu->env;
+
+    return env->serror.pending != 0;
+}
+
+static const VMStateDescription vmstate_serror = {
+    .name = "cpu/serror",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = serror_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT8(env.serror.pending, ARMCPU),
+        VMSTATE_UINT8(env.serror.has_esr, ARMCPU),
+        VMSTATE_UINT64(env.serror.esr, ARMCPU),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static bool m_needed(void *opaque)
 {
     ARMCPU *cpu = opaque;
@@ -XXX,XX +XXX,XX @@ const VMStateDescription vmstate_arm_cpu = {
 #ifdef TARGET_AARCH64
         &vmstate_sve,
 #endif
+        &vmstate_serror,
         NULL
     }
 };
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Create struct ARMISARegisters, to be accessed during translation.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181016223115.24100-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h      |  32 ++++----
 hw/intc/armv7m_nvic.c |  12 +--
 target/arm/cpu.c      | 178 +++++++++++++++++++++---------------------
 target/arm/cpu64.c    |  70 ++++++++---------
 target/arm/helper.c   |  28 +++----
 5 files changed, 162 insertions(+), 158 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
      * ARMv7AR ARM Architecture Reference Manual. A reset_ prefix
      * is used for reset values of non-constant registers; no reset_
      * prefix means a constant register.
+     * Some of these registers are split out into a substructure that
+     * is shared with the translators to control the ISA.
      */
+    struct ARMISARegisters {
+        uint32_t id_isar0;
+        uint32_t id_isar1;
+        uint32_t id_isar2;
+        uint32_t id_isar3;
+        uint32_t id_isar4;
+        uint32_t id_isar5;
+        uint32_t id_isar6;
+        uint32_t mvfr0;
+        uint32_t mvfr1;
+        uint32_t mvfr2;
+        uint64_t id_aa64isar0;
+        uint64_t id_aa64isar1;
+        uint64_t id_aa64pfr0;
+        uint64_t id_aa64pfr1;
+    } isar;
     uint32_t midr;
     uint32_t revidr;
     uint32_t reset_fpsid;
-    uint32_t mvfr0;
-    uint32_t mvfr1;
-    uint32_t mvfr2;
     uint32_t ctr;
     uint32_t reset_sctlr;
     uint32_t id_pfr0;
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
     uint32_t id_mmfr2;
     uint32_t id_mmfr3;
     uint32_t id_mmfr4;
-    uint32_t id_isar0;
-    uint32_t id_isar1;
-    uint32_t id_isar2;
-    uint32_t id_isar3;
-    uint32_t id_isar4;
-    uint32_t id_isar5;
-    uint32_t id_isar6;
-    uint64_t id_aa64pfr0;
-    uint64_t id_aa64pfr1;
     uint64_t id_aa64dfr0;
     uint64_t id_aa64dfr1;
     uint64_t id_aa64afr0;
     uint64_t id_aa64afr1;
-    uint64_t id_aa64isar0;
-    uint64_t id_aa64isar1;
     uint64_t id_aa64mmfr0;
     uint64_t id_aa64mmfr1;
     uint32_t dbgdidr;
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
     case 0xd5c: /* MMFR3.  */
         return cpu->id_mmfr3;
     case 0xd60: /* ISAR0.  */
-        return cpu->id_isar0;
+        return cpu->isar.id_isar0;
     case 0xd64: /* ISAR1.  */
-        return cpu->id_isar1;
+        return cpu->isar.id_isar1;
     case 0xd68: /* ISAR2.  */
-        return cpu->id_isar2;
+        return cpu->isar.id_isar2;
     case 0xd6c: /* ISAR3.  */
-        return cpu->id_isar3;
+        return cpu->isar.id_isar3;
     case 0xd70: /* ISAR4.  */
-        return cpu->id_isar4;
+        return cpu->isar.id_isar4;
     case 0xd74: /* ISAR5.  */
-        return cpu->id_isar5;
+        return cpu->isar.id_isar5;
     case 0xd78: /* CLIDR */
         return cpu->clidr;
     case 0xd7c: /* CTR */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
     g_hash_table_foreach(cpu->cp_regs, cp_reg_check_reset, cpu);
 
     env->vfp.xregs[ARM_VFP_FPSID] = cpu->reset_fpsid;
-    env->vfp.xregs[ARM_VFP_MVFR0] = cpu->mvfr0;
-    env->vfp.xregs[ARM_VFP_MVFR1] = cpu->mvfr1;
-    env->vfp.xregs[ARM_VFP_MVFR2] = cpu->mvfr2;
+    env->vfp.xregs[ARM_VFP_MVFR0] = cpu->isar.mvfr0;
+    env->vfp.xregs[ARM_VFP_MVFR1] = cpu->isar.mvfr1;
+    env->vfp.xregs[ARM_VFP_MVFR2] = cpu->isar.mvfr2;
 
     cpu->power_state = cpu->start_powered_off ? PSCI_OFF : PSCI_ON;
     s->halted = cpu->start_powered_off;
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
          * registers as well. These are id_pfr1[7:4] and id_aa64pfr0[15:12].
          */
         cpu->id_pfr1 &= ~0xf0;
-        cpu->id_aa64pfr0 &= ~0xf000;
+        cpu->isar.id_aa64pfr0 &= ~0xf000;
     }
 
     if (!cpu->has_el2) {
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
          * registers if we don't have EL2. These are id_pfr1[15:12] and
          * id_aa64pfr0_el1[11:8].
          */
-        cpu->id_aa64pfr0 &= ~0xf00;
+        cpu->isar.id_aa64pfr0 &= ~0xf00;
         cpu->id_pfr1 &= ~0xf000;
     }
 
@@ -XXX,XX +XXX,XX @@ static void arm1136_r2_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_CACHE_BLOCK_OPS);
     cpu->midr = 0x4107b362;
     cpu->reset_fpsid = 0x410120b4;
-    cpu->mvfr0 = 0x11111111;
-    cpu->mvfr1 = 0x00000000;
+    cpu->isar.mvfr0 = 0x11111111;
+    cpu->isar.mvfr1 = 0x00000000;
     cpu->ctr = 0x1dd20d2;
     cpu->reset_sctlr = 0x00050078;
     cpu->id_pfr0 = 0x111;
@@ -XXX,XX +XXX,XX @@ static void arm1136_r2_initfn(Object *obj)
     cpu->id_mmfr0 = 0x01130003;
     cpu->id_mmfr1 = 0x10030302;
     cpu->id_mmfr2 = 0x01222110;
-    cpu->id_isar0 = 0x00140011;
-    cpu->id_isar1 = 0x12002111;
-    cpu->id_isar2 = 0x11231111;
-    cpu->id_isar3 = 0x01102131;
-    cpu->id_isar4 = 0x141;
+    cpu->isar.id_isar0 = 0x00140011;
+    cpu->isar.id_isar1 = 0x12002111;
+    cpu->isar.id_isar2 = 0x11231111;
+    cpu->isar.id_isar3 = 0x01102131;
+    cpu->isar.id_isar4 = 0x141;
     cpu->reset_auxcr = 7;
 }
 
@@ -XXX,XX +XXX,XX @@ static void arm1136_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_CACHE_BLOCK_OPS);
     cpu->midr = 0x4117b363;
     cpu->reset_fpsid = 0x410120b4;
-    cpu->mvfr0 = 0x11111111;
-    cpu->mvfr1 = 0x00000000;
+    cpu->isar.mvfr0 = 0x11111111;
+    cpu->isar.mvfr1 = 0x00000000;
     cpu->ctr = 0x1dd20d2;
     cpu->reset_sctlr = 0x00050078;
     cpu->id_pfr0 = 0x111;
@@ -XXX,XX +XXX,XX @@ static void arm1136_initfn(Object *obj)
     cpu->id_mmfr0 = 0x01130003;
     cpu->id_mmfr1 = 0x10030302;
     cpu->id_mmfr2 = 0x01222110;
-    cpu->id_isar0 = 0x00140011;
-    cpu->id_isar1 = 0x12002111;
-    cpu->id_isar2 = 0x11231111;
-    cpu->id_isar3 = 0x01102131;
-    cpu->id_isar4 = 0x141;
+    cpu->isar.id_isar0 = 0x00140011;
+    cpu->isar.id_isar1 = 0x12002111;
+    cpu->isar.id_isar2 = 0x11231111;
+    cpu->isar.id_isar3 = 0x01102131;
+    cpu->isar.id_isar4 = 0x141;
     cpu->reset_auxcr = 7;
 }
 
@@ -XXX,XX +XXX,XX @@ static void arm1176_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_EL3);
     cpu->midr = 0x410fb767;
     cpu->reset_fpsid = 0x410120b5;
-    cpu->mvfr0 = 0x11111111;
-    cpu->mvfr1 = 0x00000000;
+    cpu->isar.mvfr0 = 0x11111111;
+    cpu->isar.mvfr1 = 0x00000000;
     cpu->ctr = 0x1dd20d2;
     cpu->reset_sctlr = 0x00050078;
     cpu->id_pfr0 = 0x111;
@@ -XXX,XX +XXX,XX @@ static void arm1176_initfn(Object *obj)
     cpu->id_mmfr0 = 0x01130003;
     cpu->id_mmfr1 = 0x10030302;
     cpu->id_mmfr2 = 0x01222100;
-    cpu->id_isar0 = 0x0140011;
-    cpu->id_isar1 = 0x12002111;
-    cpu->id_isar2 = 0x11231121;
-    cpu->id_isar3 = 0x01102131;
-    cpu->id_isar4 = 0x01141;
+    cpu->isar.id_isar0 = 0x0140011;
+    cpu->isar.id_isar1 = 0x12002111;
+    cpu->isar.id_isar2 = 0x11231121;
+    cpu->isar.id_isar3 = 0x01102131;
+    cpu->isar.id_isar4 = 0x01141;
     cpu->reset_auxcr = 7;
 }
 
@@ -XXX,XX +XXX,XX @@ static void arm11mpcore_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
     cpu->midr = 0x410fb022;
     cpu->reset_fpsid = 0x410120b4;
-    cpu->mvfr0 = 0x11111111;
-    cpu->mvfr1 = 0x00000000;
+    cpu->isar.mvfr0 = 0x11111111;
+    cpu->isar.mvfr1 = 0x00000000;
     cpu->ctr = 0x1d192992; /* 32K icache 32K dcache */
     cpu->id_pfr0 = 0x111;
     cpu->id_pfr1 = 0x1;
@@ -XXX,XX +XXX,XX @@ static void arm11mpcore_initfn(Object *obj)
     cpu->id_mmfr0 = 0x01100103;
     cpu->id_mmfr1 = 0x10020302;
     cpu->id_mmfr2 = 0x01222000;
-    cpu->id_isar0 = 0x00100011;
-    cpu->id_isar1 = 0x12002111;
-    cpu->id_isar2 = 0x11221011;
-    cpu->id_isar3 = 0x01102131;
-    cpu->id_isar4 = 0x141;
+    cpu->isar.id_isar0 = 0x00100011;
+    cpu->isar.id_isar1 = 0x12002111;
+    cpu->isar.id_isar2 = 0x11221011;
+    cpu->isar.id_isar3 = 0x01102131;
+    cpu->isar.id_isar4 = 0x141;
     cpu->reset_auxcr = 1;
 }
 
@@ -XXX,XX +XXX,XX @@ static void cortex_m3_initfn(Object *obj)
     cpu->id_mmfr1 = 0x00000000;
     cpu->id_mmfr2 = 0x00000000;
     cpu->id_mmfr3 = 0x00000000;
-    cpu->id_isar0 = 0x01141110;
-    cpu->id_isar1 = 0x02111000;
-    cpu->id_isar2 = 0x21112231;
-    cpu->id_isar3 = 0x01111110;
-    cpu->id_isar4 = 0x01310102;
-    cpu->id_isar5 = 0x00000000;
-    cpu->id_isar6 = 0x00000000;
+    cpu->isar.id_isar0 = 0x01141110;
+    cpu->isar.id_isar1 = 0x02111000;
+    cpu->isar.id_isar2 = 0x21112231;
+    cpu->isar.id_isar3 = 0x01111110;
+    cpu->isar.id_isar4 = 0x01310102;
+    cpu->isar.id_isar5 = 0x00000000;
+    cpu->isar.id_isar6 = 0x00000000;
 }
 
 static void cortex_m4_initfn(Object *obj)
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
     cpu->id_mmfr1 = 0x00000000;
     cpu->id_mmfr2 = 0x00000000;
     cpu->id_mmfr3 = 0x00000000;
-    cpu->id_isar0 = 0x01141110;
-    cpu->id_isar1 = 0x02111000;
-    cpu->id_isar2 = 0x21112231;
-    cpu->id_isar3 = 0x01111110;
-    cpu->id_isar4 = 0x01310102;
-    cpu->id_isar5 = 0x00000000;
-    cpu->id_isar6 = 0x00000000;
+    cpu->isar.id_isar0 = 0x01141110;
+    cpu->isar.id_isar1 = 0x02111000;
+    cpu->isar.id_isar2 = 0x21112231;
+    cpu->isar.id_isar3 = 0x01111110;
+    cpu->isar.id_isar4 = 0x01310102;
+    cpu->isar.id_isar5 = 0x00000000;
+    cpu->isar.id_isar6 = 0x00000000;
 }
 
 static void cortex_m33_initfn(Object *obj)
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
     cpu->id_mmfr1 = 0x00000000;
     cpu->id_mmfr2 = 0x01000000;
     cpu->id_mmfr3 = 0x00000000;
-    cpu->id_isar0 = 0x01101110;
-    cpu->id_isar1 = 0x02212000;
-    cpu->id_isar2 = 0x20232232;
-    cpu->id_isar3 = 0x01111131;
-    cpu->id_isar4 = 0x01310132;
-    cpu->id_isar5 = 0x00000000;
-    cpu->id_isar6 = 0x00000000;
+    cpu->isar.id_isar0 = 0x01101110;
+    cpu->isar.id_isar1 = 0x02212000;
+    cpu->isar.id_isar2 = 0x20232232;
+    cpu->isar.id_isar3 = 0x01111131;
+    cpu->isar.id_isar4 = 0x01310132;
+    cpu->isar.id_isar5 = 0x00000000;
+    cpu->isar.id_isar6 = 0x00000000;
     cpu->clidr = 0x00000000;
     cpu->ctr = 0x8000c000;
 }
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
     cpu->id_mmfr1 = 0x00000000;
     cpu->id_mmfr2 = 0x01200000;
     cpu->id_mmfr3 = 0x0211;
-    cpu->id_isar0 = 0x02101111;
-    cpu->id_isar1 = 0x13112111;
-    cpu->id_isar2 = 0x21232141;
-    cpu->id_isar3 = 0x01112131;
-    cpu->id_isar4 = 0x0010142;
-    cpu->id_isar5 = 0x0;
-    cpu->id_isar6 = 0x0;
+    cpu->isar.id_isar0 = 0x02101111;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232141;
+    cpu->isar.id_isar3 = 0x01112131;
+    cpu->isar.id_isar4 = 0x0010142;
+    cpu->isar.id_isar5 = 0x0;
+    cpu->isar.id_isar6 = 0x0;
     cpu->mp_is_up = true;
     cpu->pmsav7_dregion = 16;
     define_arm_cp_regs(cpu, cortexr5_cp_reginfo);
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_EL3);
     cpu->midr = 0x410fc080;
     cpu->reset_fpsid = 0x410330c0;
-    cpu->mvfr0 = 0x11110222;
-    cpu->mvfr1 = 0x00011111;
+    cpu->isar.mvfr0 = 0x11110222;
+    cpu->isar.mvfr1 = 0x00011111;
     cpu->ctr = 0x82048004;
     cpu->reset_sctlr = 0x00c50078;
     cpu->id_pfr0 = 0x1031;
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
     cpu->id_mmfr1 = 0x20000000;
     cpu->id_mmfr2 = 0x01202000;
     cpu->id_mmfr3 = 0x11;
-    cpu->id_isar0 = 0x00101111;
-    cpu->id_isar1 = 0x12112111;
-    cpu->id_isar2 = 0x21232031;
-    cpu->id_isar3 = 0x11112131;
-    cpu->id_isar4 = 0x00111142;
+    cpu->isar.id_isar0 = 0x00101111;
+    cpu->isar.id_isar1 = 0x12112111;
+    cpu->isar.id_isar2 = 0x21232031;
+    cpu->isar.id_isar3 = 0x11112131;
+    cpu->isar.id_isar4 = 0x00111142;
     cpu->dbgdidr = 0x15141000;
     cpu->clidr = (1 << 27) | (2 << 24) | 3;
     cpu->ccsidr[0] = 0xe007e01a; /* 16k L1 dcache. */
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_CBAR);
     cpu->midr = 0x410fc090;
     cpu->reset_fpsid = 0x41033090;
-    cpu->mvfr0 = 0x11110222;
-    cpu->mvfr1 = 0x01111111;
+    cpu->isar.mvfr0 = 0x11110222;
+    cpu->isar.mvfr1 = 0x01111111;
     cpu->ctr = 0x80038003;
     cpu->reset_sctlr = 0x00c50078;
     cpu->id_pfr0 = 0x1031;
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
     cpu->id_mmfr1 = 0x20000000;
     cpu->id_mmfr2 = 0x01230000;
     cpu->id_mmfr3 = 0x00002111;
-    cpu->id_isar0 = 0x00101111;
-    cpu->id_isar1 = 0x13112111;
-    cpu->id_isar2 = 0x21232041;
-    cpu->id_isar3 = 0x11112131;
-    cpu->id_isar4 = 0x00111142;
+    cpu->isar.id_isar0 = 0x00101111;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232041;
+    cpu->isar.id_isar3 = 0x11112131;
+    cpu->isar.id_isar4 = 0x00111142;
     cpu->dbgdidr = 0x35141000;
     cpu->clidr = (1 << 27) | (1 << 24) | 3;
     cpu->ccsidr[0] = 0xe00fe019; /* 16k L1 dcache. */
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
     cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A7;
     cpu->midr = 0x410fc075;
     cpu->reset_fpsid = 0x41023075;
-    cpu->mvfr0 = 0x10110222;
-    cpu->mvfr1 = 0x11111111;
+    cpu->isar.mvfr0 = 0x10110222;
+    cpu->isar.mvfr1 = 0x11111111;
     cpu->ctr = 0x84448003;
     cpu->reset_sctlr = 0x00c50078;
     cpu->id_pfr0 = 0x00001131;
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
     /* a7_mpcore_r0p5_trm, page 4-4 gives 0x01101110; but
      * table 4-41 gives 0x02101110, which includes the arm div insns.
      */
-    cpu->id_isar0 = 0x02101110;
-    cpu->id_isar1 = 0x13112111;
-    cpu->id_isar2 = 0x21232041;
-    cpu->id_isar3 = 0x11112131;
-    cpu->id_isar4 = 0x10011142;
+    cpu->isar.id_isar0 = 0x02101110;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232041;
+    cpu->isar.id_isar3 = 0x11112131;
+    cpu->isar.id_isar4 = 0x10011142;
     cpu->dbgdidr = 0x3515f005;
     cpu->clidr = 0x0a200023;
     cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
     cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A15;
     cpu->midr = 0x412fc0f1;
     cpu->reset_fpsid = 0x410430f0;
-    cpu->mvfr0 = 0x10110222;
-    cpu->mvfr1 = 0x11111111;
+    cpu->isar.mvfr0 = 0x10110222;
+    cpu->isar.mvfr1 = 0x11111111;
     cpu->ctr = 0x8444c004;
     cpu->reset_sctlr = 0x00c50078;
     cpu->id_pfr0 = 0x00001131;
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
     cpu->id_mmfr1 = 0x20000000;
     cpu->id_mmfr2 = 0x01240000;
     cpu->id_mmfr3 = 0x02102211;
-    cpu->id_isar0 = 0x02101110;
-    cpu->id_isar1 = 0x13112111;
-    cpu->id_isar2 = 0x21232041;
-    cpu->id_isar3 = 0x11112131;
-    cpu->id_isar4 = 0x10011142;
+    cpu->isar.id_isar0 = 0x02101110;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232041;
+    cpu->isar.id_isar3 = 0x11112131;
+    cpu->isar.id_isar4 = 0x10011142;
     cpu->dbgdidr = 0x3515f021;
     cpu->clidr = 0x0a200023;
     cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
     cpu->midr = 0x411fd070;
     cpu->revidr = 0x00000000;
     cpu->reset_fpsid = 0x41034070;
-    cpu->mvfr0 = 0x10110222;
-    cpu->mvfr1 = 0x12111111;
-    cpu->mvfr2 = 0x00000043;
+    cpu->isar.mvfr0 = 0x10110222;
+    cpu->isar.mvfr1 = 0x12111111;
+    cpu->isar.mvfr2 = 0x00000043;
     cpu->ctr = 0x8444c004;
     cpu->reset_sctlr = 0x00c50838;
     cpu->id_pfr0 = 0x00000131;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
     cpu->id_mmfr1 = 0x40000000;
     cpu->id_mmfr2 = 0x01260000;
     cpu->id_mmfr3 = 0x02102211;
-    cpu->id_isar0 = 0x02101110;
-    cpu->id_isar1 = 0x13112111;
-    cpu->id_isar2 = 0x21232042;
-    cpu->id_isar3 = 0x01112131;
-    cpu->id_isar4 = 0x00011142;
-    cpu->id_isar5 = 0x00011121;
-    cpu->id_isar6 = 0;
-    cpu->id_aa64pfr0 = 0x00002222;
+    cpu->isar.id_isar0 = 0x02101110;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232042;
+    cpu->isar.id_isar3 = 0x01112131;
+    cpu->isar.id_isar4 = 0x00011142;
+    cpu->isar.id_isar5 = 0x00011121;
+    cpu->isar.id_isar6 = 0;
+    cpu->isar.id_aa64pfr0 = 0x00002222;
     cpu->id_aa64dfr0 = 0x10305106;
     cpu->pmceid0 = 0x00000000;
     cpu->pmceid1 = 0x00000000;
-    cpu->id_aa64isar0 = 0x00011120;
+    cpu->isar.id_aa64isar0 = 0x00011120;
     cpu->id_aa64mmfr0 = 0x00001124;
     cpu->dbgdidr = 0x3516d000;
     cpu->clidr = 0x0a200023;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
     cpu->midr = 0x410fd034;
     cpu->revidr = 0x00000000;
     cpu->reset_fpsid = 0x41034070;
-    cpu->mvfr0 = 0x10110222;
-    cpu->mvfr1 = 0x12111111;
-    cpu->mvfr2 = 0x00000043;
+    cpu->isar.mvfr0 = 0x10110222;
+    cpu->isar.mvfr1 = 0x12111111;
+    cpu->isar.mvfr2 = 0x00000043;
     cpu->ctr = 0x84448004; /* L1Ip = VIPT */
     cpu->reset_sctlr = 0x00c50838;
     cpu->id_pfr0 = 0x00000131;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
     cpu->id_mmfr1 = 0x40000000;
     cpu->id_mmfr2 = 0x01260000;
     cpu->id_mmfr3 = 0x02102211;
-    cpu->id_isar0 = 0x02101110;
-    cpu->id_isar1 = 0x13112111;
-    cpu->id_isar2 = 0x21232042;
-    cpu->id_isar3 = 0x01112131;
-    cpu->id_isar4 = 0x00011142;
-    cpu->id_isar5 = 0x00011121;
-    cpu->id_isar6 = 0;
-    cpu->id_aa64pfr0 = 0x00002222;
+    cpu->isar.id_isar0 = 0x02101110;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232042;
+    cpu->isar.id_isar3 = 0x01112131;
+    cpu->isar.id_isar4 = 0x00011142;
+    cpu->isar.id_isar5 = 0x00011121;
+    cpu->isar.id_isar6 = 0;
+    cpu->isar.id_aa64pfr0 = 0x00002222;
     cpu->id_aa64dfr0 = 0x10305106;
-    cpu->id_aa64isar0 = 0x00011120;
+    cpu->isar.id_aa64isar0 = 0x00011120;
     cpu->id_aa64mmfr0 = 0x00001122; /* 40 bit physical addr */
     cpu->dbgdidr = 0x3516d000;
     cpu->clidr = 0x0a200023;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
     cpu->midr = 0x410fd083;
     cpu->revidr = 0x00000000;
     cpu->reset_fpsid = 0x41034080;
-    cpu->mvfr0 = 0x10110222;
-    cpu->mvfr1 = 0x12111111;
-    cpu->mvfr2 = 0x00000043;
+    cpu->isar.mvfr0 = 0x10110222;
+    cpu->isar.mvfr1 = 0x12111111;
+    cpu->isar.mvfr2 = 0x00000043;
     cpu->ctr = 0x8444c004;
     cpu->reset_sctlr = 0x00c50838;
     cpu->id_pfr0 = 0x00000131;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
     cpu->id_mmfr1 = 0x40000000;
     cpu->id_mmfr2 = 0x01260000;
     cpu->id_mmfr3 = 0x02102211;
-    cpu->id_isar0 = 0x02101110;
-    cpu->id_isar1 = 0x13112111;
-    cpu->id_isar2 = 0x21232042;
-    cpu->id_isar3 = 0x01112131;
-    cpu->id_isar4 = 0x00011142;
-    cpu->id_isar5 = 0x00011121;
-    cpu->id_aa64pfr0 = 0x00002222;
+    cpu->isar.id_isar0 = 0x02101110;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232042;
+    cpu->isar.id_isar3 = 0x01112131;
+    cpu->isar.id_isar4 = 0x00011142;
+    cpu->isar.id_isar5 = 0x00011121;
+    cpu->isar.id_aa64pfr0 = 0x00002222;
     cpu->id_aa64dfr0 = 0x10305106;
     cpu->pmceid0 = 0x00000000;
     cpu->pmceid1 = 0x00000000;
-    cpu->id_aa64isar0 = 0x00011120;
+    cpu->isar.id_aa64isar0 = 0x00011120;
     cpu->id_aa64mmfr0 = 0x00001124;
     cpu->dbgdidr = 0x3516d000;
     cpu->clidr = 0x0a200023;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static uint64_t id_pfr1_read(CPUARMState *env, const ARMCPRegInfo *ri)
 static uint64_t id_aa64pfr0_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
     ARMCPU *cpu = arm_env_get_cpu(env);
-    uint64_t pfr0 = cpu->id_aa64pfr0;
+    uint64_t pfr0 = cpu->isar.id_aa64pfr0;
 
     if (env->gicv3state) {
         pfr0 |= 1 << 24;
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "ID_ISAR0", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 0,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->id_isar0 },
+              .resetvalue = cpu->isar.id_isar0 },
             { .name = "ID_ISAR1", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 1,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->id_isar1 },
+              .resetvalue = cpu->isar.id_isar1 },
             { .name = "ID_ISAR2", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 2,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->id_isar2 },
+              .resetvalue = cpu->isar.id_isar2 },
             { .name = "ID_ISAR3", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 3,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->id_isar3 },
+              .resetvalue = cpu->isar.id_isar3 },
             { .name = "ID_ISAR4", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 4,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->id_isar4 },
+              .resetvalue = cpu->isar.id_isar4 },
             { .name = "ID_ISAR5", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 5,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->id_isar5 },
+              .resetvalue = cpu->isar.id_isar5 },
             { .name = "ID_MMFR4", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 6,
               .access = PL1_R, .type = ARM_CP_CONST,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "ID_ISAR6", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 7,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->id_isar6 },
+              .resetvalue = cpu->isar.id_isar6 },
             REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, v6_idregs);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "ID_AA64PFR1_EL1", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 1,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->id_aa64pfr1},
+              .resetvalue = cpu->isar.id_aa64pfr1},
             { .name = "ID_AA64PFR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 2,
               .access = PL1_R, .type = ARM_CP_CONST,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "ID_AA64ISAR0_EL1", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 0,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->id_aa64isar0 },
+              .resetvalue = cpu->isar.id_aa64isar0 },
             { .name = "ID_AA64ISAR1_EL1", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 1,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->id_aa64isar1 },
+              .resetvalue = cpu->isar.id_aa64isar1 },
             { .name = "ID_AA64ISAR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 2,
               .access = PL1_R, .type = ARM_CP_CONST,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "MVFR0_EL1", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 0,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->mvfr0 },
+              .resetvalue = cpu->isar.mvfr0 },
             { .name = "MVFR1_EL1", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 1,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->mvfr1 },
+              .resetvalue = cpu->isar.mvfr1 },
             { .name = "MVFR2_EL1", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 2,
               .access = PL1_R, .type = ARM_CP_CONST,
-              .resetvalue = cpu->mvfr2 },
+              .resetvalue = cpu->isar.mvfr2 },
             { .name = "MVFR3_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 3,
               .access = PL1_R, .type = ARM_CP_CONST,
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Instantiating mps2-an505 (cortex-m33) will fail make check when
V7VE asserts that ID_ISAR0.Divide includes ARM division.  It is
also wrong to include ARM_FEATURE_LPAE.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181016223115.24100-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
 
     /* Some features automatically imply others: */
     if (arm_feature(env, ARM_FEATURE_V8)) {
-        set_feature(env, ARM_FEATURE_V7VE);
+        if (arm_feature(env, ARM_FEATURE_M)) {
+            set_feature(env, ARM_FEATURE_V7);
+        } else {
+            set_feature(env, ARM_FEATURE_V7VE);
+        }
     }
     if (arm_feature(env, ARM_FEATURE_V7VE)) {
         /* v7 Virtualization Extensions. In real hardware this implies
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Most of the v8 extensions are self-contained within the ISAR
registers and are not implied by other feature bits, which
makes them the easiest to convert.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181016223115.24100-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h           | 131 +++++++++++++++++++++++++++++++++----
 target/arm/translate.h     |   7 ++
 linux-user/elfload.c       |  46 ++++++++-----
 target/arm/cpu.c           |  27 +++++---
 target/arm/cpu64.c         |  57 +++++++++-------
 target/arm/translate-a64.c | 101 ++++++++++++++--------------
 target/arm/translate.c     |  36 +++++-----
 7 files changed, 273 insertions(+), 132 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef enum ARMPSCIState {
     PSCI_ON_PENDING = 2
 } ARMPSCIState;
 
+typedef struct ARMISARegisters ARMISARegisters;
+
 /**
  * ARMCPU:
  * @env: #CPUARMState
@@ -XXX,XX +XXX,XX @@ enum arm_features {
     ARM_FEATURE_LPAE, /* has Large Physical Address Extension */
     ARM_FEATURE_V8,
     ARM_FEATURE_AARCH64, /* supports 64 bit mode */
-    ARM_FEATURE_V8_AES, /* implements AES part of v8 Crypto Extensions */
     ARM_FEATURE_CBAR, /* has cp15 CBAR */
     ARM_FEATURE_CRC, /* ARMv8 CRC instructions */
     ARM_FEATURE_CBAR_RO, /* has cp15 CBAR and it is read-only */
     ARM_FEATURE_EL2, /* has EL2 Virtualization support */
     ARM_FEATURE_EL3, /* has EL3 Secure monitor support */
-    ARM_FEATURE_V8_SHA1, /* implements SHA1 part of v8 Crypto Extensions */
-    ARM_FEATURE_V8_SHA256, /* implements SHA256 part of v8 Crypto Extensions */
-    ARM_FEATURE_V8_PMULL, /* implements PMULL part of v8 Crypto Extensions */
     ARM_FEATURE_THUMB_DSP, /* DSP insns supported in the Thumb encodings */
     ARM_FEATURE_PMU, /* has PMU support */
     ARM_FEATURE_VBAR, /* has cp15 VBAR */
     ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
     ARM_FEATURE_JAZELLE, /* has (trivial) Jazelle implementation */
     ARM_FEATURE_SVE, /* has Scalable Vector Extension */
-    ARM_FEATURE_V8_SHA512, /* implements SHA512 part of v8 Crypto Extensions */
-    ARM_FEATURE_V8_SHA3, /* implements SHA3 part of v8 Crypto Extensions */
-    ARM_FEATURE_V8_SM3, /* implements SM3 part of v8 Crypto Extensions */
-    ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */
-    ARM_FEATURE_V8_ATOMICS, /* ARMv8.1-Atomics feature */
-    ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */
-    ARM_FEATURE_V8_DOTPROD, /* implements v8.2 simd dot product */
     ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
-    ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions.  */
     ARM_FEATURE_M_MAIN, /* M profile Main Extension */
 };
 
@@ -XXX,XX +XXX,XX @@ static inline uint64_t *aa64_vfp_qreg(CPUARMState *env, unsigned regno)
 /* Shared between translate-sve.c and sve_helper.c.  */
 extern const uint64_t pred_esz_masks[4];
 
+/*
+ * 32-bit feature tests via id registers.
+ */
+static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
+}
+
+static inline bool isar_feature_aa32_pmull(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) > 1;
+}
+
+static inline bool isar_feature_aa32_sha1(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar5, ID_ISAR5, SHA1) != 0;
+}
+
+static inline bool isar_feature_aa32_sha2(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar5, ID_ISAR5, SHA2) != 0;
+}
+
+static inline bool isar_feature_aa32_crc32(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar5, ID_ISAR5, CRC32) != 0;
+}
+
+static inline bool isar_feature_aa32_rdm(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar5, ID_ISAR5, RDM) != 0;
+}
+
+static inline bool isar_feature_aa32_vcma(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar5, ID_ISAR5, VCMA) != 0;
+}
+
+static inline bool isar_feature_aa32_dp(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar6, ID_ISAR6, DP) != 0;
+}
+
+/*
+ * 64-bit feature tests via id registers.
+ */
+static inline bool isar_feature_aa64_aes(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, AES) != 0;
+}
+
+static inline bool isar_feature_aa64_pmull(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, AES) > 1;
+}
+
+static inline bool isar_feature_aa64_sha1(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA1) != 0;
+}
+
+static inline bool isar_feature_aa64_sha256(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA2) != 0;
+}
+
+static inline bool isar_feature_aa64_sha512(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA2) > 1;
+}
+
+static inline bool isar_feature_aa64_crc32(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, CRC32) != 0;
+}
+
+static inline bool isar_feature_aa64_atomics(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, ATOMIC) != 0;
+}
+
+static inline bool isar_feature_aa64_rdm(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, RDM) != 0;
+}
+
+static inline bool isar_feature_aa64_sha3(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA3) != 0;
+}
+
+static inline bool isar_feature_aa64_sm3(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SM3) != 0;
+}
+
+static inline bool isar_feature_aa64_sm4(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SM4) != 0;
+}
+
+static inline bool isar_feature_aa64_dp(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, DP) != 0;
+}
+
+static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
+}
+
+/*
+ * Forward to the above feature tests given an ARMCPU pointer.
+ */
+#define cpu_isar_feature(name, cpu) \
+    ({ ARMCPU *cpu_ = (cpu); isar_feature_##name(&cpu_->isar); })
+
 #endif
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@
 /* internal defines */
 typedef struct DisasContext {
     DisasContextBase base;
+    const ARMISARegisters *isar;
 
     target_ulong pc;
     target_ulong page_start;
@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 get_ahp_flag(void)
     return ret;
 }
 
+/*
+ * Forward to the isar_feature_* tests given a DisasContext pointer.
+ */
+#define dc_isar_feature(name, ctx) \
+    ({ DisasContext *ctx_ = (ctx); isar_feature_##name(ctx_->isar); })
+
 #endif /* TARGET_ARM_TRANSLATE_H */
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
     /* probe for the extra features */
 #define GET_FEATURE(feat, hwcap) \
     do { if (arm_feature(&cpu->env, feat)) { hwcaps |= hwcap; } } while (0)
+
+#define GET_FEATURE_ID(feat, hwcap) \
+    do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)
+
     /* EDSP is in v5TE and above, but all our v5 CPUs are v5TE */
     GET_FEATURE(ARM_FEATURE_V5, ARM_HWCAP_ARM_EDSP);
     GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap2(void)
     ARMCPU *cpu = ARM_CPU(thread_cpu);
     uint32_t hwcaps = 0;
 
-    GET_FEATURE(ARM_FEATURE_V8_AES, ARM_HWCAP2_ARM_AES);
-    GET_FEATURE(ARM_FEATURE_V8_PMULL, ARM_HWCAP2_ARM_PMULL);
-    GET_FEATURE(ARM_FEATURE_V8_SHA1, ARM_HWCAP2_ARM_SHA1);
-    GET_FEATURE(ARM_FEATURE_V8_SHA256, ARM_HWCAP2_ARM_SHA2);
-    GET_FEATURE(ARM_FEATURE_CRC, ARM_HWCAP2_ARM_CRC32);
+    GET_FEATURE_ID(aa32_aes, ARM_HWCAP2_ARM_AES);
+    GET_FEATURE_ID(aa32_pmull, ARM_HWCAP2_ARM_PMULL);
+    GET_FEATURE_ID(aa32_sha1, ARM_HWCAP2_ARM_SHA1);
+    GET_FEATURE_ID(aa32_sha2, ARM_HWCAP2_ARM_SHA2);
+    GET_FEATURE_ID(aa32_crc32, ARM_HWCAP2_ARM_CRC32);
     return hwcaps;
 }
 
 #undef GET_FEATURE
+#undef GET_FEATURE_ID
 
 #else
 /* 64 bit ARM definitions */
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
     /* probe for the extra features */
 #define GET_FEATURE(feat, hwcap) \
     do { if (arm_feature(&cpu->env, feat)) { hwcaps |= hwcap; } } while (0)
-    GET_FEATURE(ARM_FEATURE_V8_AES, ARM_HWCAP_A64_AES);
-    GET_FEATURE(ARM_FEATURE_V8_PMULL, ARM_HWCAP_A64_PMULL);
-    GET_FEATURE(ARM_FEATURE_V8_SHA1, ARM_HWCAP_A64_SHA1);
-    GET_FEATURE(ARM_FEATURE_V8_SHA256, ARM_HWCAP_A64_SHA2);
-    GET_FEATURE(ARM_FEATURE_CRC, ARM_HWCAP_A64_CRC32);
-    GET_FEATURE(ARM_FEATURE_V8_SHA3, ARM_HWCAP_A64_SHA3);
-    GET_FEATURE(ARM_FEATURE_V8_SM3, ARM_HWCAP_A64_SM3);
-    GET_FEATURE(ARM_FEATURE_V8_SM4, ARM_HWCAP_A64_SM4);
-    GET_FEATURE(ARM_FEATURE_V8_SHA512, ARM_HWCAP_A64_SHA512);
+#define GET_FEATURE_ID(feat, hwcap) \
+    do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)
+
+    GET_FEATURE_ID(aa64_aes, ARM_HWCAP_A64_AES);
+    GET_FEATURE_ID(aa64_pmull, ARM_HWCAP_A64_PMULL);
+    GET_FEATURE_ID(aa64_sha1, ARM_HWCAP_A64_SHA1);
+    GET_FEATURE_ID(aa64_sha256, ARM_HWCAP_A64_SHA2);
+    GET_FEATURE_ID(aa64_sha512, ARM_HWCAP_A64_SHA512);
+    GET_FEATURE_ID(aa64_crc32, ARM_HWCAP_A64_CRC32);
+    GET_FEATURE_ID(aa64_sha3, ARM_HWCAP_A64_SHA3);
+    GET_FEATURE_ID(aa64_sm3, ARM_HWCAP_A64_SM3);
+    GET_FEATURE_ID(aa64_sm4, ARM_HWCAP_A64_SM4);
     GET_FEATURE(ARM_FEATURE_V8_FP16,
                 ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
-    GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS);
-    GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
-    GET_FEATURE(ARM_FEATURE_V8_DOTPROD, ARM_HWCAP_A64_ASIMDDP);
-    GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
+    GET_FEATURE_ID(aa64_atomics, ARM_HWCAP_A64_ATOMICS);
+    GET_FEATURE_ID(aa64_rdm, ARM_HWCAP_A64_ASIMDRDM);
+    GET_FEATURE_ID(aa64_dp, ARM_HWCAP_A64_ASIMDDP);
+    GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
     GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
+
 #undef GET_FEATURE
+#undef GET_FEATURE_ID
 
     return hwcaps;
 }
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
         cortex_a15_initfn(obj);
 #ifdef CONFIG_USER_ONLY
         /* We don't set these in system emulation mode for the moment,
-         * since we don't correctly set the ID registers to advertise them,
+         * since we don't correctly set (all of) the ID registers to
+         * advertise them.
          */
         set_feature(&cpu->env, ARM_FEATURE_V8);
-        set_feature(&cpu->env, ARM_FEATURE_V8_AES);
-        set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
-        set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
-        set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
-        set_feature(&cpu->env, ARM_FEATURE_CRC);
-        set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
-        set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
-        set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
+        {
+            uint32_t t;
+
+            t = cpu->isar.id_isar5;
+            t = FIELD_DP32(t, ID_ISAR5, AES, 2);
+            t = FIELD_DP32(t, ID_ISAR5, SHA1, 1);
+            t = FIELD_DP32(t, ID_ISAR5, SHA2, 1);
+            t = FIELD_DP32(t, ID_ISAR5, CRC32, 1);
+            t = FIELD_DP32(t, ID_ISAR5, RDM, 1);
+            t = FIELD_DP32(t, ID_ISAR5, VCMA, 1);
+            cpu->isar.id_isar5 = t;
+
+            t = cpu->isar.id_isar6;
+            t = FIELD_DP32(t, ID_ISAR6, DP, 1);
+            cpu->isar.id_isar6 = t;
+        }
 #endif
     }
 }
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
     set_feature(&cpu->env, ARM_FEATURE_AARCH64);
     set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
-    set_feature(&cpu->env, ARM_FEATURE_V8_AES);
-    set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
-    set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
-    set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
-    set_feature(&cpu->env, ARM_FEATURE_CRC);
     set_feature(&cpu->env, ARM_FEATURE_EL2);
     set_feature(&cpu->env, ARM_FEATURE_EL3);
     set_feature(&cpu->env, ARM_FEATURE_PMU);
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
     set_feature(&cpu->env, ARM_FEATURE_AARCH64);
     set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
-    set_feature(&cpu->env, ARM_FEATURE_V8_AES);
-    set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
-    set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
-    set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
-    set_feature(&cpu->env, ARM_FEATURE_CRC);
     set_feature(&cpu->env, ARM_FEATURE_EL2);
     set_feature(&cpu->env, ARM_FEATURE_EL3);
     set_feature(&cpu->env, ARM_FEATURE_PMU);
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
     set_feature(&cpu->env, ARM_FEATURE_AARCH64);
     set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
-    set_feature(&cpu->env, ARM_FEATURE_V8_AES);
-    set_feature(&cpu->env, ARM_FEATURE_V8_SHA1);
-    set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
-    set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
-    set_feature(&cpu->env, ARM_FEATURE_CRC);
     set_feature(&cpu->env, ARM_FEATURE_EL2);
     set_feature(&cpu->env, ARM_FEATURE_EL3);
     set_feature(&cpu->env, ARM_FEATURE_PMU);
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
     if (kvm_enabled()) {
         kvm_arm_set_cpu_features_from_host(cpu);
     } else {
+        uint64_t t;
+        uint32_t u;
         aarch64_a57_initfn(obj);
+
+        t = cpu->isar.id_aa64isar0;
+        t = FIELD_DP64(t, ID_AA64ISAR0, AES, 2); /* AES + PMULL */
+        t = FIELD_DP64(t, ID_AA64ISAR0, SHA1, 1);
+        t = FIELD_DP64(t, ID_AA64ISAR0, SHA2, 2); /* SHA512 */
+        t = FIELD_DP64(t, ID_AA64ISAR0, CRC32, 1);
+        t = FIELD_DP64(t, ID_AA64ISAR0, ATOMIC, 2);
+        t = FIELD_DP64(t, ID_AA64ISAR0, RDM, 1);
+        t = FIELD_DP64(t, ID_AA64ISAR0, SHA3, 1);
+        t = FIELD_DP64(t, ID_AA64ISAR0, SM3, 1);
+        t = FIELD_DP64(t, ID_AA64ISAR0, SM4, 1);
+        t = FIELD_DP64(t, ID_AA64ISAR0, DP, 1);
+        cpu->isar.id_aa64isar0 = t;
+
+        t = cpu->isar.id_aa64isar1;
+        t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
+        cpu->isar.id_aa64isar1 = t;
+
+        /* Replicate the same data to the 32-bit id registers.  */
+        u = cpu->isar.id_isar5;
+        u = FIELD_DP32(u, ID_ISAR5, AES, 2); /* AES + PMULL */
+        u = FIELD_DP32(u, ID_ISAR5, SHA1, 1);
+        u = FIELD_DP32(u, ID_ISAR5, SHA2, 1);
+        u = FIELD_DP32(u, ID_ISAR5, CRC32, 1);
+        u = FIELD_DP32(u, ID_ISAR5, RDM, 1);
+        u = FIELD_DP32(u, ID_ISAR5, VCMA, 1);
+        cpu->isar.id_isar5 = u;
+
+        u = cpu->isar.id_isar6;
+        u = FIELD_DP32(u, ID_ISAR6, DP, 1);
+        cpu->isar.id_isar6 = u;
+
 #ifdef CONFIG_USER_ONLY
         /* We don't set these in system emulation mode for the moment,
          * since we don't correctly set the ID registers to advertise them,
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
          * whereas the architecture requires them to be present in both if
          * present in either.
          */
-        set_feature(&cpu->env, ARM_FEATURE_V8_SHA512);
-        set_feature(&cpu->env, ARM_FEATURE_V8_SHA3);
-        set_feature(&cpu->env, ARM_FEATURE_V8_SM3);
-        set_feature(&cpu->env, ARM_FEATURE_V8_SM4);
-        set_feature(&cpu->env, ARM_FEATURE_V8_ATOMICS);
-        set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
-        set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD);
         set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
-        set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
         set_feature(&cpu->env, ARM_FEATURE_SVE);
         /* For usermode -cpu max we can use a larger and more efficient DCZ
          * blocksize since we don't have to follow what the hardware does.
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
         }
         if (rt2 == 31
             && ((rt | rs) & 1) == 0
-            && arm_dc_feature(s, ARM_FEATURE_V8_ATOMICS)) {
+            && dc_isar_feature(aa64_atomics, s)) {
             /* CASP / CASPL */
             gen_compare_and_swap_pair(s, rs, rt, rn, size | 2);
             return;
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
         }
         if (rt2 == 31
             && ((rt | rs) & 1) == 0
-            && arm_dc_feature(s, ARM_FEATURE_V8_ATOMICS)) {
+            && dc_isar_feature(aa64_atomics, s)) {
             /* CASPA / CASPAL */
             gen_compare_and_swap_pair(s, rs, rt, rn, size | 2);
             return;
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
     case 0xb: /* CASL */
     case 0xe: /* CASA */
     case 0xf: /* CASAL */
-        if (rt2 == 31 && arm_dc_feature(s, ARM_FEATURE_V8_ATOMICS)) {
+        if (rt2 == 31 && dc_isar_feature(aa64_atomics, s)) {
             gen_compare_and_swap(s, rs, rt, rn, size);
             return;
         }
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_atomic(DisasContext *s, uint32_t insn,
     int rs = extract32(insn, 16, 5);
     int rn = extract32(insn, 5, 5);
     int o3_opc = extract32(insn, 12, 4);
-    int feature = ARM_FEATURE_V8_ATOMICS;
     TCGv_i64 tcg_rn, tcg_rs;
     AtomicThreeOpFn *fn;
 
-    if (is_vector) {
+    if (is_vector || !dc_isar_feature(aa64_atomics, s)) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_atomic(DisasContext *s, uint32_t insn,
         unallocated_encoding(s);
         return;
     }
-    if (!arm_dc_feature(s, feature)) {
-        unallocated_encoding(s);
-        return;
-    }
 
     if (rn == 31) {
         gen_check_sp_alignment(s);
@@ -XXX,XX +XXX,XX @@ static void handle_crc32(DisasContext *s,
     TCGv_i64 tcg_acc, tcg_val;
     TCGv_i32 tcg_bytes;
 
-    if (!arm_dc_feature(s, ARM_FEATURE_CRC)
+    if (!dc_isar_feature(aa64_crc32, s)
         || (sf == 1 && sz != 3)
         || (sf == 0 && sz == 3)) {
         unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_extra(DisasContext *s,
     bool u = extract32(insn, 29, 1);
     TCGv_i32 ele1, ele2, ele3;
     TCGv_i64 res;
-    int feature;
+    bool feature;
 
     switch (u * 16 + opcode) {
     case 0x10: /* SQRDMLAH (vector) */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_extra(DisasContext *s,
             unallocated_encoding(s);
             return;
         }
-        feature = ARM_FEATURE_V8_RDM;
+        feature = dc_isar_feature(aa64_rdm, s);
         break;
     default:
         unallocated_encoding(s);
         return;
     }
-    if (!arm_dc_feature(s, feature)) {
+    if (!feature) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
             return;
         }
         if (size == 3) {
-            if (!arm_dc_feature(s, ARM_FEATURE_V8_PMULL)) {
+            if (!dc_isar_feature(aa64_pmull, s)) {
                 unallocated_encoding(s);
                 return;
             }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
     int size = extract32(insn, 22, 2);
     bool u = extract32(insn, 29, 1);
     bool is_q = extract32(insn, 30, 1);
-    int feature, rot;
+    bool feature;
+    int rot;
 
     switch (u * 16 + opcode) {
     case 0x10: /* SQRDMLAH (vector) */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
             unallocated_encoding(s);
             return;
         }
-        feature = ARM_FEATURE_V8_RDM;
+        feature = dc_isar_feature(aa64_rdm, s);
         break;
     case 0x02: /* SDOT (vector) */
     case 0x12: /* UDOT (vector) */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
             unallocated_encoding(s);
             return;
         }
-        feature = ARM_FEATURE_V8_DOTPROD;
+        feature = dc_isar_feature(aa64_dp, s);
         break;
     case 0x18: /* FCMLA, #0 */
     case 0x19: /* FCMLA, #90 */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
             unallocated_encoding(s);
             return;
         }
-        feature = ARM_FEATURE_V8_FCMA;
+        feature = dc_isar_feature(aa64_fcma, s);
         break;
     default:
         unallocated_encoding(s);
         return;
     }
-    if (!arm_dc_feature(s, feature)) {
+    if (!feature) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
         break;
     case 0x1d: /* SQRDMLAH */
     case 0x1f: /* SQRDMLSH */
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
+        if (!dc_isar_feature(aa64_rdm, s)) {
             unallocated_encoding(s);
             return;
         }
         break;
     case 0x0e: /* SDOT */
     case 0x1e: /* UDOT */
-        if (size != MO_32 || !arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
+        if (size != MO_32 || !dc_isar_feature(aa64_dp, s)) {
             unallocated_encoding(s);
             return;
         }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
     case 0x13: /* FCMLA #90 */
     case 0x15: /* FCMLA #180 */
     case 0x17: /* FCMLA #270 */
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)) {
+        if (!dc_isar_feature(aa64_fcma, s)) {
             unallocated_encoding(s);
             return;
         }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_aes(DisasContext *s, uint32_t insn)
     TCGv_i32 tcg_decrypt;
     CryptoThreeOpIntFn *genfn;
 
-    if (!arm_dc_feature(s, ARM_FEATURE_V8_AES)
-        || size != 0) {
+    if (!dc_isar_feature(aa64_aes, s) || size != 0) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
     int rd = extract32(insn, 0, 5);
     CryptoThreeOpFn *genfn;
     TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
-    int feature = ARM_FEATURE_V8_SHA256;
+    bool feature;
 
     if (size != 0) {
         unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
     case 2: /* SHA1M */
     case 3: /* SHA1SU0 */
         genfn = NULL;
-        feature = ARM_FEATURE_V8_SHA1;
+        feature = dc_isar_feature(aa64_sha1, s);
         break;
     case 4: /* SHA256H */
         genfn = gen_helper_crypto_sha256h;
+        feature = dc_isar_feature(aa64_sha256, s);
         break;
     case 5: /* SHA256H2 */
         genfn = gen_helper_crypto_sha256h2;
+        feature = dc_isar_feature(aa64_sha256, s);
         break;
     case 6: /* SHA256SU1 */
         genfn = gen_helper_crypto_sha256su1;
+        feature = dc_isar_feature(aa64_sha256, s);
         break;
     default:
         unallocated_encoding(s);
         return;
     }
 
-    if (!arm_dc_feature(s, feature)) {
+    if (!feature) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
     int rn = extract32(insn, 5, 5);
     int rd = extract32(insn, 0, 5);
     CryptoTwoOpFn *genfn;
-    int feature;
+    bool feature;
     TCGv_ptr tcg_rd_ptr, tcg_rn_ptr;
 
     if (size != 0) {
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
 
     switch (opcode) {
     case 0: /* SHA1H */
-        feature = ARM_FEATURE_V8_SHA1;
+        feature = dc_isar_feature(aa64_sha1, s);
         genfn = gen_helper_crypto_sha1h;
         break;
     case 1: /* SHA1SU1 */
-        feature = ARM_FEATURE_V8_SHA1;
+        feature = dc_isar_feature(aa64_sha1, s);
         genfn = gen_helper_crypto_sha1su1;
         break;
     case 2: /* SHA256SU0 */
-        feature = ARM_FEATURE_V8_SHA256;
+        feature = dc_isar_feature(aa64_sha256, s);
         genfn = gen_helper_crypto_sha256su0;
         break;
     default:
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
         return;
     }
 
-    if (!arm_dc_feature(s, feature)) {
+    if (!feature) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
     int rm = extract32(insn, 16, 5);
     int rn = extract32(insn, 5, 5);
     int rd = extract32(insn, 0, 5);
-    int feature;
+    bool feature;
     CryptoThreeOpFn *genfn;
 
     if (o == 0) {
         switch (opcode) {
         case 0: /* SHA512H */
-            feature = ARM_FEATURE_V8_SHA512;
+            feature = dc_isar_feature(aa64_sha512, s);
             genfn = gen_helper_crypto_sha512h;
             break;
         case 1: /* SHA512H2 */
-            feature = ARM_FEATURE_V8_SHA512;
+            feature = dc_isar_feature(aa64_sha512, s);
             genfn = gen_helper_crypto_sha512h2;
             break;
         case 2: /* SHA512SU1 */
-            feature = ARM_FEATURE_V8_SHA512;
+            feature = dc_isar_feature(aa64_sha512, s);
             genfn = gen_helper_crypto_sha512su1;
             break;
         case 3: /* RAX1 */
-            feature = ARM_FEATURE_V8_SHA3;
+            feature = dc_isar_feature(aa64_sha3, s);
             genfn = NULL;
             break;
         }
     } else {
         switch (opcode) {
         case 0: /* SM3PARTW1 */
-            feature = ARM_FEATURE_V8_SM3;
+            feature = dc_isar_feature(aa64_sm3, s);
             genfn = gen_helper_crypto_sm3partw1;
             break;
         case 1: /* SM3PARTW2 */
-            feature = ARM_FEATURE_V8_SM3;
+            feature = dc_isar_feature(aa64_sm3, s);
             genfn = gen_helper_crypto_sm3partw2;
             break;
         case 2: /* SM4EKEY */
-            feature = ARM_FEATURE_V8_SM4;
+            feature = dc_isar_feature(aa64_sm4, s);
             genfn = gen_helper_crypto_sm4ekey;
             break;
         default:
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn)
         }
     }
 
-    if (!arm_dc_feature(s, feature)) {
+    if (!feature) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
     int rn = extract32(insn, 5, 5);
     int rd = extract32(insn, 0, 5);
     TCGv_ptr tcg_rd_ptr, tcg_rn_ptr;
-    int feature;
+    bool feature;
     CryptoTwoOpFn *genfn;
 
     switch (opcode) {
     case 0: /* SHA512SU0 */
-        feature = ARM_FEATURE_V8_SHA512;
+        feature = dc_isar_feature(aa64_sha512, s);
         genfn = gen_helper_crypto_sha512su0;
         break;
     case 1: /* SM4E */
-        feature = ARM_FEATURE_V8_SM4;
+        feature = dc_isar_feature(aa64_sm4, s);
         genfn = gen_helper_crypto_sm4e;
         break;
     default:
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn)
         return;
     }
 
-    if (!arm_dc_feature(s, feature)) {
+    if (!feature) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_four_reg(DisasContext *s, uint32_t insn)
     int ra = extract32(insn, 10, 5);
     int rn = extract32(insn, 5, 5);
     int rd = extract32(insn, 0, 5);
-    int feature;
+    bool feature;
 
     switch (op0) {
     case 0: /* EOR3 */
     case 1: /* BCAX */
-        feature = ARM_FEATURE_V8_SHA3;
+        feature = dc_isar_feature(aa64_sha3, s);
         break;
     case 2: /* SM3SS1 */
-        feature = ARM_FEATURE_V8_SM3;
+        feature = dc_isar_feature(aa64_sm3, s);
         break;
     default:
         unallocated_encoding(s);
         return;
     }
 
-    if (!arm_dc_feature(s, feature)) {
+    if (!feature) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn)
     TCGv_i64 tcg_op1, tcg_op2, tcg_res[2];
     int pass;
 
-    if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA3)) {
+    if (!dc_isar_feature(aa64_sha3, s)) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_imm2(DisasContext *s, uint32_t insn)
     TCGv_ptr tcg_rd_ptr, tcg_rn_ptr, tcg_rm_ptr;
     TCGv_i32 tcg_imm2, tcg_opcode;
 
-    if (!arm_dc_feature(s, ARM_FEATURE_V8_SM3)) {
+    if (!dc_isar_feature(aa64_sm3, s)) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
     ARMCPU *arm_cpu = arm_env_get_cpu(env);
     int bound;
 
+    dc->isar = &arm_cpu->isar;
     dc->pc = dc->base.pc_first;
     dc->condjmp = 0;
 
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
 static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
                          int q, int rd, int rn, int rm)
 {
-    if (arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
+    if (dc_isar_feature(aa32_rdm, s)) {
         int opr_sz = (1 + q) * 8;
         tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
                            vfp_reg_offset(1, rn),
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 return 1;
             }
             if (!u) { /* SHA-1 */
-                if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA1)) {
+                if (!dc_isar_feature(aa32_sha1, s)) {
                     return 1;
                 }
                 ptr1 = vfp_reg_ptr(true, rd);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp4);
                 tcg_temp_free_i32(tmp4);
             } else { /* SHA-256 */
-                if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA256) || size == 3) {
+                if (!dc_isar_feature(aa32_sha2, s) || size == 3) {
                     return 1;
                 }
                 ptr1 = vfp_reg_ptr(true, rd);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 if (op == 14 && size == 2) {
                     TCGv_i64 tcg_rn, tcg_rm, tcg_rd;
 
-                    if (!arm_dc_feature(s, ARM_FEATURE_V8_PMULL)) {
+                    if (!dc_isar_feature(aa32_pmull, s)) {
                         return 1;
                     }
                     tcg_rn = tcg_temp_new_i64();
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     {
                         NeonGenThreeOpEnvFn *fn;
 
-                        if (!arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
+                        if (!dc_isar_feature(aa32_rdm, s)) {
                             return 1;
                         }
                         if (u && ((rd | rn) & 1)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     break;
                 }
                 case NEON_2RM_AESE: case NEON_2RM_AESMC:
-                    if (!arm_dc_feature(s, ARM_FEATURE_V8_AES)
-                        || ((rm | rd) & 1)) {
+                    if (!dc_isar_feature(aa32_aes, s) || ((rm | rd) & 1)) {
                         return 1;
                     }
                     ptr1 = vfp_reg_ptr(true, rd);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     tcg_temp_free_i32(tmp3);
                     break;
                 case NEON_2RM_SHA1H:
-                    if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA1)
-                        || ((rm | rd) & 1)) {
+                    if (!dc_isar_feature(aa32_sha1, s) || ((rm | rd) & 1)) {
                         return 1;
                     }
                     ptr1 = vfp_reg_ptr(true, rd);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     }
                     /* bit 6 (q): set -> SHA256SU0, cleared -> SHA1SU1 */
                     if (q) {
-                        if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA256)) {
+                        if (!dc_isar_feature(aa32_sha2, s)) {
                             return 1;
                         }
-                    } else if (!arm_dc_feature(s, ARM_FEATURE_V8_SHA1)) {
+                    } else if (!dc_isar_feature(aa32_sha1, s)) {
                         return 1;
                     }
                     ptr1 = vfp_reg_ptr(true, rd);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
         /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
         int size = extract32(insn, 20, 1);
         data = extract32(insn, 23, 2); /* rot */
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
+        if (!dc_isar_feature(aa32_vcma, s)
             || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
         /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
         int size = extract32(insn, 20, 1);
         data = extract32(insn, 24, 1); /* rot */
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
+        if (!dc_isar_feature(aa32_vcma, s)
             || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
     } else if ((insn & 0xfeb00f00) == 0xfc200d00) {
         /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
         bool u = extract32(insn, 4, 1);
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
+        if (!dc_isar_feature(aa32_dp, s)) {
             return 1;
         }
         fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
         int size = extract32(insn, 23, 1);
         int index;
 
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)) {
+        if (!dc_isar_feature(aa32_vcma, s)) {
             return 1;
         }
         if (size == 0) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
     } else if ((insn & 0xffb00f00) == 0xfe200d00) {
         /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
         int u = extract32(insn, 4, 1);
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
+        if (!dc_isar_feature(aa32_dp, s)) {
             return 1;
         }
         fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
              * op1 == 3 is UNPREDICTABLE but handle as UNDEFINED.
              * Bits 8, 10 and 11 should be zero.
              */
-            if (!arm_dc_feature(s, ARM_FEATURE_CRC) || op1 == 0x3 ||
-                (c & 0xd) != 0) {
+            if (!dc_isar_feature(aa32_crc32, s) || op1 == 0x3 || (c & 0xd) != 0) {
                 goto illegal_op;
             }
 
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                 case 0x28:
                 case 0x29:
                 case 0x2a:
-                    if (!arm_dc_feature(s, ARM_FEATURE_CRC)) {
+                    if (!dc_isar_feature(aa32_crc32, s)) {
                         goto illegal_op;
                     }
                     break;
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     CPUARMState *env = cs->env_ptr;
     ARMCPU *cpu = arm_env_get_cpu(env);
 
+    dc->isar = &cpu->isar;
     dc->pc = dc->base.pc_first;
     dc->condjmp = 0;
 
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Both arm and thumb2 division are controlled by the same ISAR field,
which takes care of the arm implies thumb case.  Having M imply
thumb2 division was wrong for cortex-m0, which is v6m and does not
have thumb2 at all, much less thumb2 division.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181016223115.24100-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h       | 12 ++++++++++--
 linux-user/elfload.c   |  4 ++--
 target/arm/cpu.c       | 10 +---------
 target/arm/translate.c |  4 ++--
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ enum arm_features {
     ARM_FEATURE_VFP3,
     ARM_FEATURE_VFP_FP16,
     ARM_FEATURE_NEON,
-    ARM_FEATURE_THUMB_DIV, /* divide supported in Thumb encoding */
     ARM_FEATURE_M, /* Microcontroller profile.  */
     ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling.  */
     ARM_FEATURE_THUMB2EE,
@@ -XXX,XX +XXX,XX @@ enum arm_features {
     ARM_FEATURE_V5,
     ARM_FEATURE_STRONGARM,
     ARM_FEATURE_VAPA, /* cp15 VA to PA lookups */
-    ARM_FEATURE_ARM_DIV, /* divide supported in ARM encoding */
     ARM_FEATURE_VFP4, /* VFPv4 (implies that NEON is v2) */
     ARM_FEATURE_GENERIC_TIMER,
     ARM_FEATURE_MVFR, /* Media and VFP Feature Registers 0 and 1 */
@@ -XXX,XX +XXX,XX @@ extern const uint64_t pred_esz_masks[4];
 /*
  * 32-bit feature tests via id registers.
  */
+static inline bool isar_feature_thumb_div(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) != 0;
+}
+
+static inline bool isar_feature_arm_div(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) > 1;
+}
+
 static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
 {
     return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
     GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
     GET_FEATURE(ARM_FEATURE_V6K, ARM_HWCAP_ARM_TLS);
     GET_FEATURE(ARM_FEATURE_VFP4, ARM_HWCAP_ARM_VFPv4);
-    GET_FEATURE(ARM_FEATURE_ARM_DIV, ARM_HWCAP_ARM_IDIVA);
-    GET_FEATURE(ARM_FEATURE_THUMB_DIV, ARM_HWCAP_ARM_IDIVT);
+    GET_FEATURE_ID(arm_div, ARM_HWCAP_ARM_IDIVA);
+    GET_FEATURE_ID(thumb_div, ARM_HWCAP_ARM_IDIVT);
     /* All QEMU's VFPv3 CPUs have 32 registers, see VFP_DREG in translate.c.
      * Note that the ARM_HWCAP_ARM_VFPv3D16 bit is always the inverse of
      * ARM_HWCAP_ARM_VFPD32 (and so always clear for QEMU); it is unrelated
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
          * Presence of EL2 itself is ARM_FEATURE_EL2, and of the
          * Security Extensions is ARM_FEATURE_EL3.
          */
-        set_feature(env, ARM_FEATURE_ARM_DIV);
+        assert(cpu_isar_feature(arm_div, cpu));
         set_feature(env, ARM_FEATURE_LPAE);
         set_feature(env, ARM_FEATURE_V7);
     }
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
     if (arm_feature(env, ARM_FEATURE_V5)) {
         set_feature(env, ARM_FEATURE_V4T);
     }
-    if (arm_feature(env, ARM_FEATURE_M)) {
-        set_feature(env, ARM_FEATURE_THUMB_DIV);
-    }
-    if (arm_feature(env, ARM_FEATURE_ARM_DIV)) {
-        set_feature(env, ARM_FEATURE_THUMB_DIV);
-    }
     if (arm_feature(env, ARM_FEATURE_VFP4)) {
         set_feature(env, ARM_FEATURE_VFP3);
         set_feature(env, ARM_FEATURE_VFP_FP16);
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
     ARMCPU *cpu = ARM_CPU(obj);
 
     set_feature(&cpu->env, ARM_FEATURE_V7);
-    set_feature(&cpu->env, ARM_FEATURE_THUMB_DIV);
-    set_feature(&cpu->env, ARM_FEATURE_ARM_DIV);
     set_feature(&cpu->env, ARM_FEATURE_V7MP);
     set_feature(&cpu->env, ARM_FEATURE_PMSA);
     cpu->midr = 0x411fc153; /* r1p3 */
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                     case 1:
                     case 3:
                         /* SDIV, UDIV */
-                        if (!arm_dc_feature(s, ARM_FEATURE_ARM_DIV)) {
+                        if (!dc_isar_feature(arm_div, s)) {
                             goto illegal_op;
                         }
                         if (((insn >> 5) & 7) || (rd != 15)) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
             tmp2 = load_reg(s, rm);
             if ((op & 0x50) == 0x10) {
                 /* sdiv, udiv */
-                if (!arm_dc_feature(s, ARM_FEATURE_THUMB_DIV)) {
+                if (!dc_isar_feature(thumb_div, s)) {
                     goto illegal_op;
                 }
                 if (op & 0x20)
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Having V6 alone imply jazelle was wrong for cortex-m0.
Change to an assertion for V6 & !M.

This was harmless, because the only place we tested ARM_FEATURE_JAZELLE
was for 'bxj' in disas_arm(), which is unreachable for M-profile cores.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181016223115.24100-6-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h       |  6 +++++-
 target/arm/cpu.c       | 17 ++++++++++++++---
 target/arm/translate.c |  2 +-
 3 files changed, 20 insertions(+), 5 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181016223115.24100-7-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h     | 6 +++++-
 linux-user/elfload.c | 2 +-
 target/arm/cpu.c     | 4 ----
 target/arm/helper.c  | 2 +-
 target/arm/machine.c | 3 +--
 5 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ enum arm_features {
     ARM_FEATURE_NEON,
     ARM_FEATURE_M, /* Microcontroller profile.  */
     ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling.  */
-    ARM_FEATURE_THUMB2EE,
     ARM_FEATURE_V7MP,    /* v7 Multiprocessing Extensions */
     ARM_FEATURE_V7VE, /* v7 Virtualization Extensions (non-EL2 parts) */
     ARM_FEATURE_V4T,
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_jazelle(const ARMISARegisters *id)
     return FIELD_EX32(id->id_isar1, ID_ISAR1, JAZELLE) != 0;
 }
 
+static inline bool isar_feature_t32ee(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar3, ID_ISAR3, T32EE) != 0;
+}
+
 static inline bool isar_feature_aa32_aes(const ARMISARegisters *id)
 {
     return FIELD_EX32(id->id_isar5, ID_ISAR5, AES) != 0;
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
     GET_FEATURE(ARM_FEATURE_V5, ARM_HWCAP_ARM_EDSP);
     GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
     GET_FEATURE(ARM_FEATURE_IWMMXT, ARM_HWCAP_ARM_IWMMXT);
-    GET_FEATURE(ARM_FEATURE_THUMB2EE, ARM_HWCAP_ARM_THUMBEE);
+    GET_FEATURE_ID(t32ee, ARM_HWCAP_ARM_THUMBEE);
     GET_FEATURE(ARM_FEATURE_NEON, ARM_HWCAP_ARM_NEON);
     GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
     GET_FEATURE(ARM_FEATURE_V6K, ARM_HWCAP_ARM_TLS);
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_V7);
     set_feature(&cpu->env, ARM_FEATURE_VFP3);
     set_feature(&cpu->env, ARM_FEATURE_NEON);
-    set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
     set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
     set_feature(&cpu->env, ARM_FEATURE_EL3);
     cpu->midr = 0x410fc080;
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_VFP3);
     set_feature(&cpu->env, ARM_FEATURE_VFP_FP16);
     set_feature(&cpu->env, ARM_FEATURE_NEON);
-    set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
     set_feature(&cpu->env, ARM_FEATURE_EL3);
     /* Note that A9 supports the MP extensions even for
      * A9UP and single-core A9MP (which are both different
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_V7VE);
     set_feature(&cpu->env, ARM_FEATURE_VFP4);
     set_feature(&cpu->env, ARM_FEATURE_NEON);
-    set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
     set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
     set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
     set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_V7VE);
     set_feature(&cpu->env, ARM_FEATURE_VFP4);
     set_feature(&cpu->env, ARM_FEATURE_NEON);
-    set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
     set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
     set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
     set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
         define_arm_cp_regs(cpu, vmsa_pmsa_cp_reginfo);
         define_arm_cp_regs(cpu, vmsa_cp_reginfo);
     }
-    if (arm_feature(env, ARM_FEATURE_THUMB2EE)) {
+    if (cpu_isar_feature(t32ee, cpu)) {
         define_arm_cp_regs(cpu, t2ee_cp_reginfo);
     }
     if (arm_feature(env, ARM_FEATURE_GENERIC_TIMER)) {
diff --git a/target/arm/machine.c b/target/arm/machine.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m = {
 static bool thumb2ee_needed(void *opaque)
 {
     ARMCPU *cpu = opaque;
-    CPUARMState *env = &cpu->env;
 
-    return arm_feature(env, ARM_FEATURE_THUMB2EE);
+    return cpu_isar_feature(t32ee, cpu);
 }
 
 static const VMStateDescription vmstate_thumb2ee = {
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181016223115.24100-8-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h            | 16 +++++++++++++++-
 linux-user/aarch64/signal.c |  4 ++--
 linux-user/elfload.c        |  2 +-
 linux-user/syscall.c        | 10 ++++++----
 target/arm/cpu64.c          |  5 ++++-
 target/arm/helper.c         |  9 ++++++---
 target/arm/machine.c        |  3 +--
 target/arm/translate-a64.c  |  4 ++--
 8 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(ID_AA64ISAR1, FRINTTS, 32, 4)
 FIELD(ID_AA64ISAR1, SB, 36, 4)
 FIELD(ID_AA64ISAR1, SPECRES, 40, 4)
 
+FIELD(ID_AA64PFR0, EL0, 0, 4)
+FIELD(ID_AA64PFR0, EL1, 4, 4)
+FIELD(ID_AA64PFR0, EL2, 8, 4)
+FIELD(ID_AA64PFR0, EL3, 12, 4)
+FIELD(ID_AA64PFR0, FP, 16, 4)
+FIELD(ID_AA64PFR0, ADVSIMD, 20, 4)
+FIELD(ID_AA64PFR0, GIC, 24, 4)
+FIELD(ID_AA64PFR0, RAS, 28, 4)
+FIELD(ID_AA64PFR0, SVE, 32, 4)
+
 QEMU_BUILD_BUG_ON(ARRAY_SIZE(((ARMCPU *)0)->ccsidr) <= R_V7M_CSSELR_INDEX_MASK);
 
 /* If adding a feature bit which corresponds to a Linux ELF
@@ -XXX,XX +XXX,XX @@ enum arm_features {
     ARM_FEATURE_PMU, /* has PMU support */
     ARM_FEATURE_VBAR, /* has cp15 VBAR */
     ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
-    ARM_FEATURE_SVE, /* has Scalable Vector Extension */
     ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
     ARM_FEATURE_M_MAIN, /* M profile Main Extension */
 };
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
 }
 
+static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
+}
+
 /*
  * Forward to the above feature tests given an ARMCPU pointer.
  */
diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/aarch64/signal.c
+++ b/linux-user/aarch64/signal.c
@@ -XXX,XX +XXX,XX @@ static int target_restore_sigframe(CPUARMState *env,
             break;
 
         case TARGET_SVE_MAGIC:
-            if (arm_feature(env, ARM_FEATURE_SVE)) {
+            if (cpu_isar_feature(aa64_sve, arm_env_get_cpu(env))) {
                 vq = (env->vfp.zcr_el[1] & 0xf) + 1;
                 sve_size = QEMU_ALIGN_UP(TARGET_SVE_SIG_CONTEXT_SIZE(vq), 16);
                 if (!sve && size == sve_size) {
@@ -XXX,XX +XXX,XX @@ static void target_setup_frame(int usig, struct target_sigaction *ka,
                                       &layout);
 
     /* SVE state needs saving only if it exists.  */
-    if (arm_feature(env, ARM_FEATURE_SVE)) {
+    if (cpu_isar_feature(aa64_sve, arm_env_get_cpu(env))) {
         vq = (env->vfp.zcr_el[1] & 0xf) + 1;
         sve_size = QEMU_ALIGN_UP(TARGET_SVE_SIG_CONTEXT_SIZE(vq), 16);
         sve_ofs = alloc_sigframe_space(sve_size, &layout);
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
     GET_FEATURE_ID(aa64_rdm, ARM_HWCAP_A64_ASIMDRDM);
     GET_FEATURE_ID(aa64_dp, ARM_HWCAP_A64_ASIMDDP);
     GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
-    GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
+    GET_FEATURE_ID(aa64_sve, ARM_HWCAP_A64_SVE);
 
 #undef GET_FEATURE
 #undef GET_FEATURE_ID
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -XXX,XX +XXX,XX @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
              * even though the current architectural maximum is VQ=16.
              */
             ret = -TARGET_EINVAL;
-            if (arm_feature(cpu_env, ARM_FEATURE_SVE)
+            if (cpu_isar_feature(aa64_sve, arm_env_get_cpu(cpu_env))
                 && arg2 >= 0 && arg2 <= 512 * 16 && !(arg2 & 15)) {
                 CPUARMState *env = cpu_env;
                 ARMCPU *cpu = arm_env_get_cpu(env);
@@ -XXX,XX +XXX,XX @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
             return ret;
         case TARGET_PR_SVE_GET_VL:
             ret = -TARGET_EINVAL;
-            if (arm_feature(cpu_env, ARM_FEATURE_SVE)) {
-                CPUARMState *env = cpu_env;
-                ret = ((env->vfp.zcr_el[1] & 0xf) + 1) * 16;
+            {
+                ARMCPU *cpu = arm_env_get_cpu(cpu_env);
+                if (cpu_isar_feature(aa64_sve, cpu)) {
+                    ret = ((cpu->env.vfp.zcr_el[1] & 0xf) + 1) * 16;
+                }
             }
             return ret;
 #endif /* AARCH64 */
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
         cpu->isar.id_aa64isar1 = t;
 
+        t = cpu->isar.id_aa64pfr0;
+        t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1);
+        cpu->isar.id_aa64pfr0 = t;
+
         /* Replicate the same data to the 32-bit id registers.  */
         u = cpu->isar.id_isar5;
         u = FIELD_DP32(u, ID_ISAR5, AES, 2); /* AES + PMULL */
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
          * present in either.
          */
         set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
-        set_feature(&cpu->env, ARM_FEATURE_SVE);
         /* For usermode -cpu max we can use a larger and more efficient DCZ
          * blocksize since we don't have to follow what the hardware does.
          */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
         define_one_arm_cp_reg(cpu, &sctlr);
     }
 
-    if (arm_feature(env, ARM_FEATURE_SVE)) {
+    if (cpu_isar_feature(aa64_sve, cpu)) {
         define_one_arm_cp_reg(cpu, &zcr_el1_reginfo);
         if (arm_feature(env, ARM_FEATURE_EL2)) {
             define_one_arm_cp_reg(cpu, &zcr_el2_reginfo);
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
     uint32_t flags;
 
     if (is_a64(env)) {
+        ARMCPU *cpu = arm_env_get_cpu(env);
+
         *pc = env->pc;
         flags = ARM_TBFLAG_AARCH64_STATE_MASK;
         /* Get control bits for tagged addresses */
         flags |= (arm_regime_tbi0(env, mmu_idx) << ARM_TBFLAG_TBI0_SHIFT);
         flags |= (arm_regime_tbi1(env, mmu_idx) << ARM_TBFLAG_TBI1_SHIFT);
 
-        if (arm_feature(env, ARM_FEATURE_SVE)) {
+        if (cpu_isar_feature(aa64_sve, cpu)) {
             int sve_el = sve_exception_el(env, current_el);
             uint32_t zcr_len;
 
@@ -XXX,XX +XXX,XX @@ void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq)
 void aarch64_sve_change_el(CPUARMState *env, int old_el,
                            int new_el, bool el0_a64)
 {
+    ARMCPU *cpu = arm_env_get_cpu(env);
     int old_len, new_len;
     bool old_a64, new_a64;
 
     /* Nothing to do if no SVE.  */
-    if (!arm_feature(env, ARM_FEATURE_SVE)) {
+    if (!cpu_isar_feature(aa64_sve, cpu)) {
         return;
     }
 
diff --git a/target/arm/machine.c b/target/arm/machine.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_iwmmxt = {
 static bool sve_needed(void *opaque)
 {
     ARMCPU *cpu = opaque;
-    CPUARMState *env = &cpu->env;
 
-    return arm_feature(env, ARM_FEATURE_SVE);
+    return cpu_isar_feature(aa64_sve, cpu);
 }
 
 /* The first two words of each Zreg is stored in VFP state.  */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
     cpu_fprintf(f, "     FPCR=%08x FPSR=%08x\n",
                 vfp_get_fpcr(env), vfp_get_fpsr(env));
 
-    if (arm_feature(env, ARM_FEATURE_SVE) && sve_exception_el(env, el) == 0) {
+    if (cpu_isar_feature(aa64_sve, cpu) && sve_exception_el(env, el) == 0) {
         int j, zcr_len = sve_zcr_len_for_el(env, el);
 
         for (i = 0; i <= FFR_PRED_NUM; i++) {
@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
         unallocated_encoding(s);
         break;
     case 0x2:
-        if (!arm_dc_feature(s, ARM_FEATURE_SVE) || !disas_sve(s, insn)) {
+        if (!dc_isar_feature(aa64_sve, s) || !disas_sve(s, insn)) {
             unallocated_encoding(s);
         }
         break;
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181016223115.24100-9-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h           | 17 +++++++++++++++-
 linux-user/elfload.c       |  6 +-----
 target/arm/cpu64.c         | 16 ++++++++-------
 target/arm/helper.c        |  2 +-
 target/arm/translate-a64.c | 40 +++++++++++++++++++-------------------
 target/arm/translate.c     |  6 +++---
 6 files changed, 50 insertions(+), 37 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ enum arm_features {
     ARM_FEATURE_PMU, /* has PMU support */
     ARM_FEATURE_VBAR, /* has cp15 VBAR */
     ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
-    ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
     ARM_FEATURE_M_MAIN, /* M profile Main Extension */
 };
 
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_dp(const ARMISARegisters *id)
     return FIELD_EX32(id->id_isar6, ID_ISAR6, DP) != 0;
 }
 
+static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
+{
+    /*
+     * This is a placeholder for use by VCMA until the rest of
+     * the ARMv8.2-FP16 extension is implemented for aa32 mode.
+     * At which point we can properly set and check MVFR1.FPHP.
+     */
+    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
+}
+
 /*
  * 64-bit feature tests via id registers.
  */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_fcma(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FCMA) != 0;
 }
 
+static inline bool isar_feature_aa64_fp16(const ARMISARegisters *id)
+{
+    /* We always set the AdvSIMD and FP fields identically wrt FP16.  */
+    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
+}
+
 static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
 {
     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
     hwcaps |= ARM_HWCAP_A64_ASIMD;
 
     /* probe for the extra features */
-#define GET_FEATURE(feat, hwcap) \
-    do { if (arm_feature(&cpu->env, feat)) { hwcaps |= hwcap; } } while (0)
 #define GET_FEATURE_ID(feat, hwcap) \
     do { if (cpu_isar_feature(feat, cpu)) { hwcaps |= hwcap; } } while (0)
 
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
     GET_FEATURE_ID(aa64_sha3, ARM_HWCAP_A64_SHA3);
     GET_FEATURE_ID(aa64_sm3, ARM_HWCAP_A64_SM3);
     GET_FEATURE_ID(aa64_sm4, ARM_HWCAP_A64_SM4);
-    GET_FEATURE(ARM_FEATURE_V8_FP16,
-                ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
+    GET_FEATURE_ID(aa64_fp16, ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
     GET_FEATURE_ID(aa64_atomics, ARM_HWCAP_A64_ATOMICS);
     GET_FEATURE_ID(aa64_rdm, ARM_HWCAP_A64_ASIMDRDM);
     GET_FEATURE_ID(aa64_dp, ARM_HWCAP_A64_ASIMDDP);
     GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
     GET_FEATURE_ID(aa64_sve, ARM_HWCAP_A64_SVE);
 
-#undef GET_FEATURE
 #undef GET_FEATURE_ID
 
     return hwcaps;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
 
         t = cpu->isar.id_aa64pfr0;
         t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1);
+        t = FIELD_DP64(t, ID_AA64PFR0, FP, 1);
+        t = FIELD_DP64(t, ID_AA64PFR0, ADVSIMD, 1);
         cpu->isar.id_aa64pfr0 = t;
 
         /* Replicate the same data to the 32-bit id registers.  */
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         u = FIELD_DP32(u, ID_ISAR6, DP, 1);
         cpu->isar.id_isar6 = u;
 
-#ifdef CONFIG_USER_ONLY
-        /* We don't set these in system emulation mode for the moment,
-         * since we don't correctly set the ID registers to advertise them,
-         * and in some cases they're only available in AArch64 and not AArch32,
-         * whereas the architecture requires them to be present in both if
-         * present in either.
+        /*
+         * FIXME: We do not yet support ARMv8.2-fp16 for AArch32 yet,
+         * so do not set MVFR1.FPHP.  Strictly speaking this is not legal,
+         * but it is also not legal to enable SVE without support for FP16,
+         * and enabling SVE in system mode is more useful in the short term.
          */
-        set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
+
+#ifdef CONFIG_USER_ONLY
         /* For usermode -cpu max we can use a larger and more efficient DCZ
          * blocksize since we don't have to follow what the hardware does.
          */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
     uint32_t changed;
 
     /* When ARMv8.2-FP16 is not supported, FZ16 is RES0.  */
-    if (!arm_feature(env, ARM_FEATURE_V8_FP16)) {
+    if (!cpu_isar_feature(aa64_fp16, arm_env_get_cpu(env))) {
         val &= ~FPCR_FZ16;
     }
 
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_fp_compare(DisasContext *s, uint32_t insn)
         break;
     case 3:
         size = MO_16;
-        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+        if (dc_isar_feature(aa64_fp16, s)) {
             break;
         }
         /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
         break;
     case 3:
         size = MO_16;
-        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+        if (dc_isar_feature(aa64_fp16, s)) {
             break;
         }
         /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
         break;
     case 3:
         sz = MO_16;
-        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+        if (dc_isar_feature(aa64_fp16, s)) {
             break;
         }
         /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_1src(DisasContext *s, uint32_t insn)
             handle_fp_1src_double(s, opcode, rd, rn);
             break;
         case 3:
-            if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            if (!dc_isar_feature(aa64_fp16, s)) {
                 unallocated_encoding(s);
                 return;
             }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_2src(DisasContext *s, uint32_t insn)
         handle_fp_2src_double(s, opcode, rd, rn, rm);
         break;
     case 3:
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+        if (!dc_isar_feature(aa64_fp16, s)) {
             unallocated_encoding(s);
             return;
         }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_3src(DisasContext *s, uint32_t insn)
         handle_fp_3src_double(s, o0, o1, rd, rn, rm, ra);
         break;
     case 3:
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+        if (!dc_isar_feature(aa64_fp16, s)) {
             unallocated_encoding(s);
             return;
         }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
         break;
     case 3:
         sz = MO_16;
-        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+        if (dc_isar_feature(aa64_fp16, s)) {
             break;
         }
         /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_fixed_conv(DisasContext *s, uint32_t insn)
     case 1: /* float64 */
         break;
     case 3: /* float16 */
-        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+        if (dc_isar_feature(aa64_fp16, s)) {
             break;
         }
         /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
             break;
         case 0x6: /* 16-bit float, 32-bit int */
         case 0xe: /* 16-bit float, 64-bit int */
-            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            if (dc_isar_feature(aa64_fp16, s)) {
                 break;
             }
             /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
         case 1: /* float64 */
             break;
         case 3: /* float16 */
-            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            if (dc_isar_feature(aa64_fp16, s)) {
                 break;
             }
             /* fallthru */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
          */
         is_min = extract32(size, 1, 1);
         is_fp = true;
-        if (!is_u && arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+        if (!is_u && dc_isar_feature(aa64_fp16, s)) {
             size = 1;
         } else if (!is_u || !is_q || extract32(size, 0, 1)) {
             unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
 
     if (o2 != 0 || ((cmode == 0xf) && is_neg && !is_q)) {
         /* Check for FMOV (vector, immediate) - half-precision */
-        if (!(arm_dc_feature(s, ARM_FEATURE_V8_FP16) && o2 && cmode == 0xf)) {
+        if (!(dc_isar_feature(aa64_fp16, s) && o2 && cmode == 0xf)) {
             unallocated_encoding(s);
             return;
         }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
     case 0x2f: /* FMINP */
         /* FP op, size[0] is 32 or 64 bit*/
         if (!u) {
-            if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            if (!dc_isar_feature(aa64_fp16, s)) {
                 unallocated_encoding(s);
                 return;
             } else {
@@ -XXX,XX +XXX,XX @@ static void handle_simd_shift_intfp_conv(DisasContext *s, bool is_scalar,
         size = MO_32;
     } else if (immh & 2) {
         size = MO_16;
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+        if (!dc_isar_feature(aa64_fp16, s)) {
             unallocated_encoding(s);
             return;
         }
@@ -XXX,XX +XXX,XX @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
         size = MO_32;
     } else if (immh & 0x2) {
         size = MO_16;
-        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+        if (!dc_isar_feature(aa64_fp16, s)) {
             unallocated_encoding(s);
             return;
         }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
         return;
     }
 
-    if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+    if (!dc_isar_feature(aa64_fp16, s)) {
         unallocated_encoding(s);
     }
 
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
     TCGv_ptr fpst;
     bool pairwise = false;
 
-    if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+    if (!dc_isar_feature(aa64_fp16, s)) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
     case 0x1c: /* FCADD, #90 */
     case 0x1e: /* FCADD, #270 */
         if (size == 0
-            || (size == 1 && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))
+            || (size == 1 && !dc_isar_feature(aa64_fp16, s))
             || (size == 3 && !is_q)) {
             unallocated_encoding(s);
             return;
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
     bool need_fpst = true;
     int rmode;
 
-    if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+    if (!dc_isar_feature(aa64_fp16, s)) {
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
         }
         break;
     }
-    if (is_fp16 && !arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+    if (is_fp16 && !dc_isar_feature(aa64_fp16, s)) {
         unallocated_encoding(s);
         return;
     }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
         int size = extract32(insn, 20, 1);
         data = extract32(insn, 23, 2); /* rot */
         if (!dc_isar_feature(aa32_vcma, s)
-            || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
+            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
             return 1;
         }
         fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
         int size = extract32(insn, 20, 1);
         data = extract32(insn, 24, 1); /* rot */
         if (!dc_isar_feature(aa32_vcma, s)
-            || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
+            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
             return 1;
         }
         fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
             return 1;
         }
         if (size == 0) {
-            if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            if (!dc_isar_feature(aa32_fp16_arith, s)) {
                 return 1;
             }
             /* For fp16, rm is just Vm, and index is M.  */
-- 
2.19.1

For AArch32, exception return happens through certain kinds
of CPSR write. We don't currently have any CPU_LOG_INT logging
of these events (unlike AArch64, where we log in the ERET
instruction). Add some suitable logging.

This will log exception returns like this:
Exception return from AArch32 hyp to usr PC 0x80100374

paralleling the existing logging in the exception_return
helper for AArch64 exception returns:
Exception return from AArch64 EL2 to AArch64 EL0 PC 0x8003045c
Exception return from AArch64 EL2 to AArch32 EL0 PC 0x8003045c

(Note that an AArch32 exception return can only be
AArch32->AArch32, never to AArch64.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181012144235.19646-2-peter.maydell@linaro.org
---
 target/arm/internals.h | 18 ++++++++++++++++++
 target/arm/helper.c    | 10 ++++++++++
 target/arm/translate.c |  7 +------
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline uint32_t v7m_sp_limit(CPUARMState *env)
     }
 }
 
+/**
+ * aarch32_mode_name(): Return name of the AArch32 CPU mode
+ * @psr: Program Status Register indicating CPU mode
+ *
+ * Returns, for debug logging purposes, a printable representation
+ * of the AArch32 CPU mode ("svc", "usr", etc) as indicated by
+ * the low bits of the specified PSR.
+ */
+static inline const char *aarch32_mode_name(uint32_t psr)
+{
+    static const char cpu_mode_names[16][4] = {
+        "usr", "fiq", "irq", "svc", "???", "???", "mon", "abt",
+        "???", "???", "hyp", "und", "???", "???", "???", "sys"
+    };
+
+    return cpu_mode_names[psr & 0xf];
+}
+
 #endif
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
                 mask |= CPSR_IL;
                 val |= CPSR_IL;
             }
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          "Illegal AArch32 mode switch attempt from %s to %s\n",
+                          aarch32_mode_name(env->uncached_cpsr),
+                          aarch32_mode_name(val));
         } else {
+            qemu_log_mask(CPU_LOG_INT, "%s %s to %s PC 0x%" PRIx32 "\n",
+                          write_type == CPSRWriteExceptionReturn ?
+                          "Exception return from AArch32" :
+                          "AArch32 mode switch from",
+                          aarch32_mode_name(env->uncached_cpsr),
+                          aarch32_mode_name(val), env->regs[15]);
             switch_mode(env, val & CPSR_M);
         }
     }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb)
     translator_loop(ops, &dc.base, cpu, tb);
 }
 
-static const char *cpu_mode_names[16] = {
-  "usr", "fiq", "irq", "svc", "???", "???", "mon", "abt",
-  "???", "???", "hyp", "und", "???", "???", "???", "sys"
-};
-
 void arm_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
                         int flags)
 {
@@ -XXX,XX +XXX,XX @@ void arm_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
                     psr & CPSR_V ? 'V' : '-',
                     psr & CPSR_T ? 'T' : 'A',
                     ns_status,
-                    cpu_mode_names[psr & 0xf], (psr & 0x10) ? 32 : 26);
+                    aarch32_mode_name(psr), (psr & 0x10) ? 32 : 26);
     }
 
     if (flags & CPU_DUMP_FPU) {
-- 
2.19.1

The switch_mode() function is defined in target/arm/helper.c and used
only in that file and nowhere else, so we can make it file-local
rather than global.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181012144235.19646-3-peter.maydell@linaro.org
---
 target/arm/internals.h | 1 -
 target/arm/helper.c    | 6 ++++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline int bank_number(int mode)
     g_assert_not_reached();
 }
 
-void switch_mode(CPUARMState *, int);
 void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu);
 void arm_translate_init(void);
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void v8m_security_lookup(CPUARMState *env, uint32_t address,
                                 V8M_SAttributes *sattrs);
 #endif
 
+static void switch_mode(CPUARMState *env, int mode);
+
 static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
 {
     int nregs;
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
     return 0;
 }
 
-void switch_mode(CPUARMState *env, int mode)
+static void switch_mode(CPUARMState *env, int mode)
 {
     ARMCPU *cpu = arm_env_get_cpu(env);
 
@@ -XXX,XX +XXX,XX @@ void aarch64_sync_64_to_32(CPUARMState *env)
 
 #else
 
-void switch_mode(CPUARMState *env, int mode)
+static void switch_mode(CPUARMState *env, int mode)
 {
     int old_mode;
     int i;
-- 
2.19.1

The HCR.FB virtualization configuration register bit requests that
TLB maintenance, branch predictor invalidate-all and icache
invalidate-all operations performed in NS EL1 should be upgraded
from "local CPU only to "broadcast within Inner Shareable domain".
For QEMU we NOP the branch predictor and icache operations, so
we only need to upgrade the TLB invalidates:
 AArch32 TLBIALL, TLBIMVA, TLBIASID, DTLBIALL, DTLBIMVA, DTLBIASID,
         ITLBIALL, ITLBIMVA, ITLBIASID, TLBIMVAA, TLBIMVAL, TLBIMVAAL
 AArch64 TLBI VMALLE1, TLBI VAE1, TLBI ASIDE1, TLBI VAAE1,
         TLBI VALE1, TLBI VAALE1

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181012144235.19646-4-peter.maydell@linaro.org
---
 target/arm/helper.c | 191 +++++++++++++++++++++++++++-----------------
 1 file changed, 116 insertions(+), 75 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void contextidr_write(CPUARMState *env, const ARMCPRegInfo *ri,
     raw_write(env, ri, value);
 }
 
-static void tlbiall_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                          uint64_t value)
-{
-    /* Invalidate all (TLBIALL) */
-    ARMCPU *cpu = arm_env_get_cpu(env);
-
-    tlb_flush(CPU(cpu));
-}
-
-static void tlbimva_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                          uint64_t value)
-{
-    /* Invalidate single TLB entry by MVA and ASID (TLBIMVA) */
-    ARMCPU *cpu = arm_env_get_cpu(env);
-
-    tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
-}
-
-static void tlbiasid_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                           uint64_t value)
-{
-    /* Invalidate by ASID (TLBIASID) */
-    ARMCPU *cpu = arm_env_get_cpu(env);
-
-    tlb_flush(CPU(cpu));
-}
-
-static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                           uint64_t value)
-{
-    /* Invalidate single entry by MVA, all ASIDs (TLBIMVAA) */
-    ARMCPU *cpu = arm_env_get_cpu(env);
-
-    tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
-}
-
 /* IS variants of TLB operations must affect all cores */
 static void tlbiall_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
@@ -XXX,XX +XXX,XX @@ static void tlbimvaa_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     tlb_flush_page_all_cpus_synced(cs, value & TARGET_PAGE_MASK);
 }
 
+/*
+ * Non-IS variants of TLB operations are upgraded to
+ * IS versions if we are at NS EL1 and HCR_EL2.FB is set to
+ * force broadcast of these operations.
+ */
+static bool tlb_force_broadcast(CPUARMState *env)
+{
+    return (env->cp15.hcr_el2 & HCR_FB) &&
+        arm_current_el(env) == 1 && arm_is_secure_below_el3(env);
+}
+
+static void tlbiall_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                          uint64_t value)
+{
+    /* Invalidate all (TLBIALL) */
+    ARMCPU *cpu = arm_env_get_cpu(env);
+
+    if (tlb_force_broadcast(env)) {
+        tlbiall_is_write(env, NULL, value);
+        return;
+    }
+
+    tlb_flush(CPU(cpu));
+}
+
+static void tlbimva_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                          uint64_t value)
+{
+    /* Invalidate single TLB entry by MVA and ASID (TLBIMVA) */
+    ARMCPU *cpu = arm_env_get_cpu(env);
+
+    if (tlb_force_broadcast(env)) {
+        tlbimva_is_write(env, NULL, value);
+        return;
+    }
+
+    tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
+}
+
+static void tlbiasid_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                           uint64_t value)
+{
+    /* Invalidate by ASID (TLBIASID) */
+    ARMCPU *cpu = arm_env_get_cpu(env);
+
+    if (tlb_force_broadcast(env)) {
+        tlbiasid_is_write(env, NULL, value);
+        return;
+    }
+
+    tlb_flush(CPU(cpu));
+}
+
+static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                           uint64_t value)
+{
+    /* Invalidate single entry by MVA, all ASIDs (TLBIMVAA) */
+    ARMCPU *cpu = arm_env_get_cpu(env);
+
+    if (tlb_force_broadcast(env)) {
+        tlbimvaa_is_write(env, NULL, value);
+        return;
+    }
+
+    tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
+}
+
 static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                uint64_t value)
 {
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
  * Page D4-1736 (DDI0487A.b)
  */
 
-static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                                    uint64_t value)
-{
-    CPUState *cs = ENV_GET_CPU(env);
-
-    if (arm_is_secure_below_el3(env)) {
-        tlb_flush_by_mmuidx(cs,
-                            ARMMMUIdxBit_S1SE1 |
-                            ARMMMUIdxBit_S1SE0);
-    } else {
-        tlb_flush_by_mmuidx(cs,
-                            ARMMMUIdxBit_S12NSE1 |
-                            ARMMMUIdxBit_S12NSE0);
-    }
-}
-
 static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                       uint64_t value)
 {
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     }
 }
 
+static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                                    uint64_t value)
+{
+    CPUState *cs = ENV_GET_CPU(env);
+
+    if (tlb_force_broadcast(env)) {
+        tlbi_aa64_vmalle1_write(env, NULL, value);
+        return;
+    }
+
+    if (arm_is_secure_below_el3(env)) {
+        tlb_flush_by_mmuidx(cs,
+                            ARMMMUIdxBit_S1SE1 |
+                            ARMMMUIdxBit_S1SE0);
+    } else {
+        tlb_flush_by_mmuidx(cs,
+                            ARMMMUIdxBit_S12NSE1 |
+                            ARMMMUIdxBit_S12NSE0);
+    }
+}
+
 static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                   uint64_t value)
 {
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_S1E3);
 }
 
-static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                                 uint64_t value)
-{
-    /* Invalidate by VA, EL1&0 (AArch64 version).
-     * Currently handles all of VAE1, VAAE1, VAALE1 and VALE1,
-     * since we don't support flush-for-specific-ASID-only or
-     * flush-last-level-only.
-     */
-    ARMCPU *cpu = arm_env_get_cpu(env);
-    CPUState *cs = CPU(cpu);
-    uint64_t pageaddr = sextract64(value << 12, 0, 56);
-
-    if (arm_is_secure_below_el3(env)) {
-        tlb_flush_page_by_mmuidx(cs, pageaddr,
-                                 ARMMMUIdxBit_S1SE1 |
-                                 ARMMMUIdxBit_S1SE0);
-    } else {
-        tlb_flush_page_by_mmuidx(cs, pageaddr,
-                                 ARMMMUIdxBit_S12NSE1 |
-                                 ARMMMUIdxBit_S12NSE0);
-    }
-}
-
 static void tlbi_aa64_vae2_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                  uint64_t value)
 {
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     }
 }
 
+static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                                 uint64_t value)
+{
+    /* Invalidate by VA, EL1&0 (AArch64 version).
+     * Currently handles all of VAE1, VAAE1, VAALE1 and VALE1,
+     * since we don't support flush-for-specific-ASID-only or
+     * flush-last-level-only.
+     */
+    ARMCPU *cpu = arm_env_get_cpu(env);
+    CPUState *cs = CPU(cpu);
+    uint64_t pageaddr = sextract64(value << 12, 0, 56);
+
+    if (tlb_force_broadcast(env)) {
+        tlbi_aa64_vae1is_write(env, NULL, value);
+        return;
+    }
+
+    if (arm_is_secure_below_el3(env)) {
+        tlb_flush_page_by_mmuidx(cs, pageaddr,
+                                 ARMMMUIdxBit_S1SE1 |
+                                 ARMMMUIdxBit_S1SE0);
+    } else {
+        tlb_flush_page_by_mmuidx(cs, pageaddr,
+                                 ARMMMUIdxBit_S12NSE1 |
+                                 ARMMMUIdxBit_S12NSE0);
+    }
+}
+
 static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                    uint64_t value)
 {
-- 
2.19.1

The HCR.DC virtualization configuration register bit has the
following effects:
 * SCTLR.M behaves as if it is 0 for all purposes except
   direct reads of the bit
 * HCR.VM behaves as if it is 1 for all purposes except
   direct reads of the bit
 * the memory type produced by the first stage of the EL1&EL0
   translation regime is Normal Non-Shareable,
   Inner Write-Back Read-Allocate Write-Allocate,
   Outer Write-Back Read-Allocate Write-Allocate.

Implement this behaviour.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181012144235.19646-5-peter.maydell@linaro.org
---
 target/arm/helper.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
          * * The Non-secure TTBCR.EAE bit is set to 1
          * * The implementation includes EL2, and the value of HCR.VM is 1
          *
+         * (Note that HCR.DC makes HCR.VM behave as if it is 1.)
+         *
          * ATS1Hx always uses the 64bit format (not supported yet).
          */
         format64 = arm_s1_regime_using_lpae_format(env, mmu_idx);
 
         if (arm_feature(env, ARM_FEATURE_EL2)) {
             if (mmu_idx == ARMMMUIdx_S12NSE0 || mmu_idx == ARMMMUIdx_S12NSE1) {
-                format64 |= env->cp15.hcr_el2 & HCR_VM;
+                format64 |= env->cp15.hcr_el2 & (HCR_VM | HCR_DC);
             } else {
                 format64 |= arm_current_el(env) == 2;
             }
@@ -XXX,XX +XXX,XX @@ static inline bool regime_translation_disabled(CPUARMState *env,
     }
 
     if (mmu_idx == ARMMMUIdx_S2NS) {
-        return (env->cp15.hcr_el2 & HCR_VM) == 0;
+        /* HCR.DC means HCR.VM behaves as 1 */
+        return (env->cp15.hcr_el2 & (HCR_DC | HCR_VM)) == 0;
     }
 
     if (env->cp15.hcr_el2 & HCR_TGE) {
@@ -XXX,XX +XXX,XX @@ static inline bool regime_translation_disabled(CPUARMState *env,
         }
     }
 
+    if ((env->cp15.hcr_el2 & HCR_DC) &&
+        (mmu_idx == ARMMMUIdx_S1NSE0 || mmu_idx == ARMMMUIdx_S1NSE1)) {
+        /* HCR.DC means SCTLR_EL1.M behaves as 0 */
+        return true;
+    }
+
     return (regime_sctlr(env, mmu_idx) & SCTLR_M) == 0;
 }
 
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr(CPUARMState *env, target_ulong address,
 
             /* Combine the S1 and S2 cache attributes, if needed */
             if (!ret && cacheattrs != NULL) {
+                if (env->cp15.hcr_el2 & HCR_DC) {
+                    /*
+                     * HCR.DC forces the first stage attributes to
+                     *  Normal Non-Shareable,
+                     *  Inner Write-Back Read-Allocate Write-Allocate,
+                     *  Outer Write-Back Read-Allocate Write-Allocate.
+                     */
+                    cacheattrs->attrs = 0xff;
+                    cacheattrs->shareability = 0;
+                }
                 *cacheattrs = combine_cacheattrs(*cacheattrs, cacheattrs2);
             }
 
-- 
2.19.1

The A/I/F bits in ISR_EL1 should track the virtual interrupt
status, not the physical interrupt status, if the associated
HCR_EL2.AMO/IMO/FMO bit is set. Implement this, rather than
always showing the physical interrupt status.

We don't currently implement anything to do with external
aborts, so this applies only to the I and F bits (though it
ought to be possible for the outer guest to present a virtual
external abort to the inner guest, even if QEMU doesn't
emulate physical external aborts, so there is missing
functionality in this area).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181012144235.19646-6-peter.maydell@linaro.org
---
 target/arm/helper.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static uint64_t isr_read(CPUARMState *env, const ARMCPRegInfo *ri)
     CPUState *cs = ENV_GET_CPU(env);
     uint64_t ret = 0;
 
-    if (cs->interrupt_request & CPU_INTERRUPT_HARD) {
-        ret |= CPSR_I;
+    if (arm_hcr_el2_imo(env)) {
+        if (cs->interrupt_request & CPU_INTERRUPT_VIRQ) {
+            ret |= CPSR_I;
+        }
+    } else {
+        if (cs->interrupt_request & CPU_INTERRUPT_HARD) {
+            ret |= CPSR_I;
+        }
     }
-    if (cs->interrupt_request & CPU_INTERRUPT_FIQ) {
-        ret |= CPSR_F;
+
+    if (arm_hcr_el2_fmo(env)) {
+        if (cs->interrupt_request & CPU_INTERRUPT_VFIQ) {
+            ret |= CPSR_F;
+        }
+    } else {
+        if (cs->interrupt_request & CPU_INTERRUPT_FIQ) {
+            ret |= CPSR_F;
+        }
     }
+
     /* External aborts are not possible in QEMU so A bit is always clear */
     return ret;
 }
-- 
2.19.1

The HCR_EL2 VI and VF bits are supposed to track whether there is
a pending virtual IRQ or virtual FIQ. For QEMU we store the
pending VIRQ/VFIQ status in cs->interrupt_request, so this means:
 * if the register is read we must get these bit values from
   cs->interrupt_request
 * if the register is written then we must write the bit
   values back into cs->interrupt_request

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181012144235.19646-7-peter.maydell@linaro.org
---
 target/arm/helper.c | 47 +++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 43 insertions(+), 4 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
 static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
 {
     ARMCPU *cpu = arm_env_get_cpu(env);
+    CPUState *cs = ENV_GET_CPU(env);
     uint64_t valid_mask = HCR_MASK;
 
     if (arm_feature(env, ARM_FEATURE_EL3)) {
@@ -XXX,XX +XXX,XX @@ static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
     /* Clear RES0 bits.  */
     value &= valid_mask;
 
+    /*
+     * VI and VF are kept in cs->interrupt_request. Modifying that
+     * requires that we have the iothread lock, which is done by
+     * marking the reginfo structs as ARM_CP_IO.
+     * Note that if a write to HCR pends a VIRQ or VFIQ it is never
+     * possible for it to be taken immediately, because VIRQ and
+     * VFIQ are masked unless running at EL0 or EL1, and HCR
+     * can only be written at EL2.
+     */
+    g_assert(qemu_mutex_iothread_locked());
+    if (value & HCR_VI) {
+        cs->interrupt_request |= CPU_INTERRUPT_VIRQ;
+    } else {
+        cs->interrupt_request &= ~CPU_INTERRUPT_VIRQ;
+    }
+    if (value & HCR_VF) {
+        cs->interrupt_request |= CPU_INTERRUPT_VFIQ;
+    } else {
+        cs->interrupt_request &= ~CPU_INTERRUPT_VFIQ;
+    }
+    value &= ~(HCR_VI | HCR_VF);
+
     /* These bits change the MMU setup:
      * HCR_VM enables stage 2 translation
      * HCR_PTW forbids certain page-table setups
@@ -XXX,XX +XXX,XX @@ static void hcr_writelow(CPUARMState *env, const ARMCPRegInfo *ri,
     hcr_write(env, NULL, value);
 }
 
+static uint64_t hcr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+    /* The VI and VF bits live in cs->interrupt_request */
+    uint64_t ret = env->cp15.hcr_el2 & ~(HCR_VI | HCR_VF);
+    CPUState *cs = ENV_GET_CPU(env);
+
+    if (cs->interrupt_request & CPU_INTERRUPT_VIRQ) {
+        ret |= HCR_VI;
+    }
+    if (cs->interrupt_request & CPU_INTERRUPT_VFIQ) {
+        ret |= HCR_VF;
+    }
+    return ret;
+}
+
 static const ARMCPRegInfo el2_cp_reginfo[] = {
     { .name = "HCR_EL2", .state = ARM_CP_STATE_AA64,
+      .type = ARM_CP_IO,
       .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
       .access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.hcr_el2),
-      .writefn = hcr_write },
+      .writefn = hcr_write, .readfn = hcr_read },
     { .name = "HCR", .state = ARM_CP_STATE_AA32,
-      .type = ARM_CP_ALIAS,
+      .type = ARM_CP_ALIAS | ARM_CP_IO,
       .cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
       .access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.hcr_el2),
-      .writefn = hcr_writelow },
+      .writefn = hcr_writelow, .readfn = hcr_read },
     { .name = "ELR_EL2", .state = ARM_CP_STATE_AA64,
       .type = ARM_CP_ALIAS,
       .opc0 = 3, .opc1 = 4, .crn = 4, .crm = 0, .opc2 = 1,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
 
 static const ARMCPRegInfo el2_v8_cp_reginfo[] = {
     { .name = "HCR2", .state = ARM_CP_STATE_AA32,
-      .type = ARM_CP_ALIAS,
+      .type = ARM_CP_ALIAS | ARM_CP_IO,
       .cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 4,
       .access = PL2_RW,
       .fieldoffset = offsetofhigh32(CPUARMState, cp15.hcr_el2),
-- 
2.19.1

If the HCR_EL2 PTW virtualizaiton configuration register bit
is set, then this means that a stage 2 Permission fault must
be generated if a stage 1 translation table access is made
to an address that is mapped as Device memory in stage 2.
Implement this.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181012144235.19646-8-peter.maydell@linaro.org
---
 target/arm/helper.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
         hwaddr s2pa;
         int s2prot;
         int ret;
+        ARMCacheAttrs cacheattrs = {};
+        ARMCacheAttrs *pcacheattrs = NULL;
+
+        if (env->cp15.hcr_el2 & HCR_PTW) {
+            /*
+             * PTW means we must fault if this S1 walk touches S2 Device
+             * memory; otherwise we don't care about the attributes and can
+             * save the S2 translation the effort of computing them.
+             */
+            pcacheattrs = &cacheattrs;
+        }
 
         ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_S2NS, &s2pa,
-                                 &txattrs, &s2prot, &s2size, fi, NULL);
+                                 &txattrs, &s2prot, &s2size, fi, pcacheattrs);
         if (ret) {
             assert(fi->type != ARMFault_None);
             fi->s2addr = addr;
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
             fi->s1ptw = true;
             return ~0;
         }
+        if (pcacheattrs && (pcacheattrs->attrs & 0xf0) == 0) {
+            /* Access was to Device memory: generate Permission fault */
+            fi->type = ARMFault_Permission;
+            fi->s2addr = addr;
+            fi->stage2 = true;
+            fi->s1ptw = true;
+            return ~0;
+        }
         addr = s2pa;
     }
     return addr;
-- 
2.19.1

Create and use a utility function to extract the EC field
from a syndrome, rather than open-coding the shift.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181012144235.19646-9-peter.maydell@linaro.org
---
 target/arm/internals.h | 5 +++++
 target/arm/helper.c    | 4 ++--
 target/arm/kvm64.c     | 2 +-
 target/arm/op_helper.c | 2 +-
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
 #define ARM_EL_IL (1 << ARM_EL_IL_SHIFT)
 #define ARM_EL_ISV (1 << ARM_EL_ISV_SHIFT)
 
+static inline uint32_t syn_get_ec(uint32_t syn)
+{
+    return syn >> ARM_EL_EC_SHIFT;
+}
+
 /* Utility functions for constructing various kinds of syndrome value.
  * Note that in general we follow the AArch64 syndrome values; in a
  * few cases the value in HSR for exceptions taken to AArch32 Hyp
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch32(CPUState *cs)
     uint32_t moe;
 
     /* If this is a debug exception we must update the DBGDSCR.MOE bits */
-    switch (env->exception.syndrome >> ARM_EL_EC_SHIFT) {
+    switch (syn_get_ec(env->exception.syndrome)) {
     case EC_BREAKPOINT:
     case EC_BREAKPOINT_SAME_EL:
         moe = 1;
@@ -XXX,XX +XXX,XX @@ void arm_cpu_do_interrupt(CPUState *cs)
     if (qemu_loglevel_mask(CPU_LOG_INT)
         && !excp_is_internal(cs->exception_index)) {
         qemu_log_mask(CPU_LOG_INT, "...with ESR 0x%x/0x%" PRIx32 "\n",
-                      env->exception.syndrome >> ARM_EL_EC_SHIFT,
+                      syn_get_ec(env->exception.syndrome),
                       env->exception.syndrome);
     }
 
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
 
 bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit)
 {
-    int hsr_ec = debug_exit->hsr >> ARM_EL_EC_SHIFT;
+    int hsr_ec = syn_get_ec(debug_exit->hsr);
     ARMCPU *cpu = ARM_CPU(cs);
     CPUClass *cc = CPU_GET_CLASS(cs);
     CPUARMState *env = &cpu->env;
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -XXX,XX +XXX,XX @@ void raise_exception(CPUARMState *env, uint32_t excp,
          * (see DDI0478C.a D1.10.4)
          */
         target_el = 2;
-        if (syndrome >> ARM_EL_EC_SHIFT == EC_ADVSIMDFPACCESSTRAP) {
+        if (syn_get_ec(syndrome) == EC_ADVSIMDFPACCESSTRAP) {
             syndrome = syn_uncategorized();
         }
     }
-- 
2.19.1

For the v7 version of the Arm architecture, the IL bit in
syndrome register values where the field is not valid was
defined to be UNK/SBZP. In v8 this is RES1, which is what
QEMU currently implements. Handle the desired v7 behaviour
by squashing the IL bit for the affected cases:
 * EC == EC_UNCATEGORIZED
 * prefetch aborts
 * data aborts where ISV is 0

(The fourth case listed in the v8 Arm ARM DDI 0487C.a in
section G7.2.70, "illegal state exception", can't happen
on a v7 CPU.)

This deals with a corner case noted in a comment.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181012144235.19646-10-peter.maydell@linaro.org
---
 target/arm/internals.h |  7 ++-----
 target/arm/helper.c    | 13 +++++++++++++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_get_ec(uint32_t syn)
 /* Utility functions for constructing various kinds of syndrome value.
  * Note that in general we follow the AArch64 syndrome values; in a
  * few cases the value in HSR for exceptions taken to AArch32 Hyp
- * mode differs slightly, so if we ever implemented Hyp mode then the
- * syndrome value would need some massaging on exception entry.
- * (One example of this is that AArch64 defaults to IL bit set for
- * exceptions which don't specifically indicate information about the
- * trapping instruction, whereas AArch32 defaults to IL bit clear.)
+ * mode differs slightly, and we fix this up when populating HSR in
+ * arm_cpu_do_interrupt_aarch32_hyp().
  */
 static inline uint32_t syn_uncategorized(void)
 {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch32_hyp(CPUState *cs)
     }
 
     if (cs->exception_index != EXCP_IRQ && cs->exception_index != EXCP_FIQ) {
+        if (!arm_feature(env, ARM_FEATURE_V8)) {
+            /*
+             * QEMU syndrome values are v8-style. v7 has the IL bit
+             * UNK/SBZP for "field not valid" cases, where v8 uses RES1.
+             * If this is a v7 CPU, squash the IL bit in those cases.
+             */
+            if (cs->exception_index == EXCP_PREFETCH_ABORT ||
+                (cs->exception_index == EXCP_DATA_ABORT &&
+                 !(env->exception.syndrome & ARM_EL_ISV)) ||
+                syn_get_ec(env->exception.syndrome) == EC_UNCATEGORIZED) {
+                env->exception.syndrome &= ~ARM_EL_IL;
+            }
+        }
         env->cp15.esr_el[2] = env->exception.syndrome;
     }
 
-- 
2.19.1

For traps of FP/SIMD instructions to AArch32 Hyp mode, the syndrome
provided in HSR has more information than is reported to AArch64.
Specifically, there are extra fields TA and coproc which indicate
whether the trapped instruction was FP or SIMD. Add this extra
information to the syndromes we construct, and mask it out when
taking the exception to AArch64.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181012144235.19646-11-peter.maydell@linaro.org
---
 target/arm/internals.h | 14 +++++++++++++-
 target/arm/helper.c    |  9 +++++++++
 target/arm/translate.c |  8 ++++----
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_get_ec(uint32_t syn)
  * few cases the value in HSR for exceptions taken to AArch32 Hyp
  * mode differs slightly, and we fix this up when populating HSR in
  * arm_cpu_do_interrupt_aarch32_hyp().
+ * The exception is FP/SIMD access traps -- these report extra information
+ * when taking an exception to AArch32. For those we include the extra coproc
+ * and TA fields, and mask them out when taking the exception to AArch64.
  */
 static inline uint32_t syn_uncategorized(void)
 {
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_cp15_rrt_trap(int cv, int cond, int opc1, int crm,
 
 static inline uint32_t syn_fp_access_trap(int cv, int cond, bool is_16bit)
 {
+    /* AArch32 FP trap or any AArch64 FP/SIMD trap: TA == 0 coproc == 0xa */
     return (EC_ADVSIMDFPACCESSTRAP << ARM_EL_EC_SHIFT)
         | (is_16bit ? 0 : ARM_EL_IL)
-        | (cv << 24) | (cond << 20);
+        | (cv << 24) | (cond << 20) | 0xa;
+}
+
+static inline uint32_t syn_simd_access_trap(int cv, int cond, bool is_16bit)
+{
+    /* AArch32 SIMD trap: TA == 1 coproc == 0 */
+    return (EC_ADVSIMDFPACCESSTRAP << ARM_EL_EC_SHIFT)
+        | (is_16bit ? 0 : ARM_EL_IL)
+        | (cv << 24) | (cond << 20) | (1 << 5);
 }
 
 static inline uint32_t syn_sve_access_trap(void)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
     case EXCP_HVC:
     case EXCP_HYP_TRAP:
     case EXCP_SMC:
+        if (syn_get_ec(env->exception.syndrome) == EC_ADVSIMDFPACCESSTRAP) {
+            /*
+             * QEMU internal FP/SIMD syndromes from AArch32 include the
+             * TA and coproc fields which are only exposed if the exception
+             * is taken to AArch32 Hyp mode. Mask them out to get a valid
+             * AArch64 format syndrome.
+             */
+            env->exception.syndrome &= ~MAKE_64BIT_MASK(0, 20);
+        }
         env->cp15.esr_el[new_el] = env->exception.syndrome;
         break;
     case EXCP_IRQ:
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      */
     if (s->fp_excp_el) {
         gen_exception_insn(s, 4, EXCP_UDEF,
-                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
+                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
         return 0;
     }
 
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      */
     if (s->fp_excp_el) {
         gen_exception_insn(s, 4, EXCP_UDEF,
-                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
+                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
         return 0;
     }
 
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
 
     if (s->fp_excp_el) {
         gen_exception_insn(s, 4, EXCP_UDEF,
-                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
+                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
         return 0;
     }
     if (!s->vfp_enabled) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
 
     if (s->fp_excp_el) {
         gen_exception_insn(s, 4, EXCP_UDEF,
-                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
+                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
         return 0;
     }
     if (!s->vfp_enabled) {
-- 
2.19.1

From: Stewart Hildebrand <Stewart.Hildebrand@dornerworks.com>

"The Image must be placed text_offset bytes from a 2MB aligned base
address anywhere in usable system RAM and called there."

For the virt board, we write our startup bootloader at the very
bottom of RAM, so that bit can't be used for the image. To avoid
overlap in case the image requests to be loaded at an offset
smaller than our bootloader, we increment the load offset to the
next 2MB.

This fixes a boot failure for Xen AArch64.

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@dornerworks.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Message-id: b8a89518794b4436af0c151ed10de4fa@dornerworks.com
[PMM: Rephrased a comment a bit]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/boot.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/config-file.h"
 #include "qemu/option.h"
 #include "exec/address-spaces.h"
+#include "qemu/units.h"
 
 /* Kernel boot protocol is specified in the kernel docs
  * Documentation/arm/Booting and Documentation/arm64/booting.txt
@@ -XXX,XX +XXX,XX @@
 #define ARM64_TEXT_OFFSET_OFFSET    8
 #define ARM64_MAGIC_OFFSET          56
 
+#define BOOTLOADER_MAX_SIZE         (4 * KiB)
+
 AddressSpace *arm_boot_address_space(ARMCPU *cpu,
                                      const struct arm_boot_info *info)
 {
@@ -XXX,XX +XXX,XX @@ static void write_bootloader(const char *name, hwaddr addr,
         code[i] = tswap32(insn);
     }
 
+    assert((len * sizeof(uint32_t)) < BOOTLOADER_MAX_SIZE);
+
     rom_add_blob_fixed_as(name, code, len * sizeof(uint32_t), addr, as);
 
     g_free(code);
@@ -XXX,XX +XXX,XX @@ static uint64_t load_aarch64_image(const char *filename, hwaddr mem_base,
         memcpy(&hdrvals, buffer + ARM64_TEXT_OFFSET_OFFSET, sizeof(hdrvals));
         if (hdrvals[1] != 0) {
             kernel_load_offset = le64_to_cpu(hdrvals[0]);
+
+            /*
+             * We write our startup "bootloader" at the very bottom of RAM,
+             * so that bit can't be used for the image. Luckily the Image
+             * format specification is that the image requests only an offset
+             * from a 2MB boundary, not an absolute load address. So if the
+             * image requests an offset that might mean it overlaps with the
+             * bootloader, we can just load it starting at 2MB+offset rather
+             * than 0MB + offset.
+             */
+            if (kernel_load_offset < BOOTLOADER_MAX_SIZE) {
+                kernel_load_offset += 2 * MiB;
+            }
         }
     }
 
-- 
2.19.1

From: Richard Henderson <rth@twiddle.net>

This can reduce the number of opcodes required for certain
complex forms of load-multiple (e.g. ld4.16b).

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 20181011205206.3552-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
     bool is_store = !extract32(insn, 22, 1);
     bool is_postidx = extract32(insn, 23, 1);
     bool is_q = extract32(insn, 30, 1);
-    TCGv_i64 tcg_addr, tcg_rn;
+    TCGv_i64 tcg_addr, tcg_rn, tcg_ebytes;
 
     int ebytes = 1 << size;
     int elements = (is_q ? 128 : 64) / (8 << size);
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
     tcg_rn = cpu_reg_sp(s, rn);
     tcg_addr = tcg_temp_new_i64();
     tcg_gen_mov_i64(tcg_addr, tcg_rn);
+    tcg_ebytes = tcg_const_i64(ebytes);
 
     for (r = 0; r < rpt; r++) {
         int e;
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
                         clear_vec_high(s, is_q, tt);
                     }
                 }
-                tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes);
+                tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
                 tt = (tt + 1) % 32;
             }
         }
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
             tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
         }
     }
+    tcg_temp_free_i64(tcg_ebytes);
     tcg_temp_free_i64(tcg_addr);
 }
 
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
     bool replicate = false;
     int index = is_q << 3 | S << 2 | size;
     int ebytes, xs;
-    TCGv_i64 tcg_addr, tcg_rn;
+    TCGv_i64 tcg_addr, tcg_rn, tcg_ebytes;
 
     switch (scale) {
     case 3:
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
     tcg_rn = cpu_reg_sp(s, rn);
     tcg_addr = tcg_temp_new_i64();
     tcg_gen_mov_i64(tcg_addr, tcg_rn);
+    tcg_ebytes = tcg_const_i64(ebytes);
 
     for (xs = 0; xs < selem; xs++) {
         if (replicate) {
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
                 do_vec_st(s, rt, index, tcg_addr, scale);
             }
         }
-        tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes);
+        tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
         rt = (rt + 1) % 32;
     }
 
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
             tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
         }
     }
+    tcg_temp_free_i64(tcg_ebytes);
     tcg_temp_free_i64(tcg_addr);
 }
 
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

This is done generically in translator_loop.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181011205206.3552-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 1 -
 target/arm/translate.c     | 1 -
 2 files changed, 2 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
 
 static void aarch64_tr_tb_start(DisasContextBase *db, CPUState *cpu)
 {
-    tcg_clear_temp_count();
 }
 
 static void aarch64_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void arm_tr_tb_start(DisasContextBase *dcbase, CPUState *cpu)
         tcg_gen_movi_i32(tmp, 0);
         store_cpu_field(tmp, condexec_bits);
     }
-    tcg_clear_temp_count();
 }
 
 static void arm_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 28 +++-------------------------
 1 file changed, 3 insertions(+), 25 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
     for (xs = 0; xs < selem; xs++) {
         if (replicate) {
             /* Load and replicate to all elements */
-            uint64_t mulconst;
             TCGv_i64 tcg_tmp = tcg_temp_new_i64();
 
             tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr,
                                 get_mem_index(s), s->be_data + scale);
-            switch (scale) {
-            case 0:
-                mulconst = 0x0101010101010101ULL;
-                break;
-            case 1:
-                mulconst = 0x0001000100010001ULL;
-                break;
-            case 2:
-                mulconst = 0x0000000100000001ULL;
-                break;
-            case 3:
-                mulconst = 0;
-                break;
-            default:
-                g_assert_not_reached();
-            }
-            if (mulconst) {
-                tcg_gen_muli_i64(tcg_tmp, tcg_tmp, mulconst);
-            }
-            write_vec_element(s, tcg_tmp, rt, 0, MO_64);
-            if (is_q) {
-                write_vec_element(s, tcg_tmp, rt, 1, MO_64);
-            }
+            tcg_gen_gvec_dup_i64(scale, vec_full_reg_offset(s, rt),
+                                 (is_q + 1) * 8, vec_full_reg_size(s),
+                                 tcg_tmp);
             tcg_temp_free_i64(tcg_tmp);
-            clear_vec_high(s, is_q, rt);
         } else {
             /* Load/store one element per register */
             if (is_load) {
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

For a sequence of loads or stores from a single register,
little-endian operations can be promoted to an 8-byte op.
This can reduce the number of operations by a factor of 8.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 66 +++++++++++++++++++++++---------------
 1 file changed, 40 insertions(+), 26 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void write_vec_element_i32(DisasContext *s, TCGv_i32 tcg_src,
 
 /* Store from vector register to memory */
 static void do_vec_st(DisasContext *s, int srcidx, int element,
-                      TCGv_i64 tcg_addr, int size)
+                      TCGv_i64 tcg_addr, int size, TCGMemOp endian)
 {
-    TCGMemOp memop = s->be_data + size;
     TCGv_i64 tcg_tmp = tcg_temp_new_i64();
 
     read_vec_element(s, tcg_tmp, srcidx, element, size);
-    tcg_gen_qemu_st_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
+    tcg_gen_qemu_st_i64(tcg_tmp, tcg_addr, get_mem_index(s), endian | size);
 
     tcg_temp_free_i64(tcg_tmp);
 }
 
 /* Load from memory to vector register */
 static void do_vec_ld(DisasContext *s, int destidx, int element,
-                      TCGv_i64 tcg_addr, int size)
+                      TCGv_i64 tcg_addr, int size, TCGMemOp endian)
 {
-    TCGMemOp memop = s->be_data + size;
     TCGv_i64 tcg_tmp = tcg_temp_new_i64();
 
-    tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
+    tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr, get_mem_index(s), endian | size);
     write_vec_element(s, tcg_tmp, destidx, element, size);
 
     tcg_temp_free_i64(tcg_tmp);
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
     bool is_postidx = extract32(insn, 23, 1);
     bool is_q = extract32(insn, 30, 1);
     TCGv_i64 tcg_addr, tcg_rn, tcg_ebytes;
+    TCGMemOp endian = s->be_data;
 
-    int ebytes = 1 << size;
-    int elements = (is_q ? 128 : 64) / (8 << size);
+    int ebytes;   /* bytes per element */
+    int elements; /* elements per vector */
     int rpt;    /* num iterations */
     int selem;  /* structure elements */
     int r;
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
         gen_check_sp_alignment(s);
     }
 
+    /* For our purposes, bytes are always little-endian.  */
+    if (size == 0) {
+        endian = MO_LE;
+    }
+
+    /* Consecutive little-endian elements from a single register
+     * can be promoted to a larger little-endian operation.
+     */
+    if (selem == 1 && endian == MO_LE) {
+        size = 3;
+    }
+    ebytes = 1 << size;
+    elements = (is_q ? 16 : 8) / ebytes;
+
     tcg_rn = cpu_reg_sp(s, rn);
     tcg_addr = tcg_temp_new_i64();
     tcg_gen_mov_i64(tcg_addr, tcg_rn);
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
     for (r = 0; r < rpt; r++) {
         int e;
         for (e = 0; e < elements; e++) {
-            int tt = (rt + r) % 32;
             int xs;
             for (xs = 0; xs < selem; xs++) {
+                int tt = (rt + r + xs) % 32;
                 if (is_store) {
-                    do_vec_st(s, tt, e, tcg_addr, size);
+                    do_vec_st(s, tt, e, tcg_addr, size, endian);
                 } else {
-                    do_vec_ld(s, tt, e, tcg_addr, size);
-
-                    /* For non-quad operations, setting a slice of the low
-                     * 64 bits of the register clears the high 64 bits (in
-                     * the ARM ARM pseudocode this is implicit in the fact
-                     * that 'rval' is a 64 bit wide variable).
-                     * For quad operations, we might still need to zero the
-                     * high bits of SVE.  We optimize by noticing that we only
-                     * need to do this the first time we touch a register.
-                     */
-                    if (e == 0 && (r == 0 || xs == selem - 1)) {
-                        clear_vec_high(s, is_q, tt);
-                    }
+                    do_vec_ld(s, tt, e, tcg_addr, size, endian);
                 }
                 tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
-                tt = (tt + 1) % 32;
             }
         }
     }
 
+    if (!is_store) {
+        /* For non-quad operations, setting a slice of the low
+         * 64 bits of the register clears the high 64 bits (in
+         * the ARM ARM pseudocode this is implicit in the fact
+         * that 'rval' is a 64 bit wide variable).
+         * For quad operations, we might still need to zero the
+         * high bits of SVE.
+         */
+        for (r = 0; r < rpt * selem; r++) {
+            int tt = (rt + r) % 32;
+            clear_vec_high(s, is_q, tt);
+        }
+    }
+
     if (is_postidx) {
         int rm = extract32(insn, 16, 5);
         if (rm == 31) {
@@ -XXX,XX +XXX,XX @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
         } else {
             /* Load/store one element per register */
             if (is_load) {
-                do_vec_ld(s, rt, index, tcg_addr, scale);
+                do_vec_ld(s, rt, index, tcg_addr, scale, s->be_data);
             } else {
-                do_vec_st(s, rt, index, tcg_addr, scale);
+                do_vec_st(s, rt, index, tcg_addr, scale, s->be_data);
             }
         }
         tcg_gen_add_i64(tcg_addr, tcg_addr, tcg_ebytes);
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181011205206.3552-6-richard.henderson@linaro.org
[PMM: drop change to now-deleted cpu_mode_names array]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static TCGv_i64 cpu_F0d, cpu_F1d;
 
 #include "exec/gen-icount.h"
 
-static const char *regnames[] =
+static const char * const regnames[] =
     { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
       "r8", "r9", "r10", "r11", "r12", "r13", "r14", "pc" };
 
@@ -XXX,XX +XXX,XX @@ static struct {
     int nregs;
     int interleave;
     int spacing;
-} neon_ls_element_type[11] = {
+} const neon_ls_element_type[11] = {
     {4, 4, 1},
     {4, 4, 2},
     {4, 1, 1},
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Also introduces neon_element_offset to find the env offset
of a specific element within a neon register.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-7-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 63 ++++++++++++++++++++++++------------------
 1 file changed, 36 insertions(+), 27 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ neon_reg_offset (int reg, int n)
     return vfp_reg_offset(0, sreg);
 }
 
+/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
+ * where 0 is the least significant end of the register.
+ */
+static inline long
+neon_element_offset(int reg, int element, TCGMemOp size)
+{
+    int element_size = 1 << size;
+    int ofs = element * element_size;
+#ifdef HOST_WORDS_BIGENDIAN
+    /* Calculate the offset assuming fully little-endian,
+     * then XOR to account for the order of the 8-byte units.
+     */
+    if (element_size < 8) {
+        ofs ^= 8 - element_size;
+    }
+#endif
+    return neon_reg_offset(reg, 0) + ofs;
+}
+
 static TCGv_i32 neon_load_reg(int reg, int pass)
 {
     TCGv_i32 tmp = tcg_temp_new_i32();
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
                     tmp = load_reg(s, rd);
                     if (insn & (1 << 23)) {
                         /* VDUP */
-                        if (size == 0) {
-                            gen_neon_dup_u8(tmp, 0);
-                        } else if (size == 1) {
-                            gen_neon_dup_low16(tmp);
-                        }
-                        for (n = 0; n <= pass * 2; n++) {
-                            tmp2 = tcg_temp_new_i32();
-                            tcg_gen_mov_i32(tmp2, tmp);
-                            neon_store_reg(rn, n, tmp2);
-                        }
-                        neon_store_reg(rn, n, tmp);
+                        int vec_size = pass ? 16 : 8;
+                        tcg_gen_gvec_dup_i32(size, neon_reg_offset(rn, 0),
+                                             vec_size, vec_size, tmp);
+                        tcg_temp_free_i32(tmp);
                     } else {
                         /* VMOV */
                         switch (size) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 tcg_temp_free_i32(tmp);
             } else if ((insn & 0x380) == 0) {
                 /* VDUP */
+                int element;
+                TCGMemOp size;
+
                 if ((insn & (7 << 16)) == 0 || (q && (rd & 1))) {
                     return 1;
                 }
-                if (insn & (1 << 19)) {
-                    tmp = neon_load_reg(rm, 1);
-                } else {
-                    tmp = neon_load_reg(rm, 0);
-                }
                 if (insn & (1 << 16)) {
-                    gen_neon_dup_u8(tmp, ((insn >> 17) & 3) * 8);
+                    size = MO_8;
+                    element = (insn >> 17) & 7;
                 } else if (insn & (1 << 17)) {
-                    if ((insn >> 18) & 1)
-                        gen_neon_dup_high16(tmp);
-                    else
-                        gen_neon_dup_low16(tmp);
+                    size = MO_16;
+                    element = (insn >> 18) & 3;
+                } else {
+                    size = MO_32;
+                    element = (insn >> 19) & 1;
                 }
-                for (pass = 0; pass < (q ? 4 : 2); pass++) {
-                    tmp2 = tcg_temp_new_i32();
-                    tcg_gen_mov_i32(tmp2, tmp);
-                    neon_store_reg(rd, pass, tmp2);
-                }
-                tcg_temp_free_i32(tmp);
+                tcg_gen_gvec_dup_mem(size, neon_reg_offset(rd, 0),
+                                     neon_element_offset(rm, element, size),
+                                     q ? 16 : 8, q ? 16 : 8);
             } else {
                 return 1;
             }
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-8-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 67 ++++++++++++++++++++++++------------------
 1 file changed, 39 insertions(+), 28 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 return 1;
             }
         } else { /* (insn & 0x00380080) == 0 */
-            int invert;
+            int invert, reg_ofs, vec_size;
+
             if (q && (rd & 1)) {
                 return 1;
             }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 break;
             case 14:
                 imm |= (imm << 8) | (imm << 16) | (imm << 24);
-                if (invert)
+                if (invert) {
                     imm = ~imm;
+                }
                 break;
             case 15:
                 if (invert) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                       | ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
                 break;
             }
-            if (invert)
+            if (invert) {
                 imm = ~imm;
+            }
 
-            for (pass = 0; pass < (q ? 4 : 2); pass++) {
-                if (op & 1 && op < 12) {
-                    tmp = neon_load_reg(rd, pass);
-                    if (invert) {
-                        /* The immediate value has already been inverted, so
-                           BIC becomes AND.  */
-                        tcg_gen_andi_i32(tmp, tmp, imm);
-                    } else {
-                        tcg_gen_ori_i32(tmp, tmp, imm);
-                    }
+            reg_ofs = neon_reg_offset(rd, 0);
+            vec_size = q ? 16 : 8;
+
+            if (op & 1 && op < 12) {
+                if (invert) {
+                    /* The immediate value has already been inverted,
+                     * so BIC becomes AND.
+                     */
+                    tcg_gen_gvec_andi(MO_32, reg_ofs, reg_ofs, imm,
+                                      vec_size, vec_size);
                 } else {
-                    /* VMOV, VMVN.  */
-                    tmp = tcg_temp_new_i32();
-                    if (op == 14 && invert) {
-                        int n;
-                        uint32_t val;
-                        val = 0;
-                        for (n = 0; n < 4; n++) {
-                            if (imm & (1 << (n + (pass & 1) * 4)))
-                                val |= 0xff << (n * 8);
-                        }
-                        tcg_gen_movi_i32(tmp, val);
-                    } else {
-                        tcg_gen_movi_i32(tmp, imm);
-                    }
+                    tcg_gen_gvec_ori(MO_32, reg_ofs, reg_ofs, imm,
+                                     vec_size, vec_size);
+                }
+            } else {
+                /* VMOV, VMVN.  */
+                if (op == 14 && invert) {
+                    TCGv_i64 t64 = tcg_temp_new_i64();
+
+                    for (pass = 0; pass <= q; ++pass) {
+                        uint64_t val = 0;
+                        int n;
+
+                        for (n = 0; n < 8; n++) {
+                            if (imm & (1 << (n + pass * 8))) {
+                                val |= 0xffull << (n * 8);
+                            }
+                        }
+                        tcg_gen_movi_i64(t64, val);
+                        neon_store_reg64(t64, rd + pass);
+                    }
+                    tcg_temp_free_i64(t64);
+                } else {
+                    tcg_gen_gvec_dup32i(reg_ofs, vec_size, vec_size, imm);
                 }
-                neon_store_reg(rd, pass, tmp);
             }
         }
     } else { /* (insn & 0x00800010 == 0x00800000) */
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Move expanders for VBSL, VBIT, and VBIF from translate-a64.c.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-9-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     |   6 ++
 target/arm/translate-a64.c |  61 --------------
 target/arm/translate.c     | 162 +++++++++++++++++++++++++++----------
 3 files changed, 124 insertions(+), 105 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 get_ahp_flag(void)
     return ret;
 }
 
+
+/* Vector operations shared between ARM and AArch64.  */
+extern const GVecGen3 bsl_op;
+extern const GVecGen3 bit_op;
+extern const GVecGen3 bif_op;
+
 /*
  * Forward to the isar_feature_* tests given a DisasContext pointer.
  */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
     }
 }
 
-static void gen_bsl_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
-{
-    tcg_gen_xor_i64(rn, rn, rm);
-    tcg_gen_and_i64(rn, rn, rd);
-    tcg_gen_xor_i64(rd, rm, rn);
-}
-
-static void gen_bit_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
-{
-    tcg_gen_xor_i64(rn, rn, rd);
-    tcg_gen_and_i64(rn, rn, rm);
-    tcg_gen_xor_i64(rd, rd, rn);
-}
-
-static void gen_bif_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
-{
-    tcg_gen_xor_i64(rn, rn, rd);
-    tcg_gen_andc_i64(rn, rn, rm);
-    tcg_gen_xor_i64(rd, rd, rn);
-}
-
-static void gen_bsl_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
-{
-    tcg_gen_xor_vec(vece, rn, rn, rm);
-    tcg_gen_and_vec(vece, rn, rn, rd);
-    tcg_gen_xor_vec(vece, rd, rm, rn);
-}
-
-static void gen_bit_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
-{
-    tcg_gen_xor_vec(vece, rn, rn, rd);
-    tcg_gen_and_vec(vece, rn, rn, rm);
-    tcg_gen_xor_vec(vece, rd, rd, rn);
-}
-
-static void gen_bif_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
-{
-    tcg_gen_xor_vec(vece, rn, rn, rd);
-    tcg_gen_andc_vec(vece, rn, rn, rm);
-    tcg_gen_xor_vec(vece, rd, rd, rn);
-}
-
 /* Logic op (opcode == 3) subgroup of C3.6.16. */
 static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
 {
-    static const GVecGen3 bsl_op = {
-        .fni8 = gen_bsl_i64,
-        .fniv = gen_bsl_vec,
-        .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-        .load_dest = true
-    };
-    static const GVecGen3 bit_op = {
-        .fni8 = gen_bit_i64,
-        .fniv = gen_bit_vec,
-        .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-        .load_dest = true
-    };
-    static const GVecGen3 bif_op = {
-        .fni8 = gen_bif_i64,
-        .fniv = gen_bif_vec,
-        .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-        .load_dest = true
-    };
-
     int rd = extract32(insn, 0, 5);
     int rn = extract32(insn, 5, 5);
     int rm = extract32(insn, 16, 5);
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
     return 0;
 }
 
-/* Bitwise select.  dest = c ? t : f.  Clobbers T and F.  */
-static void gen_neon_bsl(TCGv_i32 dest, TCGv_i32 t, TCGv_i32 f, TCGv_i32 c)
-{
-    tcg_gen_and_i32(t, t, c);
-    tcg_gen_andc_i32(f, f, c);
-    tcg_gen_or_i32(dest, t, f);
-}
-
 static inline void gen_neon_narrow(int size, TCGv_i32 dest, TCGv_i64 src)
 {
     switch (size) {
@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
     return 1;
 }
 
+/*
+ * Expanders for VBitOps_VBIF, VBIT, VBSL.
+ */
+static void gen_bsl_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
+{
+    tcg_gen_xor_i64(rn, rn, rm);
+    tcg_gen_and_i64(rn, rn, rd);
+    tcg_gen_xor_i64(rd, rm, rn);
+}
+
+static void gen_bit_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
+{
+    tcg_gen_xor_i64(rn, rn, rd);
+    tcg_gen_and_i64(rn, rn, rm);
+    tcg_gen_xor_i64(rd, rd, rn);
+}
+
+static void gen_bif_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
+{
+    tcg_gen_xor_i64(rn, rn, rd);
+    tcg_gen_andc_i64(rn, rn, rm);
+    tcg_gen_xor_i64(rd, rd, rn);
+}
+
+static void gen_bsl_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
+{
+    tcg_gen_xor_vec(vece, rn, rn, rm);
+    tcg_gen_and_vec(vece, rn, rn, rd);
+    tcg_gen_xor_vec(vece, rd, rm, rn);
+}
+
+static void gen_bit_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
+{
+    tcg_gen_xor_vec(vece, rn, rn, rd);
+    tcg_gen_and_vec(vece, rn, rn, rm);
+    tcg_gen_xor_vec(vece, rd, rd, rn);
+}
+
+static void gen_bif_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm)
+{
+    tcg_gen_xor_vec(vece, rn, rn, rd);
+    tcg_gen_andc_vec(vece, rn, rn, rm);
+    tcg_gen_xor_vec(vece, rd, rd, rn);
+}
+
+const GVecGen3 bsl_op = {
+    .fni8 = gen_bsl_i64,
+    .fniv = gen_bsl_vec,
+    .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+    .load_dest = true
+};
+
+const GVecGen3 bit_op = {
+    .fni8 = gen_bit_i64,
+    .fniv = gen_bit_vec,
+    .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+    .load_dest = true
+};
+
+const GVecGen3 bif_op = {
+    .fni8 = gen_bif_i64,
+    .fniv = gen_bif_vec,
+    .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+    .load_dest = true
+};
+
+
 /* Translate a NEON data processing instruction.  Return nonzero if the
    instruction is invalid.
    We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
 {
     int op;
     int q;
-    int rd, rn, rm;
+    int rd, rn, rm, rd_ofs, rn_ofs, rm_ofs;
     int size;
     int shift;
     int pass;
     int count;
     int pairwise;
     int u;
+    int vec_size;
     uint32_t imm, mask;
     TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
     TCGv_ptr ptr1, ptr2, ptr3;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     VFP_DREG_N(rn, insn);
     VFP_DREG_M(rm, insn);
     size = (insn >> 20) & 3;
+    vec_size = q ? 16 : 8;
+    rd_ofs = neon_reg_offset(rd, 0);
+    rn_ofs = neon_reg_offset(rn, 0);
+    rm_ofs = neon_reg_offset(rm, 0);
+
     if ((insn & (1 << 23)) == 0) {
         /* Three register same length.  */
         op = ((insn >> 7) & 0x1e) | ((insn >> 4) & 1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                      q, rd, rn, rm);
             }
             return 1;
+
+        case NEON_3R_LOGIC: /* Logic ops.  */
+            switch ((u << 2) | size) {
+            case 0: /* VAND */
+                tcg_gen_gvec_and(0, rd_ofs, rn_ofs, rm_ofs,
+                                 vec_size, vec_size);
+                break;
+            case 1: /* VBIC */
+                tcg_gen_gvec_andc(0, rd_ofs, rn_ofs, rm_ofs,
+                                  vec_size, vec_size);
+                break;
+            case 2:
+                if (rn == rm) {
+                    /* VMOV */
+                    tcg_gen_gvec_mov(0, rd_ofs, rn_ofs, vec_size, vec_size);
+                } else {
+                    /* VORR */
+                    tcg_gen_gvec_or(0, rd_ofs, rn_ofs, rm_ofs,
+                                    vec_size, vec_size);
+                }
+                break;
+            case 3: /* VORN */
+                tcg_gen_gvec_orc(0, rd_ofs, rn_ofs, rm_ofs,
+                                 vec_size, vec_size);
+                break;
+            case 4: /* VEOR */
+                tcg_gen_gvec_xor(0, rd_ofs, rn_ofs, rm_ofs,
+                                 vec_size, vec_size);
+                break;
+            case 5: /* VBSL */
+                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
+                               vec_size, vec_size, &bsl_op);
+                break;
+            case 6: /* VBIT */
+                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
+                               vec_size, vec_size, &bit_op);
+                break;
+            case 7: /* VBIF */
+                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
+                               vec_size, vec_size, &bif_op);
+                break;
+            }
+            return 0;
         }
-        if (size == 3 && op != NEON_3R_LOGIC) {
+        if (size == 3) {
             /* 64-bit element instructions. */
             for (pass = 0; pass < (q ? 2 : 1); pass++) {
                 neon_load_reg64(cpu_V0, rn + pass);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VRHADD:
             GEN_NEON_INTEGER_OP(rhadd);
             break;
-        case NEON_3R_LOGIC: /* Logic ops.  */
-            switch ((u << 2) | size) {
-            case 0: /* VAND */
-                tcg_gen_and_i32(tmp, tmp, tmp2);
-                break;
-            case 1: /* BIC */
-                tcg_gen_andc_i32(tmp, tmp, tmp2);
-                break;
-            case 2: /* VORR */
-                tcg_gen_or_i32(tmp, tmp, tmp2);
-                break;
-            case 3: /* VORN */
-                tcg_gen_orc_i32(tmp, tmp, tmp2);
-                break;
-            case 4: /* VEOR */
-                tcg_gen_xor_i32(tmp, tmp, tmp2);
-                break;
-            case 5: /* VBSL */
-                tmp3 = neon_load_reg(rd, pass);
-                gen_neon_bsl(tmp, tmp, tmp2, tmp3);
-                tcg_temp_free_i32(tmp3);
-                break;
-            case 6: /* VBIT */
-                tmp3 = neon_load_reg(rd, pass);
-                gen_neon_bsl(tmp, tmp, tmp3, tmp2);
-                tcg_temp_free_i32(tmp3);
-                break;
-            case 7: /* VBIF */
-                tmp3 = neon_load_reg(rd, pass);
-                gen_neon_bsl(tmp, tmp3, tmp, tmp2);
-                tcg_temp_free_i32(tmp3);
-                break;
-            }
-            break;
         case NEON_3R_VHSUB:
             GEN_NEON_INTEGER_OP(hsub);
             break;
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-10-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 29 ++++++++++-------------------
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 break;
             }
             return 0;
+
+        case NEON_3R_VADD_VSUB:
+            if (u) {
+                tcg_gen_gvec_sub(size, rd_ofs, rn_ofs, rm_ofs,
+                                 vec_size, vec_size);
+            } else {
+                tcg_gen_gvec_add(size, rd_ofs, rn_ofs, rm_ofs,
+                                 vec_size, vec_size);
+            }
+            return 0;
         }
         if (size == 3) {
             /* 64-bit element instructions. */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                                   cpu_V1, cpu_V0);
                     }
                     break;
-                case NEON_3R_VADD_VSUB:
-                    if (u) {
-                        tcg_gen_sub_i64(CPU_V001);
-                    } else {
-                        tcg_gen_add_i64(CPU_V001);
-                    }
-                    break;
                 default:
                     abort();
                 }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tmp2 = neon_load_reg(rd, pass);
             gen_neon_add(size, tmp, tmp2);
             break;
-        case NEON_3R_VADD_VSUB:
-            if (!u) { /* VADD */
-                gen_neon_add(size, tmp, tmp2);
-            } else { /* VSUB */
-                switch (size) {
-                case 0: gen_helper_neon_sub_u8(tmp, tmp, tmp2); break;
-                case 1: gen_helper_neon_sub_u16(tmp, tmp, tmp2); break;
-                case 2: tcg_gen_sub_i32(tmp, tmp, tmp2); break;
-                default: abort();
-                }
-            }
-            break;
         case NEON_3R_VTST_VCEQ:
             if (!u) { /* VTST */
                 switch (size) {
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-11-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     tcg_temp_free_ptr(ptr1);
                     tcg_temp_free_ptr(ptr2);
                     break;
+
+                case NEON_2RM_VMVN:
+                    tcg_gen_gvec_not(0, rd_ofs, rm_ofs, vec_size, vec_size);
+                    break;
+                case NEON_2RM_VNEG:
+                    tcg_gen_gvec_neg(size, rd_ofs, rm_ofs, vec_size, vec_size);
+                    break;
+
                 default:
                 elementwise:
                     for (pass = 0; pass < (q ? 4 : 2); pass++) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         case NEON_2RM_VCNT:
                             gen_helper_neon_cnt_u8(tmp, tmp);
                             break;
-                        case NEON_2RM_VMVN:
-                            tcg_gen_not_i32(tmp, tmp);
-                            break;
                         case NEON_2RM_VQABS:
                             switch (size) {
                             case 0:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             default: abort();
                             }
                             break;
-                        case NEON_2RM_VNEG:
-                            tmp2 = tcg_const_i32(0);
-                            gen_neon_rsb(size, tmp, tmp2);
-                            tcg_temp_free_i32(tmp2);
-                            break;
                         case NEON_2RM_VCGT0_F:
                         {
                             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-12-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                  vec_size, vec_size);
             }
             return 0;
+
+        case NEON_3R_VMUL: /* VMUL */
+            if (u) {
+                /* Polynomial case allows only P8 and is handled below.  */
+                if (size != 0) {
+                    return 1;
+                }
+            } else {
+                tcg_gen_gvec_mul(size, rd_ofs, rn_ofs, rm_ofs,
+                                 vec_size, vec_size);
+                return 0;
+            }
+            break;
         }
         if (size == 3) {
             /* 64-bit element instructions. */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 return 1;
             }
             break;
-        case NEON_3R_VMUL:
-            if (u && (size != 0)) {
-                /* UNDEF on invalid size for polynomial subcase */
-                return 1;
-            }
-            break;
         case NEON_3R_VFM_VQRDMLSH:
             if (!arm_dc_feature(s, ARM_FEATURE_VFP4)) {
                 return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             }
             break;
         case NEON_3R_VMUL:
-            if (u) { /* polynomial */
-                gen_helper_neon_mul_p8(tmp, tmp, tmp2);
-            } else { /* Integer */
-                switch (size) {
-                case 0: gen_helper_neon_mul_u8(tmp, tmp, tmp2); break;
-                case 1: gen_helper_neon_mul_u16(tmp, tmp, tmp2); break;
-                case 2: tcg_gen_mul_i32(tmp, tmp, tmp2); break;
-                default: abort();
-                }
-            }
+            /* VMUL.P8; other cases already eliminated.  */
+            gen_helper_neon_mul_p8(tmp, tmp, tmp2);
             break;
         case NEON_3R_VPMAX:
             GEN_NEON_INTEGER_OP(pmax);
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-13-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 70 +++++++++++++++++++++++++++++-------------
 1 file changed, 48 insertions(+), 22 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     size--;
             }
             shift = (insn >> 16) & ((1 << (3 + size)) - 1);
-            /* To avoid excessive duplication of ops we implement shift
-               by immediate using the variable shift operations.  */
             if (op < 8) {
                 /* Shift by immediate:
                    VSHR, VSRA, VRSHR, VRSRA, VSRI, VSHL, VQSHL, VQSHLU.  */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 }
                 /* Right shifts are encoded as N - shift, where N is the
                    element size in bits.  */
-                if (op <= 4)
+                if (op <= 4) {
                     shift = shift - (1 << (size + 3));
+                }
+
+                switch (op) {
+                case 0:  /* VSHR */
+                    /* Right shift comes here negative.  */
+                    shift = -shift;
+                    /* Shifts larger than the element size are architecturally
+                     * valid.  Unsigned results in all zeros; signed results
+                     * in all sign bits.
+                     */
+                    if (!u) {
+                        tcg_gen_gvec_sari(size, rd_ofs, rm_ofs,
+                                          MIN(shift, (8 << size) - 1),
+                                          vec_size, vec_size);
+                    } else if (shift >= 8 << size) {
+                        tcg_gen_gvec_dup8i(rd_ofs, vec_size, vec_size, 0);
+                    } else {
+                        tcg_gen_gvec_shri(size, rd_ofs, rm_ofs, shift,
+                                          vec_size, vec_size);
+                    }
+                    return 0;
+
+                case 5: /* VSHL, VSLI */
+                    if (!u) { /* VSHL */
+                        /* Shifts larger than the element size are
+                         * architecturally valid and results in zero.
+                         */
+                        if (shift >= 8 << size) {
+                            tcg_gen_gvec_dup8i(rd_ofs, vec_size, vec_size, 0);
+                        } else {
+                            tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
+                                              vec_size, vec_size);
+                        }
+                        return 0;
+                    }
+                    break;
+                }
+
                 if (size == 3) {
                     count = q + 1;
                 } else {
                     count = q ? 4: 2;
                 }
-                switch (size) {
-                case 0:
-                    imm = (uint8_t) shift;
-                    imm |= imm << 8;
-                    imm |= imm << 16;
-                    break;
-                case 1:
-                    imm = (uint16_t) shift;
-                    imm |= imm << 16;
-                    break;
-                case 2:
-                case 3:
-                    imm = shift;
-                    break;
-                default:
-                    abort();
-                }
+
+                /* To avoid excessive duplication of ops we implement shift
+                 * by immediate using the variable shift operations.
+                  */
+                imm = dup_const(size, shift);
 
                 for (pass = 0; pass < count; pass++) {
                     if (size == 3) {
                         neon_load_reg64(cpu_V0, rm + pass);
                         tcg_gen_movi_i64(cpu_V1, imm);
                         switch (op) {
-                        case 0:  /* VSHR */
                         case 1:  /* VSRA */
                             if (u)
                                 gen_helper_neon_shl_u64(cpu_V0, cpu_V0, cpu_V1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                                          cpu_V0, cpu_V1);
                             }
                             break;
+                        default:
+                            g_assert_not_reached();
                         }
                         if (op == 1 || op == 3) {
                             /* Accumulate.  */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         tmp2 = tcg_temp_new_i32();
                         tcg_gen_movi_i32(tmp2, imm);
                         switch (op) {
-                        case 0:  /* VSHR */
                         case 1:  /* VSRA */
                             GEN_NEON_INTEGER_OP(shl);
                             break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         case 7: /* VQSHL */
                             GEN_NEON_INTEGER_OP_ENV(qshl);
                             break;
+                        default:
+                            g_assert_not_reached();
                         }
                         tcg_temp_free_i32(tmp2);
 
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Move ssra_op and usra_op expanders from translate-a64.c.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-14-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     |   2 +
 target/arm/translate-a64.c | 106 ----------------------------
 target/arm/translate.c     | 139 ++++++++++++++++++++++++++++++++++---
 3 files changed, 130 insertions(+), 117 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Move shi_op and sli_op expanders from translate-a64.c.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-15-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     |   2 +
 target/arm/translate-a64.c | 152 +----------------------
 target/arm/translate.c     | 244 ++++++++++++++++++++++++++-----------
 3 files changed, 179 insertions(+), 219 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 bit_op;
 extern const GVecGen3 bif_op;
 extern const GVecGen2i ssra_op[4];
 extern const GVecGen2i usra_op[4];
+extern const GVecGen2i sri_op[4];
+extern const GVecGen2i sli_op[4];
 
 /*
  * Forward to the isar_feature_* tests given a DisasContext pointer.
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
     }
 }
 
-static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
-{
-    uint64_t mask = dup_const(MO_8, 0xff >> shift);
-    TCGv_i64 t = tcg_temp_new_i64();
-
-    tcg_gen_shri_i64(t, a, shift);
-    tcg_gen_andi_i64(t, t, mask);
-    tcg_gen_andi_i64(d, d, ~mask);
-    tcg_gen_or_i64(d, d, t);
-    tcg_temp_free_i64(t);
-}
-
-static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
-{
-    uint64_t mask = dup_const(MO_16, 0xffff >> shift);
-    TCGv_i64 t = tcg_temp_new_i64();
-
-    tcg_gen_shri_i64(t, a, shift);
-    tcg_gen_andi_i64(t, t, mask);
-    tcg_gen_andi_i64(d, d, ~mask);
-    tcg_gen_or_i64(d, d, t);
-    tcg_temp_free_i64(t);
-}
-
-static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
-{
-    tcg_gen_shri_i32(a, a, shift);
-    tcg_gen_deposit_i32(d, d, a, 0, 32 - shift);
-}
-
-static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
-{
-    tcg_gen_shri_i64(a, a, shift);
-    tcg_gen_deposit_i64(d, d, a, 0, 64 - shift);
-}
-
-static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
-{
-    uint64_t mask = (2ull << ((8 << vece) - 1)) - 1;
-    TCGv_vec t = tcg_temp_new_vec_matching(d);
-    TCGv_vec m = tcg_temp_new_vec_matching(d);
-
-    tcg_gen_dupi_vec(vece, m, mask ^ (mask >> sh));
-    tcg_gen_shri_vec(vece, t, a, sh);
-    tcg_gen_and_vec(vece, d, d, m);
-    tcg_gen_or_vec(vece, d, d, t);
-
-    tcg_temp_free_vec(t);
-    tcg_temp_free_vec(m);
-}
-
 /* SSHR[RA]/USHR[RA] - Vector shift right (optional rounding/accumulate) */
 static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
                                  int immh, int immb, int opcode, int rn, int rd)
 {
-    static const GVecGen2i sri_op[4] = {
-        { .fni8 = gen_shr8_ins_i64,
-          .fniv = gen_shr_ins_vec,
-          .load_dest = true,
-          .opc = INDEX_op_shri_vec,
-          .vece = MO_8 },
-        { .fni8 = gen_shr16_ins_i64,
-          .fniv = gen_shr_ins_vec,
-          .load_dest = true,
-          .opc = INDEX_op_shri_vec,
-          .vece = MO_16 },
-        { .fni4 = gen_shr32_ins_i32,
-          .fniv = gen_shr_ins_vec,
-          .load_dest = true,
-          .opc = INDEX_op_shri_vec,
-          .vece = MO_32 },
-        { .fni8 = gen_shr64_ins_i64,
-          .fniv = gen_shr_ins_vec,
-          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-          .load_dest = true,
-          .opc = INDEX_op_shri_vec,
-          .vece = MO_64 },
-    };
-
     int size = 32 - clz32(immh) - 1;
     int immhb = immh << 3 | immb;
     int shift = 2 * (8 << size) - immhb;
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
     clear_vec_high(s, is_q, rd);
 }
 
-static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
-{
-    uint64_t mask = dup_const(MO_8, 0xff << shift);
-    TCGv_i64 t = tcg_temp_new_i64();
-
-    tcg_gen_shli_i64(t, a, shift);
-    tcg_gen_andi_i64(t, t, mask);
-    tcg_gen_andi_i64(d, d, ~mask);
-    tcg_gen_or_i64(d, d, t);
-    tcg_temp_free_i64(t);
-}
-
-static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
-{
-    uint64_t mask = dup_const(MO_16, 0xffff << shift);
-    TCGv_i64 t = tcg_temp_new_i64();
-
-    tcg_gen_shli_i64(t, a, shift);
-    tcg_gen_andi_i64(t, t, mask);
-    tcg_gen_andi_i64(d, d, ~mask);
-    tcg_gen_or_i64(d, d, t);
-    tcg_temp_free_i64(t);
-}
-
-static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
-{
-    tcg_gen_deposit_i32(d, d, a, shift, 32 - shift);
-}
-
-static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
-{
-    tcg_gen_deposit_i64(d, d, a, shift, 64 - shift);
-}
-
-static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
-{
-    uint64_t mask = (1ull << sh) - 1;
-    TCGv_vec t = tcg_temp_new_vec_matching(d);
-    TCGv_vec m = tcg_temp_new_vec_matching(d);
-
-    tcg_gen_dupi_vec(vece, m, mask);
-    tcg_gen_shli_vec(vece, t, a, sh);
-    tcg_gen_and_vec(vece, d, d, m);
-    tcg_gen_or_vec(vece, d, d, t);
-
-    tcg_temp_free_vec(t);
-    tcg_temp_free_vec(m);
-}
-
 /* SHL/SLI - Vector shift left */
 static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
                                  int immh, int immb, int opcode, int rn, int rd)
 {
-    static const GVecGen2i shi_op[4] = {
-        { .fni8 = gen_shl8_ins_i64,
-          .fniv = gen_shl_ins_vec,
-          .opc = INDEX_op_shli_vec,
-          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-          .load_dest = true,
-          .vece = MO_8 },
-        { .fni8 = gen_shl16_ins_i64,
-          .fniv = gen_shl_ins_vec,
-          .opc = INDEX_op_shli_vec,
-          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-          .load_dest = true,
-          .vece = MO_16 },
-        { .fni4 = gen_shl32_ins_i32,
-          .fniv = gen_shl_ins_vec,
-          .opc = INDEX_op_shli_vec,
-          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-          .load_dest = true,
-          .vece = MO_32 },
-        { .fni8 = gen_shl64_ins_i64,
-          .fniv = gen_shl_ins_vec,
-          .opc = INDEX_op_shli_vec,
-          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-          .load_dest = true,
-          .vece = MO_64 },
-    };
     int size = 32 - clz32(immh) - 1;
     int immhb = immh << 3 | immb;
     int shift = immhb - (8 << size);
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
     }
 
     if (insert) {
-        gen_gvec_op2i(s, is_q, rd, rn, shift, &shi_op[size]);
+        gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]);
     } else {
         gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size);
     }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ const GVecGen2i usra_op[4] = {
       .vece = MO_64, },
 };
 
+static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
+{
+    uint64_t mask = dup_const(MO_8, 0xff >> shift);
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shri_i64(t, a, shift);
+    tcg_gen_andi_i64(t, t, mask);
+    tcg_gen_andi_i64(d, d, ~mask);
+    tcg_gen_or_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
+{
+    uint64_t mask = dup_const(MO_16, 0xffff >> shift);
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shri_i64(t, a, shift);
+    tcg_gen_andi_i64(t, t, mask);
+    tcg_gen_andi_i64(d, d, ~mask);
+    tcg_gen_or_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
+{
+    tcg_gen_shri_i32(a, a, shift);
+    tcg_gen_deposit_i32(d, d, a, 0, 32 - shift);
+}
+
+static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
+{
+    tcg_gen_shri_i64(a, a, shift);
+    tcg_gen_deposit_i64(d, d, a, 0, 64 - shift);
+}
+
+static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
+{
+    if (sh == 0) {
+        tcg_gen_mov_vec(d, a);
+    } else {
+        TCGv_vec t = tcg_temp_new_vec_matching(d);
+        TCGv_vec m = tcg_temp_new_vec_matching(d);
+
+        tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
+        tcg_gen_shri_vec(vece, t, a, sh);
+        tcg_gen_and_vec(vece, d, d, m);
+        tcg_gen_or_vec(vece, d, d, t);
+
+        tcg_temp_free_vec(t);
+        tcg_temp_free_vec(m);
+    }
+}
+
+const GVecGen2i sri_op[4] = {
+    { .fni8 = gen_shr8_ins_i64,
+      .fniv = gen_shr_ins_vec,
+      .load_dest = true,
+      .opc = INDEX_op_shri_vec,
+      .vece = MO_8 },
+    { .fni8 = gen_shr16_ins_i64,
+      .fniv = gen_shr_ins_vec,
+      .load_dest = true,
+      .opc = INDEX_op_shri_vec,
+      .vece = MO_16 },
+    { .fni4 = gen_shr32_ins_i32,
+      .fniv = gen_shr_ins_vec,
+      .load_dest = true,
+      .opc = INDEX_op_shri_vec,
+      .vece = MO_32 },
+    { .fni8 = gen_shr64_ins_i64,
+      .fniv = gen_shr_ins_vec,
+      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+      .load_dest = true,
+      .opc = INDEX_op_shri_vec,
+      .vece = MO_64 },
+};
+
+static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
+{
+    uint64_t mask = dup_const(MO_8, 0xff << shift);
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shli_i64(t, a, shift);
+    tcg_gen_andi_i64(t, t, mask);
+    tcg_gen_andi_i64(d, d, ~mask);
+    tcg_gen_or_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
+{
+    uint64_t mask = dup_const(MO_16, 0xffff << shift);
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shli_i64(t, a, shift);
+    tcg_gen_andi_i64(t, t, mask);
+    tcg_gen_andi_i64(d, d, ~mask);
+    tcg_gen_or_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift)
+{
+    tcg_gen_deposit_i32(d, d, a, shift, 32 - shift);
+}
+
+static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
+{
+    tcg_gen_deposit_i64(d, d, a, shift, 64 - shift);
+}
+
+static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
+{
+    if (sh == 0) {
+        tcg_gen_mov_vec(d, a);
+    } else {
+        TCGv_vec t = tcg_temp_new_vec_matching(d);
+        TCGv_vec m = tcg_temp_new_vec_matching(d);
+
+        tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
+        tcg_gen_shli_vec(vece, t, a, sh);
+        tcg_gen_and_vec(vece, d, d, m);
+        tcg_gen_or_vec(vece, d, d, t);
+
+        tcg_temp_free_vec(t);
+        tcg_temp_free_vec(m);
+    }
+}
+
+const GVecGen2i sli_op[4] = {
+    { .fni8 = gen_shl8_ins_i64,
+      .fniv = gen_shl_ins_vec,
+      .load_dest = true,
+      .opc = INDEX_op_shli_vec,
+      .vece = MO_8 },
+    { .fni8 = gen_shl16_ins_i64,
+      .fniv = gen_shl_ins_vec,
+      .load_dest = true,
+      .opc = INDEX_op_shli_vec,
+      .vece = MO_16 },
+    { .fni4 = gen_shl32_ins_i32,
+      .fniv = gen_shl_ins_vec,
+      .load_dest = true,
+      .opc = INDEX_op_shli_vec,
+      .vece = MO_32 },
+    { .fni8 = gen_shl64_ins_i64,
+      .fniv = gen_shl_ins_vec,
+      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+      .load_dest = true,
+      .opc = INDEX_op_shli_vec,
+      .vece = MO_64 },
+};
+
 /* Translate a NEON data processing instruction.  Return nonzero if the
    instruction is invalid.
    We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     int pairwise;
     int u;
     int vec_size;
-    uint32_t imm, mask;
+    uint32_t imm;
     TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
     TCGv_ptr ptr1, ptr2, ptr3;
     TCGv_i64 tmp64;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     }
                     return 0;
 
+                case 4: /* VSRI */
+                    if (!u) {
+                        return 1;
+                    }
+                    /* Right shift comes here negative.  */
+                    shift = -shift;
+                    /* Shift out of range leaves destination unchanged.  */
+                    if (shift < 8 << size) {
+                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
+                                        shift, &sri_op[size]);
+                    }
+                    return 0;
+
                 case 5: /* VSHL, VSLI */
-                    if (!u) { /* VSHL */
+                    if (u) { /* VSLI */
+                        /* Shift out of range leaves destination unchanged.  */
+                        if (shift < 8 << size) {
+                            tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size,
+                                            vec_size, shift, &sli_op[size]);
+                        }
+                    } else { /* VSHL */
                         /* Shifts larger than the element size are
                          * architecturally valid and results in zero.
                          */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
                                               vec_size, vec_size);
                         }
-                        return 0;
                     }
-                    break;
+                    return 0;
                 }
 
                 if (size == 3) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             else
                                 gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1);
                             break;
-                        case 4: /* VSRI */
-                        case 5: /* VSHL, VSLI */
-                            gen_helper_neon_shl_u64(cpu_V0, cpu_V0, cpu_V1);
-                            break;
                         case 6: /* VQSHLU */
                             gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
                                                       cpu_V0, cpu_V1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             /* Accumulate.  */
                             neon_load_reg64(cpu_V1, rd + pass);
                             tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
-                        } else if (op == 4 || (op == 5 && u)) {
-                            /* Insert */
-                            neon_load_reg64(cpu_V1, rd + pass);
-                            uint64_t mask;
-                            if (shift < -63 || shift > 63) {
-                                mask = 0;
-                            } else {
-                                if (op == 4) {
-                                    mask = 0xffffffffffffffffull >> -shift;
-                                } else {
-                                    mask = 0xffffffffffffffffull << shift;
-                                }
-                            }
-                            tcg_gen_andi_i64(cpu_V1, cpu_V1, ~mask);
-                            tcg_gen_or_i64(cpu_V0, cpu_V0, cpu_V1);
                         }
                         neon_store_reg64(cpu_V0, rd + pass);
                     } else { /* size < 3 */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         case 3: /* VRSRA */
                             GEN_NEON_INTEGER_OP(rshl);
                             break;
-                        case 4: /* VSRI */
-                        case 5: /* VSHL, VSLI */
-                            switch (size) {
-                            case 0: gen_helper_neon_shl_u8(tmp, tmp, tmp2); break;
-                            case 1: gen_helper_neon_shl_u16(tmp, tmp, tmp2); break;
-                            case 2: gen_helper_neon_shl_u32(tmp, tmp, tmp2); break;
-                            default: abort();
-                            }
-                            break;
                         case 6: /* VQSHLU */
                             switch (size) {
                             case 0:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             tmp2 = neon_load_reg(rd, pass);
                             gen_neon_add(size, tmp, tmp2);
                             tcg_temp_free_i32(tmp2);
-                        } else if (op == 4 || (op == 5 && u)) {
-                            /* Insert */
-                            switch (size) {
-                            case 0:
-                                if (op == 4)
-                                    mask = 0xff >> -shift;
-                                else
-                                    mask = (uint8_t)(0xff << shift);
-                                mask |= mask << 8;
-                                mask |= mask << 16;
-                                break;
-                            case 1:
-                                if (op == 4)
-                                    mask = 0xffff >> -shift;
-                                else
-                                    mask = (uint16_t)(0xffff << shift);
-                                mask |= mask << 16;
-                                break;
-                            case 2:
-                                if (shift < -31 || shift > 31) {
-                                    mask = 0;
-                                } else {
-                                    if (op == 4)
-                                        mask = 0xffffffffu >> -shift;
-                                    else
-                                        mask = 0xffffffffu << shift;
-                                }
-                                break;
-                            default:
-                                abort();
-                            }
-                            tmp2 = neon_load_reg(rd, pass);
-                            tcg_gen_andi_i32(tmp, tmp, mask);
-                            tcg_gen_andi_i32(tmp2, tmp2, ~mask);
-                            tcg_gen_or_i32(tmp, tmp, tmp2);
-                            tcg_temp_free_i32(tmp2);
                         }
                         neon_store_reg(rd, pass, tmp);
                     }
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Move mla_op and mls_op expanders from translate-a64.c.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-16-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     |   2 +
 target/arm/translate-a64.c | 106 -----------------------------
 target/arm/translate.c     | 134 ++++++++++++++++++++++++++++++++-----
 3 files changed, 120 insertions(+), 122 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Move cmtst_op expanders from translate-a64.c.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-17-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     |  2 +
 target/arm/translate-a64.c | 38 ------------------
 target/arm/translate.c     | 81 +++++++++++++++++++++++++++-----------
 3 files changed, 60 insertions(+), 61 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-18-richard.henderson@linaro.org
[PMM: added parens in ?: expression]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 81 ++++++++++++++----------------------------
 1 file changed, 26 insertions(+), 55 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_vfp_msr(TCGv_i32 tmp)
     tcg_temp_free_i32(tmp);
 }
 
-static void gen_neon_dup_u8(TCGv_i32 var, int shift)
-{
-    TCGv_i32 tmp = tcg_temp_new_i32();
-    if (shift)
-        tcg_gen_shri_i32(var, var, shift);
-    tcg_gen_ext8u_i32(var, var);
-    tcg_gen_shli_i32(tmp, var, 8);
-    tcg_gen_or_i32(var, var, tmp);
-    tcg_gen_shli_i32(tmp, var, 16);
-    tcg_gen_or_i32(var, var, tmp);
-    tcg_temp_free_i32(tmp);
-}
-
 static void gen_neon_dup_low16(TCGv_i32 var)
 {
     TCGv_i32 tmp = tcg_temp_new_i32();
@@ -XXX,XX +XXX,XX @@ static void gen_neon_dup_high16(TCGv_i32 var)
     tcg_temp_free_i32(tmp);
 }
 
-static TCGv_i32 gen_load_and_replicate(DisasContext *s, TCGv_i32 addr, int size)
-{
-    /* Load a single Neon element and replicate into a 32 bit TCG reg */
-    TCGv_i32 tmp = tcg_temp_new_i32();
-    switch (size) {
-    case 0:
-        gen_aa32_ld8u(s, tmp, addr, get_mem_index(s));
-        gen_neon_dup_u8(tmp, 0);
-        break;
-    case 1:
-        gen_aa32_ld16u(s, tmp, addr, get_mem_index(s));
-        gen_neon_dup_low16(tmp);
-        break;
-    case 2:
-        gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
-        break;
-    default: /* Avoid compiler warnings.  */
-        abort();
-    }
-    return tmp;
-}
-
 static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm,
                        uint32_t dp)
 {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
     int load;
     int shift;
     int n;
+    int vec_size;
     TCGv_i32 addr;
     TCGv_i32 tmp;
     TCGv_i32 tmp2;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
             }
             addr = tcg_temp_new_i32();
             load_reg_var(s, addr, rn);
-            if (nregs == 1) {
-                /* VLD1 to all lanes: bit 5 indicates how many Dregs to write */
-                tmp = gen_load_and_replicate(s, addr, size);
-                tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0));
-                tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1));
-                if (insn & (1 << 5)) {
-                    tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 0));
-                    tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 1));
-                }
-                tcg_temp_free_i32(tmp);
-            } else {
-                /* VLD2/3/4 to all lanes: bit 5 indicates register stride */
-                stride = (insn & (1 << 5)) ? 2 : 1;
-                for (reg = 0; reg < nregs; reg++) {
-                    tmp = gen_load_and_replicate(s, addr, size);
-                    tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0));
-                    tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1));
-                    tcg_temp_free_i32(tmp);
-                    tcg_gen_addi_i32(addr, addr, 1 << size);
-                    rd += stride;
+
+            /* VLD1 to all lanes: bit 5 indicates how many Dregs to write.
+             * VLD2/3/4 to all lanes: bit 5 indicates register stride.
+             */
+            stride = (insn & (1 << 5)) ? 2 : 1;
+            vec_size = nregs == 1 ? stride * 8 : 8;
+
+            tmp = tcg_temp_new_i32();
+            for (reg = 0; reg < nregs; reg++) {
+                gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
+                                s->be_data | size);
+                if ((rd & 1) && vec_size == 16) {
+                    /* We cannot write 16 bytes at once because the
+                     * destination is unaligned.
+                     */
+                    tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
+                                         8, 8, tmp);
+                    tcg_gen_gvec_mov(0, neon_reg_offset(rd + 1, 0),
+                                     neon_reg_offset(rd, 0), 8, 8);
+                } else {
+                    tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
+                                         vec_size, vec_size, tmp);
                 }
+                tcg_gen_addi_i32(addr, addr, 1 << size);
+                rd += stride;
             }
+            tcg_temp_free_i32(tmp);
             tcg_temp_free_i32(addr);
             stride = (1 << size) * nregs;
         } else {
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Instead of shifts and masks, use direct loads and stores from the neon
register file.  Mirror the iteration structure of the ARM pseudocode
more closely.  Correct the parameters of the VLD2 A2 insn.

Note that this includes a bugfix for handling of the insn
"VLD2 (multiple 2-element structures)" -- we were using an
incorrect stride value.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-19-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 170 ++++++++++++++++++-----------------------
 1 file changed, 74 insertions(+), 96 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static TCGv_i32 neon_load_reg(int reg, int pass)
     return tmp;
 }
 
+static void neon_load_element64(TCGv_i64 var, int reg, int ele, TCGMemOp mop)
+{
+    long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
+
+    switch (mop) {
+    case MO_UB:
+        tcg_gen_ld8u_i64(var, cpu_env, offset);
+        break;
+    case MO_UW:
+        tcg_gen_ld16u_i64(var, cpu_env, offset);
+        break;
+    case MO_UL:
+        tcg_gen_ld32u_i64(var, cpu_env, offset);
+        break;
+    case MO_Q:
+        tcg_gen_ld_i64(var, cpu_env, offset);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
 static void neon_store_reg(int reg, int pass, TCGv_i32 var)
 {
     tcg_gen_st_i32(var, cpu_env, neon_reg_offset(reg, pass));
     tcg_temp_free_i32(var);
 }
 
+static void neon_store_element64(int reg, int ele, TCGMemOp size, TCGv_i64 var)
+{
+    long offset = neon_element_offset(reg, ele, size);
+
+    switch (size) {
+    case MO_8:
+        tcg_gen_st8_i64(var, cpu_env, offset);
+        break;
+    case MO_16:
+        tcg_gen_st16_i64(var, cpu_env, offset);
+        break;
+    case MO_32:
+        tcg_gen_st32_i64(var, cpu_env, offset);
+        break;
+    case MO_64:
+        tcg_gen_st_i64(var, cpu_env, offset);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
 static inline void neon_load_reg64(TCGv_i64 var, int reg)
 {
     tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg));
@@ -XXX,XX +XXX,XX @@ static struct {
     int interleave;
     int spacing;
 } const neon_ls_element_type[11] = {
-    {4, 4, 1},
-    {4, 4, 2},
+    {1, 4, 1},
+    {1, 4, 2},
     {4, 1, 1},
-    {4, 2, 1},
-    {3, 3, 1},
-    {3, 3, 2},
+    {2, 2, 2},
+    {1, 3, 1},
+    {1, 3, 2},
     {3, 1, 1},
     {1, 1, 1},
-    {2, 2, 1},
-    {2, 2, 2},
+    {1, 2, 1},
+    {1, 2, 2},
     {2, 1, 1}
 };
 
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
     int shift;
     int n;
     int vec_size;
+    int mmu_idx;
+    TCGMemOp endian;
     TCGv_i32 addr;
     TCGv_i32 tmp;
     TCGv_i32 tmp2;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
     rn = (insn >> 16) & 0xf;
     rm = insn & 0xf;
     load = (insn & (1 << 21)) != 0;
+    endian = s->be_data;
+    mmu_idx = get_mem_index(s);
     if ((insn & (1 << 23)) == 0) {
         /* Load store all elements.  */
         op = (insn >> 8) & 0xf;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
         nregs = neon_ls_element_type[op].nregs;
         interleave = neon_ls_element_type[op].interleave;
         spacing = neon_ls_element_type[op].spacing;
-        if (size == 3 && (interleave | spacing) != 1)
+        if (size == 3 && (interleave | spacing) != 1) {
             return 1;
+        }
+        tmp64 = tcg_temp_new_i64();
         addr = tcg_temp_new_i32();
+        tmp2 = tcg_const_i32(1 << size);
         load_reg_var(s, addr, rn);
-        stride = (1 << size) * interleave;
         for (reg = 0; reg < nregs; reg++) {
-            if (interleave > 2 || (interleave == 2 && nregs == 2)) {
-                load_reg_var(s, addr, rn);
-                tcg_gen_addi_i32(addr, addr, (1 << size) * reg);
-            } else if (interleave == 2 && nregs == 4 && reg == 2) {
-                load_reg_var(s, addr, rn);
-                tcg_gen_addi_i32(addr, addr, 1 << size);
-            }
-            if (size == 3) {
-                tmp64 = tcg_temp_new_i64();
-                if (load) {
-                    gen_aa32_ld64(s, tmp64, addr, get_mem_index(s));
-                    neon_store_reg64(tmp64, rd);
-                } else {
-                    neon_load_reg64(tmp64, rd);
-                    gen_aa32_st64(s, tmp64, addr, get_mem_index(s));
-                }
-                tcg_temp_free_i64(tmp64);
-                tcg_gen_addi_i32(addr, addr, stride);
-            } else {
-                for (pass = 0; pass < 2; pass++) {
-                    if (size == 2) {
-                        if (load) {
-                            tmp = tcg_temp_new_i32();
-                            gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
-                            neon_store_reg(rd, pass, tmp);
-                        } else {
-                            tmp = neon_load_reg(rd, pass);
-                            gen_aa32_st32(s, tmp, addr, get_mem_index(s));
-                            tcg_temp_free_i32(tmp);
-                        }
-                        tcg_gen_addi_i32(addr, addr, stride);
-                    } else if (size == 1) {
-                        if (load) {
-                            tmp = tcg_temp_new_i32();
-                            gen_aa32_ld16u(s, tmp, addr, get_mem_index(s));
-                            tcg_gen_addi_i32(addr, addr, stride);
-                            tmp2 = tcg_temp_new_i32();
-                            gen_aa32_ld16u(s, tmp2, addr, get_mem_index(s));
-                            tcg_gen_addi_i32(addr, addr, stride);
-                            tcg_gen_shli_i32(tmp2, tmp2, 16);
-                            tcg_gen_or_i32(tmp, tmp, tmp2);
-                            tcg_temp_free_i32(tmp2);
-                            neon_store_reg(rd, pass, tmp);
-                        } else {
-                            tmp = neon_load_reg(rd, pass);
-                            tmp2 = tcg_temp_new_i32();
-                            tcg_gen_shri_i32(tmp2, tmp, 16);
-                            gen_aa32_st16(s, tmp, addr, get_mem_index(s));
-                            tcg_temp_free_i32(tmp);
-                            tcg_gen_addi_i32(addr, addr, stride);
-                            gen_aa32_st16(s, tmp2, addr, get_mem_index(s));
-                            tcg_temp_free_i32(tmp2);
-                            tcg_gen_addi_i32(addr, addr, stride);
-                        }
-                    } else /* size == 0 */ {
-                        if (load) {
-                            tmp2 = NULL;
-                            for (n = 0; n < 4; n++) {
-                                tmp = tcg_temp_new_i32();
-                                gen_aa32_ld8u(s, tmp, addr, get_mem_index(s));
-                                tcg_gen_addi_i32(addr, addr, stride);
-                                if (n == 0) {
-                                    tmp2 = tmp;
-                                } else {
-                                    tcg_gen_shli_i32(tmp, tmp, n * 8);
-                                    tcg_gen_or_i32(tmp2, tmp2, tmp);
-                                    tcg_temp_free_i32(tmp);
-                                }
-                            }
-                            neon_store_reg(rd, pass, tmp2);
-                        } else {
-                            tmp2 = neon_load_reg(rd, pass);
-                            for (n = 0; n < 4; n++) {
-                                tmp = tcg_temp_new_i32();
-                                if (n == 0) {
-                                    tcg_gen_mov_i32(tmp, tmp2);
-                                } else {
-                                    tcg_gen_shri_i32(tmp, tmp2, n * 8);
-                                }
-                                gen_aa32_st8(s, tmp, addr, get_mem_index(s));
-                                tcg_temp_free_i32(tmp);
-                                tcg_gen_addi_i32(addr, addr, stride);
-                            }
-                            tcg_temp_free_i32(tmp2);
-                        }
+            for (n = 0; n < 8 >> size; n++) {
+                int xs;
+                for (xs = 0; xs < interleave; xs++) {
+                    int tt = rd + reg + spacing * xs;
+
+                    if (load) {
+                        gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
+                        neon_store_element64(tt, n, size, tmp64);
+                    } else {
+                        neon_load_element64(tmp64, tt, n, size);
+                        gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
                     }
+                    tcg_gen_add_i32(addr, addr, tmp2);
                 }
             }
-            rd += spacing;
         }
         tcg_temp_free_i32(addr);
-        stride = nregs * 8;
+        tcg_temp_free_i32(tmp2);
+        tcg_temp_free_i64(tmp64);
+        stride = nregs * interleave * 8;
     } else {
         size = (insn >> 10) & 3;
         if (size == 3) {
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

For a sequence of loads or stores from a single register,
little-endian operations can be promoted to an 8-byte op.
This can reduce the number of operations by a factor of 8.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-20-richard.henderson@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
         if (size == 3 && (interleave | spacing) != 1) {
             return 1;
         }
+        /* For our purposes, bytes are always little-endian.  */
+        if (size == 0) {
+            endian = MO_LE;
+        }
+        /* Consecutive little-endian elements from a single register
+         * can be promoted to a larger little-endian operation.
+         */
+        if (interleave == 1 && endian == MO_LE) {
+            size = 3;
+        }
         tmp64 = tcg_temp_new_i64();
         addr = tcg_temp_new_i32();
         tmp2 = tcg_const_i32(1 << size);
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Instead of shifts and masks, use direct loads and stores from
the neon register file.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20181011205206.3552-21-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 92 +++++++++++++++++++++++-------------------
 1 file changed, 50 insertions(+), 42 deletions(-)

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Announce the availability of the various priority queues.
This fixes an issue where guest kernels would miss to
configure secondary queues due to inproper feature bits.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20181017213932.19973-2-edgar.iglesias@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/net/cadence_gem.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/net/cadence_gem.c
+++ b/hw/net/cadence_gem.c
@@ -XXX,XX +XXX,XX @@ static void gem_reset(DeviceState *d)
     int i;
     CadenceGEMState *s = CADENCE_GEM(d);
     const uint8_t *a;
+    uint32_t queues_mask = 0;
 
     DB_PRINT("\n");
 
@@ -XXX,XX +XXX,XX @@ static void gem_reset(DeviceState *d)
     s->regs[GEM_DESCONF] = 0x02500111;
     s->regs[GEM_DESCONF2] = 0x2ab13fff;
     s->regs[GEM_DESCONF5] = 0x002f2045;
-    s->regs[GEM_DESCONF6] = 0x00000200;
+    s->regs[GEM_DESCONF6] = 0x0;
+
+    if (s->num_priority_queues > 1) {
+        queues_mask = MAKE_64BIT_MASK(1, s->num_priority_queues - 1);
+        s->regs[GEM_DESCONF6] |= queues_mask;
+    }
 
     /* Set MAC address */
     a = &s->conf.macaddr.a[0];
-- 
2.19.1

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Announce 64bit addressing support.

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20181017213932.19973-3-edgar.iglesias@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/net/cadence_gem.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/net/cadence_gem.c
+++ b/hw/net/cadence_gem.c
@@ -XXX,XX +XXX,XX @@
 #define GEM_DESCONF4      (0x0000028C/4)
 #define GEM_DESCONF5      (0x00000290/4)
 #define GEM_DESCONF6      (0x00000294/4)
+#define GEM_DESCONF6_64B_MASK (1U << 23)
 #define GEM_DESCONF7      (0x00000298/4)
 
 #define GEM_INT_Q1_STATUS               (0x00000400 / 4)
@@ -XXX,XX +XXX,XX @@ static void gem_reset(DeviceState *d)
     s->regs[GEM_DESCONF] = 0x02500111;
     s->regs[GEM_DESCONF2] = 0x2ab13fff;
     s->regs[GEM_DESCONF5] = 0x002f2045;
-    s->regs[GEM_DESCONF6] = 0x0;
+    s->regs[GEM_DESCONF6] = GEM_DESCONF6_64B_MASK;
 
     if (s->num_priority_queues > 1) {
         queues_mask = MAKE_64BIT_MASK(1, s->num_priority_queues - 1);
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

The EL3 version of this register does not include an ASID,
and so the tlb_flush performed by vmsa_ttbr_write is not needed.

Reviewed-by: Aaron Lindsay <aaron@os.amperecomputing.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20181019015617.22583-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.mvbar) },
     { .name = "TTBR0_EL3", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 6, .crn = 2, .crm = 0, .opc2 = 0,
-      .access = PL3_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .access = PL3_RW, .resetvalue = 0,
       .fieldoffset = offsetof(CPUARMState, cp15.ttbr0_el[3]) },
     { .name = "TCR_EL3", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 6, .crn = 2, .crm = 0, .opc2 = 2,
-- 
2.19.1

From: Richard Henderson <richard.henderson@linaro.org>

Since QEMU does not implement ASIDs, changes to the ASID must flush the
tlb.  However, if the ASID does not change there is no reason to flush.

In testing a boot of the Ubuntu installer to the first menu, this reduces
the number of flushes by 30%, or nearly 600k instances.

Reviewed-by: Aaron Lindsay <aaron@os.amperecomputing.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181019015617.22583-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Hi; here's a target-arm pullreq. Mostly this is RTH's FEAT_RME
series; there are also a handful of bug fixes including some
which aren't arm-specific but which it's convenient to include
here.

thanks
-- PMM

The following changes since commit b455ce4c2f300c8ba47cba7232dd03261368a4cb:

Merge tag 'q800-for-8.1-pull-request' of https://github.com/vivier/qemu-m68k into staging (2023-06-22 10:18:32 +0200)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20230623

for you to fetch changes up to 497fad38979c16b6412388927401e577eba43d26:

pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym (2023-06-23 11:46:02 +0100)

----------------------------------------------------------------
target-arm queue:
 * Add (experimental) support for FEAT_RME
 * host-utils: Avoid using __builtin_subcll on buggy versions of Apple Clang
 * target/arm: Restructure has_vfp_d32 test
 * hw/arm/sbsa-ref: add ITS support in SBSA GIC
 * target/arm: Fix sve predicate store, 8 <= VQ <= 15
 * pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym

----------------------------------------------------------------
Peter Maydell (2):
      host-utils: Avoid using __builtin_subcll on buggy versions of Apple Clang
      pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym

Richard Henderson (23):
      target/arm: Add isar_feature_aa64_rme
      target/arm: Update SCR and HCR for RME
      target/arm: SCR_EL3.NS may be RES1
      target/arm: Add RME cpregs
      target/arm: Introduce ARMSecuritySpace
      include/exec/memattrs: Add two bits of space to MemTxAttrs
      target/arm: Adjust the order of Phys and Stage2 ARMMMUIdx
      target/arm: Introduce ARMMMUIdx_Phys_{Realm,Root}
      target/arm: Remove __attribute__((nonnull)) from ptw.c
      target/arm: Pipe ARMSecuritySpace through ptw.c
      target/arm: NSTable is RES0 for the RME EL3 regime
      target/arm: Handle Block and Page bits for security space
      target/arm: Handle no-execute for Realm and Root regimes
      target/arm: Use get_phys_addr_with_struct in S1_ptw_translate
      target/arm: Move s1_is_el0 into S1Translate
      target/arm: Use get_phys_addr_with_struct for stage2
      target/arm: Add GPC syndrome
      target/arm: Implement GPC exceptions
      target/arm: Implement the granule protection check
      target/arm: Add cpu properties for enabling FEAT_RME
      docs/system/arm: Document FEAT_RME
      target/arm: Restructure has_vfp_d32 test
      target/arm: Fix sve predicate store, 8 <= VQ <= 15

Shashi Mallela (1):
      hw/arm/sbsa-ref: add ITS support in SBSA GIC

From: Richard Henderson <richard.henderson@linaro.org>

Add the missing field for ID_AA64PFR0, and the predicate.
Disable it if EL3 is forced off by the board or command-line.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h | 6 ++++++
 target/arm/cpu.c | 4 ++++
 2 files changed, 10 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(ID_AA64PFR0, SEL2, 36, 4)
 FIELD(ID_AA64PFR0, MPAM, 40, 4)
 FIELD(ID_AA64PFR0, AMU, 44, 4)
 FIELD(ID_AA64PFR0, DIT, 48, 4)
+FIELD(ID_AA64PFR0, RME, 52, 4)
 FIELD(ID_AA64PFR0, CSV2, 56, 4)
 FIELD(ID_AA64PFR0, CSV3, 60, 4)
 
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_sel2(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SEL2) != 0;
 }
 
+static inline bool isar_feature_aa64_rme(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, RME) != 0;
+}
+
 static inline bool isar_feature_aa64_vh(const ARMISARegisters *id)
 {
     return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, VH) != 0;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         cpu->isar.id_dfr0 = FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, COPSDBG, 0);
         cpu->isar.id_aa64pfr0 = FIELD_DP64(cpu->isar.id_aa64pfr0,
                                            ID_AA64PFR0, EL3, 0);
+
+        /* Disable the realm management extension, which requires EL3. */
+        cpu->isar.id_aa64pfr0 = FIELD_DP64(cpu->isar.id_aa64pfr0,
+                                           ID_AA64PFR0, RME, 0);
     }
 
     if (!cpu->has_el2) {
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Define the missing SCR and HCR bits, allow SCR_NSE and {SCR,HCR}_GPF
to be set, and invalidate TLBs when NSE changes.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h    |  5 +++--
 target/arm/helper.c | 10 ++++++++--
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
 #define HCR_TERR      (1ULL << 36)
 #define HCR_TEA       (1ULL << 37)
 #define HCR_MIOCNCE   (1ULL << 38)
-/* RES0 bit 39 */
+#define HCR_TME       (1ULL << 39)
 #define HCR_APK       (1ULL << 40)
 #define HCR_API       (1ULL << 41)
 #define HCR_NV        (1ULL << 42)
@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
 #define HCR_NV2       (1ULL << 45)
 #define HCR_FWB       (1ULL << 46)
 #define HCR_FIEN      (1ULL << 47)
-/* RES0 bit 48 */
+#define HCR_GPF       (1ULL << 48)
 #define HCR_TID4      (1ULL << 49)
 #define HCR_TICAB     (1ULL << 50)
 #define HCR_AMVOFFEN  (1ULL << 51)
@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
 #define SCR_TRNDR             (1ULL << 40)
 #define SCR_ENTP2             (1ULL << 41)
 #define SCR_GPF               (1ULL << 48)
+#define SCR_NSE               (1ULL << 62)
 
 #define HSTR_TTEE (1 << 16)
 #define HSTR_TJDBX (1 << 17)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
         if (cpu_isar_feature(aa64_fgt, cpu)) {
             valid_mask |= SCR_FGTEN;
         }
+        if (cpu_isar_feature(aa64_rme, cpu)) {
+            valid_mask |= SCR_NSE | SCR_GPF;
+        }
     } else {
         valid_mask &= ~(SCR_RW | SCR_ST);
         if (cpu_isar_feature(aa32_ras, cpu)) {
@@ -XXX,XX +XXX,XX @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
     env->cp15.scr_el3 = value;
 
     /*
-     * If SCR_EL3.NS changes, i.e. arm_is_secure_below_el3, then
+     * If SCR_EL3.{NS,NSE} changes, i.e. change of security state,
      * we must invalidate all TLBs below EL3.
      */
-    if (changed & SCR_NS) {
+    if (changed & (SCR_NS | SCR_NSE)) {
         tlb_flush_by_mmuidx(env_cpu(env), (ARMMMUIdxBit_E10_0 |
                                            ARMMMUIdxBit_E20_0 |
                                            ARMMMUIdxBit_E10_1 |
@@ -XXX,XX +XXX,XX @@ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
         if (cpu_isar_feature(aa64_fwb, cpu)) {
             valid_mask |= HCR_FWB;
         }
+        if (cpu_isar_feature(aa64_rme, cpu)) {
+            valid_mask |= HCR_GPF;
+        }
     }
 
     if (cpu_isar_feature(any_evt, cpu)) {
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

With RME, SEL2 must also be present to support secure state.
The NS bit is RES1 if SEL2 is not present.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
         }
         if (cpu_isar_feature(aa64_sel2, cpu)) {
             valid_mask |= SCR_EEL2;
+        } else if (cpu_isar_feature(aa64_rme, cpu)) {
+            /* With RME and without SEL2, NS is RES1 (R_GSWWH, I_DJJQJ). */
+            value |= SCR_NS;
         }
         if (cpu_isar_feature(aa64_mte, cpu)) {
             valid_mask |= SCR_ATA;
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

This includes GPCCR, GPTBR, MFAR, the TLB flush insns PAALL, PAALLOS,
RPALOS, RPAOS, and the cache flush insns CIPAPA and CIGDPAPA.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h    | 19 ++++++++++
 target/arm/helper.c | 84 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 103 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
         uint64_t fgt_read[2]; /* HFGRTR, HDFGRTR */
         uint64_t fgt_write[2]; /* HFGWTR, HDFGWTR */
         uint64_t fgt_exec[1]; /* HFGITR */
+
+        /* RME registers */
+        uint64_t gpccr_el3;
+        uint64_t gptbr_el3;
+        uint64_t mfar_el3;
     } cp15;
 
     struct {
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
     uint64_t reset_cbar;
     uint32_t reset_auxcr;
     bool reset_hivecs;
+    uint8_t reset_l0gptsz;
 
     /*
      * Intermediate values used during property parsing.
@@ -XXX,XX +XXX,XX @@ FIELD(MVFR1, SIMDFMAC, 28, 4)
 FIELD(MVFR2, SIMDMISC, 0, 4)
 FIELD(MVFR2, FPMISC, 4, 4)
 
+FIELD(GPCCR, PPS, 0, 3)
+FIELD(GPCCR, IRGN, 8, 2)
+FIELD(GPCCR, ORGN, 10, 2)
+FIELD(GPCCR, SH, 12, 2)
+FIELD(GPCCR, PGS, 14, 2)
+FIELD(GPCCR, GPC, 16, 1)
+FIELD(GPCCR, GPCP, 17, 1)
+FIELD(GPCCR, L0GPTSZ, 20, 4)
+
+FIELD(MFAR, FPA, 12, 40)
+FIELD(MFAR, NSE, 62, 1)
+FIELD(MFAR, NS, 63, 1)
+
 QEMU_BUILD_BUG_ON(ARRAY_SIZE(((ARMCPU *)0)->ccsidr) <= R_V7M_CSSELR_INDEX_MASK);
 
 /* If adding a feature bit which corresponds to a Linux ELF
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo sme_reginfo[] = {
       .access = PL2_RW, .accessfn = access_esm,
       .type = ARM_CP_CONST, .resetvalue = 0 },
 };
+
+static void tlbi_aa64_paall_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                                  uint64_t value)
+{
+    CPUState *cs = env_cpu(env);
+
+    tlb_flush(cs);
+}
+
+static void gpccr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                        uint64_t value)
+{
+    /* L0GPTSZ is RO; other bits not mentioned are RES0. */
+    uint64_t rw_mask = R_GPCCR_PPS_MASK | R_GPCCR_IRGN_MASK |
+        R_GPCCR_ORGN_MASK | R_GPCCR_SH_MASK | R_GPCCR_PGS_MASK |
+        R_GPCCR_GPC_MASK | R_GPCCR_GPCP_MASK;
+
+    env->cp15.gpccr_el3 = (value & rw_mask) | (env->cp15.gpccr_el3 & ~rw_mask);
+}
+
+static void gpccr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+    env->cp15.gpccr_el3 = FIELD_DP64(0, GPCCR, L0GPTSZ,
+                                     env_archcpu(env)->reset_l0gptsz);
+}
+
+static void tlbi_aa64_paallos_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                                    uint64_t value)
+{
+    CPUState *cs = env_cpu(env);
+
+    tlb_flush_all_cpus_synced(cs);
+}
+
+static const ARMCPRegInfo rme_reginfo[] = {
+    { .name = "GPCCR_EL3", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 6, .crn = 2, .crm = 1, .opc2 = 6,
+      .access = PL3_RW, .writefn = gpccr_write, .resetfn = gpccr_reset,
+      .fieldoffset = offsetof(CPUARMState, cp15.gpccr_el3) },
+    { .name = "GPTBR_EL3", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 6, .crn = 2, .crm = 1, .opc2 = 4,
+      .access = PL3_RW, .fieldoffset = offsetof(CPUARMState, cp15.gptbr_el3) },
+    { .name = "MFAR_EL3", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 6, .crn = 6, .crm = 0, .opc2 = 5,
+      .access = PL3_RW, .fieldoffset = offsetof(CPUARMState, cp15.mfar_el3) },
+    { .name = "TLBI_PAALL", .state = ARM_CP_STATE_AA64,
+      .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 7, .opc2 = 4,
+      .access = PL3_W, .type = ARM_CP_NO_RAW,
+      .writefn = tlbi_aa64_paall_write },
+    { .name = "TLBI_PAALLOS", .state = ARM_CP_STATE_AA64,
+      .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 1, .opc2 = 4,
+      .access = PL3_W, .type = ARM_CP_NO_RAW,
+      .writefn = tlbi_aa64_paallos_write },
+    /*
+     * QEMU does not have a way to invalidate by physical address, thus
+     * invalidating a range of physical addresses is accomplished by
+     * flushing all tlb entries in the outer sharable domain,
+     * just like PAALLOS.
+     */
+    { .name = "TLBI_RPALOS", .state = ARM_CP_STATE_AA64,
+      .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 4, .opc2 = 7,
+      .access = PL3_W, .type = ARM_CP_NO_RAW,
+      .writefn = tlbi_aa64_paallos_write },
+    { .name = "TLBI_RPAOS", .state = ARM_CP_STATE_AA64,
+      .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 4, .opc2 = 3,
+      .access = PL3_W, .type = ARM_CP_NO_RAW,
+      .writefn = tlbi_aa64_paallos_write },
+    { .name = "DC_CIPAPA", .state = ARM_CP_STATE_AA64,
+      .opc0 = 1, .opc1 = 6, .crn = 7, .crm = 14, .opc2 = 1,
+      .access = PL3_W, .type = ARM_CP_NOP },
+};
+
+static const ARMCPRegInfo rme_mte_reginfo[] = {
+    { .name = "DC_CIGDPAPA", .state = ARM_CP_STATE_AA64,
+      .opc0 = 1, .opc1 = 6, .crn = 7, .crm = 14, .opc2 = 5,
+      .access = PL3_W, .type = ARM_CP_NOP },
+};
 #endif /* TARGET_AARCH64 */
 
 static void define_pmu_regs(ARMCPU *cpu)
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
     if (cpu_isar_feature(aa64_fgt, cpu)) {
         define_arm_cp_regs(cpu, fgt_reginfo);
     }
+
+    if (cpu_isar_feature(aa64_rme, cpu)) {
+        define_arm_cp_regs(cpu, rme_reginfo);
+        if (cpu_isar_feature(aa64_mte, cpu)) {
+            define_arm_cp_regs(cpu, rme_mte_reginfo);
+        }
+    }
 #endif
 
     if (cpu_isar_feature(any_predinv, cpu)) {
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Introduce both the enumeration and functions to retrieve
the current state, and state outside of EL3.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h    | 89 ++++++++++++++++++++++++++++++++++-----------
 target/arm/helper.c | 60 ++++++++++++++++++++++++++++++
 2 files changed, 127 insertions(+), 22 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline int arm_feature(CPUARMState *env, int feature)
 
 void arm_cpu_finalize_features(ARMCPU *cpu, Error **errp);
 
-#if !defined(CONFIG_USER_ONLY)
 /*
+ * ARM v9 security states.
+ * The ordering of the enumeration corresponds to the low 2 bits
+ * of the GPI value, and (except for Root) the concat of NSE:NS.
+ */
+
+typedef enum ARMSecuritySpace {
+    ARMSS_Secure     = 0,
+    ARMSS_NonSecure  = 1,
+    ARMSS_Root       = 2,
+    ARMSS_Realm      = 3,
+} ARMSecuritySpace;
+
+/* Return true if @space is secure, in the pre-v9 sense. */
+static inline bool arm_space_is_secure(ARMSecuritySpace space)
+{
+    return space == ARMSS_Secure || space == ARMSS_Root;
+}
+
+/* Return the ARMSecuritySpace for @secure, assuming !RME or EL[0-2]. */
+static inline ARMSecuritySpace arm_secure_to_space(bool secure)
+{
+    return secure ? ARMSS_Secure : ARMSS_NonSecure;
+}
+
+#if !defined(CONFIG_USER_ONLY)
+/**
+ * arm_security_space_below_el3:
+ * @env: cpu context
+ *
+ * Return the security space of exception levels below EL3, following
+ * an exception return to those levels.  Unlike arm_security_space,
+ * this doesn't care about the current EL.
+ */
+ARMSecuritySpace arm_security_space_below_el3(CPUARMState *env);
+
+/**
+ * arm_is_secure_below_el3:
+ * @env: cpu context
+ *
  * Return true if exception levels below EL3 are in secure state,
- * or would be following an exception return to that level.
- * Unlike arm_is_secure() (which is always a question about the
- * _current_ state of the CPU) this doesn't care about the current
- * EL or mode.
+ * or would be following an exception return to those levels.
  */
 static inline bool arm_is_secure_below_el3(CPUARMState *env)
 {
-    assert(!arm_feature(env, ARM_FEATURE_M));
-    if (arm_feature(env, ARM_FEATURE_EL3)) {
-        return !(env->cp15.scr_el3 & SCR_NS);
-    } else {
-        /* If EL3 is not supported then the secure state is implementation
-         * defined, in which case QEMU defaults to non-secure.
-         */
-        return false;
-    }
+    ARMSecuritySpace ss = arm_security_space_below_el3(env);
+    return ss == ARMSS_Secure;
 }
 
 /* Return true if the CPU is AArch64 EL3 or AArch32 Mon */
@@ -XXX,XX +XXX,XX @@ static inline bool arm_is_el3_or_mon(CPUARMState *env)
     return false;
 }
 
-/* Return true if the processor is in secure state */
+/**
+ * arm_security_space:
+ * @env: cpu context
+ *
+ * Return the current security space of the cpu.
+ */
+ARMSecuritySpace arm_security_space(CPUARMState *env);
+
+/**
+ * arm_is_secure:
+ * @env: cpu context
+ *
+ * Return true if the processor is in secure state.
+ */
 static inline bool arm_is_secure(CPUARMState *env)
 {
-    if (arm_feature(env, ARM_FEATURE_M)) {
-        return env->v7m.secure;
-    }
-    if (arm_is_el3_or_mon(env)) {
-        return true;
-    }
-    return arm_is_secure_below_el3(env);
+    return arm_space_is_secure(arm_security_space(env));
 }
 
 /*
@@ -XXX,XX +XXX,XX @@ static inline bool arm_is_el2_enabled(CPUARMState *env)
 }
 
 #else
+static inline ARMSecuritySpace arm_security_space_below_el3(CPUARMState *env)
+{
+    return ARMSS_NonSecure;
+}
+
 static inline bool arm_is_secure_below_el3(CPUARMState *env)
 {
     return false;
 }
 
+static inline ARMSecuritySpace arm_security_space(CPUARMState *env)
+{
+    return ARMSS_NonSecure;
+}
+
 static inline bool arm_is_secure(CPUARMState *env)
 {
     return false;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void aarch64_sve_change_el(CPUARMState *env, int old_el,
     }
 }
 #endif
+
+#ifndef CONFIG_USER_ONLY
+ARMSecuritySpace arm_security_space(CPUARMState *env)
+{
+    if (arm_feature(env, ARM_FEATURE_M)) {
+        return arm_secure_to_space(env->v7m.secure);
+    }
+
+    /*
+     * If EL3 is not supported then the secure state is implementation
+     * defined, in which case QEMU defaults to non-secure.
+     */
+    if (!arm_feature(env, ARM_FEATURE_EL3)) {
+        return ARMSS_NonSecure;
+    }
+
+    /* Check for AArch64 EL3 or AArch32 Mon. */
+    if (is_a64(env)) {
+        if (extract32(env->pstate, 2, 2) == 3) {
+            if (cpu_isar_feature(aa64_rme, env_archcpu(env))) {
+                return ARMSS_Root;
+            } else {
+                return ARMSS_Secure;
+            }
+        }
+    } else {
+        if ((env->uncached_cpsr & CPSR_M) == ARM_CPU_MODE_MON) {
+            return ARMSS_Secure;
+        }
+    }
+
+    return arm_security_space_below_el3(env);
+}
+
+ARMSecuritySpace arm_security_space_below_el3(CPUARMState *env)
+{
+    assert(!arm_feature(env, ARM_FEATURE_M));
+
+    /*
+     * If EL3 is not supported then the secure state is implementation
+     * defined, in which case QEMU defaults to non-secure.
+     */
+    if (!arm_feature(env, ARM_FEATURE_EL3)) {
+        return ARMSS_NonSecure;
+    }
+
+    /*
+     * Note NSE cannot be set without RME, and NSE & !NS is Reserved.
+     * Ignoring NSE when !NS retains consistency without having to
+     * modify other predicates.
+     */
+    if (!(env->cp15.scr_el3 & SCR_NS)) {
+        return ARMSS_Secure;
+    } else if (env->cp15.scr_el3 & SCR_NSE) {
+        return ARMSS_Realm;
+    } else {
+        return ARMSS_NonSecure;
+    }
+}
+#endif /* !CONFIG_USER_ONLY */
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

We will need 2 bits to represent ARMSecurityState.

Do not attempt to replace or widen secure, even though it
logically overlaps the new field -- there are uses within
e.g. hw/block/pflash_cfi01.c, which don't know anything
specific about ARM.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/exec/memattrs.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/exec/memattrs.h b/include/exec/memattrs.h
index XXXXXXX..XXXXXXX 100644
--- a/include/exec/memattrs.h
+++ b/include/exec/memattrs.h
@@ -XXX,XX +XXX,XX @@ typedef struct MemTxAttrs {
      * "didn't specify" if necessary.
      */
     unsigned int unspecified:1;
-    /* ARM/AMBA: TrustZone Secure access
+    /*
+     * ARM/AMBA: TrustZone Secure access
      * x86: System Management Mode access
      */
     unsigned int secure:1;
+    /*
+     * ARM: ArmSecuritySpace.  This partially overlaps secure, but it is
+     * easier to have both fields to assist code that does not understand
+     * ARMv9 RME, or no specific knowledge of ARM at all (e.g. pflash).
+     */
+    unsigned int space:2;
     /* Memory access is usermode (unprivileged) */
     unsigned int user:1;
     /*
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

It will be helpful to have ARMMMUIdx_Phys_* to be in the same
relative order as ARMSecuritySpace enumerators. This requires
the adjustment to the nstable check. While there, check for being
in secure state rather than rely on clearing the low bit making
no change to non-secure state.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h | 12 ++++++------
 target/arm/ptw.c | 12 +++++-------
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
     ARMMMUIdx_E2        = 6 | ARM_MMU_IDX_A,
     ARMMMUIdx_E3        = 7 | ARM_MMU_IDX_A,
 
-    /* TLBs with 1-1 mapping to the physical address spaces. */
-    ARMMMUIdx_Phys_NS   = 8 | ARM_MMU_IDX_A,
-    ARMMMUIdx_Phys_S    = 9 | ARM_MMU_IDX_A,
-
     /*
      * Used for second stage of an S12 page table walk, or for descriptor
      * loads during first stage of an S1 page table walk.  Note that both
      * are in use simultaneously for SecureEL2: the security state for
      * the S2 ptw is selected by the NS bit from the S1 ptw.
      */
-    ARMMMUIdx_Stage2    = 10 | ARM_MMU_IDX_A,
-    ARMMMUIdx_Stage2_S  = 11 | ARM_MMU_IDX_A,
+    ARMMMUIdx_Stage2_S  = 8 | ARM_MMU_IDX_A,
+    ARMMMUIdx_Stage2    = 9 | ARM_MMU_IDX_A,
+
+    /* TLBs with 1-1 mapping to the physical address spaces. */
+    ARMMMUIdx_Phys_S    = 10 | ARM_MMU_IDX_A,
+    ARMMMUIdx_Phys_NS   = 11 | ARM_MMU_IDX_A,
 
     /*
      * These are not allocated TLBs and are used only for AT system
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
     descaddr |= (address >> (stride * (4 - level))) & indexmask;
     descaddr &= ~7ULL;
     nstable = !regime_is_stage2(mmu_idx) && extract32(tableattrs, 4, 1);
-    if (nstable) {
+    if (nstable && ptw->in_secure) {
         /*
          * Stage2_S -> Stage2 or Phys_S -> Phys_NS
-         * Assert that the non-secure idx are even, and relative order.
+         * Assert the relative order of the secure/non-secure indexes.
          */
-        QEMU_BUILD_BUG_ON((ARMMMUIdx_Phys_NS & 1) != 0);
-        QEMU_BUILD_BUG_ON((ARMMMUIdx_Stage2 & 1) != 0);
-        QEMU_BUILD_BUG_ON(ARMMMUIdx_Phys_NS + 1 != ARMMMUIdx_Phys_S);
-        QEMU_BUILD_BUG_ON(ARMMMUIdx_Stage2 + 1 != ARMMMUIdx_Stage2_S);
-        ptw->in_ptw_idx &= ~1;
+        QEMU_BUILD_BUG_ON(ARMMMUIdx_Phys_S + 1 != ARMMMUIdx_Phys_NS);
+        QEMU_BUILD_BUG_ON(ARMMMUIdx_Stage2_S + 1 != ARMMMUIdx_Stage2);
+        ptw->in_ptw_idx += 1;
         ptw->in_secure = false;
     }
     if (!S1_ptw_translate(env, ptw, descaddr, fi)) {
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

With FEAT_RME, there are four physical address spaces.
For now, just define the symbols, and mention them in
the same spots as the other Phys indexes in ptw.c.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h | 23 +++++++++++++++++++++--
 target/arm/ptw.c | 10 ++++++++--
 2 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
     ARMMMUIdx_Stage2    = 9 | ARM_MMU_IDX_A,
 
     /* TLBs with 1-1 mapping to the physical address spaces. */
-    ARMMMUIdx_Phys_S    = 10 | ARM_MMU_IDX_A,
-    ARMMMUIdx_Phys_NS   = 11 | ARM_MMU_IDX_A,
+    ARMMMUIdx_Phys_S     = 10 | ARM_MMU_IDX_A,
+    ARMMMUIdx_Phys_NS    = 11 | ARM_MMU_IDX_A,
+    ARMMMUIdx_Phys_Root  = 12 | ARM_MMU_IDX_A,
+    ARMMMUIdx_Phys_Realm = 13 | ARM_MMU_IDX_A,
 
     /*
      * These are not allocated TLBs and are used only for AT system
@@ -XXX,XX +XXX,XX @@ typedef enum ARMASIdx {
     ARMASIdx_TagS = 3,
 } ARMASIdx;
 
+static inline ARMMMUIdx arm_space_to_phys(ARMSecuritySpace space)
+{
+    /* Assert the relative order of the physical mmu indexes. */
+    QEMU_BUILD_BUG_ON(ARMSS_Secure != 0);
+    QEMU_BUILD_BUG_ON(ARMMMUIdx_Phys_NS != ARMMMUIdx_Phys_S + ARMSS_NonSecure);
+    QEMU_BUILD_BUG_ON(ARMMMUIdx_Phys_Root != ARMMMUIdx_Phys_S + ARMSS_Root);
+    QEMU_BUILD_BUG_ON(ARMMMUIdx_Phys_Realm != ARMMMUIdx_Phys_S + ARMSS_Realm);
+
+    return ARMMMUIdx_Phys_S + space;
+}
+
+static inline ARMSecuritySpace arm_phys_to_space(ARMMMUIdx idx)
+{
+    assert(idx >= ARMMMUIdx_Phys_S && idx <= ARMMMUIdx_Phys_Realm);
+    return idx - ARMMMUIdx_Phys_S;
+}
+
 static inline bool arm_v7m_csselr_razwi(ARMCPU *cpu)
 {
     /* If all the CLIDR.Ctypem bits are 0 there are no caches, and
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ static bool regime_translation_disabled(CPUARMState *env, ARMMMUIdx mmu_idx,
     case ARMMMUIdx_E3:
         break;
 
-    case ARMMMUIdx_Phys_NS:
     case ARMMMUIdx_Phys_S:
+    case ARMMMUIdx_Phys_NS:
+    case ARMMMUIdx_Phys_Root:
+    case ARMMMUIdx_Phys_Realm:
         /* No translation for physical address spaces. */
         return true;
 
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_disabled(CPUARMState *env, target_ulong address,
     switch (mmu_idx) {
     case ARMMMUIdx_Stage2:
     case ARMMMUIdx_Stage2_S:
-    case ARMMMUIdx_Phys_NS:
     case ARMMMUIdx_Phys_S:
+    case ARMMMUIdx_Phys_NS:
+    case ARMMMUIdx_Phys_Root:
+    case ARMMMUIdx_Phys_Realm:
         break;
 
     default:
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
     switch (mmu_idx) {
     case ARMMMUIdx_Phys_S:
     case ARMMMUIdx_Phys_NS:
+    case ARMMMUIdx_Phys_Root:
+    case ARMMMUIdx_Phys_Realm:
         /* Checking Phys early avoids special casing later vs regime_el. */
         return get_phys_addr_disabled(env, address, access_type, mmu_idx,
                                       is_secure, result, fi);
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

This was added in 7e98e21c098 as part of a reorg in which
one of the argument had been legally NULL, and this caught
actual instances.  Now that the reorg is complete, this
serves little purpose.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/ptw.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ typedef struct S1Translate {
 static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
                                uint64_t address,
                                MMUAccessType access_type, bool s1_is_el0,
-                               GetPhysAddrResult *result, ARMMMUFaultInfo *fi)
-    __attribute__((nonnull));
+                               GetPhysAddrResult *result, ARMMMUFaultInfo *fi);
 
 static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
                                       target_ulong address,
                                       MMUAccessType access_type,
                                       GetPhysAddrResult *result,
-                                      ARMMMUFaultInfo *fi)
-    __attribute__((nonnull));
+                                      ARMMMUFaultInfo *fi);
 
 /* This mapping is common between ID_AA64MMFR0.PARANGE and TCR_ELx.{I}PS. */
 static const uint8_t pamax_map[] = {
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Add input and output space members to S1Translate.  Set and adjust
them in S1_ptw_translate, and the various points at which we drop
secure state.  Initialize the space in get_phys_addr; for now leave
get_phys_addr_with_secure considering only secure vs non-secure spaces.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/ptw.c | 86 +++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 71 insertions(+), 15 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@
 typedef struct S1Translate {
     ARMMMUIdx in_mmu_idx;
     ARMMMUIdx in_ptw_idx;
+    ARMSecuritySpace in_space;
     bool in_secure;
     bool in_debug;
     bool out_secure;
     bool out_rw;
     bool out_be;
+    ARMSecuritySpace out_space;
     hwaddr out_virt;
     hwaddr out_phys;
     void *out_host;
@@ -XXX,XX +XXX,XX @@ static bool S2_attrs_are_device(uint64_t hcr, uint8_t attrs)
 static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
                              hwaddr addr, ARMMMUFaultInfo *fi)
 {
+    ARMSecuritySpace space = ptw->in_space;
     bool is_secure = ptw->in_secure;
     ARMMMUIdx mmu_idx = ptw->in_mmu_idx;
     ARMMMUIdx s2_mmu_idx = ptw->in_ptw_idx;
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
                 .in_mmu_idx = s2_mmu_idx,
                 .in_ptw_idx = ptw_idx_for_stage_2(env, s2_mmu_idx),
                 .in_secure = s2_mmu_idx == ARMMMUIdx_Stage2_S,
+                .in_space = (s2_mmu_idx == ARMMMUIdx_Stage2_S ? ARMSS_Secure
+                             : space == ARMSS_Realm ? ARMSS_Realm
+                             : ARMSS_NonSecure),
                 .in_debug = true,
             };
             GetPhysAddrResult s2 = { };
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
             ptw->out_phys = s2.f.phys_addr;
             pte_attrs = s2.cacheattrs.attrs;
             ptw->out_secure = s2.f.attrs.secure;
+            ptw->out_space = s2.f.attrs.space;
         } else {
             /* Regime is physical. */
             ptw->out_phys = addr;
             pte_attrs = 0;
             ptw->out_secure = s2_mmu_idx == ARMMMUIdx_Phys_S;
+            ptw->out_space = (s2_mmu_idx == ARMMMUIdx_Phys_S ? ARMSS_Secure
+                              : space == ARMSS_Realm ? ARMSS_Realm
+                              : ARMSS_NonSecure);
         }
         ptw->out_host = NULL;
         ptw->out_rw = false;
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
         ptw->out_rw = full->prot & PAGE_WRITE;
         pte_attrs = full->pte_attrs;
         ptw->out_secure = full->attrs.secure;
+        ptw->out_space = full->attrs.space;
 #else
         g_assert_not_reached();
 #endif
@@ -XXX,XX +XXX,XX @@ static uint32_t arm_ldl_ptw(CPUARMState *env, S1Translate *ptw,
         }
     } else {
         /* Page tables are in MMIO. */
-        MemTxAttrs attrs = { .secure = ptw->out_secure };
+        MemTxAttrs attrs = {
+            .secure = ptw->out_secure,
+            .space = ptw->out_space,
+        };
         AddressSpace *as = arm_addressspace(cs, attrs);
         MemTxResult result = MEMTX_OK;
 
@@ -XXX,XX +XXX,XX @@ static uint64_t arm_ldq_ptw(CPUARMState *env, S1Translate *ptw,
 #endif
     } else {
         /* Page tables are in MMIO. */
-        MemTxAttrs attrs = { .secure = ptw->out_secure };
+        MemTxAttrs attrs = {
+            .secure = ptw->out_secure,
+            .space = ptw->out_space,
+        };
         AddressSpace *as = arm_addressspace(cs, attrs);
         MemTxResult result = MEMTX_OK;
 
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_v6(CPUARMState *env, S1Translate *ptw,
          * regime, because the attribute will already be non-secure.
          */
         result->f.attrs.secure = false;
+        result->f.attrs.space = ARMSS_NonSecure;
     }
     result->f.phys_addr = phys_addr;
     return false;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
          * regime, because the attribute will already be non-secure.
          */
         result->f.attrs.secure = false;
+        result->f.attrs.space = ARMSS_NonSecure;
     }
 
     if (regime_is_stage2(mmu_idx)) {
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_pmsav8(CPUARMState *env, uint32_t address,
              */
             if (sattrs.ns) {
                 result->f.attrs.secure = false;
+                result->f.attrs.space = ARMSS_NonSecure;
             } else if (!secure) {
                 /*
                  * NS access to S memory must fault.
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
     bool is_secure = ptw->in_secure;
     bool ret, ipa_secure;
     ARMCacheAttrs cacheattrs1;
+    ARMSecuritySpace ipa_space;
     bool is_el0;
     uint64_t hcr;
 
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
 
     ipa = result->f.phys_addr;
     ipa_secure = result->f.attrs.secure;
+    ipa_space = result->f.attrs.space;
 
     is_el0 = ptw->in_mmu_idx == ARMMMUIdx_Stage1_E0;
     ptw->in_mmu_idx = ipa_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
     ptw->in_secure = ipa_secure;
+    ptw->in_space = ipa_space;
     ptw->in_ptw_idx = ptw_idx_for_stage_2(env, ptw->in_mmu_idx);
 
     /*
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
     ARMMMUIdx s1_mmu_idx;
 
     /*
-     * The page table entries may downgrade secure to non-secure, but
-     * cannot upgrade an non-secure translation regime's attributes
-     * to secure.
+     * The page table entries may downgrade Secure to NonSecure, but
+     * cannot upgrade a NonSecure translation regime's attributes
+     * to Secure or Realm.
      */
     result->f.attrs.secure = is_secure;
+    result->f.attrs.space = ptw->in_space;
 
     switch (mmu_idx) {
     case ARMMMUIdx_Phys_S:
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
 
     default:
         /* Single stage uses physical for ptw. */
-        ptw->in_ptw_idx = is_secure ? ARMMMUIdx_Phys_S : ARMMMUIdx_Phys_NS;
+        ptw->in_ptw_idx = arm_space_to_phys(ptw->in_space);
         break;
     }
 
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr_with_secure(CPUARMState *env, target_ulong address,
     S1Translate ptw = {
         .in_mmu_idx = mmu_idx,
         .in_secure = is_secure,
+        .in_space = arm_secure_to_space(is_secure),
     };
     return get_phys_addr_with_struct(env, &ptw, address, access_type,
                                      result, fi);
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
                    MMUAccessType access_type, ARMMMUIdx mmu_idx,
                    GetPhysAddrResult *result, ARMMMUFaultInfo *fi)
 {
-    bool is_secure;
+    S1Translate ptw = {
+        .in_mmu_idx = mmu_idx,
+    };
+    ARMSecuritySpace ss;
 
     switch (mmu_idx) {
     case ARMMMUIdx_E10_0:
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
     case ARMMMUIdx_Stage1_E1:
     case ARMMMUIdx_Stage1_E1_PAN:
     case ARMMMUIdx_E2:
-        is_secure = arm_is_secure_below_el3(env);
+        ss = arm_security_space_below_el3(env);
         break;
     case ARMMMUIdx_Stage2:
+        /*
+         * For Secure EL2, we need this index to be NonSecure;
+         * otherwise this will already be NonSecure or Realm.
+         */
+        ss = arm_security_space_below_el3(env);
+        if (ss == ARMSS_Secure) {
+            ss = ARMSS_NonSecure;
+        }
+        break;
     case ARMMMUIdx_Phys_NS:
     case ARMMMUIdx_MPrivNegPri:
     case ARMMMUIdx_MUserNegPri:
     case ARMMMUIdx_MPriv:
     case ARMMMUIdx_MUser:
-        is_secure = false;
+        ss = ARMSS_NonSecure;
         break;
-    case ARMMMUIdx_E3:
     case ARMMMUIdx_Stage2_S:
     case ARMMMUIdx_Phys_S:
     case ARMMMUIdx_MSPrivNegPri:
     case ARMMMUIdx_MSUserNegPri:
     case ARMMMUIdx_MSPriv:
     case ARMMMUIdx_MSUser:
-        is_secure = true;
+        ss = ARMSS_Secure;
+        break;
+    case ARMMMUIdx_E3:
+        if (arm_feature(env, ARM_FEATURE_AARCH64) &&
+            cpu_isar_feature(aa64_rme, env_archcpu(env))) {
+            ss = ARMSS_Root;
+        } else {
+            ss = ARMSS_Secure;
+        }
+        break;
+    case ARMMMUIdx_Phys_Root:
+        ss = ARMSS_Root;
+        break;
+    case ARMMMUIdx_Phys_Realm:
+        ss = ARMSS_Realm;
         break;
     default:
         g_assert_not_reached();
     }
-    return get_phys_addr_with_secure(env, address, access_type, mmu_idx,
-                                     is_secure, result, fi);
+
+    ptw.in_space = ss;
+    ptw.in_secure = arm_space_is_secure(ss);
+    return get_phys_addr_with_struct(env, &ptw, address, access_type,
+                                     result, fi);
 }
 
 hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
@@ -XXX,XX +XXX,XX @@ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
 {
     ARMCPU *cpu = ARM_CPU(cs);
     CPUARMState *env = &cpu->env;
+    ARMMMUIdx mmu_idx = arm_mmu_idx(env);
+    ARMSecuritySpace ss = arm_security_space(env);
     S1Translate ptw = {
-        .in_mmu_idx = arm_mmu_idx(env),
-        .in_secure = arm_is_secure(env),
+        .in_mmu_idx = mmu_idx,
+        .in_space = ss,
+        .in_secure = arm_space_is_secure(ss),
         .in_debug = true,
     };
     GetPhysAddrResult res = {};
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Test in_space instead of in_secure so that we don't
switch out of Root space.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/ptw.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
 {
     ARMCPU *cpu = env_archcpu(env);
     ARMMMUIdx mmu_idx = ptw->in_mmu_idx;
-    bool is_secure = ptw->in_secure;
     int32_t level;
     ARMVAParameters param;
     uint64_t ttbr;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
     uint64_t descaddrmask;
     bool aarch64 = arm_el_is_aa64(env, el);
     uint64_t descriptor, new_descriptor;
-    bool nstable;
 
     /* TODO: This code does not support shareability levels. */
     if (aarch64) {
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
         descaddrmask = MAKE_64BIT_MASK(0, 40);
     }
     descaddrmask &= ~indexmask_grainsize;
-
-    /*
-     * Secure stage 1 accesses start with the page table in secure memory and
-     * can be downgraded to non-secure at any step. Non-secure accesses
-     * remain non-secure. We implement this by just ORing in the NSTable/NS
-     * bits at each step.
-     * Stage 2 never gets this kind of downgrade.
-     */
-    tableattrs = is_secure ? 0 : (1 << 4);
+    tableattrs = 0;
 
  next_level:
     descaddr |= (address >> (stride * (4 - level))) & indexmask;
     descaddr &= ~7ULL;
-    nstable = !regime_is_stage2(mmu_idx) && extract32(tableattrs, 4, 1);
-    if (nstable && ptw->in_secure) {
+
+    /*
+     * Process the NSTable bit from the previous level.  This changes
+     * the table address space and the output space from Secure to
+     * NonSecure.  With RME, the EL3 translation regime does not change
+     * from Root to NonSecure.
+     */
+    if (ptw->in_space == ARMSS_Secure
+        && !regime_is_stage2(mmu_idx)
+        && extract32(tableattrs, 4, 1)) {
         /*
          * Stage2_S -> Stage2 or Phys_S -> Phys_NS
          * Assert the relative order of the secure/non-secure indexes.
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
         QEMU_BUILD_BUG_ON(ARMMMUIdx_Stage2_S + 1 != ARMMMUIdx_Stage2);
         ptw->in_ptw_idx += 1;
         ptw->in_secure = false;
+        ptw->in_space = ARMSS_NonSecure;
     }
+
     if (!S1_ptw_translate(env, ptw, descaddr, fi)) {
         goto do_fault;
     }
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
      */
     attrs = new_descriptor & (MAKE_64BIT_MASK(2, 10) | MAKE_64BIT_MASK(50, 14));
     if (!regime_is_stage2(mmu_idx)) {
-        attrs |= nstable << 5; /* NS */
+        attrs |= !ptw->in_secure << 5; /* NS */
         if (!param.hpd) {
             attrs |= extract64(tableattrs, 0, 2) << 53;     /* XN, PXN */
             /*
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

With Realm security state, bit 55 of a block or page descriptor during
the stage2 walk becomes the NS bit; during the stage1 walk the bit 5
NS bit is RES0.  With Root security state, bit 11 of the block or page
descriptor during the stage1 walk becomes the NSE bit.

Rather than collecting an NS bit and applying it later, compute the
output pa space from the input pa space and unconditionally assign.
This means that we no longer need to adjust the output space earlier
for the NSTable bit.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/ptw.c | 89 +++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 73 insertions(+), 16 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
  * @mmu_idx: MMU index indicating required translation regime
  * @is_aa64: TRUE if AArch64
  * @ap:      The 2-bit simple AP (AP[2:1])
- * @ns:      NS (non-secure) bit
  * @xn:      XN (execute-never) bit
  * @pxn:     PXN (privileged execute-never) bit
+ * @in_pa:   The original input pa space
+ * @out_pa:  The output pa space, modified by NSTable, NS, and NSE
  */
 static int get_S1prot(CPUARMState *env, ARMMMUIdx mmu_idx, bool is_aa64,
-                      int ap, int ns, int xn, int pxn)
+                      int ap, int xn, int pxn,
+                      ARMSecuritySpace in_pa, ARMSecuritySpace out_pa)
 {
     ARMCPU *cpu = env_archcpu(env);
     bool is_user = regime_is_user(env, mmu_idx);
@@ -XXX,XX +XXX,XX @@ static int get_S1prot(CPUARMState *env, ARMMMUIdx mmu_idx, bool is_aa64,
         }
     }
 
-    if (ns && arm_is_secure(env) && (env->cp15.scr_el3 & SCR_SIF)) {
+    if (out_pa == ARMSS_NonSecure && in_pa == ARMSS_Secure &&
+        (env->cp15.scr_el3 & SCR_SIF)) {
         return prot_rw;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
     int32_t stride;
     int addrsize, inputsize, outputsize;
     uint64_t tcr = regime_tcr(env, mmu_idx);
-    int ap, ns, xn, pxn;
+    int ap, xn, pxn;
     uint32_t el = regime_el(env, mmu_idx);
     uint64_t descaddrmask;
     bool aarch64 = arm_el_is_aa64(env, el);
     uint64_t descriptor, new_descriptor;
+    ARMSecuritySpace out_space;
 
     /* TODO: This code does not support shareability levels. */
     if (aarch64) {
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
     }
 
     ap = extract32(attrs, 6, 2);
+    out_space = ptw->in_space;
     if (regime_is_stage2(mmu_idx)) {
-        ns = mmu_idx == ARMMMUIdx_Stage2;
+        /*
+         * R_GYNXY: For stage2 in Realm security state, bit 55 is NS.
+         * The bit remains ignored for other security states.
+         */
+        if (out_space == ARMSS_Realm && extract64(attrs, 55, 1)) {
+            out_space = ARMSS_NonSecure;
+        }
         xn = extract64(attrs, 53, 2);
         result->f.prot = get_S2prot(env, ap, xn, s1_is_el0);
     } else {
-        ns = extract32(attrs, 5, 1);
+        int nse, ns = extract32(attrs, 5, 1);
+        switch (out_space) {
+        case ARMSS_Root:
+            /*
+             * R_GVZML: Bit 11 becomes the NSE field in the EL3 regime.
+             * R_XTYPW: NSE and NS together select the output pa space.
+             */
+            nse = extract32(attrs, 11, 1);
+            out_space = (nse << 1) | ns;
+            if (out_space == ARMSS_Secure &&
+                !cpu_isar_feature(aa64_sel2, cpu)) {
+                out_space = ARMSS_NonSecure;
+            }
+            break;
+        case ARMSS_Secure:
+            if (ns) {
+                out_space = ARMSS_NonSecure;
+            }
+            break;
+        case ARMSS_Realm:
+            switch (mmu_idx) {
+            case ARMMMUIdx_Stage1_E0:
+            case ARMMMUIdx_Stage1_E1:
+            case ARMMMUIdx_Stage1_E1_PAN:
+                /* I_CZPRF: For Realm EL1&0 stage1, NS bit is RES0. */
+                break;
+            case ARMMMUIdx_E2:
+            case ARMMMUIdx_E20_0:
+            case ARMMMUIdx_E20_2:
+            case ARMMMUIdx_E20_2_PAN:
+                /*
+                 * R_LYKFZ, R_WGRZN: For Realm EL2 and EL2&1,
+                 * NS changes the output to non-secure space.
+                 */
+                if (ns) {
+                    out_space = ARMSS_NonSecure;
+                }
+                break;
+            default:
+                g_assert_not_reached();
+            }
+            break;
+        case ARMSS_NonSecure:
+            /* R_QRMFF: For NonSecure state, the NS bit is RES0. */
+            break;
+        default:
+            g_assert_not_reached();
+        }
         xn = extract64(attrs, 54, 1);
         pxn = extract64(attrs, 53, 1);
-        result->f.prot = get_S1prot(env, mmu_idx, aarch64, ap, ns, xn, pxn);
+
+        /*
+         * Note that we modified ptw->in_space earlier for NSTable, but
+         * result->f.attrs retains a copy of the original security space.
+         */
+        result->f.prot = get_S1prot(env, mmu_idx, aarch64, ap, xn, pxn,
+                                    result->f.attrs.space, out_space);
     }
 
     if (!(result->f.prot & (1 << access_type))) {
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
         }
     }
 
-    if (ns) {
-        /*
-         * The NS bit will (as required by the architecture) have no effect if
-         * the CPU doesn't support TZ or this is a non-secure translation
-         * regime, because the attribute will already be non-secure.
-         */
-        result->f.attrs.secure = false;
-        result->f.attrs.space = ARMSS_NonSecure;
-    }
+    result->f.attrs.space = out_space;
+    result->f.attrs.secure = arm_space_is_secure(out_space);
 
     if (regime_is_stage2(mmu_idx)) {
         result->cacheattrs.is_s2_format = true;
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

While Root and Realm may read and write data from other spaces,
neither may execute from other pa spaces.

This happens for Stage1 EL3, EL2, EL2&0, and Stage2 EL1&0.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/ptw.c | 52 ++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 46 insertions(+), 6 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ do_fault:
  * @xn:      XN (execute-never) bits
  * @s1_is_el0: true if this is S2 of an S1+2 walk for EL0
  */
-static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
+static int get_S2prot_noexecute(int s2ap)
 {
     int prot = 0;
 
@@ -XXX,XX +XXX,XX @@ static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
     if (s2ap & 2) {
         prot |= PAGE_WRITE;
     }
+    return prot;
+}
+
+static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
+{
+    int prot = get_S2prot_noexecute(s2ap);
 
     if (cpu_isar_feature(any_tts2uxn, env_archcpu(env))) {
         switch (xn) {
@@ -XXX,XX +XXX,XX @@ static int get_S1prot(CPUARMState *env, ARMMMUIdx mmu_idx, bool is_aa64,
         }
     }
 
-    if (out_pa == ARMSS_NonSecure && in_pa == ARMSS_Secure &&
-        (env->cp15.scr_el3 & SCR_SIF)) {
-        return prot_rw;
+    if (in_pa != out_pa) {
+        switch (in_pa) {
+        case ARMSS_Root:
+            /*
+             * R_ZWRVD: permission fault for insn fetched from non-Root,
+             * I_WWBFB: SIF has no effect in EL3.
+             */
+            return prot_rw;
+        case ARMSS_Realm:
+            /*
+             * R_PKTDS: permission fault for insn fetched from non-Realm,
+             * for Realm EL2 or EL2&0.  The corresponding fault for EL1&0
+             * happens during any stage2 translation.
+             */
+            switch (mmu_idx) {
+            case ARMMMUIdx_E2:
+            case ARMMMUIdx_E20_0:
+            case ARMMMUIdx_E20_2:
+            case ARMMMUIdx_E20_2_PAN:
+                return prot_rw;
+            default:
+                break;
+            }
+            break;
+        case ARMSS_Secure:
+            if (env->cp15.scr_el3 & SCR_SIF) {
+                return prot_rw;
+            }
+            break;
+        default:
+            /* Input NonSecure must have output NonSecure. */
+            g_assert_not_reached();
+        }
     }
 
     /* TODO have_wxn should be replaced with
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
         /*
          * R_GYNXY: For stage2 in Realm security state, bit 55 is NS.
          * The bit remains ignored for other security states.
+         * R_YMCSL: Executing an insn fetched from non-Realm causes
+         * a stage2 permission fault.
          */
         if (out_space == ARMSS_Realm && extract64(attrs, 55, 1)) {
             out_space = ARMSS_NonSecure;
+            result->f.prot = get_S2prot_noexecute(ap);
+        } else {
+            xn = extract64(attrs, 53, 2);
+            result->f.prot = get_S2prot(env, ap, xn, s1_is_el0);
         }
-        xn = extract64(attrs, 53, 2);
-        result->f.prot = get_S2prot(env, ap, xn, s1_is_el0);
     } else {
         int nse, ns = extract32(attrs, 5, 1);
         switch (out_space) {
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Do not provide a fast-path for physical addresses,
as those will need to be validated for GPC.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/ptw.c | 44 +++++++++++++++++---------------------------
 1 file changed, 17 insertions(+), 27 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
          * From gdbstub, do not use softmmu so that we don't modify the
          * state of the cpu at all, including softmmu tlb contents.
          */
-        if (regime_is_stage2(s2_mmu_idx)) {
-            S1Translate s2ptw = {
-                .in_mmu_idx = s2_mmu_idx,
-                .in_ptw_idx = ptw_idx_for_stage_2(env, s2_mmu_idx),
-                .in_secure = s2_mmu_idx == ARMMMUIdx_Stage2_S,
-                .in_space = (s2_mmu_idx == ARMMMUIdx_Stage2_S ? ARMSS_Secure
-                             : space == ARMSS_Realm ? ARMSS_Realm
-                             : ARMSS_NonSecure),
-                .in_debug = true,
-            };
-            GetPhysAddrResult s2 = { };
+        S1Translate s2ptw = {
+            .in_mmu_idx = s2_mmu_idx,
+            .in_ptw_idx = ptw_idx_for_stage_2(env, s2_mmu_idx),
+            .in_secure = s2_mmu_idx == ARMMMUIdx_Stage2_S,
+            .in_space = (s2_mmu_idx == ARMMMUIdx_Stage2_S ? ARMSS_Secure
+                         : space == ARMSS_Realm ? ARMSS_Realm
+                         : ARMSS_NonSecure),
+            .in_debug = true,
+        };
+        GetPhysAddrResult s2 = { };
 
-            if (get_phys_addr_lpae(env, &s2ptw, addr, MMU_DATA_LOAD,
-                                   false, &s2, fi)) {
-                goto fail;
-            }
-            ptw->out_phys = s2.f.phys_addr;
-            pte_attrs = s2.cacheattrs.attrs;
-            ptw->out_secure = s2.f.attrs.secure;
-            ptw->out_space = s2.f.attrs.space;
-        } else {
-            /* Regime is physical. */
-            ptw->out_phys = addr;
-            pte_attrs = 0;
-            ptw->out_secure = s2_mmu_idx == ARMMMUIdx_Phys_S;
-            ptw->out_space = (s2_mmu_idx == ARMMMUIdx_Phys_S ? ARMSS_Secure
-                              : space == ARMSS_Realm ? ARMSS_Realm
-                              : ARMSS_NonSecure);
+        if (get_phys_addr_with_struct(env, &s2ptw, addr,
+                                      MMU_DATA_LOAD, &s2, fi)) {
+            goto fail;
         }
+        ptw->out_phys = s2.f.phys_addr;
+        pte_attrs = s2.cacheattrs.attrs;
         ptw->out_host = NULL;
         ptw->out_rw = false;
+        ptw->out_secure = s2.f.attrs.secure;
+        ptw->out_space = s2.f.attrs.space;
     } else {
 #ifdef CONFIG_TCG
         CPUTLBEntryFull *full;
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Instead of passing this to get_phys_addr_lpae, stash it
in the S1Translate structure.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/ptw.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ typedef struct S1Translate {
     ARMSecuritySpace in_space;
     bool in_secure;
     bool in_debug;
+    /*
+     * If this is stage 2 of a stage 1+2 page table walk, then this must
+     * be true if stage 1 is an EL0 access; otherwise this is ignored.
+     * Stage 2 is indicated by in_mmu_idx set to ARMMMUIdx_Stage2{,_S}.
+     */
+    bool in_s1_is_el0;
     bool out_secure;
     bool out_rw;
     bool out_be;
@@ -XXX,XX +XXX,XX @@ typedef struct S1Translate {
 } S1Translate;
 
 static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
-                               uint64_t address,
-                               MMUAccessType access_type, bool s1_is_el0,
+                               uint64_t address, MMUAccessType access_type,
                                GetPhysAddrResult *result, ARMMMUFaultInfo *fi);
 
 static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
@@ -XXX,XX +XXX,XX @@ static int check_s2_mmu_setup(ARMCPU *cpu, bool is_aa64, uint64_t tcr,
  * @ptw: Current and next stage parameters for the walk.
  * @address: virtual address to get physical address for
  * @access_type: MMU_DATA_LOAD, MMU_DATA_STORE or MMU_INST_FETCH
- * @s1_is_el0: if @ptw->in_mmu_idx is ARMMMUIdx_Stage2
- *             (so this is a stage 2 page table walk),
- *             must be true if this is stage 2 of a stage 1+2
- *             walk for an EL0 access. If @mmu_idx is anything else,
- *             @s1_is_el0 is ignored.
  * @result: set on translation success,
  * @fi: set to fault info if the translation fails
  */
 static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
                                uint64_t address,
-                               MMUAccessType access_type, bool s1_is_el0,
+                               MMUAccessType access_type,
                                GetPhysAddrResult *result, ARMMMUFaultInfo *fi)
 {
     ARMCPU *cpu = env_archcpu(env);
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
             result->f.prot = get_S2prot_noexecute(ap);
         } else {
             xn = extract64(attrs, 53, 2);
-            result->f.prot = get_S2prot(env, ap, xn, s1_is_el0);
+            result->f.prot = get_S2prot(env, ap, xn, ptw->in_s1_is_el0);
         }
     } else {
         int nse, ns = extract32(attrs, 5, 1);
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
     bool ret, ipa_secure;
     ARMCacheAttrs cacheattrs1;
     ARMSecuritySpace ipa_space;
-    bool is_el0;
     uint64_t hcr;
 
     ret = get_phys_addr_with_struct(env, ptw, address, access_type, result, fi);
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
     ipa_secure = result->f.attrs.secure;
     ipa_space = result->f.attrs.space;
 
-    is_el0 = ptw->in_mmu_idx == ARMMMUIdx_Stage1_E0;
+    ptw->in_s1_is_el0 = ptw->in_mmu_idx == ARMMMUIdx_Stage1_E0;
     ptw->in_mmu_idx = ipa_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
     ptw->in_secure = ipa_secure;
     ptw->in_space = ipa_space;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
         ret = get_phys_addr_pmsav8(env, ipa, access_type,
                                    ptw->in_mmu_idx, is_secure, result, fi);
     } else {
-        ret = get_phys_addr_lpae(env, ptw, ipa, access_type,
-                                 is_el0, result, fi);
+        ret = get_phys_addr_lpae(env, ptw, ipa, access_type, result, fi);
     }
     fi->s2addr = ipa;
 
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
     }
 
     if (regime_using_lpae_format(env, mmu_idx)) {
-        return get_phys_addr_lpae(env, ptw, address, access_type, false,
-                                  result, fi);
+        return get_phys_addr_lpae(env, ptw, address, access_type, result, fi);
     } else if (arm_feature(env, ARM_FEATURE_V7) ||
                regime_sctlr(env, mmu_idx) & SCTLR_XP) {
         return get_phys_addr_v6(env, ptw, address, access_type, result, fi);
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

This fixes a bug in which we failed to initialize
the result attributes properly after the memset.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/ptw.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

The function takes the fields as filled in by
the Arm ARM pseudocode for TakeGPCException.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-18-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/syndrome.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/syndrome.h
+++ b/target/arm/syndrome.h
@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
     EC_SVEACCESSTRAP          = 0x19,
     EC_ERETTRAP               = 0x1a,
     EC_SMETRAP                = 0x1d,
+    EC_GPC                    = 0x1e,
     EC_INSNABORT              = 0x20,
     EC_INSNABORT_SAME_EL      = 0x21,
     EC_PCALIGNMENT            = 0x22,
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_bxjtrap(int cv, int cond, int rm)
         (cv << 24) | (cond << 20) | rm;
 }
 
+static inline uint32_t syn_gpc(int s2ptw, int ind, int gpcsc,
+                               int cm, int s1ptw, int wnr, int fsc)
+{
+    /* TODO: FEAT_NV2 adds VNCR */
+    return (EC_GPC << ARM_EL_EC_SHIFT) | ARM_EL_IL | (s2ptw << 21)
+            | (ind << 20) | (gpcsc << 14) | (cm << 8) | (s1ptw << 7)
+            | (wnr << 6) | fsc;
+}
+
 static inline uint32_t syn_insn_abort(int same_el, int ea, int s1ptw, int fsc)
 {
     return (EC_INSNABORT << ARM_EL_EC_SHIFT) | (same_el << ARM_EL_EC_SHIFT)
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Handle GPC Fault types in arm_deliver_fault, reporting as
either a GPC exception at EL3, or falling through to insn
or data aborts at various exception levels.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-19-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h            |  1 +
 target/arm/internals.h      | 27 +++++++++++
 target/arm/helper.c         |  5 ++
 target/arm/tcg/tlb_helper.c | 96 +++++++++++++++++++++++++++++++++++--
 4 files changed, 126 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@
 #define EXCP_UNALIGNED      22   /* v7M UNALIGNED UsageFault */
 #define EXCP_DIVBYZERO      23   /* v7M DIVBYZERO UsageFault */
 #define EXCP_VSERR          24
+#define EXCP_GPC            25   /* v9 Granule Protection Check Fault */
 /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
 
 #define ARMV7M_EXCP_RESET   1
diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ typedef enum ARMFaultType {
     ARMFault_ICacheMaint,
     ARMFault_QEMU_NSCExec, /* v8M: NS executing in S&NSC memory */
     ARMFault_QEMU_SFault, /* v8M: SecureFault INVTRAN, INVEP or AUVIOL */
+    ARMFault_GPCFOnWalk,
+    ARMFault_GPCFOnOutput,
 } ARMFaultType;
 
+typedef enum ARMGPCF {
+    GPCF_None,
+    GPCF_AddressSize,
+    GPCF_Walk,
+    GPCF_EABT,
+    GPCF_Fail,
+} ARMGPCF;
+
 /**
  * ARMMMUFaultInfo: Information describing an ARM MMU Fault
  * @type: Type of fault
+ * @gpcf: Subtype of ARMFault_GPCFOn{Walk,Output}.
  * @level: Table walk level (for translation, access flag and permission faults)
  * @domain: Domain of the fault address (for non-LPAE CPUs only)
  * @s2addr: Address that caused a fault at stage 2
+ * @paddr: physical address that caused a fault for gpc
+ * @paddr_space: physical address space that caused a fault for gpc
  * @stage2: True if we faulted at stage 2
  * @s1ptw: True if we faulted at stage 2 while doing a stage 1 page-table walk
  * @s1ns: True if we faulted on a non-secure IPA while in secure state
@@ -XXX,XX +XXX,XX @@ typedef enum ARMFaultType {
 typedef struct ARMMMUFaultInfo ARMMMUFaultInfo;
 struct ARMMMUFaultInfo {
     ARMFaultType type;
+    ARMGPCF gpcf;
     target_ulong s2addr;
+    target_ulong paddr;
+    ARMSecuritySpace paddr_space;
     int level;
     int domain;
     bool stage2;
@@ -XXX,XX +XXX,XX @@ static inline uint32_t arm_fi_to_lfsc(ARMMMUFaultInfo *fi)
     case ARMFault_Exclusive:
         fsc = 0x35;
         break;
+    case ARMFault_GPCFOnWalk:
+        assert(fi->level >= -1 && fi->level <= 3);
+        if (fi->level < 0) {
+            fsc = 0b100011;
+        } else {
+            fsc = 0b100100 | fi->level;
+        }
+        break;
+    case ARMFault_GPCFOnOutput:
+        fsc = 0b101000;
+        break;
     default:
         /* Other faults can't occur in a context that requires a
          * long-format status code.
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void arm_log_exception(CPUState *cs)
             [EXCP_UNALIGNED] = "v7M UNALIGNED UsageFault",
             [EXCP_DIVBYZERO] = "v7M DIVBYZERO UsageFault",
             [EXCP_VSERR] = "Virtual SERR",
+            [EXCP_GPC] = "Granule Protection Check",
         };
 
         if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
     }
 
     switch (cs->exception_index) {
+    case EXCP_GPC:
+        qemu_log_mask(CPU_LOG_INT, "...with MFAR 0x%" PRIx64 "\n",
+                      env->cp15.mfar_el3);
+        /* fall through */
     case EXCP_PREFETCH_ABORT:
     case EXCP_DATA_ABORT:
         /*
diff --git a/target/arm/tcg/tlb_helper.c b/target/arm/tcg/tlb_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/tlb_helper.c
+++ b/target/arm/tcg/tlb_helper.c
@@ -XXX,XX +XXX,XX @@ static uint32_t compute_fsr_fsc(CPUARMState *env, ARMMMUFaultInfo *fi,
     return fsr;
 }
 
+static bool report_as_gpc_exception(ARMCPU *cpu, int current_el,
+                                    ARMMMUFaultInfo *fi)
+{
+    bool ret;
+
+    switch (fi->gpcf) {
+    case GPCF_None:
+        return false;
+    case GPCF_AddressSize:
+    case GPCF_Walk:
+    case GPCF_EABT:
+        /* R_PYTGX: GPT faults are reported as GPC. */
+        ret = true;
+        break;
+    case GPCF_Fail:
+        /*
+         * R_BLYPM: A GPF at EL3 is reported as insn or data abort.
+         * R_VBZMW, R_LXHQR: A GPF at EL[0-2] is reported as a GPC
+         * if SCR_EL3.GPF is set, otherwise an insn or data abort.
+         */
+        ret = (cpu->env.cp15.scr_el3 & SCR_GPF) && current_el != 3;
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    assert(cpu_isar_feature(aa64_rme, cpu));
+    assert(fi->type == ARMFault_GPCFOnWalk ||
+           fi->type == ARMFault_GPCFOnOutput);
+    if (fi->gpcf == GPCF_AddressSize) {
+        assert(fi->level == 0);
+    } else {
+        assert(fi->level >= 0 && fi->level <= 1);
+    }
+
+    return ret;
+}
+
+static unsigned encode_gpcsc(ARMMMUFaultInfo *fi)
+{
+    static uint8_t const gpcsc[] = {
+        [GPCF_AddressSize] = 0b000000,
+        [GPCF_Walk]        = 0b000100,
+        [GPCF_Fail]        = 0b001100,
+        [GPCF_EABT]        = 0b010100,
+    };
+
+    /* Note that we've validated fi->gpcf and fi->level above. */
+    return gpcsc[fi->gpcf] | fi->level;
+}
+
 static G_NORETURN
 void arm_deliver_fault(ARMCPU *cpu, vaddr addr,
                        MMUAccessType access_type,
                        int mmu_idx, ARMMMUFaultInfo *fi)
 {
     CPUARMState *env = &cpu->env;
-    int target_el;
+    int target_el = exception_target_el(env);
+    int current_el = arm_current_el(env);
     bool same_el;
     uint32_t syn, exc, fsr, fsc;
 
-    target_el = exception_target_el(env);
+    if (report_as_gpc_exception(cpu, current_el, fi)) {
+        target_el = 3;
+
+        fsr = compute_fsr_fsc(env, fi, target_el, mmu_idx, &fsc);
+
+        syn = syn_gpc(fi->stage2 && fi->type == ARMFault_GPCFOnWalk,
+                      access_type == MMU_INST_FETCH,
+                      encode_gpcsc(fi), 0, fi->s1ptw,
+                      access_type == MMU_DATA_STORE, fsc);
+
+        env->cp15.mfar_el3 = fi->paddr;
+        switch (fi->paddr_space) {
+        case ARMSS_Secure:
+            break;
+        case ARMSS_NonSecure:
+            env->cp15.mfar_el3 |= R_MFAR_NS_MASK;
+            break;
+        case ARMSS_Root:
+            env->cp15.mfar_el3 |= R_MFAR_NSE_MASK;
+            break;
+        case ARMSS_Realm:
+            env->cp15.mfar_el3 |= R_MFAR_NSE_MASK | R_MFAR_NS_MASK;
+            break;
+        default:
+            g_assert_not_reached();
+        }
+
+        exc = EXCP_GPC;
+        goto do_raise;
+    }
+
+    /* If SCR_EL3.GPF is unset, GPF may still be routed to EL2. */
+    if (fi->gpcf == GPCF_Fail && target_el < 2) {
+        if (arm_hcr_el2_eff(env) & HCR_GPF) {
+            target_el = 2;
+        }
+    }
+
     if (fi->stage2) {
         target_el = 2;
         env->cp15.hpfar_el2 = extract64(fi->s2addr, 12, 47) << 4;
@@ -XXX,XX +XXX,XX @@ void arm_deliver_fault(ARMCPU *cpu, vaddr addr,
             env->cp15.hpfar_el2 |= HPFAR_NS;
         }
     }
-    same_el = (arm_current_el(env) == target_el);
 
+    same_el = current_el == target_el;
     fsr = compute_fsr_fsc(env, fi, target_el, mmu_idx, &fsc);
 
     if (access_type == MMU_INST_FETCH) {
@@ -XXX,XX +XXX,XX @@ void arm_deliver_fault(ARMCPU *cpu, vaddr addr,
         exc = EXCP_DATA_ABORT;
     }
 
+ do_raise:
     env->exception.vaddress = addr;
     env->exception.fsr = fsr;
     raise_exception(env, exc, syn, target_el);
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Place the check at the end of get_phys_addr_with_struct,
so that we check all physical results.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-20-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/ptw.c | 249 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 232 insertions(+), 17 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ typedef struct S1Translate {
     void *out_host;
 } S1Translate;
 
-static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
-                                      target_ulong address,
-                                      MMUAccessType access_type,
-                                      GetPhysAddrResult *result,
-                                      ARMMMUFaultInfo *fi);
+static bool get_phys_addr_nogpc(CPUARMState *env, S1Translate *ptw,
+                                target_ulong address,
+                                MMUAccessType access_type,
+                                GetPhysAddrResult *result,
+                                ARMMMUFaultInfo *fi);
+
+static bool get_phys_addr_gpc(CPUARMState *env, S1Translate *ptw,
+                              target_ulong address,
+                              MMUAccessType access_type,
+                              GetPhysAddrResult *result,
+                              ARMMMUFaultInfo *fi);
 
 /* This mapping is common between ID_AA64MMFR0.PARANGE and TCR_ELx.{I}PS. */
 static const uint8_t pamax_map[] = {
@@ -XXX,XX +XXX,XX @@ static bool regime_translation_disabled(CPUARMState *env, ARMMMUIdx mmu_idx,
     return (regime_sctlr(env, mmu_idx) & SCTLR_M) == 0;
 }
 
+static bool granule_protection_check(CPUARMState *env, uint64_t paddress,
+                                     ARMSecuritySpace pspace,
+                                     ARMMMUFaultInfo *fi)
+{
+    MemTxAttrs attrs = {
+        .secure = true,
+        .space = ARMSS_Root,
+    };
+    ARMCPU *cpu = env_archcpu(env);
+    uint64_t gpccr = env->cp15.gpccr_el3;
+    unsigned pps, pgs, l0gptsz, level = 0;
+    uint64_t tableaddr, pps_mask, align, entry, index;
+    AddressSpace *as;
+    MemTxResult result;
+    int gpi;
+
+    if (!FIELD_EX64(gpccr, GPCCR, GPC)) {
+        return true;
+    }
+
+    /*
+     * GPC Priority 1 (R_GMGRR):
+     * R_JWCSM: If the configuration of GPCCR_EL3 is invalid,
+     * the access fails as GPT walk fault at level 0.
+     */
+
+    /*
+     * Configuration of PPS to a value exceeding the implemented
+     * physical address size is invalid.
+     */
+    pps = FIELD_EX64(gpccr, GPCCR, PPS);
+    if (pps > FIELD_EX64(cpu->isar.id_aa64mmfr0, ID_AA64MMFR0, PARANGE)) {
+        goto fault_walk;
+    }
+    pps = pamax_map[pps];
+    pps_mask = MAKE_64BIT_MASK(0, pps);
+
+    switch (FIELD_EX64(gpccr, GPCCR, SH)) {
+    case 0b10: /* outer shareable */
+        break;
+    case 0b00: /* non-shareable */
+    case 0b11: /* inner shareable */
+        /* Inner and Outer non-cacheable requires Outer shareable. */
+        if (FIELD_EX64(gpccr, GPCCR, ORGN) == 0 &&
+            FIELD_EX64(gpccr, GPCCR, IRGN) == 0) {
+            goto fault_walk;
+        }
+        break;
+    default:   /* reserved */
+        goto fault_walk;
+    }
+
+    switch (FIELD_EX64(gpccr, GPCCR, PGS)) {
+    case 0b00: /* 4KB */
+        pgs = 12;
+        break;
+    case 0b01: /* 64KB */
+        pgs = 16;
+        break;
+    case 0b10: /* 16KB */
+        pgs = 14;
+        break;
+    default: /* reserved */
+        goto fault_walk;
+    }
+
+    /* Note this field is read-only and fixed at reset. */
+    l0gptsz = 30 + FIELD_EX64(gpccr, GPCCR, L0GPTSZ);
+
+    /*
+     * GPC Priority 2: Secure, Realm or Root address exceeds PPS.
+     * R_CPDSB: A NonSecure physical address input exceeding PPS
+     * does not experience any fault.
+     */
+    if (paddress & ~pps_mask) {
+        if (pspace == ARMSS_NonSecure) {
+            return true;
+        }
+        goto fault_size;
+    }
+
+    /* GPC Priority 3: the base address of GPTBR_EL3 exceeds PPS. */
+    tableaddr = env->cp15.gptbr_el3 << 12;
+    if (tableaddr & ~pps_mask) {
+        goto fault_size;
+    }
+
+    /*
+     * BADDR is aligned per a function of PPS and L0GPTSZ.
+     * These bits of GPTBR_EL3 are RES0, but are not a configuration error,
+     * unlike the RES0 bits of the GPT entries (R_XNKFZ).
+     */
+    align = MAX(pps - l0gptsz + 3, 12);
+    align = MAKE_64BIT_MASK(0, align);
+    tableaddr &= ~align;
+
+    as = arm_addressspace(env_cpu(env), attrs);
+
+    /* Level 0 lookup. */
+    index = extract64(paddress, l0gptsz, pps - l0gptsz);
+    tableaddr += index * 8;
+    entry = address_space_ldq_le(as, tableaddr, attrs, &result);
+    if (result != MEMTX_OK) {
+        goto fault_eabt;
+    }
+
+    switch (extract32(entry, 0, 4)) {
+    case 1: /* block descriptor */
+        if (entry >> 8) {
+            goto fault_walk; /* RES0 bits not 0 */
+        }
+        gpi = extract32(entry, 4, 4);
+        goto found;
+    case 3: /* table descriptor */
+        tableaddr = entry & ~0xf;
+        align = MAX(l0gptsz - pgs - 1, 12);
+        align = MAKE_64BIT_MASK(0, align);
+        if (tableaddr & (~pps_mask | align)) {
+            goto fault_walk; /* RES0 bits not 0 */
+        }
+        break;
+    default: /* invalid */
+        goto fault_walk;
+    }
+
+    /* Level 1 lookup */
+    level = 1;
+    index = extract64(paddress, pgs + 4, l0gptsz - pgs - 4);
+    tableaddr += index * 8;
+    entry = address_space_ldq_le(as, tableaddr, attrs, &result);
+    if (result != MEMTX_OK) {
+        goto fault_eabt;
+    }
+
+    switch (extract32(entry, 0, 4)) {
+    case 1: /* contiguous descriptor */
+        if (entry >> 10) {
+            goto fault_walk; /* RES0 bits not 0 */
+        }
+        /*
+         * Because the softmmu tlb only works on units of TARGET_PAGE_SIZE,
+         * and because we cannot invalidate by pa, and thus will always
+         * flush entire tlbs, we don't actually care about the range here
+         * and can simply extract the GPI as the result.
+         */
+        if (extract32(entry, 8, 2) == 0) {
+            goto fault_walk; /* reserved contig */
+        }
+        gpi = extract32(entry, 4, 4);
+        break;
+    default:
+        index = extract64(paddress, pgs, 4);
+        gpi = extract64(entry, index * 4, 4);
+        break;
+    }
+
+ found:
+    switch (gpi) {
+    case 0b0000: /* no access */
+        break;
+    case 0b1111: /* all access */
+        return true;
+    case 0b1000:
+    case 0b1001:
+    case 0b1010:
+    case 0b1011:
+        if (pspace == (gpi & 3)) {
+            return true;
+        }
+        break;
+    default:
+        goto fault_walk; /* reserved */
+    }
+
+    fi->gpcf = GPCF_Fail;
+    goto fault_common;
+ fault_eabt:
+    fi->gpcf = GPCF_EABT;
+    goto fault_common;
+ fault_size:
+    fi->gpcf = GPCF_AddressSize;
+    goto fault_common;
+ fault_walk:
+    fi->gpcf = GPCF_Walk;
+ fault_common:
+    fi->level = level;
+    fi->paddr = paddress;
+    fi->paddr_space = pspace;
+    return false;
+}
+
 static bool S2_attrs_are_device(uint64_t hcr, uint8_t attrs)
 {
     /*
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
         };
         GetPhysAddrResult s2 = { };
 
-        if (get_phys_addr_with_struct(env, &s2ptw, addr,
-                                      MMU_DATA_LOAD, &s2, fi)) {
+        if (get_phys_addr_gpc(env, &s2ptw, addr, MMU_DATA_LOAD, &s2, fi)) {
             goto fail;
         }
+
         ptw->out_phys = s2.f.phys_addr;
         pte_attrs = s2.cacheattrs.attrs;
         ptw->out_host = NULL;
@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
 
  fail:
     assert(fi->type != ARMFault_None);
+    if (fi->type == ARMFault_GPCFOnOutput) {
+        fi->type = ARMFault_GPCFOnWalk;
+    }
     fi->s2addr = addr;
     fi->stage2 = true;
     fi->s1ptw = true;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_disabled(CPUARMState *env, target_ulong address,
                                    ARMMMUFaultInfo *fi)
 {
     uint8_t memattr = 0x00;    /* Device nGnRnE */
-    uint8_t shareability = 0;  /* non-sharable */
+    uint8_t shareability = 0;  /* non-shareable */
     int r_el;
 
     switch (mmu_idx) {
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_disabled(CPUARMState *env, target_ulong address,
             } else {
                 memattr = 0x44;  /* Normal, NC, No */
             }
-            shareability = 2; /* outer sharable */
+            shareability = 2; /* outer shareable */
         }
         result->cacheattrs.is_s2_format = false;
         break;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
     ARMSecuritySpace ipa_space;
     uint64_t hcr;
 
-    ret = get_phys_addr_with_struct(env, ptw, address, access_type, result, fi);
+    ret = get_phys_addr_nogpc(env, ptw, address, access_type, result, fi);
 
     /* If S1 fails, return early.  */
     if (ret) {
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
     cacheattrs1 = result->cacheattrs;
     memset(result, 0, sizeof(*result));
 
-    ret = get_phys_addr_with_struct(env, ptw, ipa, access_type, result, fi);
+    ret = get_phys_addr_nogpc(env, ptw, ipa, access_type, result, fi);
     fi->s2addr = ipa;
 
     /* Combine the S1 and S2 perms.  */
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_twostage(CPUARMState *env, S1Translate *ptw,
     return false;
 }
 
-static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
+static bool get_phys_addr_nogpc(CPUARMState *env, S1Translate *ptw,
                                       target_ulong address,
                                       MMUAccessType access_type,
                                       GetPhysAddrResult *result,
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_with_struct(CPUARMState *env, S1Translate *ptw,
     }
 }
 
+static bool get_phys_addr_gpc(CPUARMState *env, S1Translate *ptw,
+                              target_ulong address,
+                              MMUAccessType access_type,
+                              GetPhysAddrResult *result,
+                              ARMMMUFaultInfo *fi)
+{
+    if (get_phys_addr_nogpc(env, ptw, address, access_type, result, fi)) {
+        return true;
+    }
+    if (!granule_protection_check(env, result->f.phys_addr,
+                                  result->f.attrs.space, fi)) {
+        fi->type = ARMFault_GPCFOnOutput;
+        return true;
+    }
+    return false;
+}
+
 bool get_phys_addr_with_secure(CPUARMState *env, target_ulong address,
                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
                                bool is_secure, GetPhysAddrResult *result,
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr_with_secure(CPUARMState *env, target_ulong address,
         .in_secure = is_secure,
         .in_space = arm_secure_to_space(is_secure),
     };
-    return get_phys_addr_with_struct(env, &ptw, address, access_type,
-                                     result, fi);
+    return get_phys_addr_gpc(env, &ptw, address, access_type, result, fi);
 }
 
 bool get_phys_addr(CPUARMState *env, target_ulong address,
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
 
     ptw.in_space = ss;
     ptw.in_secure = arm_space_is_secure(ss);
-    return get_phys_addr_with_struct(env, &ptw, address, access_type,
-                                     result, fi);
+    return get_phys_addr_gpc(env, &ptw, address, access_type, result, fi);
 }
 
 hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
@@ -XXX,XX +XXX,XX @@ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr,
     ARMMMUFaultInfo fi = {};
     bool ret;
 
-    ret = get_phys_addr_with_struct(env, &ptw, addr, MMU_DATA_LOAD, &res, &fi);
+    ret = get_phys_addr_gpc(env, &ptw, addr, MMU_DATA_LOAD, &res, &fi);
     *attrs = res.f.attrs;
 
     if (ret) {
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Add an x-rme cpu property to enable FEAT_RME.
Add an x-l0gptsz property to set GPCCR_EL3.L0GPTSZ,
for testing various possible configurations.

We're not currently completely sure whether FEAT_RME will
be OK to enable purely as a CPU-level property, or if it will
need board co-operation, so we're making these experimental
x- properties, so that the people developing the system
level software for RME can try to start using this and let
us know how it goes. The command line syntax for enabling
this will change in future, without backwards-compatibility.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-21-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/cpu64.c | 53 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void cpu_max_set_sve_max_vq(Object *obj, Visitor *v, const char *name,
     cpu->sve_max_vq = max_vq;
 }
 
+static bool cpu_arm_get_rme(Object *obj, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    return cpu_isar_feature(aa64_rme, cpu);
+}
+
+static void cpu_arm_set_rme(Object *obj, bool value, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    uint64_t t;
+
+    t = cpu->isar.id_aa64pfr0;
+    t = FIELD_DP64(t, ID_AA64PFR0, RME, value);
+    cpu->isar.id_aa64pfr0 = t;
+}
+
+static void cpu_max_set_l0gptsz(Object *obj, Visitor *v, const char *name,
+                                void *opaque, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    uint32_t value;
+
+    if (!visit_type_uint32(v, name, &value, errp)) {
+        return;
+    }
+
+    /* Encode the value for the GPCCR_EL3 field. */
+    switch (value) {
+    case 30:
+    case 34:
+    case 36:
+    case 39:
+        cpu->reset_l0gptsz = value - 30;
+        break;
+    default:
+        error_setg(errp, "invalid value for l0gptsz");
+        error_append_hint(errp, "valid values are 30, 34, 36, 39\n");
+        break;
+    }
+}
+
+static void cpu_max_get_l0gptsz(Object *obj, Visitor *v, const char *name,
+                                void *opaque, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    uint32_t value = cpu->reset_l0gptsz + 30;
+
+    visit_type_uint32(v, name, &value, errp);
+}
+
 static Property arm_cpu_lpa2_property =
     DEFINE_PROP_BOOL("lpa2", ARMCPU, prop_lpa2, true);
 
@@ -XXX,XX +XXX,XX @@ void aarch64_max_tcg_initfn(Object *obj)
     aarch64_add_sme_properties(obj);
     object_property_add(obj, "sve-max-vq", "uint32", cpu_max_get_sve_max_vq,
                         cpu_max_set_sve_max_vq, NULL, NULL);
+    object_property_add_bool(obj, "x-rme", cpu_arm_get_rme, cpu_arm_set_rme);
+    object_property_add(obj, "x-l0gptsz", "uint32", cpu_max_get_l0gptsz,
+                        cpu_max_set_l0gptsz, NULL, NULL);
     qdev_property_add_static(DEVICE(obj), &arm_cpu_lpa2_property);
 }
 
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230622143046.1578160-1-richard.henderson@linaro.org
[PMM: fixed typo; note experimental status in emulation.rst too]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/system/arm/cpu-features.rst | 23 +++++++++++++++++++++++
 docs/system/arm/emulation.rst    |  1 +
 2 files changed, 24 insertions(+)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -XXX,XX +XXX,XX @@ As with ``sve-default-vector-length``, if the default length is larger
 than the maximum vector length enabled, the actual vector length will
 be reduced.  If this property is set to ``-1`` then the default vector
 length is set to the maximum possible length.
+
+RME CPU Properties
+==================
+
+The status of RME support with QEMU is experimental.  At this time we
+only support RME within the CPU proper, not within the SMMU or GIC.
+The feature is enabled by the CPU property ``x-rme``, with the ``x-``
+prefix present as a reminder of the experimental status, and defaults off.
+
+The method for enabling RME will change in some future QEMU release
+without notice or backward compatibility.
+
+RME Level 0 GPT Size Property
+-----------------------------
+
+To aid firmware developers in testing different possible CPU
+configurations, ``x-l0gptsz=S`` may be used to specify the value
+to encode into ``GPCCR_EL3.L0GPTSZ``, a read-only field that
+specifies the size of the Level 0 Granule Protection Table.
+Legal values for ``S`` are 30, 34, 36, and 39; the default is 30.
+
+As with ``x-rme``, the ``x-l0gptsz`` property may be renamed or
+removed in some future QEMU release.
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
 - FEAT_RAS (Reliability, availability, and serviceability)
 - FEAT_RASv1p1 (RAS Extension v1.1)
 - FEAT_RDM (Advanced SIMD rounding double multiply accumulate instructions)
+- FEAT_RME (Realm Management Extension) (NB: support status in QEMU is experimental)
 - FEAT_RNG (Random number generator)
 - FEAT_S2FWB (Stage 2 forced Write-Back)
 - FEAT_SB (Speculation Barrier)
-- 
2.34.1

We use __builtin_subcll() to do a 64-bit subtract with borrow-in and
borrow-out when the host compiler supports it.  Unfortunately some
versions of Apple Clang have a bug in their implementation of this
intrinsic which means it returns the wrong value.  The effect is that
a QEMU built with the affected compiler will hang when emulating x86
or m68k float80 division.

The upstream LLVM issue is:
https://github.com/llvm/llvm-project/issues/55253

The commit that introduced the bug apparently never made it into an
upstream LLVM release without the subsequent fix
https://github.com/llvm/llvm-project/commit/fffb6e6afdbaba563189c1f715058ed401fbc88d
but unfortunately it did make it into Apple Clang 14.0, as shipped
in Xcode 14.3 (14.2 is reported to be OK). The Apple bug number is
FB12210478.

Add ifdefs to avoid use of __builtin_subcll() on Apple Clang version
14 or greater.  There is not currently a version of Apple Clang which
has the bug fix -- when one appears we should be able to add an upper
bound to the ifdef condition so we can start using the builtin again.
We make the lower bound a conservative "any Apple clang with major
version 14 or greater" because the consequences of incorrectly
disabling the builtin when it would work are pretty small and the
consequences of not disabling it when we should are pretty bad.

Many thanks to those users who both reported this bug and also
did a lot of work in identifying the root cause; in particular
to Daniel Bertalan and osy.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1631
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1659
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Daniel Bertalan <dani@danielbertalan.dev>
Tested-by: Tested-By: Solra Bizna <solra@bizna.name>
Message-id: 20230622130823.1631719-1-peter.maydell@linaro.org
---
 include/qemu/compiler.h   | 13 +++++++++++++
 include/qemu/host-utils.h |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index XXXXXXX..XXXXXXX 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -XXX,XX +XXX,XX @@
 #define QEMU_DISABLE_CFI
 #endif
 
+/*
+ * Apple clang version 14 has a bug in its __builtin_subcll(); define
+ * BUILTIN_SUBCLL_BROKEN for the offending versions so we can avoid it.
+ * When a version of Apple clang which has this bug fixed is released
+ * we can add an upper bound to this check.
+ * See https://gitlab.com/qemu-project/qemu/-/issues/1631
+ * and https://gitlab.com/qemu-project/qemu/-/issues/1659 for details.
+ * The bug never made it into any upstream LLVM releases, only Apple ones.
+ */
+#if defined(__apple_build_version__) && __clang_major__ >= 14
+#define BUILTIN_SUBCLL_BROKEN
+#endif
+
 #endif /* COMPILER_H */
diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h
index XXXXXXX..XXXXXXX 100644
--- a/include/qemu/host-utils.h
+++ b/include/qemu/host-utils.h
@@ -XXX,XX +XXX,XX @@ static inline uint64_t uadd64_carry(uint64_t x, uint64_t y, bool *pcarry)
  */
 static inline uint64_t usub64_borrow(uint64_t x, uint64_t y, bool *pborrow)
 {
-#if __has_builtin(__builtin_subcll)
+#if __has_builtin(__builtin_subcll) && !defined(BUILTIN_SUBCLL_BROKEN)
     unsigned long long b = *pborrow;
     x = __builtin_subcll(x, y, b, &b);
     *pborrow = b & 1;
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

One cannot test for feature aa32_simd_r32 without first
testing if AArch32 mode is supported at all.  This leads to

qemu-system-aarch64: ARM CPUs must have both VFP-D32 and Neon or neither

for Apple M1 cpus.

We already have a check for ARMv8-A never setting vfp-d32 true,
so restructure the code so that AArch64 avoids the test entirely.

Reported-by: Mads Ynddal <mads@ynddal.dk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Mads Ynddal <m.ynddal@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Mads Ynddal <m.ynddal@samsung.com>
Message-id: 20230619140216.402530-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
      * KVM does not currently allow us to lie to the guest about its
      * ID/feature registers, so the guest always sees what the host has.
      */
-    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)
-        ? cpu_isar_feature(aa64_fp_simd, cpu)
-        : cpu_isar_feature(aa32_vfp, cpu)) {
-        cpu->has_vfp = true;
-        if (!kvm_enabled()) {
-            qdev_property_add_static(DEVICE(obj), &arm_cpu_has_vfp_property);
+    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+        if (cpu_isar_feature(aa64_fp_simd, cpu)) {
+            cpu->has_vfp = true;
+            cpu->has_vfp_d32 = true;
+            if (tcg_enabled() || qtest_enabled()) {
+                qdev_property_add_static(DEVICE(obj),
+                                         &arm_cpu_has_vfp_property);
+            }
         }
-    }
-
-    if (cpu->has_vfp && cpu_isar_feature(aa32_simd_r32, cpu)) {
-        cpu->has_vfp_d32 = true;
-        if (!kvm_enabled()) {
+    } else if (cpu_isar_feature(aa32_vfp, cpu)) {
+        cpu->has_vfp = true;
+        if (cpu_isar_feature(aa32_simd_r32, cpu)) {
+            cpu->has_vfp_d32 = true;
             /*
              * The permitted values of the SIMDReg bits [3:0] on
              * Armv8-A are either 0b0000 and 0b0010. On such CPUs,
              * make sure that has_vfp_d32 can not be set to false.
              */
-            if (!(arm_feature(&cpu->env, ARM_FEATURE_V8) &&
-                  !arm_feature(&cpu->env, ARM_FEATURE_M))) {
+            if ((tcg_enabled() || qtest_enabled())
+                && !(arm_feature(&cpu->env, ARM_FEATURE_V8)
+                     && !arm_feature(&cpu->env, ARM_FEATURE_M))) {
                 qdev_property_add_static(DEVICE(obj),
                                          &arm_cpu_has_vfp_d32_property);
             }
-- 
2.34.1

From: Shashi Mallela <shashi.mallela@linaro.org>

Create ITS as part of SBSA platform GIC initialization.

GIC ITS information is in DeviceTree so TF-A can pass it to EDK2.

Bumping platform version to 0.2 as this is important hardware change.

Signed-off-by: Shashi Mallela <shashi.mallela@linaro.org>
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-id: 20230619170913.517373-2-marcin.juszkiewicz@linaro.org
Co-authored-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/system/arm/sbsa.rst | 14 ++++++++++++++
 hw/arm/sbsa-ref.c        | 33 ++++++++++++++++++++++++++++++---
 2 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/docs/system/arm/sbsa.rst b/docs/system/arm/sbsa.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/arm/sbsa.rst
+++ b/docs/system/arm/sbsa.rst
@@ -XXX,XX +XXX,XX @@ to be a complete compliant DT. It currently reports:
    - platform version
    - GIC addresses
 
+Platform version
+''''''''''''''''
+
 The platform version is only for informing platform firmware about
 what kind of ``sbsa-ref`` board it is running on. It is neither
 a QEMU versioned machine type nor a reflection of the level of the
@@ -XXX,XX +XXX,XX @@ SBSA/SystemReady SR support provided.
 The ``machine-version-major`` value is updated when changes breaking
 fw compatibility are introduced. The ``machine-version-minor`` value
 is updated when features are added that don't break fw compatibility.
+
+Platform version changes:
+
+0.0
+  Devicetree holds information about CPUs, memory and platform version.
+
+0.1
+  GIC information is present in devicetree.
+
+0.2
+  GIC ITS information is present in devicetree.
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -XXX,XX +XXX,XX @@ enum {
     SBSA_CPUPERIPHS,
     SBSA_GIC_DIST,
     SBSA_GIC_REDIST,
+    SBSA_GIC_ITS,
     SBSA_SECURE_EC,
     SBSA_GWDT_WS0,
     SBSA_GWDT_REFRESH,
@@ -XXX,XX +XXX,XX @@ static const MemMapEntry sbsa_ref_memmap[] = {
     [SBSA_CPUPERIPHS] =         { 0x40000000, 0x00040000 },
     [SBSA_GIC_DIST] =           { 0x40060000, 0x00010000 },
     [SBSA_GIC_REDIST] =         { 0x40080000, 0x04000000 },
+    [SBSA_GIC_ITS] =            { 0x44081000, 0x00020000 },
     [SBSA_SECURE_EC] =          { 0x50000000, 0x00001000 },
     [SBSA_GWDT_REFRESH] =       { 0x50010000, 0x00001000 },
     [SBSA_GWDT_CONTROL] =       { 0x50011000, 0x00001000 },
@@ -XXX,XX +XXX,XX @@ static void sbsa_fdt_add_gic_node(SBSAMachineState *sms)
                                  2, sbsa_ref_memmap[SBSA_GIC_REDIST].base,
                                  2, sbsa_ref_memmap[SBSA_GIC_REDIST].size);
 
+    nodename = g_strdup_printf("/intc/its");
+    qemu_fdt_add_subnode(sms->fdt, nodename);
+    qemu_fdt_setprop_sized_cells(sms->fdt, nodename, "reg",
+                                 2, sbsa_ref_memmap[SBSA_GIC_ITS].base,
+                                 2, sbsa_ref_memmap[SBSA_GIC_ITS].size);
+
     g_free(nodename);
 }
+
 /*
  * Firmware on this machine only uses ACPI table to load OS, these limited
  * device tree nodes are just to let firmware know the info which varies from
@@ -XXX,XX +XXX,XX @@ static void create_fdt(SBSAMachineState *sms)
      *                        fw compatibility.
      */
     qemu_fdt_setprop_cell(fdt, "/", "machine-version-major", 0);
-    qemu_fdt_setprop_cell(fdt, "/", "machine-version-minor", 1);
+    qemu_fdt_setprop_cell(fdt, "/", "machine-version-minor", 2);
 
     if (ms->numa_state->have_numa_distance) {
         int size = nb_numa_nodes * nb_numa_nodes * 3 * sizeof(uint32_t);
@@ -XXX,XX +XXX,XX @@ static void create_secure_ram(SBSAMachineState *sms,
     memory_region_add_subregion(secure_sysmem, base, secram);
 }
 
-static void create_gic(SBSAMachineState *sms)
+static void create_its(SBSAMachineState *sms)
+{
+    const char *itsclass = its_class_name();
+    DeviceState *dev;
+
+    dev = qdev_new(itsclass);
+
+    object_property_set_link(OBJECT(dev), "parent-gicv3", OBJECT(sms->gic),
+                             &error_abort);
+    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, sbsa_ref_memmap[SBSA_GIC_ITS].base);
+}
+
+static void create_gic(SBSAMachineState *sms, MemoryRegion *mem)
 {
     unsigned int smp_cpus = MACHINE(sms)->smp.cpus;
     SysBusDevice *gicbusdev;
@@ -XXX,XX +XXX,XX @@ static void create_gic(SBSAMachineState *sms)
     qdev_prop_set_uint32(sms->gic, "len-redist-region-count", 1);
     qdev_prop_set_uint32(sms->gic, "redist-region-count[0]", redist0_count);
 
+    object_property_set_link(OBJECT(sms->gic), "sysmem",
+                             OBJECT(mem), &error_fatal);
+    qdev_prop_set_bit(sms->gic, "has-lpi", true);
+
     gicbusdev = SYS_BUS_DEVICE(sms->gic);
     sysbus_realize_and_unref(gicbusdev, &error_fatal);
     sysbus_mmio_map(gicbusdev, 0, sbsa_ref_memmap[SBSA_GIC_DIST].base);
@@ -XXX,XX +XXX,XX @@ static void create_gic(SBSAMachineState *sms)
         sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus,
                            qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
     }
+    create_its(sms);
 }
 
 static void create_uart(const SBSAMachineState *sms, int uart,
@@ -XXX,XX +XXX,XX @@ static void sbsa_ref_init(MachineState *machine)
 
     create_secure_ram(sms, secure_sysmem);
 
-    create_gic(sms);
+    create_gic(sms, sysmem);
 
     create_uart(sms, SBSA_UART, sysmem, serial_hd(0));
     create_uart(sms, SBSA_SECURE_UART, secure_sysmem, serial_hd(1));
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Brown bag time: store instead of load results in uninitialized temp.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1704
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620134659.817559-1-richard.henderson@linaro.org
Fixes: e6dd5e782be ("target/arm: Use tcg_gen_qemu_{ld, st}_i128 in gen_sve_{ld, st}r")
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/translate-sve.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -XXX,XX +XXX,XX @@ void gen_sve_str(DisasContext *s, TCGv_ptr base, int vofs,
     /* Predicate register stores can be any multiple of 2.  */
     if (len_remain >= 8) {
         t0 = tcg_temp_new_i64();
-        tcg_gen_st_i64(t0, base, vofs + len_align);
+        tcg_gen_ld_i64(t0, base, vofs + len_align);
         tcg_gen_qemu_st_i64(t0, clean_addr, midx, MO_LEUQ | MO_ATOM_NONE);
         len_remain -= 8;
         len_align += 8;
-- 
2.34.1

The xkb official name for the Arabic keyboard layout is 'ara'.
However xkb has for at least the past 15 years also permitted it to
be named via the legacy synonym 'ar'.  In xkeyboard-config 2.39 this
synoynm was removed, which breaks compilation of QEMU:

FAILED: pc-bios/keymaps/ar
/home/fred/qemu-git/src/qemu/build-full/qemu-keymap -f pc-bios/keymaps/ar -l ar
xkbcommon: ERROR: Couldn't find file "symbols/ar" in include paths
xkbcommon: ERROR: 1 include paths searched:
xkbcommon: ERROR: 	/usr/share/X11/xkb
xkbcommon: ERROR: 3 include paths could not be added:
xkbcommon: ERROR: 	/home/fred/.config/xkb
xkbcommon: ERROR: 	/home/fred/.xkb
xkbcommon: ERROR: 	/etc/xkb
xkbcommon: ERROR: Abandoning symbols file "(unnamed)"
xkbcommon: ERROR: Failed to compile xkb_symbols
xkbcommon: ERROR: Failed to compile keymap

The upstream xkeyboard-config change removing the compat
mapping is:
https://gitlab.freedesktop.org/xkeyboard-config/xkeyboard-config/-/commit/470ad2cd8fea84d7210377161d86b31999bb5ea6

Make QEMU always ask for the 'ara' xkb layout, which should work on
both older and newer xkeyboard-config.  We leave the QEMU name for
this keyboard layout as 'ar'; it is not the only one where our name
for it deviates from the xkb standard name.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230620162024.1132013-1-peter.maydell@linaro.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1709
---
 pc-bios/keymaps/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pc-bios/keymaps/meson.build b/pc-bios/keymaps/meson.build
index XXXXXXX..XXXXXXX 100644
--- a/pc-bios/keymaps/meson.build
+++ b/pc-bios/keymaps/meson.build
@@ -XXX,XX +XXX,XX @@
 keymaps = {
-  'ar': '-l ar',
+  'ar': '-l ara',
   'bepo': '-l fr -v dvorak',
   'cz': '-l cz',
   'da': '-l dk',
-- 
2.34.1