Series comparison

-[PULL 00/39] target-arm queue
+[PULL 00/33] target-arm queue
-Most of this is the Neon decodetree patches, followed by Edgar's versal cleanups.
+The following changes since commit bf4460a8d9a86f6cfe05d7a7f470c48e3a93d8b2:
-thanks
+  Merge tag 'pull-tcg-20230123' of https://gitlab.com/rth7680/qemu into staging (2023-02-03 09:30:45 +0000)
 -- PMM
 The following changes since commit 2ef486e76d64436be90f7359a3071fb2a56ce835:
   Merge remote-tracking branch 'remotes/marcel/tags/rdma-pull-request' into staging (2020-05-03 14:12:56 +0100)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200504
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20230203
-for you to fetch changes up to 9aefc6cf9b73f66062d2f914a0136756e7a28211:
+for you to fetch changes up to bb18151d8bd9bedc497ee9d4e8d81b39a4e5bbf6:
-  target/arm: Move gen_ function typedefs to translate.h (2020-05-04 12:59:26 +0100)
+  target/arm: Enable FEAT_FGT on '-cpu max' (2023-02-03 12:59:24 +0000)
 ----------------------------------------------------------------
 target-arm queue:
- * Start of conversion of Neon insns to decodetree
+ * Fix physical address resolution for Stage2
- * versal board: support SD and RTC
+ * pl011: refactoring, implement reset method
- * Implement ARMv8.2-TTS2UXN
+ * Support GICv3 with hvf acceleration
- * Make VQDMULL undefined when U=1
+ * sbsa-ref: remove cortex-a76 from list of supported cpus
- * Some minor code cleanups
+ * Correct syndrome for ATS12NSO* traps at Secure EL1
  * Fix priority of HSTR_EL2 traps vs UNDEFs
  * Implement FEAT_FGT for '-cpu max'
 ----------------------------------------------------------------
-Edgar E. Iglesias (11):
+Alexander Graf (3):
-      hw/arm: versal: Remove inclusion of arm_gicv3_common.h
+      hvf: arm: Add support for GICv3
-      hw/arm: versal: Move misplaced comment
+      hw/arm/virt: Consolidate GIC finalize logic
-      hw/arm: versal-virt: Fix typo xlnx-ve -> xlnx-versal
+      hw/arm/virt: Make accels in GIC finalize logic explicit
       hw/arm: versal: Embed the UARTs into the SoC type
       hw/arm: versal: Embed the GEMs into the SoC type
       hw/arm: versal: Embed the ADMAs into the SoC type
       hw/arm: versal: Embed the APUs into the SoC type
       hw/arm: versal: Add support for SD
       hw/arm: versal: Add support for the RTC
       hw/arm: versal-virt: Add support for SD
       hw/arm: versal-virt: Add support for the RTC
-Fredrik Strupe (1):
+Evgeny Iakovlev (4):
-      target/arm: Make VQDMULL undefined when U=1
+      hw/char/pl011: refactor FIFO depth handling code
       hw/char/pl011: add post_load hook for backwards-compatibility
       hw/char/pl011: implement a reset method
       hw/char/pl011: better handling of FIFO flags on LCR reset
-Peter Maydell (25):
+Marcin Juszkiewicz (1):
-      target/arm: Don't use a TLB for ARMMMUIdx_Stage2
+      sbsa-ref: remove cortex-a76 from list of supported cpus
       target/arm: Use enum constant in get_phys_addr_lpae() call
       target/arm: Add new 's1_is_el0' argument to get_phys_addr_lpae()
       target/arm: Implement ARMv8.2-TTS2UXN
       target/arm: Use correct variable for setting 'max' cpu's ID_AA64DFR0
       target/arm/translate-vfp.inc.c: Remove duplicate simd_r32 check
       target/arm: Don't allow Thumb Neon insns without FEATURE_NEON
       target/arm: Add stubs for AArch32 Neon decodetree
       target/arm: Convert VCMLA (vector) to decodetree
       target/arm: Convert VCADD (vector) to decodetree
       target/arm: Convert V[US]DOT (vector) to decodetree
       target/arm: Convert VFM[AS]L (vector) to decodetree
       target/arm: Convert VCMLA (scalar) to decodetree
       target/arm: Convert V[US]DOT (scalar) to decodetree
       target/arm: Convert VFM[AS]L (scalar) to decodetree
       target/arm: Convert Neon load/store multiple structures to decodetree
       target/arm: Convert Neon 'load single structure to all lanes' to decodetree
       target/arm: Convert Neon 'load/store single structure' to decodetree
       target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree
       target/arm: Convert Neon 3-reg-same logic ops to decodetree
       target/arm: Convert Neon 3-reg-same VMAX/VMIN to decodetree
       target/arm: Convert Neon 3-reg-same comparisons to decodetree
       target/arm: Convert Neon 3-reg-same VQADD/VQSUB to decodetree
       target/arm: Convert Neon 3-reg-same VMUL, VMLA, VMLS, VSHL to decodetree
       target/arm: Move gen_ function typedefs to translate.h
-Philippe Mathieu-Daudé (2):
+Peter Maydell (23):
-      hw/arm/mps2-tz: Use TYPE_IOTKIT instead of hardcoded string
+      target/arm: Name AT_S1E1RP and AT_S1E1WP cpregs correctly
-      target/arm: Use uint64_t for midr field in CPU state struct
+      target/arm: Correct syndrome for ATS12NSO* at Secure EL1
       target/arm: Remove CP_ACCESS_TRAP_UNCATEGORIZED_{EL2, EL3}
       target/arm: Move do_coproc_insn() syndrome calculation earlier
       target/arm: All UNDEF-at-EL0 traps take priority over HSTR_EL2 traps
       target/arm: Make HSTR_EL2 traps take priority over UNDEF-at-EL1
       target/arm: Disable HSTR_EL2 traps if EL2 is not enabled
       target/arm: Define the FEAT_FGT registers
       target/arm: Implement FGT trapping infrastructure
       target/arm: Mark up sysregs for HFGRTR bits 0..11
       target/arm: Mark up sysregs for HFGRTR bits 12..23
       target/arm: Mark up sysregs for HFGRTR bits 24..35
       target/arm: Mark up sysregs for HFGRTR bits 36..63
       target/arm: Mark up sysregs for HDFGRTR bits 0..11
       target/arm: Mark up sysregs for HDFGRTR bits 12..63
       target/arm: Mark up sysregs for HFGITR bits 0..11
       target/arm: Mark up sysregs for HFGITR bits 12..17
       target/arm: Mark up sysregs for HFGITR bits 18..47
       target/arm: Mark up sysregs for HFGITR bits 48..63
       target/arm: Implement the HFGITR_EL2.ERET trap
       target/arm: Implement the HFGITR_EL2.SVC_EL0 and SVC_EL1 traps
       target/arm: Implement MDCR_EL2.TDCC and MDCR_EL3.TDCC traps
       target/arm: Enable FEAT_FGT on '-cpu max'
- include/hw/arm/xlnx-versal.h    |  31 +-
+Richard Henderson (2):
- target/arm/cpu-param.h          |   2 +-
+      hw/arm: Use TYPE_ARM_SMMUV3
- target/arm/cpu.h                |  38 ++-
+      target/arm: Fix physical address resolution for Stage2
  target/arm/translate-a64.h      |   9 -
  target/arm/translate.h          |  26 ++
  target/arm/neon-dp.decode       |  86 +++++
  target/arm/neon-ls.decode       |  52 +++
  target/arm/neon-shared.decode   |  66 ++++
  hw/arm/mps2-tz.c                |   2 +-
  hw/arm/xlnx-versal-virt.c       |  74 ++++-
  hw/arm/xlnx-versal.c            | 115 +++++--
  target/arm/cpu.c                |   3 +-
  target/arm/cpu64.c              |   8 +-
  target/arm/helper.c             | 183 ++++------
  target/arm/translate-a64.c      |  17 -
  target/arm/translate-neon.inc.c | 714 +++++++++++++++++++++++++++++++++++++++
  target/arm/translate-vfp.inc.c  |   6 -
  target/arm/translate.c          | 716 +++-------------------------------------
  target/arm/Makefile.objs        |  18 +
 files changed, 1302 insertions(+), 864 deletions(-)
  create mode 100644 target/arm/neon-dp.decode
  create mode 100644 target/arm/neon-ls.decode
  create mode 100644 target/arm/neon-shared.decode
  create mode 100644 target/arm/translate-neon.inc.c
+ docs/system/arm/emulation.rst |   1 +
+ include/hw/arm/virt.h         |  15 +-
+ include/hw/char/pl011.h       |   5 +-
+ target/arm/cpregs.h           | 484 +++++++++++++++++++++++++++++++++++++++++-
+ target/arm/cpu.h              |  18 ++
+ target/arm/internals.h        |  20 ++
+ target/arm/syndrome.h         |  10 +
+ target/arm/translate.h        |   6 +
+ hw/arm/sbsa-ref.c             |   4 +-
+ hw/arm/virt.c                 | 203 +++++++++---------
+ hw/char/pl011.c               |  93 ++++++--
+ hw/intc/arm_gicv3_cpuif.c     |  18 +-
+ target/arm/cpu64.c            |   1 +
+ target/arm/debug_helper.c     |  46 +++-
+ target/arm/helper.c           | 245 ++++++++++++++++++++-
+ target/arm/hvf/hvf.c          | 151 +++++++++++++
+ target/arm/op_helper.c        |  58 ++++-
+ target/arm/ptw.c              |   2 +-
+ target/arm/translate-a64.c    |  22 +-
+ target/arm/translate.c        | 125 +++++++----
+ target/arm/hvf/trace-events   |   2 +
+files changed, 1340 insertions(+), 189 deletions(-)

-[PULL 01/39] target/arm: Make VQDMULL undefined when U=1
+Deleted patch
-From: Fredrik Strupe <fredrik@strupe.net>
-According to Arm ARM, VQDMULL is only valid when U=0, while having
-U=1 is unallocated.
-Signed-off-by: Fredrik Strupe <fredrik@strupe.net>
-Fixes: 695272dcb976 ("target-arm: Handle UNDEF cases for Neon 3-regs-different-widths")
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/translate.c | 2 +-
-file changed, 1 insertion(+), 1 deletion(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                     {0, 0, 0, 0}, /* VMLSL */
-                     {0, 0, 0, 9}, /* VQDMLSL */
-                     {0, 0, 0, 0}, /* Integer VMULL */
--                    {0, 0, 0, 1}, /* VQDMULL */
-+                    {0, 0, 0, 9}, /* VQDMULL */
-                     {0, 0, 0, 0xa}, /* Polynomial VMULL */
-                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
-                 };
---
-.20.1

-[PULL 11/39] hw/arm: versal-virt: Fix typo xlnx-ve -> xlnx-versal
+[PULL 01/33] hw/arm: Use TYPE_ARM_SMMUV3
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-Fix typo xlnx-ve -> xlnx-versal.
+Use the macro instead of two explicit string literals.
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Eric Auger <eric.auger@redhat.com>
-Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20230124232059.4017615-1-richard.henderson@linaro.org
 Message-id: 20200427181649.26851-4-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/xlnx-versal-virt.c | 2 +-
+ hw/arm/sbsa-ref.c | 3 ++-
-file changed, 1 insertion(+), 1 deletion(-)
+ hw/arm/virt.c     | 2 +-
 files changed, 3 insertions(+), 2 deletions(-)
-diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
+diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-versal-virt.c
+--- a/hw/arm/sbsa-ref.c
-+++ b/hw/arm/xlnx-versal-virt.c
++++ b/hw/arm/sbsa-ref.c
-@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
+@@ -XXX,XX +XXX,XX @@
-         psci_conduit = QEMU_PSCI_CONDUIT_SMC;
+ #include "exec/hwaddr.h"
  #include "kvm_arm.h"
  #include "hw/arm/boot.h"
 +#include "hw/arm/smmuv3.h"
  #include "hw/block/flash.h"
  #include "hw/boards.h"
  #include "hw/ide/internal.h"
@@ -XXX,XX +XXX,XX @@ static void create_smmu(const SBSAMachineState *sms, PCIBus *bus)
      DeviceState *dev;
      int i;
 -    dev = qdev_new("arm-smmuv3");
 +    dev = qdev_new(TYPE_ARM_SMMUV3);
      object_property_set_link(OBJECT(dev), "primary-bus", OBJECT(bus),
                               &error_abort);
 diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/virt.c
 +++ b/hw/arm/virt.c
@@ -XXX,XX +XXX,XX @@ static void create_smmu(const VirtMachineState *vms,
          return;
      }
--    sysbus_init_child_obj(OBJECT(machine), "xlnx-ve", &s->soc,
+-    dev = qdev_new("arm-smmuv3");
-+    sysbus_init_child_obj(OBJECT(machine), "xlnx-versal", &s->soc,
++    dev = qdev_new(TYPE_ARM_SMMUV3);
-                           sizeof(s->soc), TYPE_XLNX_VERSAL);
-     object_property_set_link(OBJECT(&s->soc), OBJECT(machine->ram),
+     object_property_set_link(OBJECT(dev), "primary-bus", OBJECT(bus),
-                              "ddr", &error_abort);
+                              &error_abort);
 --
-.20.1
+.34.1

-[PULL 10/39] hw/arm: versal: Move misplaced comment
+[PULL 02/33] target/arm: Fix physical address resolution for Stage2
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-Move misplaced comment.
+Conversion to probe_access_full missed applying the page offset.
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Cc: qemu-stable@nongnu.org
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reported-by: Sid Manning <sidneym@quicinc.com>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
-Message-id: 20200427181649.26851-3-edgar.iglesias@gmail.com
+Message-id: 20230126233134.103193-1-richard.henderson@linaro.org
 Fixes: f3639a64f602 ("target/arm: Use softmmu tlbs for page table walking")
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/xlnx-versal.c | 2 +-
+ target/arm/ptw.c | 2 +-
 file changed, 1 insertion(+), 1 deletion(-)
-diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
+diff --git a/target/arm/ptw.c b/target/arm/ptw.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-versal.c
+--- a/target/arm/ptw.c
-+++ b/hw/arm/xlnx-versal.c
++++ b/target/arm/ptw.c
-@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
+@@ -XXX,XX +XXX,XX @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
+         if (unlikely(flags & TLB_INVALID_MASK)) {
-         obj = object_new(XLNX_VERSAL_ACPU_TYPE);
+             goto fail;
          if (!obj) {
 -            /* Secondary CPUs start in PSCI powered-down state */
              error_report("Unable to create apu.cpu[%d] of type %s",
                           i, XLNX_VERSAL_ACPU_TYPE);
              exit(EXIT_FAILURE);
@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
          object_property_set_int(obj, s->cfg.psci_conduit,
                                  "psci-conduit", &error_abort);
          if (i) {
 +            /* Secondary CPUs start in PSCI powered-down state */
              object_property_set_bool(obj, true,
                                       "start-powered-off", &error_abort);
          }
+-        ptw->out_phys = full->phys_addr;
++        ptw->out_phys = full->phys_addr | (addr & ~TARGET_PAGE_MASK);
+         ptw->out_rw = full->prot & PAGE_WRITE;
+         pte_attrs = full->pte_attrs;
+         pte_secure = full->attrs.secure;
 --
-.20.1
+.34.1

-[PULL 17/39] hw/arm: versal: Add support for the RTC
+[PULL 03/33] hw/char/pl011: refactor FIFO depth handling code
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+From: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
-hw/arm: versal: Add support for the RTC.
+PL011 can be in either of 2 modes depending guest config: FIFO and
 single register. The last mode could be viewed as a 1-element-deep FIFO.
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Current code open-codes a bunch of depth-dependent logic. Refactor FIFO
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+depth handling code to isolate calculating current FIFO depth.
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+One functional (albeit guest-invisible) side-effect of this change is
-Message-id: 20200427181649.26851-10-edgar.iglesias@gmail.com
+that previously we would always increment s->read_pos in UARTDR read
 handler even if FIFO was disabled, now we are limiting read_pos to not
 exceed FIFO depth (read_pos itself is reset to 0 if user disables FIFO).
 Signed-off-by: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
 Message-id: 20230123162304.26254-2-eiakovlev@linux.microsoft.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/arm/xlnx-versal.h |  8 ++++++++
+ include/hw/char/pl011.h |  5 ++++-
- hw/arm/xlnx-versal.c         | 21 +++++++++++++++++++++
+ hw/char/pl011.c         | 30 ++++++++++++++++++------------
-files changed, 29 insertions(+)
+files changed, 22 insertions(+), 13 deletions(-)
-diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
+diff --git a/include/hw/char/pl011.h b/include/hw/char/pl011.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/xlnx-versal.h
+--- a/include/hw/char/pl011.h
-+++ b/include/hw/arm/xlnx-versal.h
++++ b/include/hw/char/pl011.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ OBJECT_DECLARE_SIMPLE_TYPE(PL011State, PL011)
- #include "hw/char/pl011.h"
+ /* This shares the same struct (and cast macro) as the base pl011 device */
- #include "hw/dma/xlnx-zdma.h"
+ #define TYPE_PL011_LUMINARY "pl011_luminary"
- #include "hw/net/cadence_gem.h"
-+#include "hw/rtc/xlnx-zynqmp-rtc.h"
++/* Depth of UART FIFO in bytes, when FIFO mode is enabled (else depth == 1) */
++#define PL011_FIFO_DEPTH 16
  #define TYPE_XLNX_VERSAL "xlnx-versal"
  #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
          struct {
              SDHCIState sd[XLNX_VERSAL_NR_SDS];
          } iou;
 +
-+        XlnxZynqMPRTC rtc;
+ struct PL011State {
-     } pmc;
+     SysBusDevice parent_obj;
-     struct {
+@@ -XXX,XX +XXX,XX @@ struct PL011State {
-@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
+     uint32_t dmacr;
- #define VERSAL_GEM1_IRQ_0          58
+     uint32_t int_enabled;
- #define VERSAL_GEM1_WAKE_IRQ_0     59
+     uint32_t int_level;
- #define VERSAL_ADMA_IRQ_0          60
+-    uint32_t read_fifo[16];
-+#define VERSAL_RTC_APB_ERR_IRQ     121
++    uint32_t read_fifo[PL011_FIFO_DEPTH];
- #define VERSAL_SD0_IRQ_0           126
+     uint32_t ilpr;
-+#define VERSAL_RTC_ALARM_IRQ       142
+     uint32_t ibrd;
-+#define VERSAL_RTC_SECONDS_IRQ     143
+     uint32_t fbrd;
+diff --git a/hw/char/pl011.c b/hw/char/pl011.c
  /* Architecturally reserved IRQs suitable for virtualization.  */
  #define VERSAL_RSVD_IRQ_FIRST 111
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
  #define MM_PMC_SD0_SIZE             0x10000
  #define MM_PMC_CRP                  0xf1260000U
  #define MM_PMC_CRP_SIZE             0x10000
 +#define MM_PMC_RTC                  0xf12a0000
 +#define MM_PMC_RTC_SIZE             0x10000
  #endif
 diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-versal.c
+--- a/hw/char/pl011.c
-+++ b/hw/arm/xlnx-versal.c
++++ b/hw/char/pl011.c
-@@ -XXX,XX +XXX,XX @@ static void versal_create_sds(Versal *s, qemu_irq *pic)
+@@ -XXX,XX +XXX,XX @@ static void pl011_update(PL011State *s)
      }
  }
-+static void versal_create_rtc(Versal *s, qemu_irq *pic)
++static bool pl011_is_fifo_enabled(PL011State *s)
 +{
-+    SysBusDevice *sbd;
++    return (s->lcr & 0x10) != 0;
 +    MemoryRegion *mr;
 +
 +    sysbus_init_child_obj(OBJECT(s), "rtc", &s->pmc.rtc, sizeof(s->pmc.rtc),
 +                          TYPE_XLNX_ZYNQMP_RTC);
 +    sbd = SYS_BUS_DEVICE(&s->pmc.rtc);
 +    qdev_init_nofail(DEVICE(sbd));
 +
 +    mr = sysbus_mmio_get_region(sbd, 0);
 +    memory_region_add_subregion(&s->mr_ps, MM_PMC_RTC, mr);
 +
 +    /*
 +     * TODO: Connect the ALARM and SECONDS interrupts once our RTC model
 +     * supports them.
 +     */
 +    sysbus_connect_irq(sbd, 1, pic[VERSAL_RTC_APB_ERR_IRQ]);
 +}
 +
- /* This takes the board allocated linear DDR memory and creates aliases
++static inline unsigned pl011_get_fifo_depth(PL011State *s)
-  * for each split DDR range/aperture on the Versal address map.
++{
-  */
++    /* Note: FIFO depth is expected to be power-of-2 */
-@@ -XXX,XX +XXX,XX @@ static void versal_realize(DeviceState *dev, Error **errp)
++    return pl011_is_fifo_enabled(s) ? PL011_FIFO_DEPTH : 1;
-     versal_create_gems(s, pic);
++}
-     versal_create_admas(s, pic);
++
-     versal_create_sds(s, pic);
+ static uint64_t pl011_read(void *opaque, hwaddr offset,
-+    versal_create_rtc(s, pic);
+                            unsigned size)
-     versal_map_ddr(s);
+ {
-     versal_unimp(s);
+@@ -XXX,XX +XXX,XX @@ static uint64_t pl011_read(void *opaque, hwaddr offset,
+         c = s->read_fifo[s->read_pos];
          if (s->read_count > 0) {
              s->read_count--;
 -            if (++s->read_pos == 16)
 -                s->read_pos = 0;
 +            s->read_pos = (s->read_pos + 1) & (pl011_get_fifo_depth(s) - 1);
          }
          if (s->read_count == 0) {
              s->flags |= PL011_FLAG_RXFE;
@@ -XXX,XX +XXX,XX @@ static int pl011_can_receive(void *opaque)
      PL011State *s = (PL011State *)opaque;
      int r;
 -    if (s->lcr & 0x10) {
 -        r = s->read_count < 16;
 -    } else {
 -        r = s->read_count < 1;
 -    }
 +    r = s->read_count < pl011_get_fifo_depth(s);
      trace_pl011_can_receive(s->lcr, s->read_count, r);
      return r;
  }
@@ -XXX,XX +XXX,XX @@ static void pl011_put_fifo(void *opaque, uint32_t value)
  {
      PL011State *s = (PL011State *)opaque;
      int slot;
 +    unsigned pipe_depth;
 -    slot = s->read_pos + s->read_count;
 -    if (slot >= 16)
 -        slot -= 16;
 +    pipe_depth = pl011_get_fifo_depth(s);
 +    slot = (s->read_pos + s->read_count) & (pipe_depth - 1);
      s->read_fifo[slot] = value;
      s->read_count++;
      s->flags &= ~PL011_FLAG_RXFE;
      trace_pl011_put_fifo(value, s->read_count);
 -    if (!(s->lcr & 0x10) || s->read_count == 16) {
 +    if (s->read_count == pipe_depth) {
          trace_pl011_put_fifo_full();
          s->flags |= PL011_FLAG_RXFF;
      }
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_pl011 = {
          VMSTATE_UINT32(dmacr, PL011State),
          VMSTATE_UINT32(int_enabled, PL011State),
          VMSTATE_UINT32(int_level, PL011State),
 -        VMSTATE_UINT32_ARRAY(read_fifo, PL011State, 16),
 +        VMSTATE_UINT32_ARRAY(read_fifo, PL011State, PL011_FIFO_DEPTH),
          VMSTATE_UINT32(ilpr, PL011State),
          VMSTATE_UINT32(ibrd, PL011State),
          VMSTATE_UINT32(fbrd, PL011State),
 --
-.20.1
+.34.1

-[PULL 31/39] target/arm: Convert Neon 'load single structure to all lanes' to decodetree
+[PULL 04/33] hw/char/pl011: add post_load hook for backwards-compatibility
-Convert the Neon "load single structure to all lanes" insns to
+From: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
 decodetree.
+Previous change slightly modified the way we handle data writes when
+FIFO is disabled. Previously we kept incrementing read_pos and were
+storing data at that position, although we only have a
+single-register-deep FIFO now. Then we changed it to always store data
+at pos 0.
+If guest disables FIFO and the proceeds to read data, it will work out
+fine, because we still read from current read_pos before setting it to
+.
+However, to make code less fragile, introduce a post_load hook for
+PL011State and move fixup read FIFO state when FIFO is disabled. Since
+we are introducing a post_load hook, also do some sanity checking on
+untrusted incoming input state.
+Signed-off-by: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
+Message-id: 20230123162304.26254-3-eiakovlev@linux.microsoft.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-13-peter.maydell@linaro.org
 ---
- target/arm/neon-ls.decode       |  5 +++
+ hw/char/pl011.c | 25 +++++++++++++++++++++++++
- target/arm/translate-neon.inc.c | 73 +++++++++++++++++++++++++++++++++
+file changed, 25 insertions(+)
  target/arm/translate.c          | 55 +------------------------
 files changed, 80 insertions(+), 53 deletions(-)
-diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
+diff --git a/hw/char/pl011.c b/hw/char/pl011.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-ls.decode
+--- a/hw/char/pl011.c
-+++ b/target/arm/neon-ls.decode
++++ b/hw/char/pl011.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_pl011_clock = {
+     }
- VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
+ };
-                vd=%vd_dp
 +static int pl011_post_load(void *opaque, int version_id)
 +{
 +    PL011State* s = opaque;
 +
-+# Neon load single element to all lanes
++    /* Sanity-check input state */
-+
++    if (s->read_pos >= ARRAY_SIZE(s->read_fifo) ||
-+VLD_all_lanes  1111 0100 1 . 1 0 rn:4 .... 11 n:2 size:2 t:1 a:1 rm:4 \
++        s->read_count > ARRAY_SIZE(s->read_fifo)) {
-+               vd=%vd_dp
++        return -1;
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
      gen_neon_ldst_base_update(s, a->rm, a->rn, nregs * interleave * 8);
      return true;
  }
 +
 +static bool trans_VLD_all_lanes(DisasContext *s, arg_VLD_all_lanes *a)
 +{
 +    /* Neon load single structure to all lanes */
 +    int reg, stride, vec_size;
 +    int vd = a->vd;
 +    int size = a->size;
 +    int nregs = a->n + 1;
 +    TCGv_i32 addr, tmp;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
-+    /* UNDEF accesses to D16-D31 if they don't exist */
++    if (!pl011_is_fifo_enabled(s) && s->read_count > 0 && s->read_pos > 0) {
-+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
++        /*
-+        return false;
++         * Older versions of PL011 didn't ensure that the single
 +         * character in the FIFO in FIFO-disabled mode is in
 +         * element 0 of the array; convert to follow the current
 +         * code's assumptions.
 +         */
 +        s->read_fifo[0] = s->read_fifo[s->read_pos];
 +        s->read_pos = 0;
 +    }
 +
-+    if (size == 3) {
++    return 0;
-+        if (nregs != 4 || a->a == 0) {
++}
 +            return false;
 +        }
 +        /* For VLD4 size == 3 a == 1 means 32 bits at 16 byte alignment */
 +        size = 2;
 +    }
 +    if (nregs == 1 && a->a == 1 && size == 0) {
 +        return false;
 +    }
 +    if (nregs == 3 && a->a == 1) {
 +        return false;
 +    }
 +
-+    if (!vfp_access_check(s)) {
+ static const VMStateDescription vmstate_pl011 = {
-+        return true;
+     .name = "pl011",
-+    }
+     .version_id = 2,
-+
+     .minimum_version_id = 2,
-+    /*
++    .post_load = pl011_post_load,
-+     * VLD1 to all lanes: T bit indicates how many Dregs to write.
+     .fields = (VMStateField[]) {
-+     * VLD2/3/4 to all lanes: T bit indicates register stride.
+         VMSTATE_UINT32(readbuff, PL011State),
-+     */
+         VMSTATE_UINT32(flags, PL011State),
 +    stride = a->t ? 2 : 1;
 +    vec_size = nregs == 1 ? stride * 8 : 8;
 +
 +    tmp = tcg_temp_new_i32();
 +    addr = tcg_temp_new_i32();
 +    load_reg_var(s, addr, a->rn);
 +    for (reg = 0; reg < nregs; reg++) {
 +        gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 +                        s->be_data | size);
 +        if ((vd & 1) && vec_size == 16) {
 +            /*
 +             * We cannot write 16 bytes at once because the
 +             * destination is unaligned.
 +             */
 +            tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0),
 +                                 8, 8, tmp);
 +            tcg_gen_gvec_mov(0, neon_reg_offset(vd + 1, 0),
 +                             neon_reg_offset(vd, 0), 8, 8);
 +        } else {
 +            tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0),
 +                                 vec_size, vec_size, tmp);
 +        }
 +        tcg_gen_addi_i32(addr, addr, 1 << size);
 +        vd += stride;
 +    }
 +    tcg_temp_free_i32(tmp);
 +    tcg_temp_free_i32(addr);
 +
 +    gen_neon_ldst_base_update(s, a->rm, a->rn, (1 << size) * nregs);
 +
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      int size;
      int reg;
      int load;
 -    int vec_size;
      TCGv_i32 addr;
      TCGv_i32 tmp;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      } else {
          size = (insn >> 10) & 3;
          if (size == 3) {
 -            /* Load single element to all lanes.  */
 -            int a = (insn >> 4) & 1;
 -            if (!load) {
 -                return 1;
 -            }
 -            size = (insn >> 6) & 3;
 -            nregs = ((insn >> 8) & 3) + 1;
 -
 -            if (size == 3) {
 -                if (nregs != 4 || a == 0) {
 -                    return 1;
 -                }
 -                /* For VLD4 size==3 a == 1 means 32 bits at 16 byte alignment */
 -                size = 2;
 -            }
 -            if (nregs == 1 && a == 1 && size == 0) {
 -                return 1;
 -            }
 -            if (nregs == 3 && a == 1) {
 -                return 1;
 -            }
 -            addr = tcg_temp_new_i32();
 -            load_reg_var(s, addr, rn);
 -
 -            /* VLD1 to all lanes: bit 5 indicates how many Dregs to write.
 -             * VLD2/3/4 to all lanes: bit 5 indicates register stride.
 -             */
 -            stride = (insn & (1 << 5)) ? 2 : 1;
 -            vec_size = nregs == 1 ? stride * 8 : 8;
 -
 -            tmp = tcg_temp_new_i32();
 -            for (reg = 0; reg < nregs; reg++) {
 -                gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 -                                s->be_data | size);
 -                if ((rd & 1) && vec_size == 16) {
 -                    /* We cannot write 16 bytes at once because the
 -                     * destination is unaligned.
 -                     */
 -                    tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
 -                                         8, 8, tmp);
 -                    tcg_gen_gvec_mov(0, neon_reg_offset(rd + 1, 0),
 -                                     neon_reg_offset(rd, 0), 8, 8);
 -                } else {
 -                    tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
 -                                         vec_size, vec_size, tmp);
 -                }
 -                tcg_gen_addi_i32(addr, addr, 1 << size);
 -                rd += stride;
 -            }
 -            tcg_temp_free_i32(tmp);
 -            tcg_temp_free_i32(addr);
 -            stride = (1 << size) * nregs;
 +            /* Load single element to all lanes -- handled by decodetree  */
 +            return 1;
          } else {
              /* Single element.  */
              int idx = (insn >> 4) & 0xf;
 --
-.20.1
+.34.1

-[PULL 15/39] hw/arm: versal: Embed the APUs into the SoC type
+[PULL 05/33] hw/char/pl011: implement a reset method
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+From: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
-Embed the APUs into the SoC type.
+PL011 currently lacks a reset method. Implement it.
-Suggested-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20230123162304.26254-4-eiakovlev@linux.microsoft.com
 Reviewed-by: Luc Michel <luc.michel@greensocs.com>
 Message-id: 20200427181649.26851-8-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/arm/xlnx-versal.h |  2 +-
+ hw/char/pl011.c | 26 +++++++++++++++++++++-----
- hw/arm/xlnx-versal-virt.c    |  4 ++--
+file changed, 21 insertions(+), 5 deletions(-)
  hw/arm/xlnx-versal.c         | 19 +++++--------------
 files changed, 8 insertions(+), 17 deletions(-)
-diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
+diff --git a/hw/char/pl011.c b/hw/char/pl011.c
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/xlnx-versal.h
+--- a/hw/char/pl011.c
-+++ b/include/hw/arm/xlnx-versal.h
++++ b/hw/char/pl011.c
-@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
+@@ -XXX,XX +XXX,XX @@ static void pl011_init(Object *obj)
-     struct {
+     s->clk = qdev_init_clock_in(DEVICE(obj), "clk", pl011_clock_update, s,
-         struct {
+                                 ClockUpdate);
-             MemoryRegion mr;
--            ARMCPU *cpu[XLNX_VERSAL_NR_ACPUS];
+-    s->read_trigger = 1;
-+            ARMCPU cpu[XLNX_VERSAL_NR_ACPUS];
+-    s->ifl = 0x12;
-             GICv3State gic;
+-    s->cr = 0x300;
-         } apu;
+-    s->flags = 0x90;
      } fpd;
 diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/xlnx-versal-virt.c
 +++ b/hw/arm/xlnx-versal-virt.c
@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
      s->binfo.get_dtb = versal_virt_get_dtb;
      s->binfo.modify_dtb = versal_virt_modify_dtb;
      if (machine->kernel_filename) {
 -        arm_load_kernel(s->soc.fpd.apu.cpu[0], machine, &s->binfo);
 +        arm_load_kernel(&s->soc.fpd.apu.cpu[0], machine, &s->binfo);
      } else {
 -        AddressSpace *as = arm_boot_address_space(s->soc.fpd.apu.cpu[0],
 +        AddressSpace *as = arm_boot_address_space(&s->soc.fpd.apu.cpu[0],
                                                    &s->binfo);
          /* Some boot-loaders (e.g u-boot) don't like blobs at address 0 (NULL).
           * Offset things by 4K.  */
 diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/xlnx-versal.c
 +++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
      for (i = 0; i < ARRAY_SIZE(s->fpd.apu.cpu); i++) {
          Object *obj;
 -        char *name;
 -
--        obj = object_new(XLNX_VERSAL_ACPU_TYPE);
+     s->id = pl011_id_arm;
 -        if (!obj) {
 -            error_report("Unable to create apu.cpu[%d] of type %s",
 -                         i, XLNX_VERSAL_ACPU_TYPE);
 -            exit(EXIT_FAILURE);
 -        }
 -
 -        name = g_strdup_printf("apu-cpu[%d]", i);
 -        object_property_add_child(OBJECT(s), name, obj, &error_fatal);
 -        g_free(name);
 +        object_initialize_child(OBJECT(s), "apu-cpu[*]",
 +                                &s->fpd.apu.cpu[i], sizeof(s->fpd.apu.cpu[i]),
 +                                XLNX_VERSAL_ACPU_TYPE, &error_abort, NULL);
 +        obj = OBJECT(&s->fpd.apu.cpu[i]);
          object_property_set_int(obj, s->cfg.psci_conduit,
                                  "psci-conduit", &error_abort);
          if (i) {
@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
          object_property_set_link(obj, OBJECT(&s->fpd.apu.mr), "memory",
                                   &error_abort);
          object_property_set_bool(obj, true, "realized", &error_fatal);
 -        s->fpd.apu.cpu[i] = ARM_CPU(obj);
      }
  }
-@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_gic(Versal *s, qemu_irq *pic)
+@@ -XXX,XX +XXX,XX @@ static void pl011_realize(DeviceState *dev, Error **errp)
-     }
+                              pl011_event, NULL, s, NULL, true);
+ }
-     for (i = 0; i < nr_apu_cpus; i++) {
--        DeviceState *cpudev = DEVICE(s->fpd.apu.cpu[i]);
++static void pl011_reset(DeviceState *dev)
-+        DeviceState *cpudev = DEVICE(&s->fpd.apu.cpu[i]);
++{
-         int ppibase = XLNX_VERSAL_NR_IRQS + i * GIC_INTERNAL + GIC_NR_SGIS;
++    PL011State *s = PL011(dev);
-         qemu_irq maint_irq;
++
-         int ti;
++    s->lcr = 0;
 +    s->rsr = 0;
 +    s->dmacr = 0;
 +    s->int_enabled = 0;
 +    s->int_level = 0;
 +    s->ilpr = 0;
 +    s->ibrd = 0;
 +    s->fbrd = 0;
 +    s->read_pos = 0;
 +    s->read_count = 0;
 +    s->read_trigger = 1;
 +    s->ifl = 0x12;
 +    s->cr = 0x300;
 +    s->flags = 0x90;
 +}
 +
  static void pl011_class_init(ObjectClass *oc, void *data)
  {
      DeviceClass *dc = DEVICE_CLASS(oc);
      dc->realize = pl011_realize;
 +    dc->reset = pl011_reset;
      dc->vmsd = &vmstate_pl011;
      device_class_set_props(dc, pl011_properties);
  }
 --
-.20.1
+.34.1

-[PULL 19/39] hw/arm: versal-virt: Add support for the RTC
+[PULL 06/33] hw/char/pl011: better handling of FIFO flags on LCR reset
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+From: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
-Add support for the RTC.
+Current FIFO handling code does not reset RXFE/RXFF flags when guest
 resets FIFO by writing to UARTLCR register, although internal FIFO state
 is reset to 0 read count. Actual guest-visible flag update will happen
 only on next data read or write attempt. As a result of that any guest
 that expects RXFE flag to be set (and RXFF to be cleared) after resetting
 FIFO will never see that happen.
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Signed-off-by: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20230123162304.26254-5-eiakovlev@linux.microsoft.com
 Message-id: 20200427181649.26851-12-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/xlnx-versal-virt.c | 22 ++++++++++++++++++++++
+ hw/char/pl011.c | 18 +++++++++++++-----
-file changed, 22 insertions(+)
+file changed, 13 insertions(+), 5 deletions(-)
-diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
+diff --git a/hw/char/pl011.c b/hw/char/pl011.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-versal-virt.c
+--- a/hw/char/pl011.c
-+++ b/hw/arm/xlnx-versal-virt.c
++++ b/hw/char/pl011.c
-@@ -XXX,XX +XXX,XX @@ static void fdt_add_sd_nodes(VersalVirt *s)
+@@ -XXX,XX +XXX,XX @@ static inline unsigned pl011_get_fifo_depth(PL011State *s)
-     }
+     return pl011_is_fifo_enabled(s) ? PL011_FIFO_DEPTH : 1;
  }
-+static void fdt_add_rtc_node(VersalVirt *s)
++static inline void pl011_reset_fifo(PL011State *s)
 +{
-+    const char compat[] = "xlnx,zynqmp-rtc";
++    s->read_count = 0;
-+    const char interrupt_names[] = "alarm\0sec";
++    s->read_pos = 0;
 +    char *name = g_strdup_printf("/rtc@%x", MM_PMC_RTC);
 +
-+    qemu_fdt_add_subnode(s->fdt, name);
++    /* Reset FIFO flags */
-+
++    s->flags &= ~(PL011_FLAG_RXFF | PL011_FLAG_TXFF);
-+    qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
++    s->flags |= PL011_FLAG_RXFE | PL011_FLAG_TXFE;
 +                           GIC_FDT_IRQ_TYPE_SPI, VERSAL_RTC_ALARM_IRQ,
 +                           GIC_FDT_IRQ_FLAGS_LEVEL_HI,
 +                           GIC_FDT_IRQ_TYPE_SPI, VERSAL_RTC_SECONDS_IRQ,
 +                           GIC_FDT_IRQ_FLAGS_LEVEL_HI);
 +    qemu_fdt_setprop(s->fdt, name, "interrupt-names",
 +                     interrupt_names, sizeof(interrupt_names));
 +    qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
 +                                 2, MM_PMC_RTC, 2, MM_PMC_RTC_SIZE);
 +    qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat));
 +    g_free(name);
 +}
 +
- static void fdt_nop_memory_nodes(void *fdt, Error **errp)
+ static uint64_t pl011_read(void *opaque, hwaddr offset,
                             unsigned size)
  {
-     Error *err = NULL;
+@@ -XXX,XX +XXX,XX @@ static void pl011_write(void *opaque, hwaddr offset,
-@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
+     case 11: /* UARTLCR_H */
-     fdt_add_timer_nodes(s);
+         /* Reset the FIFO state on FIFO enable or disable */
-     fdt_add_zdma_nodes(s);
+         if ((s->lcr ^ value) & 0x10) {
-     fdt_add_sd_nodes(s);
+-            s->read_count = 0;
-+    fdt_add_rtc_node(s);
+-            s->read_pos = 0;
-     fdt_add_cpu_nodes(s, psci_conduit);
++            pl011_reset_fifo(s);
-     fdt_add_clk_node(s, "/clk125", 125000000, s->phandle.clk_125Mhz);
+         }
-     fdt_add_clk_node(s, "/clk25", 25000000, s->phandle.clk_25Mhz);
+         if ((s->lcr ^ value) & 0x1) {
              int break_enable = value & 0x1;
@@ -XXX,XX +XXX,XX @@ static void pl011_reset(DeviceState *dev)
      s->ilpr = 0;
      s->ibrd = 0;
      s->fbrd = 0;
 -    s->read_pos = 0;
 -    s->read_count = 0;
      s->read_trigger = 1;
      s->ifl = 0x12;
      s->cr = 0x300;
 -    s->flags = 0x90;
 +    s->flags = 0;
 +    pl011_reset_fifo(s);
  }
  static void pl011_class_init(ObjectClass *oc, void *data)
 --
-.20.1
+.34.1

-[PULL 18/39] hw/arm: versal-virt: Add support for SD
+[PULL 07/33] hvf: arm: Add support for GICv3
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+From: Alexander Graf <agraf@csgraf.de>
-Add support for SD.
+We currently only support GICv2 emulation. To also support GICv3, we will
+need to pass a few system registers into their respective handler functions.
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+This patch adds support for HVF to call into the TCG callbacks for GICv3
-Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+system register handlers. This is safe because the GICv3 TCG code is generic
-Message-id: 20200427181649.26851-11-edgar.iglesias@gmail.com
+as long as we limit ourselves to EL0 and EL1 - which are the only modes
 supported by HVF.
 To make sure nobody trips over that, we also annotate callbacks that don't
 work in HVF mode, such as EL state change hooks.
 With GICv3 support in place, we can run with more than 8 vCPUs.
 Signed-off-by: Alexander Graf <agraf@csgraf.de>
 Message-id: 20230128224459.70676-1-agraf@csgraf.de
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/xlnx-versal-virt.c | 46 +++++++++++++++++++++++++++++++++++++++
+ hw/intc/arm_gicv3_cpuif.c   |  16 +++-
-file changed, 46 insertions(+)
+ target/arm/hvf/hvf.c        | 151 ++++++++++++++++++++++++++++++++++++
+ target/arm/hvf/trace-events |   2 +
-diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
+files changed, 168 insertions(+), 1 deletion(-)
 diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-versal-virt.c
+--- a/hw/intc/arm_gicv3_cpuif.c
-+++ b/hw/arm/xlnx-versal-virt.c
++++ b/hw/intc/arm_gicv3_cpuif.c
 @@ -XXX,XX +XXX,XX @@
- #include "hw/arm/sysbus-fdt.h"
+ #include "hw/irq.h"
  #include "hw/arm/fdt.h"
  #include "cpu.h"
-+#include "hw/qdev-properties.h"
+ #include "target/arm/cpregs.h"
- #include "hw/arm/xlnx-versal.h"
++#include "sysemu/tcg.h"
++#include "sysemu/qtest.h"
- #define TYPE_XLNX_VERSAL_VIRT_MACHINE MACHINE_TYPE_NAME("xlnx-versal-virt")
-@@ -XXX,XX +XXX,XX @@ static void fdt_add_zdma_nodes(VersalVirt *s)
+ /*
   * Special case return value from hppvi_index(); must be larger than
@@ -XXX,XX +XXX,XX @@ void gicv3_init_cpuif(GICv3State *s)
           * which case we'd get the wrong value.
           * So instead we define the regs with no ri->opaque info, and
           * get back to the GICv3CPUState from the CPUARMState.
 +         *
 +         * These CP regs callbacks can be called from either TCG or HVF code.
           */
          define_arm_cp_regs(cpu, gicv3_cpuif_reginfo);
@@ -XXX,XX +XXX,XX @@ void gicv3_init_cpuif(GICv3State *s)
                  define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr23_reginfo);
              }
          }
 -        arm_register_el_change_hook(cpu, gicv3_cpuif_el_change_hook, cs);
 +        if (tcg_enabled() || qtest_enabled()) {
 +            /*
 +             * We can only trap EL changes with TCG. However the GIC interrupt
 +             * state only changes on EL changes involving EL2 or EL3, so for
 +             * the non-TCG case this is OK, as EL2 and EL3 can't exist.
 +             */
 +            arm_register_el_change_hook(cpu, gicv3_cpuif_el_change_hook, cs);
 +        } else {
 +            assert(!arm_feature(&cpu->env, ARM_FEATURE_EL2));
 +            assert(!arm_feature(&cpu->env, ARM_FEATURE_EL3));
 +        }
      }
  }
+diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
-+static void fdt_add_sd_nodes(VersalVirt *s)
+index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/hvf/hvf.c
 +++ b/target/arm/hvf/hvf.c
@@ -XXX,XX +XXX,XX @@
  #define SYSREG_PMCCNTR_EL0    SYSREG(3, 3, 9, 13, 0)
  #define SYSREG_PMCCFILTR_EL0  SYSREG(3, 3, 14, 15, 7)
 +#define SYSREG_ICC_AP0R0_EL1     SYSREG(3, 0, 12, 8, 4)
 +#define SYSREG_ICC_AP0R1_EL1     SYSREG(3, 0, 12, 8, 5)
 +#define SYSREG_ICC_AP0R2_EL1     SYSREG(3, 0, 12, 8, 6)
 +#define SYSREG_ICC_AP0R3_EL1     SYSREG(3, 0, 12, 8, 7)
 +#define SYSREG_ICC_AP1R0_EL1     SYSREG(3, 0, 12, 9, 0)
 +#define SYSREG_ICC_AP1R1_EL1     SYSREG(3, 0, 12, 9, 1)
 +#define SYSREG_ICC_AP1R2_EL1     SYSREG(3, 0, 12, 9, 2)
 +#define SYSREG_ICC_AP1R3_EL1     SYSREG(3, 0, 12, 9, 3)
 +#define SYSREG_ICC_ASGI1R_EL1    SYSREG(3, 0, 12, 11, 6)
 +#define SYSREG_ICC_BPR0_EL1      SYSREG(3, 0, 12, 8, 3)
 +#define SYSREG_ICC_BPR1_EL1      SYSREG(3, 0, 12, 12, 3)
 +#define SYSREG_ICC_CTLR_EL1      SYSREG(3, 0, 12, 12, 4)
 +#define SYSREG_ICC_DIR_EL1       SYSREG(3, 0, 12, 11, 1)
 +#define SYSREG_ICC_EOIR0_EL1     SYSREG(3, 0, 12, 8, 1)
 +#define SYSREG_ICC_EOIR1_EL1     SYSREG(3, 0, 12, 12, 1)
 +#define SYSREG_ICC_HPPIR0_EL1    SYSREG(3, 0, 12, 8, 2)
 +#define SYSREG_ICC_HPPIR1_EL1    SYSREG(3, 0, 12, 12, 2)
 +#define SYSREG_ICC_IAR0_EL1      SYSREG(3, 0, 12, 8, 0)
 +#define SYSREG_ICC_IAR1_EL1      SYSREG(3, 0, 12, 12, 0)
 +#define SYSREG_ICC_IGRPEN0_EL1   SYSREG(3, 0, 12, 12, 6)
 +#define SYSREG_ICC_IGRPEN1_EL1   SYSREG(3, 0, 12, 12, 7)
 +#define SYSREG_ICC_PMR_EL1       SYSREG(3, 0, 4, 6, 0)
 +#define SYSREG_ICC_RPR_EL1       SYSREG(3, 0, 12, 11, 3)
 +#define SYSREG_ICC_SGI0R_EL1     SYSREG(3, 0, 12, 11, 7)
 +#define SYSREG_ICC_SGI1R_EL1     SYSREG(3, 0, 12, 11, 5)
 +#define SYSREG_ICC_SRE_EL1       SYSREG(3, 0, 12, 12, 5)
 +
  #define WFX_IS_WFE (1 << 0)
  #define TMR_CTL_ENABLE  (1 << 0)
@@ -XXX,XX +XXX,XX @@ static bool is_id_sysreg(uint32_t reg)
             SYSREG_CRM(reg) < 8;
  }
 +static uint32_t hvf_reg2cp_reg(uint32_t reg)
 +{
-+    const char clocknames[] = "clk_xin\0clk_ahb";
++    return ENCODE_AA64_CP_REG(CP_REG_ARM64_SYSREG_CP,
-+    const char compat[] = "arasan,sdhci-8.9a";
++                              (reg >> SYSREG_CRN_SHIFT) & SYSREG_CRN_MASK,
-+    int i;
++                              (reg >> SYSREG_CRM_SHIFT) & SYSREG_CRM_MASK,
-+
++                              (reg >> SYSREG_OP0_SHIFT) & SYSREG_OP0_MASK,
-+    for (i = ARRAY_SIZE(s->soc.pmc.iou.sd) - 1; i >= 0; i--) {
++                              (reg >> SYSREG_OP1_SHIFT) & SYSREG_OP1_MASK,
-+        uint64_t addr = MM_PMC_SD0 + MM_PMC_SD0_SIZE * i;
++                              (reg >> SYSREG_OP2_SHIFT) & SYSREG_OP2_MASK);
-+        char *name = g_strdup_printf("/sdhci@%" PRIx64, addr);
++}
 +
-+        qemu_fdt_add_subnode(s->fdt, name);
++static bool hvf_sysreg_read_cp(CPUState *cpu, uint32_t reg, uint64_t *val)
-+
++{
-+        qemu_fdt_setprop_cells(s->fdt, name, "clocks",
++    ARMCPU *arm_cpu = ARM_CPU(cpu);
-+                               s->phandle.clk_25Mhz, s->phandle.clk_25Mhz);
++    CPUARMState *env = &arm_cpu->env;
-+        qemu_fdt_setprop(s->fdt, name, "clock-names",
++    const ARMCPRegInfo *ri;
-+                         clocknames, sizeof(clocknames));
++
-+        qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
++    ri = get_arm_cp_reginfo(arm_cpu->cp_regs, hvf_reg2cp_reg(reg));
-+                               GIC_FDT_IRQ_TYPE_SPI, VERSAL_SD0_IRQ_0 + i * 2,
++    if (ri) {
-+                               GIC_FDT_IRQ_FLAGS_LEVEL_HI);
++        if (ri->accessfn) {
-+        qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
++            if (ri->accessfn(env, ri, true) != CP_ACCESS_OK) {
-+                                     2, addr, 2, MM_PMC_SD0_SIZE);
++                return false;
-+        qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat));
++            }
-+        g_free(name);
++        }
 +        if (ri->type & ARM_CP_CONST) {
 +            *val = ri->resetvalue;
 +        } else if (ri->readfn) {
 +            *val = ri->readfn(env, ri);
 +        } else {
 +            *val = CPREG_FIELD64(env, ri);
 +        }
 +        trace_hvf_vgic_read(ri->name, *val);
 +        return true;
 +    }
++
++    return false;
 +}
 +
- static void fdt_nop_memory_nodes(void *fdt, Error **errp)
+ static int hvf_sysreg_read(CPUState *cpu, uint32_t reg, uint32_t rt)
  {
-     Error *err = NULL;
+     ARMCPU *arm_cpu = ARM_CPU(cpu);
-@@ -XXX,XX +XXX,XX @@ static void create_virtio_regions(VersalVirt *s)
+@@ -XXX,XX +XXX,XX @@ static int hvf_sysreg_read(CPUState *cpu, uint32_t reg, uint32_t rt)
      case SYSREG_OSDLR_EL1:
          /* Dummy register */
          break;
 +    case SYSREG_ICC_AP0R0_EL1:
 +    case SYSREG_ICC_AP0R1_EL1:
 +    case SYSREG_ICC_AP0R2_EL1:
 +    case SYSREG_ICC_AP0R3_EL1:
 +    case SYSREG_ICC_AP1R0_EL1:
 +    case SYSREG_ICC_AP1R1_EL1:
 +    case SYSREG_ICC_AP1R2_EL1:
 +    case SYSREG_ICC_AP1R3_EL1:
 +    case SYSREG_ICC_ASGI1R_EL1:
 +    case SYSREG_ICC_BPR0_EL1:
 +    case SYSREG_ICC_BPR1_EL1:
 +    case SYSREG_ICC_DIR_EL1:
 +    case SYSREG_ICC_EOIR0_EL1:
 +    case SYSREG_ICC_EOIR1_EL1:
 +    case SYSREG_ICC_HPPIR0_EL1:
 +    case SYSREG_ICC_HPPIR1_EL1:
 +    case SYSREG_ICC_IAR0_EL1:
 +    case SYSREG_ICC_IAR1_EL1:
 +    case SYSREG_ICC_IGRPEN0_EL1:
 +    case SYSREG_ICC_IGRPEN1_EL1:
 +    case SYSREG_ICC_PMR_EL1:
 +    case SYSREG_ICC_SGI0R_EL1:
 +    case SYSREG_ICC_SGI1R_EL1:
 +    case SYSREG_ICC_SRE_EL1:
 +    case SYSREG_ICC_CTLR_EL1:
 +        /* Call the TCG sysreg handler. This is only safe for GICv3 regs. */
 +        if (!hvf_sysreg_read_cp(cpu, reg, &val)) {
 +            hvf_raise_exception(cpu, EXCP_UDEF, syn_uncategorized());
 +        }
 +        break;
      default:
          if (is_id_sysreg(reg)) {
              /* ID system registers read as RES0 */
@@ -XXX,XX +XXX,XX @@ static void pmswinc_write(CPUARMState *env, uint64_t value)
      }
  }
-+static void sd_plugin_card(SDHCIState *sd, DriveInfo *di)
++static bool hvf_sysreg_write_cp(CPUState *cpu, uint32_t reg, uint64_t val)
 +{
-+    BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL;
++    ARMCPU *arm_cpu = ARM_CPU(cpu);
-+    DeviceState *card;
++    CPUARMState *env = &arm_cpu->env;
-+
++    const ARMCPRegInfo *ri;
-+    card = qdev_create(qdev_get_child_bus(DEVICE(sd), "sd-bus"), TYPE_SD_CARD);
++
-+    object_property_add_child(OBJECT(sd), "card[*]", OBJECT(card),
++    ri = get_arm_cp_reginfo(arm_cpu->cp_regs, hvf_reg2cp_reg(reg));
-+                              &error_fatal);
++
-+    qdev_prop_set_drive(card, "drive", blk, &error_fatal);
++    if (ri) {
-+    object_property_set_bool(OBJECT(card), true, "realized", &error_fatal);
++        if (ri->accessfn) {
 +            if (ri->accessfn(env, ri, false) != CP_ACCESS_OK) {
 +                return false;
 +            }
 +        }
 +        if (ri->writefn) {
 +            ri->writefn(env, ri, val);
 +        } else {
 +            CPREG_FIELD64(env, ri) = val;
 +        }
 +
 +        trace_hvf_vgic_write(ri->name, val);
 +        return true;
 +    }
 +
 +    return false;
 +}
 +
- static void versal_virt_init(MachineState *machine)
+ static int hvf_sysreg_write(CPUState *cpu, uint32_t reg, uint64_t val)
  {
-     VersalVirt *s = XLNX_VERSAL_VIRT_MACHINE(machine);
+     ARMCPU *arm_cpu = ARM_CPU(cpu);
-     int psci_conduit = QEMU_PSCI_CONDUIT_DISABLED;
+@@ -XXX,XX +XXX,XX @@ static int hvf_sysreg_write(CPUState *cpu, uint32_t reg, uint64_t val)
-+    int i;
+     case SYSREG_OSDLR_EL1:
+         /* Dummy register */
-     /*
+         break;
-      * If the user provides an Operating System to be loaded, we expect them
++    case SYSREG_ICC_AP0R0_EL1:
-@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
++    case SYSREG_ICC_AP0R1_EL1:
-     fdt_add_gic_nodes(s);
++    case SYSREG_ICC_AP0R2_EL1:
-     fdt_add_timer_nodes(s);
++    case SYSREG_ICC_AP0R3_EL1:
-     fdt_add_zdma_nodes(s);
++    case SYSREG_ICC_AP1R0_EL1:
-+    fdt_add_sd_nodes(s);
++    case SYSREG_ICC_AP1R1_EL1:
-     fdt_add_cpu_nodes(s, psci_conduit);
++    case SYSREG_ICC_AP1R2_EL1:
-     fdt_add_clk_node(s, "/clk125", 125000000, s->phandle.clk_125Mhz);
++    case SYSREG_ICC_AP1R3_EL1:
-     fdt_add_clk_node(s, "/clk25", 25000000, s->phandle.clk_25Mhz);
++    case SYSREG_ICC_ASGI1R_EL1:
-@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
++    case SYSREG_ICC_BPR0_EL1:
-     memory_region_add_subregion_overlap(get_system_memory(),
++    case SYSREG_ICC_BPR1_EL1:
-, &s->soc.fpd.apu.mr, 0);
++    case SYSREG_ICC_CTLR_EL1:
++    case SYSREG_ICC_DIR_EL1:
-+    /* Plugin SD cards.  */
++    case SYSREG_ICC_EOIR0_EL1:
-+    for (i = 0; i < ARRAY_SIZE(s->soc.pmc.iou.sd); i++) {
++    case SYSREG_ICC_EOIR1_EL1:
-+        sd_plugin_card(&s->soc.pmc.iou.sd[i], drive_get_next(IF_SD));
++    case SYSREG_ICC_HPPIR0_EL1:
-+    }
++    case SYSREG_ICC_HPPIR1_EL1:
-+
++    case SYSREG_ICC_IAR0_EL1:
-     s->binfo.ram_size = machine->ram_size;
++    case SYSREG_ICC_IAR1_EL1:
-     s->binfo.loader_start = 0x0;
++    case SYSREG_ICC_IGRPEN0_EL1:
-     s->binfo.get_dtb = versal_virt_get_dtb;
++    case SYSREG_ICC_IGRPEN1_EL1:
 +    case SYSREG_ICC_PMR_EL1:
 +    case SYSREG_ICC_SGI0R_EL1:
 +    case SYSREG_ICC_SGI1R_EL1:
 +    case SYSREG_ICC_SRE_EL1:
 +        /* Call the TCG sysreg handler. This is only safe for GICv3 regs. */
 +        if (!hvf_sysreg_write_cp(cpu, reg, val)) {
 +            hvf_raise_exception(cpu, EXCP_UDEF, syn_uncategorized());
 +        }
 +        break;
      default:
          cpu_synchronize_state(cpu);
          trace_hvf_unhandled_sysreg_write(env->pc, reg,
 diff --git a/target/arm/hvf/trace-events b/target/arm/hvf/trace-events
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/hvf/trace-events
 +++ b/target/arm/hvf/trace-events
@@ -XXX,XX +XXX,XX @@ hvf_unknown_hvc(uint64_t x0) "unknown HVC! 0x%016"PRIx64
  hvf_unknown_smc(uint64_t x0) "unknown SMC! 0x%016"PRIx64
  hvf_exit(uint64_t syndrome, uint32_t ec, uint64_t pc) "exit: 0x%"PRIx64" [ec=0x%x pc=0x%"PRIx64"]"
  hvf_psci_call(uint64_t x0, uint64_t x1, uint64_t x2, uint64_t x3, uint32_t cpuid) "PSCI Call x0=0x%016"PRIx64" x1=0x%016"PRIx64" x2=0x%016"PRIx64" x3=0x%016"PRIx64" cpu=0x%x"
 +hvf_vgic_write(const char *name, uint64_t val) "vgic write to %s [val=0x%016"PRIx64"]"
 +hvf_vgic_read(const char *name, uint64_t val) "vgic read from %s [val=0x%016"PRIx64"]"
 --
-.20.1
+.34.1

-[PULL 16/39] hw/arm: versal: Add support for SD
+[PULL 08/33] hw/arm/virt: Consolidate GIC finalize logic
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+From: Alexander Graf <agraf@csgraf.de>
-Add support for SD.
+Up to now, the finalize_gic_version() code open coded what is essentially
+a support bitmap match between host/emulation environment and desired
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+target GIC type.
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+This open coding leads to undesirable side effects. For example, a VM with
-Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+KVM and -smp 10 will automatically choose GICv3 while the same command
-Message-id: 20200427181649.26851-9-edgar.iglesias@gmail.com
+line with TCG will stay on GICv2 and fail the launch.
 This patch combines the TCG and KVM matching code paths by making
 everything a 2 pass process. First, we determine which GIC versions the
 current environment is able to support, then we go through a single
 state machine to determine which target GIC mode that means for us.
 After this patch, the only user noticable changes should be consolidated
 error messages as well as TCG -M virt supporting -smp > 8 automatically.
 Signed-off-by: Alexander Graf <agraf@csgraf.de>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Cornelia Huck <cohuck@redhat.com>
 Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
 Message-id: 20221223090107.98888-2-agraf@csgraf.de
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/arm/xlnx-versal.h | 12 ++++++++++++
+ include/hw/arm/virt.h |  15 ++--
- hw/arm/xlnx-versal.c         | 31 +++++++++++++++++++++++++++++++
+ hw/arm/virt.c         | 198 ++++++++++++++++++++++--------------------
-files changed, 43 insertions(+)
+files changed, 112 insertions(+), 101 deletions(-)
-diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
+diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/xlnx-versal.h
+--- a/include/hw/arm/virt.h
-+++ b/include/hw/arm/xlnx-versal.h
++++ b/include/hw/arm/virt.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ typedef enum VirtMSIControllerType {
+ } VirtMSIControllerType;
- #include "hw/sysbus.h"
- #include "hw/arm/boot.h"
+ typedef enum VirtGICType {
-+#include "hw/sd/sdhci.h"
+-    VIRT_GIC_VERSION_MAX,
- #include "hw/intc/arm_gicv3.h"
+-    VIRT_GIC_VERSION_HOST,
- #include "hw/char/pl011.h"
+-    VIRT_GIC_VERSION_2,
- #include "hw/dma/xlnx-zdma.h"
+-    VIRT_GIC_VERSION_3,
-@@ -XXX,XX +XXX,XX @@
+-    VIRT_GIC_VERSION_4,
- #define XLNX_VERSAL_NR_UARTS   2
++    VIRT_GIC_VERSION_MAX = 0,
- #define XLNX_VERSAL_NR_GEMS    2
++    VIRT_GIC_VERSION_HOST = 1,
- #define XLNX_VERSAL_NR_ADMAS   8
++    /* The concrete GIC values have to match the GIC version number */
-+#define XLNX_VERSAL_NR_SDS     2
++    VIRT_GIC_VERSION_2 = 2,
- #define XLNX_VERSAL_NR_IRQS    192
++    VIRT_GIC_VERSION_3 = 3,
++    VIRT_GIC_VERSION_4 = 4,
- typedef struct Versal {
+     VIRT_GIC_VERSION_NOSEL,
-@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
+ } VirtGICType;
-         } iou;
-     } lpd;
++#define VIRT_GIC_VERSION_2_MASK BIT(VIRT_GIC_VERSION_2)
++#define VIRT_GIC_VERSION_3_MASK BIT(VIRT_GIC_VERSION_3)
-+    /* The Platform Management Controller subsystem.  */
++#define VIRT_GIC_VERSION_4_MASK BIT(VIRT_GIC_VERSION_4)
-+    struct {
++
-+        struct {
+ struct VirtMachineClass {
-+            SDHCIState sd[XLNX_VERSAL_NR_SDS];
+     MachineClass parent;
-+        } iou;
+     bool disallow_affinity_adjustment;
-+    } pmc;
+diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 +
      struct {
          MemoryRegion *mr_ddr;
          uint32_t psci_conduit;
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
  #define VERSAL_GEM1_IRQ_0          58
  #define VERSAL_GEM1_WAKE_IRQ_0     59
  #define VERSAL_ADMA_IRQ_0          60
 +#define VERSAL_SD0_IRQ_0           126
  /* Architecturally reserved IRQs suitable for virtualization.  */
  #define VERSAL_RSVD_IRQ_FIRST 111
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
  #define MM_FPD_CRF                  0xfd1a0000U
  #define MM_FPD_CRF_SIZE             0x140000
 +#define MM_PMC_SD0                  0xf1040000U
 +#define MM_PMC_SD0_SIZE             0x10000
  #define MM_PMC_CRP                  0xf1260000U
  #define MM_PMC_CRP_SIZE             0x10000
  #endif
 diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-versal.c
+--- a/hw/arm/virt.c
-+++ b/hw/arm/xlnx-versal.c
++++ b/hw/arm/virt.c
-@@ -XXX,XX +XXX,XX @@ static void versal_create_admas(Versal *s, qemu_irq *pic)
+@@ -XXX,XX +XXX,XX @@ static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
      }
  }
-+#define SDHCI_CAPABILITIES  0x280737ec6481 /* Same as on ZynqMP.  */
++static VirtGICType finalize_gic_version_do(const char *accel_name,
-+static void versal_create_sds(Versal *s, qemu_irq *pic)
++                                           VirtGICType gic_version,
 +                                           int gics_supported,
 +                                           unsigned int max_cpus)
 +{
-+    int i;
++    /* Convert host/max/nosel to GIC version number */
-+
++    switch (gic_version) {
-+    for (i = 0; i < ARRAY_SIZE(s->pmc.iou.sd); i++) {
++    case VIRT_GIC_VERSION_HOST:
-+        DeviceState *dev;
++        if (!kvm_enabled()) {
-+        MemoryRegion *mr;
++            error_report("gic-version=host requires KVM");
-+
++            exit(1);
-+        sysbus_init_child_obj(OBJECT(s), "sd[*]",
++        }
-+                              &s->pmc.iou.sd[i], sizeof(s->pmc.iou.sd[i]),
++
-+                              TYPE_SYSBUS_SDHCI);
++        /* For KVM, gic-version=host means gic-version=max */
-+        dev = DEVICE(&s->pmc.iou.sd[i]);
++        return finalize_gic_version_do(accel_name, VIRT_GIC_VERSION_MAX,
-+
++                                       gics_supported, max_cpus);
-+        object_property_set_uint(OBJECT(dev),
++    case VIRT_GIC_VERSION_MAX:
-+                                 3, "sd-spec-version", &error_fatal);
++        if (gics_supported & VIRT_GIC_VERSION_4_MASK) {
-+        object_property_set_uint(OBJECT(dev), SDHCI_CAPABILITIES, "capareg",
++            gic_version = VIRT_GIC_VERSION_4;
-+                                 &error_fatal);
++        } else if (gics_supported & VIRT_GIC_VERSION_3_MASK) {
-+        object_property_set_uint(OBJECT(dev), UHS_I, "uhs", &error_fatal);
++            gic_version = VIRT_GIC_VERSION_3;
-+        qdev_init_nofail(dev);
++        } else {
-+
++            gic_version = VIRT_GIC_VERSION_2;
-+        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
++        }
-+        memory_region_add_subregion(&s->mr_ps,
++        break;
-+                                    MM_PMC_SD0 + i * MM_PMC_SD0_SIZE, mr);
++    case VIRT_GIC_VERSION_NOSEL:
-+
++        if ((gics_supported & VIRT_GIC_VERSION_2_MASK) &&
-+        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0,
++            max_cpus <= GIC_NCPU) {
-+                           pic[VERSAL_SD0_IRQ_0 + i * 2]);
++            gic_version = VIRT_GIC_VERSION_2;
 +        } else if (gics_supported & VIRT_GIC_VERSION_3_MASK) {
 +            /*
 +             * in case the host does not support v2 emulation or
 +             * the end-user requested more than 8 VCPUs we now default
 +             * to v3. In any case defaulting to v2 would be broken.
 +             */
 +            gic_version = VIRT_GIC_VERSION_3;
 +        } else if (max_cpus > GIC_NCPU) {
 +            error_report("%s only supports GICv2 emulation but more than 8 "
 +                         "vcpus are requested", accel_name);
 +            exit(1);
 +        }
 +        break;
 +    case VIRT_GIC_VERSION_2:
 +    case VIRT_GIC_VERSION_3:
 +    case VIRT_GIC_VERSION_4:
 +        break;
 +    }
++
++    /* Check chosen version is effectively supported */
++    switch (gic_version) {
++    case VIRT_GIC_VERSION_2:
++        if (!(gics_supported & VIRT_GIC_VERSION_2_MASK)) {
++            error_report("%s does not support GICv2 emulation", accel_name);
++            exit(1);
++        }
++        break;
++    case VIRT_GIC_VERSION_3:
++        if (!(gics_supported & VIRT_GIC_VERSION_3_MASK)) {
++            error_report("%s does not support GICv3 emulation", accel_name);
++            exit(1);
++        }
++        break;
++    case VIRT_GIC_VERSION_4:
++        if (!(gics_supported & VIRT_GIC_VERSION_4_MASK)) {
++            error_report("%s does not support GICv4 emulation, is virtualization=on?",
++                         accel_name);
++            exit(1);
++        }
++        break;
++    default:
++        error_report("logic error in finalize_gic_version");
++        exit(1);
++        break;
++    }
++
++    return gic_version;
 +}
 +
- /* This takes the board allocated linear DDR memory and creates aliases
+ /*
-  * for each split DDR range/aperture on the Versal address map.
+  * finalize_gic_version - Determines the final gic_version
   * according to the gic-version property
@@ -XXX,XX +XXX,XX @@ static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
   */
-@@ -XXX,XX +XXX,XX @@ static void versal_realize(DeviceState *dev, Error **errp)
+ static void finalize_gic_version(VirtMachineState *vms)
-     versal_create_uarts(s, pic);
+ {
-     versal_create_gems(s, pic);
++    const char *accel_name = current_accel_name();
-     versal_create_admas(s, pic);
+     unsigned int max_cpus = MACHINE(vms)->smp.max_cpus;
-+    versal_create_sds(s, pic);
++    int gics_supported = 0;
-     versal_map_ddr(s);
-     versal_unimp(s);
+-    if (kvm_enabled()) {
+-        int probe_bitmap;
 +    /* Determine which GIC versions the current environment supports */
 +    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
 +        int probe_bitmap = kvm_arm_vgic_probe();
 -        if (!kvm_irqchip_in_kernel()) {
 -            switch (vms->gic_version) {
 -            case VIRT_GIC_VERSION_HOST:
 -                warn_report(
 -                    "gic-version=host not relevant with kernel-irqchip=off "
 -                     "as only userspace GICv2 is supported. Using v2 ...");
 -                return;
 -            case VIRT_GIC_VERSION_MAX:
 -            case VIRT_GIC_VERSION_NOSEL:
 -                vms->gic_version = VIRT_GIC_VERSION_2;
 -                return;
 -            case VIRT_GIC_VERSION_2:
 -                return;
 -            case VIRT_GIC_VERSION_3:
 -                error_report(
 -                    "gic-version=3 is not supported with kernel-irqchip=off");
 -                exit(1);
 -            case VIRT_GIC_VERSION_4:
 -                error_report(
 -                    "gic-version=4 is not supported with kernel-irqchip=off");
 -                exit(1);
 -            }
 -        }
 -
 -        probe_bitmap = kvm_arm_vgic_probe();
          if (!probe_bitmap) {
              error_report("Unable to determine GIC version supported by host");
              exit(1);
          }
 -        switch (vms->gic_version) {
 -        case VIRT_GIC_VERSION_HOST:
 -        case VIRT_GIC_VERSION_MAX:
 -            if (probe_bitmap & KVM_ARM_VGIC_V3) {
 -                vms->gic_version = VIRT_GIC_VERSION_3;
 -            } else {
 -                vms->gic_version = VIRT_GIC_VERSION_2;
 -            }
 -            return;
 -        case VIRT_GIC_VERSION_NOSEL:
 -            if ((probe_bitmap & KVM_ARM_VGIC_V2) && max_cpus <= GIC_NCPU) {
 -                vms->gic_version = VIRT_GIC_VERSION_2;
 -            } else if (probe_bitmap & KVM_ARM_VGIC_V3) {
 -                /*
 -                 * in case the host does not support v2 in-kernel emulation or
 -                 * the end-user requested more than 8 VCPUs we now default
 -                 * to v3. In any case defaulting to v2 would be broken.
 -                 */
 -                vms->gic_version = VIRT_GIC_VERSION_3;
 -            } else if (max_cpus > GIC_NCPU) {
 -                error_report("host only supports in-kernel GICv2 emulation "
 -                             "but more than 8 vcpus are requested");
 -                exit(1);
 -            }
 -            break;
 -        case VIRT_GIC_VERSION_2:
 -        case VIRT_GIC_VERSION_3:
 -            break;
 -        case VIRT_GIC_VERSION_4:
 -            error_report("gic-version=4 is not supported with KVM");
 -            exit(1);
 +        if (probe_bitmap & KVM_ARM_VGIC_V2) {
 +            gics_supported |= VIRT_GIC_VERSION_2_MASK;
          }
 -
 -        /* Check chosen version is effectively supported by the host */
 -        if (vms->gic_version == VIRT_GIC_VERSION_2 &&
 -            !(probe_bitmap & KVM_ARM_VGIC_V2)) {
 -            error_report("host does not support in-kernel GICv2 emulation");
 -            exit(1);
 -        } else if (vms->gic_version == VIRT_GIC_VERSION_3 &&
 -                   !(probe_bitmap & KVM_ARM_VGIC_V3)) {
 -            error_report("host does not support in-kernel GICv3 emulation");
 -            exit(1);
 +        if (probe_bitmap & KVM_ARM_VGIC_V3) {
 +            gics_supported |= VIRT_GIC_VERSION_3_MASK;
          }
 -        return;
 -    }
 -
 -    /* TCG mode */
 -    switch (vms->gic_version) {
 -    case VIRT_GIC_VERSION_NOSEL:
 -        vms->gic_version = VIRT_GIC_VERSION_2;
 -        break;
 -    case VIRT_GIC_VERSION_MAX:
 +    } else if (kvm_enabled() && !kvm_irqchip_in_kernel()) {
 +        /* KVM w/o kernel irqchip can only deal with GICv2 */
 +        gics_supported |= VIRT_GIC_VERSION_2_MASK;
 +        accel_name = "KVM with kernel-irqchip=off";
 +    } else {
 +        gics_supported |= VIRT_GIC_VERSION_2_MASK;
          if (module_object_class_by_name("arm-gicv3")) {
 -            /* CONFIG_ARM_GICV3_TCG was set */
 +            gics_supported |= VIRT_GIC_VERSION_3_MASK;
              if (vms->virt) {
                  /* GICv4 only makes sense if CPU has EL2 */
 -                vms->gic_version = VIRT_GIC_VERSION_4;
 -            } else {
 -                vms->gic_version = VIRT_GIC_VERSION_3;
 +                gics_supported |= VIRT_GIC_VERSION_4_MASK;
              }
 -        } else {
 -            vms->gic_version = VIRT_GIC_VERSION_2;
          }
 -        break;
 -    case VIRT_GIC_VERSION_HOST:
 -        error_report("gic-version=host requires KVM");
 -        exit(1);
 -    case VIRT_GIC_VERSION_4:
 -        if (!vms->virt) {
 -            error_report("gic-version=4 requires virtualization enabled");
 -            exit(1);
 -        }
 -        break;
 -    case VIRT_GIC_VERSION_2:
 -    case VIRT_GIC_VERSION_3:
 -        break;
      }
 +
 +    /*
 +     * Then convert helpers like host/max to concrete GIC versions and ensure
 +     * the desired version is supported
 +     */
 +    vms->gic_version = finalize_gic_version_do(accel_name, vms->gic_version,
 +                                               gics_supported, max_cpus);
  }
  /*
 --
-.20.1
+.34.1

-[PULL 02/39] hw/arm/mps2-tz: Use TYPE_IOTKIT instead of hardcoded string
+[PULL 09/33] hw/arm/virt: Make accels in GIC finalize logic explicit
-From: Philippe Mathieu-Daudé <f4bug@amsat.org>
+From: Alexander Graf <agraf@csgraf.de>
-By using the TYPE_* definitions for devices, we can:
+Let's explicitly list out all accelerators that we support when trying to
- - quickly find where devices are used with 'git-grep'
+determine the supported set of GIC versions. KVM was already separate, so
- - easily rename a device (one-line change).
+the only missing one is HVF which simply reuses all of TCG's emulation
 code and thus has the same compatibility matrix.
-Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Signed-off-by: Alexander Graf <agraf@csgraf.de>
-Message-id: 20200428154650.21991-1-f4bug@amsat.org
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Cornelia Huck <cohuck@redhat.com>
 Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20221223090107.98888-3-agraf@csgraf.de
 [PMM: Added qtest to the list of accelerators]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/mps2-tz.c | 2 +-
+ hw/arm/virt.c | 7 ++++++-
-file changed, 1 insertion(+), 1 deletion(-)
+file changed, 6 insertions(+), 1 deletion(-)
-diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
+diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/mps2-tz.c
+--- a/hw/arm/virt.c
-+++ b/hw/arm/mps2-tz.c
++++ b/hw/arm/virt.c
-@@ -XXX,XX +XXX,XX @@ static void mps2tz_common_init(MachineState *machine)
+@@ -XXX,XX +XXX,XX @@
-         exit(EXIT_FAILURE);
+ #include "sysemu/numa.h"
  #include "sysemu/runstate.h"
  #include "sysemu/tpm.h"
 +#include "sysemu/tcg.h"
  #include "sysemu/kvm.h"
  #include "sysemu/hvf.h"
 +#include "sysemu/qtest.h"
  #include "hw/loader.h"
  #include "qapi/error.h"
  #include "qemu/bitops.h"
@@ -XXX,XX +XXX,XX @@ static void finalize_gic_version(VirtMachineState *vms)
          /* KVM w/o kernel irqchip can only deal with GICv2 */
          gics_supported |= VIRT_GIC_VERSION_2_MASK;
          accel_name = "KVM with kernel-irqchip=off";
 -    } else {
 +    } else if (tcg_enabled() || hvf_enabled() || qtest_enabled())  {
          gics_supported |= VIRT_GIC_VERSION_2_MASK;
          if (module_object_class_by_name("arm-gicv3")) {
              gics_supported |= VIRT_GIC_VERSION_3_MASK;
@@ -XXX,XX +XXX,XX @@ static void finalize_gic_version(VirtMachineState *vms)
                  gics_supported |= VIRT_GIC_VERSION_4_MASK;
              }
          }
 +    } else {
 +        error_report("Unsupported accelerator, can not determine GIC support");
 +        exit(1);
      }
--    sysbus_init_child_obj(OBJECT(machine), "iotkit", &mms->iotkit,
+     /*
 +    sysbus_init_child_obj(OBJECT(machine), TYPE_IOTKIT, &mms->iotkit,
                            sizeof(mms->iotkit), mmc->armsse_type);
      iotkitdev = DEVICE(&mms->iotkit);
      object_property_set_link(OBJECT(&mms->iotkit), OBJECT(system_memory),
 --
-.20.1
+.34.1

-[PULL 09/39] hw/arm: versal: Remove inclusion of arm_gicv3_common.h
+[PULL 10/33] sbsa-ref: remove cortex-a76 from list of supported cpus
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
-Remove inclusion of arm_gicv3_common.h, this already gets
+Cortex-A76 supports 40bits of address space. sbsa-ref's memory
-included via xlnx-versal.h.
+starts above this limit.
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
-Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200427181649.26851-2-edgar.iglesias@gmail.com
+Message-id: 20230126114416.2447685-1-marcin.juszkiewicz@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/xlnx-versal.c | 1 -
+ hw/arm/sbsa-ref.c | 1 -
 file changed, 1 deletion(-)
-diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
+diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-versal.c
+--- a/hw/arm/sbsa-ref.c
-+++ b/hw/arm/xlnx-versal.c
++++ b/hw/arm/sbsa-ref.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static const int sbsa_ref_irqmap[] = {
- #include "hw/arm/boot.h"
+ static const char * const valid_cpus[] = {
- #include "kvm_arm.h"
+     ARM_CPU_TYPE_NAME("cortex-a57"),
- #include "hw/misc/unimp.h"
+     ARM_CPU_TYPE_NAME("cortex-a72"),
--#include "hw/intc/arm_gicv3_common.h"
+-    ARM_CPU_TYPE_NAME("cortex-a76"),
- #include "hw/arm/xlnx-versal.h"
+     ARM_CPU_TYPE_NAME("neoverse-n1"),
- #include "hw/char/pl011.h"
+     ARM_CPU_TYPE_NAME("max"),
+ };
 --
-.20.1
+.34.1

-[PULL 04/39] target/arm: Use enum constant in get_phys_addr_lpae() call
+[PULL 11/33] target/arm: Name AT_S1E1RP and AT_S1E1WP cpregs correctly
-The access_type argument to get_phys_addr_lpae() is an MMUAccessType;
+The encodings 0,0,C7,C9,0 and 0,0,C7,C9,1 are AT SP1E1RP and AT
-use the enum constant MMU_DATA_LOAD rather than a literal 0 when we
+S1E1WP, but our ARMCPRegInfo definitions for them incorrectly name
-call it in S1_ptw_translate().
+them AT S1E1R and AT S1E1W (which are entirely different
 instructions).  Fix the names.
 (This has no guest-visible effect as the names are for debug purposes
 only.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200330210400.11724-3-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-2-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-2-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 5 +++--
+ target/arm/helper.c | 4 ++--
-file changed, 3 insertions(+), 2 deletions(-)
+file changed, 2 insertions(+), 2 deletions(-)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vhe_reginfo[] = {
-             pcacheattrs = &cacheattrs;
-         }
+ #ifndef CONFIG_USER_ONLY
+ static const ARMCPRegInfo ats1e1_reginfo[] = {
--        ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_Stage2, &s2pa,
+-    { .name = "AT_S1E1R", .state = ARM_CP_STATE_AA64,
--                                 &txattrs, &s2prot, &s2size, fi, pcacheattrs);
++    { .name = "AT_S1E1RP", .state = ARM_CP_STATE_AA64,
-+        ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 0,
-+                                 &s2pa, &txattrs, &s2prot, &s2size, fi,
+       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
-+                                 pcacheattrs);
+       .writefn = ats_write64 },
-         if (ret) {
+-    { .name = "AT_S1E1W", .state = ARM_CP_STATE_AA64,
-             assert(fi->type != ARMFault_None);
++    { .name = "AT_S1E1WP", .state = ARM_CP_STATE_AA64,
-             fi->s2addr = addr;
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 1,
        .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
        .writefn = ats_write64 },
 --
-.20.1
+.34.1

-[PULL 38/39] target/arm: Convert Neon 3-reg-same VMUL, VMLA, VMLS, VSHL to decodetree
+[PULL 12/33] target/arm: Correct syndrome for ATS12NSO* at Secure EL1
-Convert the Neon VMUL, VMLA, VMLS and VSHL insns in the
+The AArch32 ATS12NSO* address translation operations are supposed to
--reg-same grouping to decodetree.
+trap to either EL2 or EL3 if they're executed at Secure EL1 (which
 can only happen if EL3 is AArch64).  We implement this, but we got
 the syndrome value wrong: like other traps to EL2 or EL3 on an
 AArch32 cpreg access, they should report the 0x3 syndrome, not the
 x0 'uncategorized' syndrome.  This is clear in the access pseudocode
 for these instructions.
 Fix the syndrome value for these operations by correcting the
 returned value from the ats_access() function.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-20-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-3-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-3-peter.maydell@linaro.org
 ---
- target/arm/neon-dp.decode       |  9 +++++++
+ target/arm/helper.c | 4 ++--
- target/arm/translate-neon.inc.c | 44 +++++++++++++++++++++++++++++++++
+file changed, 2 insertions(+), 2 deletions(-)
  target/arm/translate.c          | 28 +++------------------
 files changed, 56 insertions(+), 25 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/helper.c
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
+@@ -XXX,XX +XXX,XX @@ static CPAccessResult ats_access(CPUARMState *env, const ARMCPRegInfo *ri,
- VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
+         if (arm_current_el(env) == 1) {
- VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
+             if (arm_is_secure_below_el3(env)) {
+                 if (env->cp15.scr_el3 & SCR_EEL2) {
-+VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
+-                    return CP_ACCESS_TRAP_UNCATEGORIZED_EL2;
-+VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
++                    return CP_ACCESS_TRAP_EL2;
-+
+                 }
- VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
+-                return CP_ACCESS_TRAP_UNCATEGORIZED_EL3;
- VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
++                return CP_ACCESS_TRAP_EL3;
  VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
@@ -XXX,XX +XXX,XX @@ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
  VTST_3s          1111 001 0 0 . .. .... .... 1000 . . . 1 .... @3same
  VCEQ_3s          1111 001 1 0 . .. .... .... 1000 . . . 1 .... @3same
 +
 +VMLA_3s          1111 001 0 0 . .. .... .... 1001 . . . 0 .... @3same
 +VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
 +
 +VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
 +VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
  DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
  DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
  DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
 +DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
  #define DO_3SAME_CMP(INSN, COND)                                        \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_GVEC4(VQADD_S, sqadd_op)
  DO_3SAME_GVEC4(VQADD_U, uqadd_op)
  DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
  DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
 +
 +static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                           uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
 +{
 +    tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz,
 +                       0, gen_helper_gvec_pmul_b);
 +}
 +
 +static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
 +{
 +    if (a->size != 0) {
 +        return false;
 +    }
 +    return do_3same(s, a, gen_VMUL_p_3s);
 +}
 +
 +#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY)                           \
 +    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 +                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 +                                uint32_t oprsz, uint32_t maxsz)         \
 +    {                                                                   \
 +        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
 +                       oprsz, maxsz, &OPARRAY[vece]);                   \
 +    }                                                                   \
 +    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
 +
 +
 +DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op)
 +DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op)
 +
 +#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
 +    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 +                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 +                                uint32_t oprsz, uint32_t maxsz)         \
 +    {                                                                   \
 +        /* Note the operation is vshl vd,vm,vn */                       \
 +        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
 +                       oprsz, maxsz, &OPARRAY[vece]);                   \
 +    }                                                                   \
 +    DO_3SAME(INSN, gen_##INSN##_3s)
 +
 +DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
 +DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
-             return 1;
+             return CP_ACCESS_TRAP_UNCATEGORIZED;
 -        case NEON_3R_VMUL: /* VMUL */
 -            if (u) {
 -                /* Polynomial case allows only P8.  */
 -                if (size != 0) {
 -                    return 1;
 -                }
 -                tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
 -                                   0, gen_helper_gvec_pmul_b);
 -            } else {
 -                tcg_gen_gvec_mul(size, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -            }
 -            return 0;
 -
 -        case NEON_3R_VML: /* VMLA, VMLS */
 -            tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
 -                           u ? &mls_op[size] : &mla_op[size]);
 -            return 0;
 -
 -        case NEON_3R_VSHL:
 -            /* Note the operation is vshl vd,vm,vn */
 -            tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
 -                           u ? &ushl_op[size] : &sshl_op[size]);
 -            return 0;
 -
          case NEON_3R_VADD_VSUB:
          case NEON_3R_LOGIC:
          case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VCGE:
          case NEON_3R_VQADD:
          case NEON_3R_VQSUB:
 +        case NEON_3R_VMUL:
 +        case NEON_3R_VML:
 +        case NEON_3R_VSHL:
              /* Already handled by decodetree */
              return 1;
          }
 --
-.20.1
+.34.1

-[PULL 37/39] target/arm: Convert Neon 3-reg-same VQADD/VQSUB to decodetree
+[PULL 13/33] target/arm: Remove CP_ACCESS_TRAP_UNCATEGORIZED_{EL2, EL3}
-Convert the Neon VQADD/VQSUB insns in the 3-reg-same grouping
+We added the CPAccessResult values CP_ACCESS_TRAP_UNCATEGORIZED_EL2
-to decodetree.
+and CP_ACCESS_TRAP_UNCATEGORIZED_EL3 purely in order to use them in
 the ats_access() function, but doing so was incorrect (a bug fixed in
 a previous commit).  There aren't any cases where we want an access
 function to be able to request a trap to EL2 or EL3 with a zero
 syndrome value, so remove these enum values.
 As well as cleaning up dead code, the motivation here is that
 we'd like to implement fine-grained-trap handling in
 helper_access_check_cp_reg(). Although the fine-grained traps
 to EL2 are always lower priority than trap-to-same-EL and
 higher priority than trap-to-EL3, they are in the middle of
 various other kinds of trap-to-EL2. Knowing that a trap-to-EL2
 must always for us have the same syndrome (ie that an access
 function will return CP_ACCESS_TRAP_EL2 and there is no other
 kind of trap-to-EL2 enum value) means we don't have to try
 to choose which of the two syndrome values to report if the
 access would trap to EL2 both for the fine-grained-trap and
 because the access function requires it.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-19-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-4-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-4-peter.maydell@linaro.org
 ---
- target/arm/neon-dp.decode       |  6 ++++++
+ target/arm/cpregs.h    | 4 ++--
- target/arm/translate-neon.inc.c | 15 +++++++++++++++
+ target/arm/op_helper.c | 2 ++
- target/arm/translate.c          | 14 ++------------
+files changed, 4 insertions(+), 2 deletions(-)
 files changed, 23 insertions(+), 12 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/cpregs.h
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ typedef enum CPAccessResult {
- @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
+      * Access fails and results in an exception syndrome 0x0 ("uncategorized").
-                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
+      * Note that this is not a catch-all case -- the set of cases which may
+      * result in this failure is specifically defined by the architecture.
-+VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
++     * This trap is always to the usual target EL, never directly to a
-+VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
++     * specified target EL.
-+
+      */
- @3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
+     CP_ACCESS_TRAP_UNCATEGORIZED = (2 << 2),
-                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
+-    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = CP_ACCESS_TRAP_UNCATEGORIZED | 2,
+-    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = CP_ACCESS_TRAP_UNCATEGORIZED | 3,
-@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
+ } CPAccessResult;
- VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
- VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
+ typedef struct ARMCPRegInfo ARMCPRegInfo;
+diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
 +VQSUB_S_3s       1111 001 0 0 . .. .... .... 0010 . . . 1 .... @3same
 +VQSUB_U_3s       1111 001 1 0 . .. .... .... 0010 . . . 1 .... @3same
 +
  VCGT_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 0 .... @3same
  VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
  VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/op_helper.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/op_helper.c
-@@ -XXX,XX +XXX,XX @@ static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+@@ -XXX,XX +XXX,XX @@ const void *HELPER(access_check_cp_reg)(CPUARMState *env, uint32_t key,
-     tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
+     case CP_ACCESS_TRAP:
- }
+         break;
- DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
+     case CP_ACCESS_TRAP_UNCATEGORIZED:
-+
++        /* Only CP_ACCESS_TRAP traps are direct to a specified EL */
-+#define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
++        assert((res & CP_ACCESS_EL_MASK) == 0);
-+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+         if (cpu_isar_feature(aa64_ids, cpu) && isread &&
-+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+             arm_cpreg_in_idspace(ri)) {
-+                                uint32_t oprsz, uint32_t maxsz)         \
+             /*
 +    {                                                                   \
 +        tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),           \
 +                       rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]);   \
 +    }                                                                   \
 +    DO_3SAME(INSN, gen_##INSN##_3s)
 +
 +DO_3SAME_GVEC4(VQADD_S, sqadd_op)
 +DO_3SAME_GVEC4(VQADD_U, uqadd_op)
 +DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
 +DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 1;
 -        case NEON_3R_VQADD:
 -            tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 -                           rn_ofs, rm_ofs, vec_size, vec_size,
 -                           (u ? uqadd_op : sqadd_op) + size);
 -            return 0;
 -
 -        case NEON_3R_VQSUB:
 -            tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 -                           rn_ofs, rm_ofs, vec_size, vec_size,
 -                           (u ? uqsub_op : sqsub_op) + size);
 -            return 0;
 -
          case NEON_3R_VMUL: /* VMUL */
              if (u) {
                  /* Polynomial case allows only P8.  */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VTST_VCEQ:
          case NEON_3R_VCGT:
          case NEON_3R_VCGE:
 +        case NEON_3R_VQADD:
 +        case NEON_3R_VQSUB:
              /* Already handled by decodetree */
              return 1;
          }
 --
-.20.1
+.34.1

-[PULL 30/39] target/arm: Convert Neon load/store multiple structures to decodetree
+[PULL 14/33] target/arm: Move do_coproc_insn() syndrome calculation earlier
-Convert the Neon "load/store multiple structures" insns to decodetree.
+Rearrange the code in do_coproc_insn() so that we calculate the
 syndrome value for a potential trap early; we're about to add a
 second check that wants this value earlier than where it is currently
 determined.
 (Specifically, a trap to EL2 because of HSTR_EL2 should take
 priority over an UNDEF to EL1, even when the UNDEF is because
 the register does not exist at all or because its ri->access
 bits non-configurably fail the access. So the check we put in
 for HSTR_EL2 trapping at EL1 (which needs the syndrome) is
 going to have to be done before the check "is the ARMCPRegInfo
 pointer NULL".)
 This commit is just code motion; the change to HSTR_EL2
 handling that will use the 'syndrome' variable is in a
 subsequent commit.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-12-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-5-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-5-peter.maydell@linaro.org
 ---
- target/arm/neon-ls.decode       |   7 ++
+ target/arm/translate.c | 83 +++++++++++++++++++++---------------------
- target/arm/translate-neon.inc.c | 124 ++++++++++++++++++++++++++++++++
+file changed, 41 insertions(+), 42 deletions(-)
  target/arm/translate.c          |  91 +----------------------
 files changed, 133 insertions(+), 89 deletions(-)
-diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-ls.decode
-+++ b/target/arm/neon-ls.decode
-@@ -XXX,XX +XXX,XX @@
- #   0b1111_1001_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
- # This file works on the A32 encoding only; calling code for T32 has to
- # transform the insn into the A32 version first.
-+
-+%vd_dp  22:1 12:4
-+
-+# Neon load/store multiple structures
-+
-+VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
-+               vd=%vd_dp
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a)
-                        gen_helper_gvec_fmlal_idx_a32);
-     return true;
- }
-+
-+static struct {
-+    int nregs;
-+    int interleave;
-+    int spacing;
-+} const neon_ls_element_type[11] = {
-+    {1, 4, 1},
-+    {1, 4, 2},
-+    {4, 1, 1},
-+    {2, 2, 2},
-+    {1, 3, 1},
-+    {1, 3, 2},
-+    {3, 1, 1},
-+    {1, 1, 1},
-+    {1, 2, 1},
-+    {1, 2, 2},
-+    {2, 1, 1}
-+};
-+
-+static void gen_neon_ldst_base_update(DisasContext *s, int rm, int rn,
-+                                      int stride)
-+{
-+    if (rm != 15) {
-+        TCGv_i32 base;
-+
-+        base = load_reg(s, rn);
-+        if (rm == 13) {
-+            tcg_gen_addi_i32(base, base, stride);
-+        } else {
-+            TCGv_i32 index;
-+            index = load_reg(s, rm);
-+            tcg_gen_add_i32(base, base, index);
-+            tcg_temp_free_i32(index);
-+        }
-+        store_reg(s, rn, base);
-+    }
-+}
-+
-+static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
-+{
-+    /* Neon load/store multiple structures */
-+    int nregs, interleave, spacing, reg, n;
-+    MemOp endian = s->be_data;
-+    int mmu_idx = get_mem_index(s);
-+    int size = a->size;
-+    TCGv_i64 tmp64;
-+    TCGv_i32 addr, tmp;
-+
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist */
-+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
-+        return false;
-+    }
-+    if (a->itype > 10) {
-+        return false;
-+    }
-+    /* Catch UNDEF cases for bad values of align field */
-+    switch (a->itype & 0xc) {
-+    case 4:
-+        if (a->align >= 2) {
-+            return false;
-+        }
-+        break;
-+    case 8:
-+        if (a->align == 3) {
-+            return false;
-+        }
-+        break;
-+    default:
-+        break;
-+    }
-+    nregs = neon_ls_element_type[a->itype].nregs;
-+    interleave = neon_ls_element_type[a->itype].interleave;
-+    spacing = neon_ls_element_type[a->itype].spacing;
-+    if (size == 3 && (interleave | spacing) != 1) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    /* For our purposes, bytes are always little-endian.  */
-+    if (size == 0) {
-+        endian = MO_LE;
-+    }
-+    /*
-+     * Consecutive little-endian elements from a single register
-+     * can be promoted to a larger little-endian operation.
-+     */
-+    if (interleave == 1 && endian == MO_LE) {
-+        size = 3;
-+    }
-+    tmp64 = tcg_temp_new_i64();
-+    addr = tcg_temp_new_i32();
-+    tmp = tcg_const_i32(1 << size);
-+    load_reg_var(s, addr, a->rn);
-+    for (reg = 0; reg < nregs; reg++) {
-+        for (n = 0; n < 8 >> size; n++) {
-+            int xs;
-+            for (xs = 0; xs < interleave; xs++) {
-+                int tt = a->vd + reg + spacing * xs;
-+
-+                if (a->l) {
-+                    gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
-+                    neon_store_element64(tt, n, size, tmp64);
-+                } else {
-+                    neon_load_element64(tmp64, tt, n, size);
-+                    gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
-+                }
-+                tcg_gen_add_i32(addr, addr, tmp);
-+            }
-+        }
-+    }
-+    tcg_temp_free_i32(addr);
-+    tcg_temp_free_i32(tmp);
-+    tcg_temp_free_i64(tmp64);
-+
-+    gen_neon_ldst_base_update(s, a->rm, a->rn, nregs * interleave * 8);
-+    return true;
-+}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
+@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
- }
+     const ARMCPRegInfo *ri = get_arm_cp_reginfo(s->cp_regs, key);
+     TCGv_ptr tcg_ri = NULL;
+     bool need_exit_tb;
--static struct {
++    uint32_t syndrome;
--    int nregs;
++
--    int interleave;
++    /*
--    int spacing;
++     * Note that since we are an implementation which takes an
--} const neon_ls_element_type[11] = {
++     * exception on a trapped conditional instruction only if the
--    {1, 4, 1},
++     * instruction passes its condition code check, we can take
--    {1, 4, 2},
++     * advantage of the clause in the ARM ARM that allows us to set
--    {4, 1, 1},
++     * the COND field in the instruction to 0xE in all cases.
--    {2, 2, 2},
++     * We could fish the actual condition out of the insn (ARM)
--    {1, 3, 1},
++     * or the condexec bits (Thumb) but it isn't necessary.
--    {1, 3, 2},
++     */
--    {3, 1, 1},
++    switch (cpnum) {
--    {1, 1, 1},
++    case 14:
--    {1, 2, 1},
++        if (is64) {
--    {1, 2, 2},
++            syndrome = syn_cp14_rrt_trap(1, 0xe, opc1, crm, rt, rt2,
--    {2, 1, 1}
++                                         isread, false);
--};
++        } else {
 +            syndrome = syn_cp14_rt_trap(1, 0xe, opc1, opc2, crn, crm,
 +                                        rt, isread, false);
 +        }
 +        break;
 +    case 15:
 +        if (is64) {
 +            syndrome = syn_cp15_rrt_trap(1, 0xe, opc1, crm, rt, rt2,
 +                                         isread, false);
 +        } else {
 +            syndrome = syn_cp15_rt_trap(1, 0xe, opc1, opc2, crn, crm,
 +                                        rt, isread, false);
 +        }
 +        break;
 +    default:
 +        /*
 +         * ARMv8 defines that only coprocessors 14 and 15 exist,
 +         * so this can only happen if this is an ARMv7 or earlier CPU,
 +         * in which case the syndrome information won't actually be
 +         * guest visible.
 +         */
 +        assert(!arm_dc_feature(s, ARM_FEATURE_V8));
 +        syndrome = syn_uncategorized();
 +        break;
 +    }
      if (!ri) {
          /*
@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
           * Note that on XScale all cp0..c13 registers do an access check
           * call in order to handle c15_cpar.
           */
 -        uint32_t syndrome;
 -
- /* Translate a NEON load/store element instruction.  Return nonzero if the
+-        /*
-    instruction is invalid.  */
+-         * Note that since we are an implementation which takes an
- static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
+-         * exception on a trapped conditional instruction only if the
- {
+-         * instruction passes its condition code check, we can take
-     int rd, rn, rm;
+-         * advantage of the clause in the ARM ARM that allows us to set
--    int op;
+-         * the COND field in the instruction to 0xE in all cases.
-     int nregs;
+-         * We could fish the actual condition out of the insn (ARM)
--    int interleave;
+-         * or the condexec bits (Thumb) but it isn't necessary.
--    int spacing;
+-         */
-     int stride;
+-        switch (cpnum) {
-     int size;
+-        case 14:
-     int reg;
+-            if (is64) {
-     int load;
+-                syndrome = syn_cp14_rrt_trap(1, 0xe, opc1, crm, rt, rt2,
--    int n;
+-                                             isread, false);
-     int vec_size;
+-            } else {
--    int mmu_idx;
+-                syndrome = syn_cp14_rt_trap(1, 0xe, opc1, opc2, crn, crm,
--    MemOp endian;
+-                                            rt, isread, false);
      TCGv_i32 addr;
      TCGv_i32 tmp;
 -    TCGv_i32 tmp2;
 -    TCGv_i64 tmp64;
      if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
          return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      rn = (insn >> 16) & 0xf;
      rm = insn & 0xf;
      load = (insn & (1 << 21)) != 0;
 -    endian = s->be_data;
 -    mmu_idx = get_mem_index(s);
      if ((insn & (1 << 23)) == 0) {
 -        /* Load store all elements.  */
 -        op = (insn >> 8) & 0xf;
 -        size = (insn >> 6) & 3;
 -        if (op > 10)
 -            return 1;
 -        /* Catch UNDEF cases for bad values of align field */
 -        switch (op & 0xc) {
 -        case 4:
 -            if (((insn >> 5) & 1) == 1) {
 -                return 1;
 -            }
 -            break;
--        case 8:
+-        case 15:
--            if (((insn >> 4) & 3) == 3) {
+-            if (is64) {
--                return 1;
+-                syndrome = syn_cp15_rrt_trap(1, 0xe, opc1, crm, rt, rt2,
 -                                             isread, false);
 -            } else {
 -                syndrome = syn_cp15_rt_trap(1, 0xe, opc1, opc2, crn, crm,
 -                                            rt, isread, false);
 -            }
 -            break;
 -        default:
+-            /*
+-             * ARMv8 defines that only coprocessors 14 and 15 exist,
+-             * so this can only happen if this is an ARMv7 or earlier CPU,
+-             * in which case the syndrome information won't actually be
+-             * guest visible.
+-             */
+-            assert(!arm_dc_feature(s, ARM_FEATURE_V8));
+-            syndrome = syn_uncategorized();
 -            break;
 -        }
--        nregs = neon_ls_element_type[op].nregs;
--        interleave = neon_ls_element_type[op].interleave;
--        spacing = neon_ls_element_type[op].spacing;
--        if (size == 3 && (interleave | spacing) != 1) {
--            return 1;
--        }
--        /* For our purposes, bytes are always little-endian.  */
--        if (size == 0) {
--            endian = MO_LE;
--        }
--        /* Consecutive little-endian elements from a single register
--         * can be promoted to a larger little-endian operation.
--         */
--        if (interleave == 1 && endian == MO_LE) {
--            size = 3;
--        }
--        tmp64 = tcg_temp_new_i64();
--        addr = tcg_temp_new_i32();
--        tmp2 = tcg_const_i32(1 << size);
--        load_reg_var(s, addr, rn);
--        for (reg = 0; reg < nregs; reg++) {
--            for (n = 0; n < 8 >> size; n++) {
--                int xs;
--                for (xs = 0; xs < interleave; xs++) {
--                    int tt = rd + reg + spacing * xs;
 -
--                    if (load) {
+         gen_set_condexec(s);
--                        gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
+         gen_update_pc(s, 0);
--                        neon_store_element64(tt, n, size, tmp64);
+         tcg_ri = tcg_temp_new_ptr();
 -                    } else {
 -                        neon_load_element64(tmp64, tt, n, size);
 -                        gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
 -                    }
 -                    tcg_gen_add_i32(addr, addr, tmp2);
 -                }
 -            }
 -        }
 -        tcg_temp_free_i32(addr);
 -        tcg_temp_free_i32(tmp2);
 -        tcg_temp_free_i64(tmp64);
 -        stride = nregs * interleave * 8;
 +        /* Load store all elements -- handled already by decodetree */
 +        return 1;
      } else {
          size = (insn >> 10) & 3;
          if (size == 3) {
 --
-.20.1
+.34.1

-[PULL 29/39] target/arm: Convert VFM[AS]L (scalar) to decodetree
+[PULL 15/33] target/arm: All UNDEF-at-EL0 traps take priority over HSTR_EL2 traps
-Convert the VFM[AS]L (scalar) insns in the 2reg-scalar-ext group
+The HSTR_EL2 register has a collection of trap bits which allow
-to decodetree. These are the last ones in the group so we can remove
+trapping to EL2 for AArch32 EL0 or EL1 accesses to coprocessor
-all the legacy decode for the group.
+registers.  The specification of these bits is that when the bit is
 set we should trap
  * EL1 accesses
  * EL0 accesses, if the access is not UNDEFINED when the
    trap bit is 0
-Note that in disas_thumb2_insn() the parts of this encoding space
+In other words, all UNDEF traps from EL0 to EL1 take precedence over
-where the decodetree decoder returns false will correctly be directed
+the HSTR_EL2 trap to EL2.  (Since this is all AArch32, the only kind
-to illegal_op by the "(insn & (1 << 28))" check so they won't fall
+of trap-to-EL1 is the UNDEF.)
-into disas_coproc_insn() by mistake.
 Our implementation doesn't quite get this right -- we check for traps
 in the order:
  * no such register
  * ARMCPRegInfo::access bits
  * HSTR_EL2 trap bits
  * ARMCPRegInfo::accessfn
 So UNDEFs that happen because of the access bits or because the
 register doesn't exist at all correctly take priority over the
 HSTR_EL2 trap, but where a register can UNDEF at EL0 because of the
 accessfn we are incorrectly always taking the HSTR_EL2 trap.  There
 aren't many of these, but one example is the PMCR; if you look at the
 access pseudocode for this register you can see that UNDEFs taken
 because of the value of PMUSERENR.EN are checked before the HSTR_EL2
 bit.
 Rearrange helper_access_check_cp_reg() so that we always call the
 accessfn, and use its return value if it indicates that the access
 traps to EL0 rather than continuing to do the HSTR_EL2 check.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-11-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-6-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-6-peter.maydell@linaro.org
 ---
- target/arm/neon-shared.decode   |   7 +++
+ target/arm/op_helper.c | 21 ++++++++++++++++-----
- target/arm/translate-neon.inc.c |  32 ++++++++++
+file changed, 16 insertions(+), 5 deletions(-)
  target/arm/translate.c          | 107 +-------------------------------
 files changed, 40 insertions(+), 106 deletions(-)
-diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-shared.decode
+--- a/target/arm/op_helper.c
-+++ b/target/arm/neon-shared.decode
++++ b/target/arm/op_helper.c
-@@ -XXX,XX +XXX,XX @@ VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
+@@ -XXX,XX +XXX,XX @@ const void *HELPER(access_check_cp_reg)(CPUARMState *env, uint32_t key,
+         goto fail;
- VDOT_scalar    1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \
+     }
-                vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+
++    if (ri->accessfn) {
-+%vfml_scalar_q0_rm 0:3 5:1
++        res = ri->accessfn(env, ri, isread);
 +%vfml_scalar_q1_index 5:1 3:1
 +VFML_scalar    1111 1110 0 . 0 s:1 .... .... 1000 . 0 . 1 index:1 ... \
 +               rm=%vfml_scalar_q0_rm vn=%vn_sp vd=%vd_dp q=0
 +VFML_scalar    1111 1110 0 . 0 s:1 .... .... 1000 . 1 . 1 . rm:3 \
 +               index=%vfml_scalar_q1_index vn=%vn_dp vd=%vd_dp q=1
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a)
      tcg_temp_free_ptr(fpst);
      return true;
  }
 +
 +static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a)
 +{
 +    int opr_sz;
 +
 +    if (!dc_isar_feature(aa32_fhm, s)) {
 +        return false;
 +    }
 +
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
+     /*
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+-     * Check for an EL2 trap due to HSTR_EL2. We expect EL0 accesses
-+        ((a->vd & 0x10) || (a->q && (a->vn & 0x10)))) {
+-     * to sysregs non accessible at EL0 to have UNDEF-ed already.
-+        return false;
++     * If the access function indicates a trap from EL0 to EL1 then
 +     * that always takes priority over the HSTR_EL2 trap. (If it indicates
 +     * a trap to EL3, then the HSTR_EL2 trap takes priority; if it indicates
 +     * a trap to EL2, then the syndrome is the same either way so we don't
 +     * care whether technically the architecture says that HSTR_EL2 trap or
 +     * the other trap takes priority. So we take the "check HSTR_EL2" path
 +     * for all of those cases.)
       */
 +    if (res != CP_ACCESS_OK && ((res & CP_ACCESS_EL_MASK) == 0) &&
 +        arm_current_el(env) == 0) {
 +        goto fail;
 +    }
 +
-+    if (a->vd & a->q) {
+     if (!is_a64(env) && arm_current_el(env) < 2 && ri->cp == 15 &&
-+        return false;
+         (arm_hcr_el2_eff(env) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
-+    }
+         uint32_t mask = 1 << ri->crn;
-+
+@@ -XXX,XX +XXX,XX @@ const void *HELPER(access_check_cp_reg)(CPUARMState *env, uint32_t key,
-+    if (!vfp_access_check(s)) {
+         }
-+        return true;
+     }
-+    }
-+
+-    if (ri->accessfn) {
-+    opr_sz = (1 + a->q) * 8;
+-        res = ri->accessfn(env, ri, isread);
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(a->q, a->vn),
 +                       vfp_reg_offset(a->q, a->rm),
 +                       cpu_env, opr_sz, opr_sz,
 +                       (a->index << 2) | a->s, /* is_2 == 0 */
 +                       gen_helper_gvec_fmlal_idx_a32);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
  }
  #define VFP_REG_SHR(x, n) (((n) > 0) ? (x) >> (n) : (x) << -(n))
 -#define VFP_SREG(insn, bigbit, smallbit) \
 -  ((VFP_REG_SHR(insn, bigbit - 1) & 0x1e) | (((insn) >> (smallbit)) & 1))
  #define VFP_DREG(reg, insn, bigbit, smallbit) do { \
      if (dc_isar_feature(aa32_simd_r32, s)) { \
          reg = (((insn) >> (bigbit)) & 0x0f) \
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
          reg = ((insn) >> (bigbit)) & 0x0f; \
      }} while (0)
 -#define VFP_SREG_D(insn) VFP_SREG(insn, 12, 22)
  #define VFP_DREG_D(reg, insn) VFP_DREG(reg, insn, 12, 22)
 -#define VFP_SREG_N(insn) VFP_SREG(insn, 16,  7)
  #define VFP_DREG_N(reg, insn) VFP_DREG(reg, insn, 16,  7)
 -#define VFP_SREG_M(insn) VFP_SREG(insn,  0,  5)
  #define VFP_DREG_M(reg, insn) VFP_DREG(reg, insn,  0,  5)
  static void gen_neon_dup_low16(TCGv_i32 var)
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      return 0;
  }
 -/* Advanced SIMD two registers and a scalar extension.
 - *  31             24   23  22   20   16   12  11   10   9    8        3     0
 - * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
 - * | 1 1 1 1 1 1 1 0 | o1 | D | o2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
 - * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
 - *
 - */
 -
 -static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
 -{
 -    gen_helper_gvec_3 *fn_gvec = NULL;
 -    gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
 -    int rd, rn, rm, opr_sz, data;
 -    int off_rn, off_rm;
 -    bool is_long = false, q = extract32(insn, 6, 1);
 -    bool ptr_is_env = false;
 -
 -    if ((insn & 0xffa00f10) == 0xfe000810) {
 -        /* VFM[AS]L -- 1111 1110 0.0S .... .... 1000 .Q.1 .... */
 -        int is_s = extract32(insn, 20, 1);
 -        int vm20 = extract32(insn, 0, 3);
 -        int vm3 = extract32(insn, 3, 1);
 -        int m = extract32(insn, 5, 1);
 -        int index;
 -
 -        if (!dc_isar_feature(aa32_fhm, s)) {
 -            return 1;
 -        }
 -        if (q) {
 -            rm = vm20;
 -            index = m * 2 + vm3;
 -        } else {
 -            rm = vm20 * 2 + m;
 -            index = vm3;
 -        }
 -        is_long = true;
 -        data = (index << 2) | is_s; /* is_2 == 0 */
 -        fn_gvec_ptr = gen_helper_gvec_fmlal_idx_a32;
 -        ptr_is_env = true;
 -    } else {
 -        return 1;
 -    }
--
+     if (likely(res == CP_ACCESS_OK)) {
--    VFP_DREG_D(rd, insn);
+         return ri;
 -    if (rd & q) {
 -        return 1;
 -    }
 -    if (q || !is_long) {
 -        VFP_DREG_N(rn, insn);
 -        if (rn & q & !is_long) {
 -            return 1;
 -        }
 -        off_rn = vfp_reg_offset(1, rn);
 -        off_rm = vfp_reg_offset(1, rm);
 -    } else {
 -        rn = VFP_SREG_N(insn);
 -        off_rn = vfp_reg_offset(0, rn);
 -        off_rm = vfp_reg_offset(0, rm);
 -    }
 -    if (s->fp_excp_el) {
 -        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
 -                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
 -        return 0;
 -    }
 -    if (!s->vfp_enabled) {
 -        return 1;
 -    }
 -
 -    opr_sz = (1 + q) * 8;
 -    if (fn_gvec_ptr) {
 -        TCGv_ptr ptr;
 -        if (ptr_is_env) {
 -            ptr = cpu_env;
 -        } else {
 -            ptr = get_fpstatus_ptr(1);
 -        }
 -        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
 -                           opr_sz, opr_sz, data, fn_gvec_ptr);
 -        if (!ptr_is_env) {
 -            tcg_temp_free_ptr(ptr);
 -        }
 -    } else {
 -        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
 -                           opr_sz, opr_sz, data, fn_gvec);
 -    }
 -    return 0;
 -}
 -
  static int disas_coproc_insn(DisasContext *s, uint32_t insn)
  {
      int cpnum, is64, crn, crm, opc1, opc2, isread, rt, rt2;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                      }
                  }
              }
 -        } else if ((insn & 0x0f000a00) == 0x0e000800
 -                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -            return;
          }
          goto illegal_op;
      }
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
-             }
-             break;
-         }
--        if ((insn & 0xff000a00) == 0xfe000800
--            && arm_dc_feature(s, ARM_FEATURE_V8)) {
--            /* The Thumb2 and ARM encodings are identical.  */
--            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
--                goto illegal_op;
--            }
--        } else if (((insn >> 24) & 3) == 3) {
-+        if (((insn >> 24) & 3) == 3) {
-             /* Translate into the equivalent ARM encoding.  */
-             insn = (insn & 0xe2ffffff) | ((insn & (1 << 28)) >> 4) | (1 << 28);
-             if (disas_neon_data_insn(s, insn)) {
 --
-.20.1
+.34.1

-[PULL 22/39] target/arm: Add stubs for AArch32 Neon decodetree
+[PULL 16/33] target/arm: Make HSTR_EL2 traps take priority over UNDEF-at-EL1
-Add the infrastructure for building and invoking a decodetree decoder
+The semantics of HSTR_EL2 require that it traps cpreg accesses
-for the AArch32 Neon encodings.  At the moment the new decoder covers
+to EL2 for:
-nothing, so we always fall back to the existing hand-written decode.
+ * EL1 accesses
  * EL0 accesses, if the access is not UNDEFINED when the
    trap bit is 0
-We follow the same pattern we did for the VFP decodetree conversion
+(You can see this in the I_ZFGJP priority ordering, where HSTR_EL2
-(commit 78e138bc1f672c145ef6ace74617d and following): code that deals
+traps from EL1 to EL2 are priority 12, UNDEFs are priority 13, and
-with Neon will be moving gradually out to translate-neon.vfp.inc,
+HSTR_EL2 traps from EL0 are priority 15.)
 which we #include into translate.c.
-In order to share the decode files between A32 and T32, we
+However, we don't get this right for EL1 accesses which UNDEF because
-split Neon into 3 parts:
+the register doesn't exist at all or because its ri->access bits
- * data-processing
+non-configurably forbid the access.  At EL1, check for the HSTR_EL2
- * load-store
+trap early, before either of these UNDEF reasons.
  * 'shared' encodings
-The first two groups of instructions have similar but not identical
+We have to retain the HSTR_EL2 check in access_check_cp_reg(),
-A32 and T32 encodings, so we need to manually transform the T32
+because at EL0 any kind of UNDEF-to-EL1 (including "no such
-encoding into the A32 one before calling the decoder; the third group
+register", "bad ri->access" and "ri->accessfn returns 'trap to EL1'")
-covers the Neon instructions which are identical in A32 and T32.
+takes precedence over the trap to EL2.  But we only need to do that
 check for EL0 now.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Tested-by: Fuad Tabba <tabba@google.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-4-peter.maydell@linaro.org
+Message-id: 20230130182459.3309057-7-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-7-peter.maydell@linaro.org
 ---
- target/arm/neon-dp.decode       | 29 ++++++++++++++++++++++++++
+ target/arm/op_helper.c |  6 +++++-
- target/arm/neon-ls.decode       | 29 ++++++++++++++++++++++++++
+ target/arm/translate.c | 28 +++++++++++++++++++++++++++-
- target/arm/neon-shared.decode   | 27 +++++++++++++++++++++++++
+files changed, 32 insertions(+), 2 deletions(-)
  target/arm/translate-neon.inc.c | 32 +++++++++++++++++++++++++++++
  target/arm/translate.c          | 36 +++++++++++++++++++++++++++++++--
  target/arm/Makefile.objs        | 18 +++++++++++++++++
 files changed, 169 insertions(+), 2 deletions(-)
  create mode 100644 target/arm/neon-dp.decode
  create mode 100644 target/arm/neon-ls.decode
  create mode 100644 target/arm/neon-shared.decode
  create mode 100644 target/arm/translate-neon.inc.c
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
-new file mode 100644
+index XXXXXXX..XXXXXXX 100644
-index XXXXXXX..XXXXXXX
+--- a/target/arm/op_helper.c
---- /dev/null
++++ b/target/arm/op_helper.c
-+++ b/target/arm/neon-dp.decode
+@@ -XXX,XX +XXX,XX @@ const void *HELPER(access_check_cp_reg)(CPUARMState *env, uint32_t key,
-@@ -XXX,XX +XXX,XX @@
+         goto fail;
-+# AArch32 Neon data-processing instruction descriptions
+     }
-+#
-+#  Copyright (c) 2020 Linaro, Ltd
+-    if (!is_a64(env) && arm_current_el(env) < 2 && ri->cp == 15 &&
-+#
++    /*
-+# This library is free software; you can redistribute it and/or
++     * HSTR_EL2 traps from EL1 are checked earlier, in generated code;
-+# modify it under the terms of the GNU Lesser General Public
++     * we only need to check here for traps from EL0.
-+# License as published by the Free Software Foundation; either
++     */
-+# version 2 of the License, or (at your option) any later version.
++    if (!is_a64(env) && arm_current_el(env) == 0 && ri->cp == 15 &&
-+#
+         (arm_hcr_el2_eff(env) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
-+# This library is distributed in the hope that it will be useful,
+         uint32_t mask = 1 << ri->crn;
-+# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +# Lesser General Public License for more details.
 +#
 +# You should have received a copy of the GNU Lesser General Public
 +# License along with this library; if not, see <http://www.gnu.org/licenses/>.
 +
 +#
 +# This file is processed by scripts/decodetree.py
 +#
 +
 +# Encodings for Neon data processing instructions where the T32 encoding
 +# is a simple transformation of the A32 encoding.
 +# More specifically, this file covers instructions where the A32 encoding is
 +#   0b1111_001p_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 +# and the T32 encoding is
 +#   0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 +# This file works on the A32 encoding only; calling code for T32 has to
 +# transform the insn into the A32 version first.
 diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/neon-ls.decode
@@ -XXX,XX +XXX,XX @@
 +# AArch32 Neon load/store instruction descriptions
 +#
 +#  Copyright (c) 2020 Linaro, Ltd
 +#
 +# This library is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU Lesser General Public
 +# License as published by the Free Software Foundation; either
 +# version 2 of the License, or (at your option) any later version.
 +#
 +# This library is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +# Lesser General Public License for more details.
 +#
 +# You should have received a copy of the GNU Lesser General Public
 +# License along with this library; if not, see <http://www.gnu.org/licenses/>.
 +
 +#
 +# This file is processed by scripts/decodetree.py
 +#
 +
 +# Encodings for Neon load/store instructions where the T32 encoding
 +# is a simple transformation of the A32 encoding.
 +# More specifically, this file covers instructions where the A32 encoding is
 +#   0b1111_0100_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
 +# and the T32 encoding is
 +#   0b1111_1001_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
 +# This file works on the A32 encoding only; calling code for T32 has to
 +# transform the insn into the A32 version first.
 diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/neon-shared.decode
@@ -XXX,XX +XXX,XX @@
 +# AArch32 Neon instruction descriptions
 +#
 +#  Copyright (c) 2020 Linaro, Ltd
 +#
 +# This library is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU Lesser General Public
 +# License as published by the Free Software Foundation; either
 +# version 2 of the License, or (at your option) any later version.
 +#
 +# This library is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +# Lesser General Public License for more details.
 +#
 +# You should have received a copy of the GNU Lesser General Public
 +# License along with this library; if not, see <http://www.gnu.org/licenses/>.
 +
 +#
 +# This file is processed by scripts/decodetree.py
 +#
 +
 +# Encodings for Neon instructions whose encoding is the same for
 +# both A32 and T32.
 +
 +# More specifically, this covers:
 +# 2reg scalar ext: 0b1111_1110_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
 +# 3same ext:       0b1111_110x_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + *  ARM translation: AArch32 Neon instructions
 + *
 + *  Copyright (c) 2003 Fabrice Bellard
 + *  Copyright (c) 2005-2007 CodeSourcery
 + *  Copyright (c) 2007 OpenedHand, Ltd.
 + *  Copyright (c) 2020 Linaro, Ltd.
 + *
 + * This library is free software; you can redistribute it and/or
 + * modify it under the terms of the GNU Lesser General Public
 + * License as published by the Free Software Foundation; either
 + * version 2 of the License, or (at your option) any later version.
 + *
 + * This library is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 + * Lesser General Public License for more details.
 + *
 + * You should have received a copy of the GNU Lesser General Public
 + * License along with this library; if not, see <http://www.gnu.org/licenses/>.
 + */
 +
 +/*
 + * This file is intended to be included from translate.c; it uses
 + * some macros and definitions provided by that file.
 + * It might be possible to convert it to a standalone .c file eventually.
 + */
 +
 +/* Include the generated Neon decoder */
 +#include "decode-neon-dp.inc.c"
 +#include "decode-neon-ls.inc.c"
 +#include "decode-neon-shared.inc.c"
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static TCGv_ptr vfp_reg_ptr(bool dp, int reg)
+@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
+         break;
  #define ARM_CP_RW_BIT   (1 << 20)
 -/* Include the VFP decoder */
 +/* Include the VFP and Neon decoders */
  #include "translate-vfp.inc.c"
 +#include "translate-neon.inc.c"
  static inline void iwmmxt_load_reg(TCGv_i64 var, int reg)
  {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
          /* Unconditional instructions.  */
          /* TODO: Perhaps merge these into one decodetree output file.  */
          if (disas_a32_uncond(s, insn) ||
 -            disas_vfp_uncond(s, insn)) {
 +            disas_vfp_uncond(s, insn) ||
 +            disas_neon_dp(s, insn) ||
 +            disas_neon_ls(s, insn) ||
 +            disas_neon_shared(s, insn)) {
              return;
          }
          /* fall back to legacy decoder */
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
          ARCH(6T2);
      }
-+    if ((insn & 0xef000000) == 0xef000000) {
++    if (s->hstr_active && cpnum == 15 && s->current_el == 1) {
 +        /*
-+         * T32 encodings 0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
++         * At EL1, check for a HSTR_EL2 trap, which must take precedence
-+         * transform into
++         * over the UNDEF for "no such register" or the UNDEF for "access
-+         * A32 encodings 0b1111_001p_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
++         * permissions forbid this EL1 access". HSTR_EL2 traps from EL0
 +         * only happen if the cpreg doesn't UNDEF at EL0, so we do those in
 +         * access_check_cp_reg(), after the checks for whether the access
 +         * configurably trapped to EL1.
 +         */
-+        uint32_t a32_insn = (insn & 0xe2ffffff) |
++        uint32_t maskbit = is64 ? crm : crn;
 +            ((insn & (1 << 28)) >> 4) | (1 << 28);
 +
-+        if (disas_neon_dp(s, a32_insn)) {
++        if (maskbit != 4 && maskbit != 14) {
-+            return;
++            /* T4 and T14 are RES0 so never cause traps */
 +            TCGv_i32 t;
 +            DisasLabel over = gen_disas_label(s);
 +
 +            t = load_cpu_offset(offsetoflow32(CPUARMState, cp15.hstr_el2));
 +            tcg_gen_andi_i32(t, t, 1u << maskbit);
 +            tcg_gen_brcondi_i32(TCG_COND_EQ, t, 0, over.label);
 +            tcg_temp_free_i32(t);
 +
 +            gen_exception_insn(s, 0, EXCP_UDEF, syndrome);
 +            set_disas_label(s, over);
 +        }
 +    }
 +
-+    if ((insn & 0xff100000) == 0xf9000000) {
+     if (!ri) {
-+        /*
+         /*
-+         * T32 encodings 0b1111_1001_ppp0_qqqq_qqqq_qqqq_qqqq_qqqq
+          * Unknown register; this might be a guest error or a QEMU
-+         * transform into
+@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
 +         * A32 encodings 0b1111_0100_ppp0_qqqq_qqqq_qqqq_qqqq_qqqq
 +         */
 +        uint32_t a32_insn = (insn & 0x00ffffff) | 0xf4000000;
 +
 +        if (disas_neon_ls(s, a32_insn)) {
 +            return;
 +        }
 +    }
 +
      /*
       * TODO: Perhaps merge these into one decodetree output file.
       * Note disas_vfp is written for a32 with cond field in the
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
       */
      if (disas_t32(s, insn) ||
          disas_vfp_uncond(s, insn) ||
 +        disas_neon_shared(s, insn) ||
          ((insn >> 28) == 0xe && disas_vfp(s, insn))) {
          return;
      }
-diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
-index XXXXXXX..XXXXXXX 100644
+-    if (s->hstr_active || ri->accessfn ||
---- a/target/arm/Makefile.objs
++    if ((s->hstr_active && s->current_el == 0) || ri->accessfn ||
-+++ b/target/arm/Makefile.objs
+         (arm_dc_feature(s, ARM_FEATURE_XSCALE) && cpnum < 14)) {
-@@ -XXX,XX +XXX,XX @@ target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.decode $(DECODETREE)
+         /*
-       $(PYTHON) $(DECODETREE) --decode disas_sve -o $@ $<,\
+          * Emit code to perform further access permissions checks at
        "GEN", $(TARGET_DIR)$@)
 +target/arm/decode-neon-shared.inc.c: $(SRC_PATH)/target/arm/neon-shared.decode $(DECODETREE)
 +    $(call quiet-command,\
 +      $(PYTHON) $(DECODETREE) --static-decode disas_neon_shared -o $@ $<,\
 +      "GEN", $(TARGET_DIR)$@)
 +
 +target/arm/decode-neon-dp.inc.c: $(SRC_PATH)/target/arm/neon-dp.decode $(DECODETREE)
 +    $(call quiet-command,\
 +      $(PYTHON) $(DECODETREE) --static-decode disas_neon_dp -o $@ $<,\
 +      "GEN", $(TARGET_DIR)$@)
 +
 +target/arm/decode-neon-ls.inc.c: $(SRC_PATH)/target/arm/neon-ls.decode $(DECODETREE)
 +    $(call quiet-command,\
 +      $(PYTHON) $(DECODETREE) --static-decode disas_neon_ls -o $@ $<,\
 +      "GEN", $(TARGET_DIR)$@)
 +
  target/arm/decode-vfp.inc.c: $(SRC_PATH)/target/arm/vfp.decode $(DECODETREE)
      $(call quiet-command,\
        $(PYTHON) $(DECODETREE) --static-decode disas_vfp -o $@ $<,\
@@ -XXX,XX +XXX,XX @@ target/arm/decode-t16.inc.c: $(SRC_PATH)/target/arm/t16.decode $(DECODETREE)
        "GEN", $(TARGET_DIR)$@)
  target/arm/translate-sve.o: target/arm/decode-sve.inc.c
 +target/arm/translate.o: target/arm/decode-neon-shared.inc.c
 +target/arm/translate.o: target/arm/decode-neon-dp.inc.c
 +target/arm/translate.o: target/arm/decode-neon-ls.inc.c
  target/arm/translate.o: target/arm/decode-vfp.inc.c
  target/arm/translate.o: target/arm/decode-vfp-uncond.inc.c
  target/arm/translate.o: target/arm/decode-a32.inc.c
 --
-.20.1
+.34.1

-[PULL 05/39] target/arm: Add new 's1_is_el0' argument to get_phys_addr_lpae()
+[PULL 17/33] target/arm: Disable HSTR_EL2 traps if EL2 is not enabled
-For ARMv8.2-TTS2UXN, the stage 2 page table walk wants to know
+The HSTR_EL2 register is not supposed to have an effect unless EL2 is
-whether the stage 1 access is for EL0 or not, because whether
+enabled in the current security state.  We weren't checking for this,
-exec permission is given can depend on whether this is an EL0
+which meant that if the guest set up the HSTR_EL2 register we would
-or EL1 access. Add a new argument to get_phys_addr_lpae() so
+incorrectly trap even for accesses from Secure EL0 and EL1.
 the call sites can pass this information in.
-Since get_phys_addr_lpae() doesn't already have a doc comment,
+Add the missing checks. (Other places where we look at HSTR_EL2
-add one so we have a place to put the documentation of the
+for the not-in-v8A bits TTEE and TJDBX are already checking that
-semantics of the new s1_is_el0 argument.
+we are in NS EL0 or EL1, so there we alredy know EL2 is enabled.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200330210400.11724-4-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-8-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-8-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 29 ++++++++++++++++++++++++++++-
+ target/arm/helper.c    | 2 +-
-file changed, 28 insertions(+), 1 deletion(-)
+ target/arm/op_helper.c | 1 +
 files changed, 2 insertions(+), 1 deletion(-)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a32(CPUARMState *env, int fp_el,
+         DP_TBFLAG_A32(flags, VFPEN, 1);
  static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
                                 MMUAccessType access_type, ARMMMUIdx mmu_idx,
 +                               bool s1_is_el0,
                                 hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
                                 target_ulong *page_size_ptr,
                                 ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs);
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
          }
          ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
 +                                 false,
                                   &s2pa, &txattrs, &s2prot, &s2size, fi,
                                   pcacheattrs);
          if (ret) {
@@ -XXX,XX +XXX,XX @@ static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
      };
  }
 +/**
 + * get_phys_addr_lpae: perform one stage of page table walk, LPAE format
 + *
 + * Returns false if the translation was successful. Otherwise, phys_ptr, attrs,
 + * prot and page_size may not be filled in, and the populated fsr value provides
 + * information on why the translation aborted, in the format of a long-format
 + * DFSR/IFSR fault register, with the following caveats:
 + *  * the WnR bit is never set (the caller must do this).
 + *
 + * @env: CPUARMState
 + * @address: virtual address to get physical address for
 + * @access_type: MMU_DATA_LOAD, MMU_DATA_STORE or MMU_INST_FETCH
 + * @mmu_idx: MMU index indicating required translation regime
 + * @s1_is_el0: if @mmu_idx is ARMMMUIdx_Stage2 (so this is a stage 2 page table
 + *             walk), must be true if this is stage 2 of a stage 1+2 walk for an
 + *             EL0 access). If @mmu_idx is anything else, @s1_is_el0 is ignored.
 + * @phys_ptr: set to the physical address corresponding to the virtual address
 + * @attrs: set to the memory transaction attributes to use
 + * @prot: set to the permissions for the page containing phys_ptr
 + * @page_size_ptr: set to the size of the page containing phys_ptr
 + * @fi: set to fault info if the translation fails
 + * @cacheattrs: (if non-NULL) set to the cacheability/shareability attributes
 + */
  static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
                                 MMUAccessType access_type, ARMMMUIdx mmu_idx,
 +                               bool s1_is_el0,
                                 hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
                                 target_ulong *page_size_ptr,
                                 ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs)
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
              /* S1 is done. Now do S2 translation.  */
              ret = get_phys_addr_lpae(env, ipa, access_type, ARMMMUIdx_Stage2,
 +                                     mmu_idx == ARMMMUIdx_E10_0,
                                       phys_ptr, attrs, &s2_prot,
                                       page_size, fi,
                                       cacheattrs != NULL ? &cacheattrs2 : NULL);
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
      }
-     if (regime_using_lpae_format(env, mmu_idx)) {
+-    if (el < 2 && env->cp15.hstr_el2 &&
--        return get_phys_addr_lpae(env, address, access_type, mmu_idx,
++    if (el < 2 && env->cp15.hstr_el2 && arm_is_el2_enabled(env) &&
-+        return get_phys_addr_lpae(env, address, access_type, mmu_idx, false,
+         (arm_hcr_el2_eff(env) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
-                                   phys_ptr, attrs, prot, page_size,
+         DP_TBFLAG_A32(flags, HSTR_ACTIVE, 1);
-                                   fi, cacheattrs);
+     }
-     } else if (regime_sctlr(env, mmu_idx) & SCTLR_XP) {
+diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/op_helper.c
 +++ b/target/arm/op_helper.c
@@ -XXX,XX +XXX,XX @@ const void *HELPER(access_check_cp_reg)(CPUARMState *env, uint32_t key,
       * we only need to check here for traps from EL0.
       */
      if (!is_a64(env) && arm_current_el(env) == 0 && ri->cp == 15 &&
 +        arm_is_el2_enabled(env) &&
          (arm_hcr_el2_eff(env) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
          uint32_t mask = 1 << ri->crn;
 --
-.20.1
+.34.1

-[PULL 06/39] target/arm: Implement ARMv8.2-TTS2UXN
+[PULL 18/33] target/arm: Define the FEAT_FGT registers
-The ARMv8.2-TTS2UXN feature extends the XN field in stage 2
+Define the system registers which are provided by the
-translation table descriptors from just bit [54] to bits [54:53],
+FEAT_FGT fine-grained trap architectural feature:
-allowing stage 2 to control execution permissions separately for EL0
+ HFGRTR_EL2, HFGWTR_EL2, HDFGRTR_EL2, HDFGWTR_EL2, HFGITR_EL2
-and EL1. Implement the new semantics of the XN field and enable
-the feature for our 'max' CPU.
+All these registers are a set of bit fields, where each bit is set
 for a trap and clear to not trap on a particular system register
 access.  The R and W register pairs are for system registers,
 allowing trapping to be done separately for reads and writes; the I
 register is for system instructions where trapping is on instruction
 execution.
 The data storage in the CPU state struct is arranged as a set of
 arrays rather than separate fields so that when we're looking up the
 bits for a system register access we can just index into the array
 rather than having to use a switch to select a named struct member.
 The later FEAT_FGT2 will add extra elements to these arrays.
 The field definitions for the new registers are in cpregs.h because
 in practice the code that needs them is code that also needs
 the cpregs information; cpu.h is included in a lot more files.
 We're also going to add some FGT-specific definitions to cpregs.h
 in the next commit.
 We do not implement HAFGRTR_EL2, because we don't implement
 FEAT_AMUv1.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200330210400.11724-5-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-9-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-9-peter.maydell@linaro.org
 ---
- target/arm/cpu.h    | 15 +++++++++++++++
+ target/arm/cpregs.h | 285 ++++++++++++++++++++++++++++++++++++++++++++
- target/arm/cpu.c    |  1 +
+ target/arm/cpu.h    |  15 +++
- target/arm/cpu64.c  |  2 ++
+ target/arm/helper.c |  40 +++++++
- target/arm/helper.c | 37 +++++++++++++++++++++++++++++++------
+files changed, 340 insertions(+)
-files changed, 49 insertions(+), 6 deletions(-)
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpregs.h
 +++ b/target/arm/cpregs.h
@@ -XXX,XX +XXX,XX @@ typedef enum CPAccessResult {
      CP_ACCESS_TRAP_UNCATEGORIZED = (2 << 2),
  } CPAccessResult;
 +/* Indexes into fgt_read[] */
 +#define FGTREG_HFGRTR 0
 +#define FGTREG_HDFGRTR 1
 +/* Indexes into fgt_write[] */
 +#define FGTREG_HFGWTR 0
 +#define FGTREG_HDFGWTR 1
 +/* Indexes into fgt_exec[] */
 +#define FGTREG_HFGITR 0
 +
 +FIELD(HFGRTR_EL2, AFSR0_EL1, 0, 1)
 +FIELD(HFGRTR_EL2, AFSR1_EL1, 1, 1)
 +FIELD(HFGRTR_EL2, AIDR_EL1, 2, 1)
 +FIELD(HFGRTR_EL2, AMAIR_EL1, 3, 1)
 +FIELD(HFGRTR_EL2, APDAKEY, 4, 1)
 +FIELD(HFGRTR_EL2, APDBKEY, 5, 1)
 +FIELD(HFGRTR_EL2, APGAKEY, 6, 1)
 +FIELD(HFGRTR_EL2, APIAKEY, 7, 1)
 +FIELD(HFGRTR_EL2, APIBKEY, 8, 1)
 +FIELD(HFGRTR_EL2, CCSIDR_EL1, 9, 1)
 +FIELD(HFGRTR_EL2, CLIDR_EL1, 10, 1)
 +FIELD(HFGRTR_EL2, CONTEXTIDR_EL1, 11, 1)
 +FIELD(HFGRTR_EL2, CPACR_EL1, 12, 1)
 +FIELD(HFGRTR_EL2, CSSELR_EL1, 13, 1)
 +FIELD(HFGRTR_EL2, CTR_EL0, 14, 1)
 +FIELD(HFGRTR_EL2, DCZID_EL0, 15, 1)
 +FIELD(HFGRTR_EL2, ESR_EL1, 16, 1)
 +FIELD(HFGRTR_EL2, FAR_EL1, 17, 1)
 +FIELD(HFGRTR_EL2, ISR_EL1, 18, 1)
 +FIELD(HFGRTR_EL2, LORC_EL1, 19, 1)
 +FIELD(HFGRTR_EL2, LOREA_EL1, 20, 1)
 +FIELD(HFGRTR_EL2, LORID_EL1, 21, 1)
 +FIELD(HFGRTR_EL2, LORN_EL1, 22, 1)
 +FIELD(HFGRTR_EL2, LORSA_EL1, 23, 1)
 +FIELD(HFGRTR_EL2, MAIR_EL1, 24, 1)
 +FIELD(HFGRTR_EL2, MIDR_EL1, 25, 1)
 +FIELD(HFGRTR_EL2, MPIDR_EL1, 26, 1)
 +FIELD(HFGRTR_EL2, PAR_EL1, 27, 1)
 +FIELD(HFGRTR_EL2, REVIDR_EL1, 28, 1)
 +FIELD(HFGRTR_EL2, SCTLR_EL1, 29, 1)
 +FIELD(HFGRTR_EL2, SCXTNUM_EL1, 30, 1)
 +FIELD(HFGRTR_EL2, SCXTNUM_EL0, 31, 1)
 +FIELD(HFGRTR_EL2, TCR_EL1, 32, 1)
 +FIELD(HFGRTR_EL2, TPIDR_EL1, 33, 1)
 +FIELD(HFGRTR_EL2, TPIDRRO_EL0, 34, 1)
 +FIELD(HFGRTR_EL2, TPIDR_EL0, 35, 1)
 +FIELD(HFGRTR_EL2, TTBR0_EL1, 36, 1)
 +FIELD(HFGRTR_EL2, TTBR1_EL1, 37, 1)
 +FIELD(HFGRTR_EL2, VBAR_EL1, 38, 1)
 +FIELD(HFGRTR_EL2, ICC_IGRPENN_EL1, 39, 1)
 +FIELD(HFGRTR_EL2, ERRIDR_EL1, 40, 1)
 +FIELD(HFGRTR_EL2, ERRSELR_EL1, 41, 1)
 +FIELD(HFGRTR_EL2, ERXFR_EL1, 42, 1)
 +FIELD(HFGRTR_EL2, ERXCTLR_EL1, 43, 1)
 +FIELD(HFGRTR_EL2, ERXSTATUS_EL1, 44, 1)
 +FIELD(HFGRTR_EL2, ERXMISCN_EL1, 45, 1)
 +FIELD(HFGRTR_EL2, ERXPFGF_EL1, 46, 1)
 +FIELD(HFGRTR_EL2, ERXPFGCTL_EL1, 47, 1)
 +FIELD(HFGRTR_EL2, ERXPFGCDN_EL1, 48, 1)
 +FIELD(HFGRTR_EL2, ERXADDR_EL1, 49, 1)
 +FIELD(HFGRTR_EL2, NACCDATA_EL1, 50, 1)
 +/* 51-53: RES0 */
 +FIELD(HFGRTR_EL2, NSMPRI_EL1, 54, 1)
 +FIELD(HFGRTR_EL2, NTPIDR2_EL0, 55, 1)
 +/* 56-63: RES0 */
 +
 +/* These match HFGRTR but bits for RO registers are RES0 */
 +FIELD(HFGWTR_EL2, AFSR0_EL1, 0, 1)
 +FIELD(HFGWTR_EL2, AFSR1_EL1, 1, 1)
 +FIELD(HFGWTR_EL2, AMAIR_EL1, 3, 1)
 +FIELD(HFGWTR_EL2, APDAKEY, 4, 1)
 +FIELD(HFGWTR_EL2, APDBKEY, 5, 1)
 +FIELD(HFGWTR_EL2, APGAKEY, 6, 1)
 +FIELD(HFGWTR_EL2, APIAKEY, 7, 1)
 +FIELD(HFGWTR_EL2, APIBKEY, 8, 1)
 +FIELD(HFGWTR_EL2, CONTEXTIDR_EL1, 11, 1)
 +FIELD(HFGWTR_EL2, CPACR_EL1, 12, 1)
 +FIELD(HFGWTR_EL2, CSSELR_EL1, 13, 1)
 +FIELD(HFGWTR_EL2, ESR_EL1, 16, 1)
 +FIELD(HFGWTR_EL2, FAR_EL1, 17, 1)
 +FIELD(HFGWTR_EL2, LORC_EL1, 19, 1)
 +FIELD(HFGWTR_EL2, LOREA_EL1, 20, 1)
 +FIELD(HFGWTR_EL2, LORN_EL1, 22, 1)
 +FIELD(HFGWTR_EL2, LORSA_EL1, 23, 1)
 +FIELD(HFGWTR_EL2, MAIR_EL1, 24, 1)
 +FIELD(HFGWTR_EL2, PAR_EL1, 27, 1)
 +FIELD(HFGWTR_EL2, SCTLR_EL1, 29, 1)
 +FIELD(HFGWTR_EL2, SCXTNUM_EL1, 30, 1)
 +FIELD(HFGWTR_EL2, SCXTNUM_EL0, 31, 1)
 +FIELD(HFGWTR_EL2, TCR_EL1, 32, 1)
 +FIELD(HFGWTR_EL2, TPIDR_EL1, 33, 1)
 +FIELD(HFGWTR_EL2, TPIDRRO_EL0, 34, 1)
 +FIELD(HFGWTR_EL2, TPIDR_EL0, 35, 1)
 +FIELD(HFGWTR_EL2, TTBR0_EL1, 36, 1)
 +FIELD(HFGWTR_EL2, TTBR1_EL1, 37, 1)
 +FIELD(HFGWTR_EL2, VBAR_EL1, 38, 1)
 +FIELD(HFGWTR_EL2, ICC_IGRPENN_EL1, 39, 1)
 +FIELD(HFGWTR_EL2, ERRSELR_EL1, 41, 1)
 +FIELD(HFGWTR_EL2, ERXCTLR_EL1, 43, 1)
 +FIELD(HFGWTR_EL2, ERXSTATUS_EL1, 44, 1)
 +FIELD(HFGWTR_EL2, ERXMISCN_EL1, 45, 1)
 +FIELD(HFGWTR_EL2, ERXPFGCTL_EL1, 47, 1)
 +FIELD(HFGWTR_EL2, ERXPFGCDN_EL1, 48, 1)
 +FIELD(HFGWTR_EL2, ERXADDR_EL1, 49, 1)
 +FIELD(HFGWTR_EL2, NACCDATA_EL1, 50, 1)
 +FIELD(HFGWTR_EL2, NSMPRI_EL1, 54, 1)
 +FIELD(HFGWTR_EL2, NTPIDR2_EL0, 55, 1)
 +
 +FIELD(HFGITR_EL2, ICIALLUIS, 0, 1)
 +FIELD(HFGITR_EL2, ICIALLU, 1, 1)
 +FIELD(HFGITR_EL2, ICIVAU, 2, 1)
 +FIELD(HFGITR_EL2, DCIVAC, 3, 1)
 +FIELD(HFGITR_EL2, DCISW, 4, 1)
 +FIELD(HFGITR_EL2, DCCSW, 5, 1)
 +FIELD(HFGITR_EL2, DCCISW, 6, 1)
 +FIELD(HFGITR_EL2, DCCVAU, 7, 1)
 +FIELD(HFGITR_EL2, DCCVAP, 8, 1)
 +FIELD(HFGITR_EL2, DCCVADP, 9, 1)
 +FIELD(HFGITR_EL2, DCCIVAC, 10, 1)
 +FIELD(HFGITR_EL2, DCZVA, 11, 1)
 +FIELD(HFGITR_EL2, ATS1E1R, 12, 1)
 +FIELD(HFGITR_EL2, ATS1E1W, 13, 1)
 +FIELD(HFGITR_EL2, ATS1E0R, 14, 1)
 +FIELD(HFGITR_EL2, ATS1E0W, 15, 1)
 +FIELD(HFGITR_EL2, ATS1E1RP, 16, 1)
 +FIELD(HFGITR_EL2, ATS1E1WP, 17, 1)
 +FIELD(HFGITR_EL2, TLBIVMALLE1OS, 18, 1)
 +FIELD(HFGITR_EL2, TLBIVAE1OS, 19, 1)
 +FIELD(HFGITR_EL2, TLBIASIDE1OS, 20, 1)
 +FIELD(HFGITR_EL2, TLBIVAAE1OS, 21, 1)
 +FIELD(HFGITR_EL2, TLBIVALE1OS, 22, 1)
 +FIELD(HFGITR_EL2, TLBIVAALE1OS, 23, 1)
 +FIELD(HFGITR_EL2, TLBIRVAE1OS, 24, 1)
 +FIELD(HFGITR_EL2, TLBIRVAAE1OS, 25, 1)
 +FIELD(HFGITR_EL2, TLBIRVALE1OS, 26, 1)
 +FIELD(HFGITR_EL2, TLBIRVAALE1OS, 27, 1)
 +FIELD(HFGITR_EL2, TLBIVMALLE1IS, 28, 1)
 +FIELD(HFGITR_EL2, TLBIVAE1IS, 29, 1)
 +FIELD(HFGITR_EL2, TLBIASIDE1IS, 30, 1)
 +FIELD(HFGITR_EL2, TLBIVAAE1IS, 31, 1)
 +FIELD(HFGITR_EL2, TLBIVALE1IS, 32, 1)
 +FIELD(HFGITR_EL2, TLBIVAALE1IS, 33, 1)
 +FIELD(HFGITR_EL2, TLBIRVAE1IS, 34, 1)
 +FIELD(HFGITR_EL2, TLBIRVAAE1IS, 35, 1)
 +FIELD(HFGITR_EL2, TLBIRVALE1IS, 36, 1)
 +FIELD(HFGITR_EL2, TLBIRVAALE1IS, 37, 1)
 +FIELD(HFGITR_EL2, TLBIRVAE1, 38, 1)
 +FIELD(HFGITR_EL2, TLBIRVAAE1, 39, 1)
 +FIELD(HFGITR_EL2, TLBIRVALE1, 40, 1)
 +FIELD(HFGITR_EL2, TLBIRVAALE1, 41, 1)
 +FIELD(HFGITR_EL2, TLBIVMALLE1, 42, 1)
 +FIELD(HFGITR_EL2, TLBIVAE1, 43, 1)
 +FIELD(HFGITR_EL2, TLBIASIDE1, 44, 1)
 +FIELD(HFGITR_EL2, TLBIVAAE1, 45, 1)
 +FIELD(HFGITR_EL2, TLBIVALE1, 46, 1)
 +FIELD(HFGITR_EL2, TLBIVAALE1, 47, 1)
 +FIELD(HFGITR_EL2, CFPRCTX, 48, 1)
 +FIELD(HFGITR_EL2, DVPRCTX, 49, 1)
 +FIELD(HFGITR_EL2, CPPRCTX, 50, 1)
 +FIELD(HFGITR_EL2, ERET, 51, 1)
 +FIELD(HFGITR_EL2, SVC_EL0, 52, 1)
 +FIELD(HFGITR_EL2, SVC_EL1, 53, 1)
 +FIELD(HFGITR_EL2, DCCVAC, 54, 1)
 +FIELD(HFGITR_EL2, NBRBINJ, 55, 1)
 +FIELD(HFGITR_EL2, NBRBIALL, 56, 1)
 +
 +FIELD(HDFGRTR_EL2, DBGBCRN_EL1, 0, 1)
 +FIELD(HDFGRTR_EL2, DBGBVRN_EL1, 1, 1)
 +FIELD(HDFGRTR_EL2, DBGWCRN_EL1, 2, 1)
 +FIELD(HDFGRTR_EL2, DBGWVRN_EL1, 3, 1)
 +FIELD(HDFGRTR_EL2, MDSCR_EL1, 4, 1)
 +FIELD(HDFGRTR_EL2, DBGCLAIM, 5, 1)
 +FIELD(HDFGRTR_EL2, DBGAUTHSTATUS_EL1, 6, 1)
 +FIELD(HDFGRTR_EL2, DBGPRCR_EL1, 7, 1)
 +/* 8: RES0: OSLAR_EL1 is WO */
 +FIELD(HDFGRTR_EL2, OSLSR_EL1, 9, 1)
 +FIELD(HDFGRTR_EL2, OSECCR_EL1, 10, 1)
 +FIELD(HDFGRTR_EL2, OSDLR_EL1, 11, 1)
 +FIELD(HDFGRTR_EL2, PMEVCNTRN_EL0, 12, 1)
 +FIELD(HDFGRTR_EL2, PMEVTYPERN_EL0, 13, 1)
 +FIELD(HDFGRTR_EL2, PMCCFILTR_EL0, 14, 1)
 +FIELD(HDFGRTR_EL2, PMCCNTR_EL0, 15, 1)
 +FIELD(HDFGRTR_EL2, PMCNTEN, 16, 1)
 +FIELD(HDFGRTR_EL2, PMINTEN, 17, 1)
 +FIELD(HDFGRTR_EL2, PMOVS, 18, 1)
 +FIELD(HDFGRTR_EL2, PMSELR_EL0, 19, 1)
 +/* 20: RES0: PMSWINC_EL0 is WO */
 +/* 21: RES0: PMCR_EL0 is WO */
 +FIELD(HDFGRTR_EL2, PMMIR_EL1, 22, 1)
 +FIELD(HDFGRTR_EL2, PMBLIMITR_EL1, 23, 1)
 +FIELD(HDFGRTR_EL2, PMBPTR_EL1, 24, 1)
 +FIELD(HDFGRTR_EL2, PMBSR_EL1, 25, 1)
 +FIELD(HDFGRTR_EL2, PMSCR_EL1, 26, 1)
 +FIELD(HDFGRTR_EL2, PMSEVFR_EL1, 27, 1)
 +FIELD(HDFGRTR_EL2, PMSFCR_EL1, 28, 1)
 +FIELD(HDFGRTR_EL2, PMSICR_EL1, 29, 1)
 +FIELD(HDFGRTR_EL2, PMSIDR_EL1, 30, 1)
 +FIELD(HDFGRTR_EL2, PMSIRR_EL1, 31, 1)
 +FIELD(HDFGRTR_EL2, PMSLATFR_EL1, 32, 1)
 +FIELD(HDFGRTR_EL2, TRC, 33, 1)
 +FIELD(HDFGRTR_EL2, TRCAUTHSTATUS, 34, 1)
 +FIELD(HDFGRTR_EL2, TRCAUXCTLR, 35, 1)
 +FIELD(HDFGRTR_EL2, TRCCLAIM, 36, 1)
 +FIELD(HDFGRTR_EL2, TRCCNTVRn, 37, 1)
 +/* 38, 39: RES0 */
 +FIELD(HDFGRTR_EL2, TRCID, 40, 1)
 +FIELD(HDFGRTR_EL2, TRCIMSPECN, 41, 1)
 +/* 42: RES0: TRCOSLAR is WO */
 +FIELD(HDFGRTR_EL2, TRCOSLSR, 43, 1)
 +FIELD(HDFGRTR_EL2, TRCPRGCTLR, 44, 1)
 +FIELD(HDFGRTR_EL2, TRCSEQSTR, 45, 1)
 +FIELD(HDFGRTR_EL2, TRCSSCSRN, 46, 1)
 +FIELD(HDFGRTR_EL2, TRCSTATR, 47, 1)
 +FIELD(HDFGRTR_EL2, TRCVICTLR, 48, 1)
 +/* 49: RES0: TRFCR_EL1 is WO */
 +FIELD(HDFGRTR_EL2, TRBBASER_EL1, 50, 1)
 +FIELD(HDFGRTR_EL2, TRBIDR_EL1, 51, 1)
 +FIELD(HDFGRTR_EL2, TRBLIMITR_EL1, 52, 1)
 +FIELD(HDFGRTR_EL2, TRBMAR_EL1, 53, 1)
 +FIELD(HDFGRTR_EL2, TRBPTR_EL1, 54, 1)
 +FIELD(HDFGRTR_EL2, TRBSR_EL1, 55, 1)
 +FIELD(HDFGRTR_EL2, TRBTRG_EL1, 56, 1)
 +FIELD(HDFGRTR_EL2, PMUSERENR_EL0, 57, 1)
 +FIELD(HDFGRTR_EL2, PMCEIDN_EL0, 58, 1)
 +FIELD(HDFGRTR_EL2, NBRBIDR, 59, 1)
 +FIELD(HDFGRTR_EL2, NBRBCTL, 60, 1)
 +FIELD(HDFGRTR_EL2, NBRBDATA, 61, 1)
 +FIELD(HDFGRTR_EL2, NPMSNEVFR_EL1, 62, 1)
 +FIELD(HDFGRTR_EL2, PMBIDR_EL1, 63, 1)
 +
 +/*
 + * These match HDFGRTR_EL2, but bits for RO registers are RES0.
 + * A few bits are for WO registers, where the HDFGRTR_EL2 bit is RES0.
 + */
 +FIELD(HDFGWTR_EL2, DBGBCRN_EL1, 0, 1)
 +FIELD(HDFGWTR_EL2, DBGBVRN_EL1, 1, 1)
 +FIELD(HDFGWTR_EL2, DBGWCRN_EL1, 2, 1)
 +FIELD(HDFGWTR_EL2, DBGWVRN_EL1, 3, 1)
 +FIELD(HDFGWTR_EL2, MDSCR_EL1, 4, 1)
 +FIELD(HDFGWTR_EL2, DBGCLAIM, 5, 1)
 +FIELD(HDFGWTR_EL2, DBGPRCR_EL1, 7, 1)
 +FIELD(HDFGWTR_EL2, OSLAR_EL1, 8, 1)
 +FIELD(HDFGWTR_EL2, OSLSR_EL1, 9, 1)
 +FIELD(HDFGWTR_EL2, OSECCR_EL1, 10, 1)
 +FIELD(HDFGWTR_EL2, OSDLR_EL1, 11, 1)
 +FIELD(HDFGWTR_EL2, PMEVCNTRN_EL0, 12, 1)
 +FIELD(HDFGWTR_EL2, PMEVTYPERN_EL0, 13, 1)
 +FIELD(HDFGWTR_EL2, PMCCFILTR_EL0, 14, 1)
 +FIELD(HDFGWTR_EL2, PMCCNTR_EL0, 15, 1)
 +FIELD(HDFGWTR_EL2, PMCNTEN, 16, 1)
 +FIELD(HDFGWTR_EL2, PMINTEN, 17, 1)
 +FIELD(HDFGWTR_EL2, PMOVS, 18, 1)
 +FIELD(HDFGWTR_EL2, PMSELR_EL0, 19, 1)
 +FIELD(HDFGWTR_EL2, PMSWINC_EL0, 20, 1)
 +FIELD(HDFGWTR_EL2, PMCR_EL0, 21, 1)
 +FIELD(HDFGWTR_EL2, PMBLIMITR_EL1, 23, 1)
 +FIELD(HDFGWTR_EL2, PMBPTR_EL1, 24, 1)
 +FIELD(HDFGWTR_EL2, PMBSR_EL1, 25, 1)
 +FIELD(HDFGWTR_EL2, PMSCR_EL1, 26, 1)
 +FIELD(HDFGWTR_EL2, PMSEVFR_EL1, 27, 1)
 +FIELD(HDFGWTR_EL2, PMSFCR_EL1, 28, 1)
 +FIELD(HDFGWTR_EL2, PMSICR_EL1, 29, 1)
 +FIELD(HDFGWTR_EL2, PMSIRR_EL1, 31, 1)
 +FIELD(HDFGWTR_EL2, PMSLATFR_EL1, 32, 1)
 +FIELD(HDFGWTR_EL2, TRC, 33, 1)
 +FIELD(HDFGWTR_EL2, TRCAUXCTLR, 35, 1)
 +FIELD(HDFGWTR_EL2, TRCCLAIM, 36, 1)
 +FIELD(HDFGWTR_EL2, TRCCNTVRn, 37, 1)
 +FIELD(HDFGWTR_EL2, TRCIMSPECN, 41, 1)
 +FIELD(HDFGWTR_EL2, TRCOSLAR, 42, 1)
 +FIELD(HDFGWTR_EL2, TRCPRGCTLR, 44, 1)
 +FIELD(HDFGWTR_EL2, TRCSEQSTR, 45, 1)
 +FIELD(HDFGWTR_EL2, TRCSSCSRN, 46, 1)
 +FIELD(HDFGWTR_EL2, TRCVICTLR, 48, 1)
 +FIELD(HDFGWTR_EL2, TRFCR_EL1, 49, 1)
 +FIELD(HDFGWTR_EL2, TRBBASER_EL1, 50, 1)
 +FIELD(HDFGWTR_EL2, TRBLIMITR_EL1, 52, 1)
 +FIELD(HDFGWTR_EL2, TRBMAR_EL1, 53, 1)
 +FIELD(HDFGWTR_EL2, TRBPTR_EL1, 54, 1)
 +FIELD(HDFGWTR_EL2, TRBSR_EL1, 55, 1)
 +FIELD(HDFGWTR_EL2, TRBTRG_EL1, 56, 1)
 +FIELD(HDFGWTR_EL2, PMUSERENR_EL0, 57, 1)
 +FIELD(HDFGWTR_EL2, NBRBCTL, 60, 1)
 +FIELD(HDFGWTR_EL2, NBRBDATA, 61, 1)
 +FIELD(HDFGWTR_EL2, NPMSNEVFR_EL1, 62, 1)
 +
  typedef struct ARMCPRegInfo ARMCPRegInfo;
  /*
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_ccidx(const ARMISARegisters *id)
+@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
-     return FIELD_EX32(id->id_mmfr4, ID_MMFR4, CCIDX) != 0;
+         uint64_t disr_el1;
          uint64_t vdisr_el2;
          uint64_t vsesr_el2;
 +
 +        /*
 +         * Fine-Grained Trap registers. We store these as arrays so the
 +         * access checking code doesn't have to manually select
 +         * HFGRTR_EL2 vs HFDFGRTR_EL2 etc when looking up the bit to test.
 +         * FEAT_FGT2 will add more elements to these arrays.
 +         */
 +        uint64_t fgt_read[2]; /* HFGRTR, HDFGRTR */
 +        uint64_t fgt_write[2]; /* HFGWTR, HDFGWTR */
 +        uint64_t fgt_exec[1]; /* HFGITR */
      } cp15;
      struct {
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_tgran64_2(const ARMISARegisters *id)
      return t >= 2 || (t == 0 && isar_feature_aa64_tgran64(id));
  }
-+static inline bool isar_feature_aa32_tts2uxn(const ARMISARegisters *id)
++static inline bool isar_feature_aa64_fgt(const ARMISARegisters *id)
 +{
-+    return FIELD_EX32(id->id_mmfr4, ID_MMFR4, XNX) != 0;
++    return FIELD_EX64(id->id_aa64mmfr0, ID_AA64MMFR0, FGT) != 0;
 +}
 +
- /*
+ static inline bool isar_feature_aa64_ccidx(const ARMISARegisters *id)
-  * 64-bit feature tests via id registers.
+ {
   */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_ccidx(const ARMISARegisters *id)
      return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, CCIDX) != 0;
- }
-+static inline bool isar_feature_aa64_tts2uxn(const ARMISARegisters *id)
-+{
-+    return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, XNX) != 0;
-+}
-+
- /*
-  * Feature tests for "does this exist in either 32-bit or 64-bit?"
-  */
-@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_ccidx(const ARMISARegisters *id)
-     return isar_feature_aa64_ccidx(id) || isar_feature_aa32_ccidx(id);
- }
-+static inline bool isar_feature_any_tts2uxn(const ARMISARegisters *id)
-+{
-+    return isar_feature_aa64_tts2uxn(id) || isar_feature_aa32_tts2uxn(id);
-+}
-+
- /*
-  * Forward to the above feature tests given an ARMCPU pointer.
-  */
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
-+++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
-             t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
-             t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
-             t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
-+            t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* TTS2UXN */
-             cpu->isar.id_mmfr4 = t;
-         }
- #endif
-diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu64.c
-+++ b/target/arm/cpu64.c
-@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
-         t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);
-         t = FIELD_DP64(t, ID_AA64MMFR1, PAN, 2); /* ATS1E1 */
-         t = FIELD_DP64(t, ID_AA64MMFR1, VMIDBITS, 2); /* VMID16 */
-+        t = FIELD_DP64(t, ID_AA64MMFR1, XNX, 1); /* TTS2UXN */
-         cpu->isar.id_aa64mmfr1 = t;
-         t = cpu->isar.id_aa64mmfr2;
-@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
-         u = FIELD_DP32(u, ID_MMFR4, HPDS, 1); /* AA32HPD */
-         u = FIELD_DP32(u, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
-         u = FIELD_DP32(u, ID_MMFR4, CNP, 1); /* TTCNP */
-+        u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
-         cpu->isar.id_mmfr4 = u;
-         u = cpu->isar.id_aa64dfr0;
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ simple_ap_to_rw_prot(CPUARMState *env, ARMMMUIdx mmu_idx, int ap)
+@@ -XXX,XX +XXX,XX @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
-  *
+         if (cpu_isar_feature(aa64_hcx, cpu)) {
-  * @env:     CPUARMState
+             valid_mask |= SCR_HXEN;
-  * @s2ap:    The 2-bit stage2 access permissions (S2AP)
+         }
-- * @xn:      XN (execute-never) bit
++        if (cpu_isar_feature(aa64_fgt, cpu)) {
-+ * @xn:      XN (execute-never) bits
++            valid_mask |= SCR_FGTEN;
-+ * @s1_is_el0: true if this is S2 of an S1+2 walk for EL0
++        }
-  */
+     } else {
--static int get_S2prot(CPUARMState *env, int s2ap, int xn)
+         valid_mask &= ~(SCR_RW | SCR_ST);
-+static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
+         if (cpu_isar_feature(aa32_ras, cpu)) {
- {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo scxtnum_reginfo[] = {
-     int prot = 0;
+       .access = PL3_RW,
+       .fieldoffset = offsetof(CPUARMState, scxtnum_el[3]) },
-@@ -XXX,XX +XXX,XX @@ static int get_S2prot(CPUARMState *env, int s2ap, int xn)
+ };
-     if (s2ap & 2) {
++
-         prot |= PAGE_WRITE;
++static CPAccessResult access_fgt(CPUARMState *env, const ARMCPRegInfo *ri,
 +                                 bool isread)
 +{
 +    if (arm_current_el(env) == 2 &&
 +        arm_feature(env, ARM_FEATURE_EL3) && !(env->cp15.scr_el3 & SCR_FGTEN)) {
 +        return CP_ACCESS_TRAP_EL3;
 +    }
 +    return CP_ACCESS_OK;
 +}
 +
 +static const ARMCPRegInfo fgt_reginfo[] = {
 +    { .name = "HFGRTR_EL2", .state = ARM_CP_STATE_AA64,
 +      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 4,
 +      .access = PL2_RW, .accessfn = access_fgt,
 +      .fieldoffset = offsetof(CPUARMState, cp15.fgt_read[FGTREG_HFGRTR]) },
 +    { .name = "HFGWTR_EL2", .state = ARM_CP_STATE_AA64,
 +      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 5,
 +      .access = PL2_RW, .accessfn = access_fgt,
 +      .fieldoffset = offsetof(CPUARMState, cp15.fgt_write[FGTREG_HFGWTR]) },
 +    { .name = "HDFGRTR_EL2", .state = ARM_CP_STATE_AA64,
 +      .opc0 = 3, .opc1 = 4, .crn = 3, .crm = 1, .opc2 = 4,
 +      .access = PL2_RW, .accessfn = access_fgt,
 +      .fieldoffset = offsetof(CPUARMState, cp15.fgt_read[FGTREG_HDFGRTR]) },
 +    { .name = "HDFGWTR_EL2", .state = ARM_CP_STATE_AA64,
 +      .opc0 = 3, .opc1 = 4, .crn = 3, .crm = 1, .opc2 = 5,
 +      .access = PL2_RW, .accessfn = access_fgt,
 +      .fieldoffset = offsetof(CPUARMState, cp15.fgt_write[FGTREG_HDFGWTR]) },
 +    { .name = "HFGITR_EL2", .state = ARM_CP_STATE_AA64,
 +      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 6,
 +      .access = PL2_RW, .accessfn = access_fgt,
 +      .fieldoffset = offsetof(CPUARMState, cp15.fgt_exec[FGTREG_HFGITR]) },
 +};
  #endif /* TARGET_AARCH64 */
  static CPAccessResult access_predinv(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
      if (cpu_isar_feature(aa64_scxtnum, cpu)) {
          define_arm_cp_regs(cpu, scxtnum_reginfo);
      }
--    if (!xn) {
++
--        if (arm_el_is_aa64(env, 2) || prot & PAGE_READ) {
++    if (cpu_isar_feature(aa64_fgt, cpu)) {
-+
++        define_arm_cp_regs(cpu, fgt_reginfo);
-+    if (cpu_isar_feature(any_tts2uxn, env_archcpu(env))) {
++    }
-+        switch (xn) {
+ #endif
-+        case 0:
-             prot |= PAGE_EXEC;
+     if (cpu_isar_feature(any_predinv, cpu)) {
 +            break;
 +        case 1:
 +            if (s1_is_el0) {
 +                prot |= PAGE_EXEC;
 +            }
 +            break;
 +        case 2:
 +            break;
 +        case 3:
 +            if (!s1_is_el0) {
 +                prot |= PAGE_EXEC;
 +            }
 +            break;
 +        default:
 +            g_assert_not_reached();
 +        }
 +    } else {
 +        if (!extract32(xn, 1, 1)) {
 +            if (arm_el_is_aa64(env, 2) || prot & PAGE_READ) {
 +                prot |= PAGE_EXEC;
 +            }
          }
      }
      return prot;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
      }
      ap = extract32(attrs, 4, 2);
 -    xn = extract32(attrs, 12, 1);
      if (mmu_idx == ARMMMUIdx_Stage2) {
          ns = true;
 -        *prot = get_S2prot(env, ap, xn);
 +        xn = extract32(attrs, 11, 2);
 +        *prot = get_S2prot(env, ap, xn, s1_is_el0);
      } else {
          ns = extract32(attrs, 3, 1);
 +        xn = extract32(attrs, 12, 1);
          pxn = extract32(attrs, 11, 1);
          *prot = get_S1prot(env, mmu_idx, aarch64, ap, ns, xn, pxn);
      }
 --
-.20.1
+.34.1

-[PULL 28/39] target/arm: Convert V[US]DOT (scalar) to decodetree
+[PULL 19/33] target/arm: Implement FGT trapping infrastructure
-Convert the V[US]DOT (scalar) insns in the 2reg-scalar-ext group
+Implement the machinery for fine-grained traps on normal sysregs.
-to decodetree.
+Any sysreg with a fine-grained trap will set the new field to
 indicate which FGT register bit it should trap on.
 FGT traps only happen when an AArch64 EL2 enables them for
 an AArch64 EL1. They therefore are only relevant for AArch32
 cpregs when the cpreg can be accessed from EL0. The logic
 in access_check_cp_reg() will check this, so it is safe to
 add a .fgt marking to an ARM_CP_STATE_BOTH ARMCPRegInfo.
 The DO_BIT and DO_REV_BIT macros define enum constants FGT_##bitname
 which can be used to specify the FGT bit, eg
    .fgt = FGT_AFSR0_EL1
 (We assume that there is no bit name duplication across the FGT
 registers, for brevity's sake.)
 Subsequent commits will add the .fgt fields to the relevant register
 definitions and define the FGT_nnn values for them.
 Note that some of the FGT traps are for instructions that we don't
 handle via the cpregs mechanisms (mostly these are instruction traps).
 Those we will have to handle separately.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-10-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-10-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-10-peter.maydell@linaro.org
 ---
- target/arm/neon-shared.decode   |  3 +++
+ target/arm/cpregs.h        | 72 ++++++++++++++++++++++++++++++++++++++
- target/arm/translate-neon.inc.c | 35 +++++++++++++++++++++++++++++++++
+ target/arm/cpu.h           |  1 +
- target/arm/translate.c          | 13 +-----------
+ target/arm/internals.h     | 20 +++++++++++
-files changed, 39 insertions(+), 12 deletions(-)
+ target/arm/translate.h     |  2 ++
+ target/arm/helper.c        |  9 +++++
-diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+ target/arm/op_helper.c     | 30 ++++++++++++++++
-index XXXXXXX..XXXXXXX 100644
+ target/arm/translate-a64.c |  3 +-
---- a/target/arm/neon-shared.decode
+ target/arm/translate.c     |  2 ++
-+++ b/target/arm/neon-shared.decode
+files changed, 138 insertions(+), 1 deletion(-)
-@@ -XXX,XX +XXX,XX @@ VCMLA_scalar   1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \
-                vn=%vn_dp vd=%vd_dp size=0
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
- VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
+index XXXXXXX..XXXXXXX 100644
-                vm=%vm_dp vn=%vn_dp vd=%vd_dp size=1 index=0
+--- a/target/arm/cpregs.h
-+
++++ b/target/arm/cpregs.h
-+VDOT_scalar    1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \
+@@ -XXX,XX +XXX,XX @@ FIELD(HDFGWTR_EL2, NBRBCTL, 60, 1)
-+               vm=%vm_dp vn=%vn_dp vd=%vd_dp
+ FIELD(HDFGWTR_EL2, NBRBDATA, 61, 1)
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+ FIELD(HDFGWTR_EL2, NPMSNEVFR_EL1, 62, 1)
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
++/* Which fine-grained trap bit register to check, if any */
-+++ b/target/arm/translate-neon.inc.c
++FIELD(FGT, TYPE, 10, 3)
-@@ -XXX,XX +XXX,XX @@ static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a)
++FIELD(FGT, REV, 9, 1) /* Is bit sense reversed? */
-     tcg_temp_free_ptr(fpst);
++FIELD(FGT, IDX, 6, 3) /* Index within a uint64_t[] array */
-     return true;
++FIELD(FGT, BITPOS, 0, 6) /* Bit position within the uint64_t */
 +
 +/*
 + * Macros to define FGT_##bitname enum constants to use in ARMCPRegInfo::fgt
 + * fields. We assume for brevity's sake that there are no duplicated
 + * bit names across the various FGT registers.
 + */
 +#define DO_BIT(REG, BITNAME)                                    \
 +    FGT_##BITNAME = FGT_##REG | R_##REG##_EL2_##BITNAME##_SHIFT
 +
 +/* Some bits have reversed sense, so 0 means trap and 1 means not */
 +#define DO_REV_BIT(REG, BITNAME)                                        \
 +    FGT_##BITNAME = FGT_##REG | FGT_REV | R_##REG##_EL2_##BITNAME##_SHIFT
 +
 +typedef enum FGTBit {
 +    /*
 +     * These bits tell us which register arrays to use:
 +     * if FGT_R is set then reads are checked against fgt_read[];
 +     * if FGT_W is set then writes are checked against fgt_write[];
 +     * if FGT_EXEC is set then all accesses are checked against fgt_exec[].
 +     *
 +     * For almost all bits in the R/W register pairs, the bit exists in
 +     * both registers for a RW register, in HFGRTR/HDFGRTR for a RO register
 +     * with the corresponding HFGWTR/HDFGTWTR bit being RES0, and vice-versa
 +     * for a WO register. There are unfortunately a couple of exceptions
 +     * (PMCR_EL0, TRFCR_EL1) where the register being trapped is RW but
 +     * the FGT system only allows trapping of writes, not reads.
 +     *
 +     * Note that we arrange these bits so that a 0 FGTBit means "no trap".
 +     */
 +    FGT_R = 1 << R_FGT_TYPE_SHIFT,
 +    FGT_W = 2 << R_FGT_TYPE_SHIFT,
 +    FGT_EXEC = 4 << R_FGT_TYPE_SHIFT,
 +    FGT_RW = FGT_R | FGT_W,
 +    /* Bit to identify whether trap bit is reversed sense */
 +    FGT_REV = R_FGT_REV_MASK,
 +
 +    /*
 +     * If a bit exists in HFGRTR/HDFGRTR then either the register being
 +     * trapped is RO or the bit also exists in HFGWTR/HDFGWTR, so we either
 +     * want to trap for both reads and writes or else it's harmless to mark
 +     * it as trap-on-writes.
 +     * If a bit exists only in HFGWTR/HDFGWTR then either the register being
 +     * trapped is WO, or else it is one of the two oddball special cases
 +     * which are RW but have only a write trap. We mark these as only
 +     * FGT_W so we get the right behaviour for those special cases.
 +     * (If a bit was added in future that provided only a read trap for an
 +     * RW register we'd need to do something special to get the FGT_R bit
 +     * only. But this seems unlikely to happen.)
 +     *
 +     * So for the DO_BIT/DO_REV_BIT macros: use FGT_HFGRTR/FGT_HDFGRTR if
 +     * the bit exists in that register. Otherwise use FGT_HFGWTR/FGT_HDFGWTR.
 +     */
 +    FGT_HFGRTR = FGT_RW | (FGTREG_HFGRTR << R_FGT_IDX_SHIFT),
 +    FGT_HFGWTR = FGT_W | (FGTREG_HFGWTR << R_FGT_IDX_SHIFT),
 +    FGT_HDFGRTR = FGT_RW | (FGTREG_HDFGRTR << R_FGT_IDX_SHIFT),
 +    FGT_HDFGWTR = FGT_W | (FGTREG_HDFGWTR << R_FGT_IDX_SHIFT),
 +    FGT_HFGITR = FGT_EXEC | (FGTREG_HFGITR << R_FGT_IDX_SHIFT),
 +} FGTBit;
 +
 +#undef DO_BIT
 +#undef DO_REV_BIT
 +
  typedef struct ARMCPRegInfo ARMCPRegInfo;
  /*
@@ -XXX,XX +XXX,XX @@ struct ARMCPRegInfo {
      CPAccessRights access;
      /* Security state: ARM_CP_SECSTATE_* bits/values */
      CPSecureState secure;
 +    /*
 +     * Which fine-grained trap register bit to check, if any. This
 +     * value encodes both the trap register and bit within it.
 +     */
 +    FGTBit fgt;
      /*
       * The opaque pointer passed to define_arm_cp_regs_with_opaque() when
       * this register was defined: can be used to hand data through to the
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, FPEXC_EL, 8, 2)
  /* Memory operations require alignment: SCTLR_ELx.A or CCR.UNALIGN_TRP */
  FIELD(TBFLAG_ANY, ALIGN_MEM, 10, 1)
  FIELD(TBFLAG_ANY, PSTATE__IL, 11, 1)
 +FIELD(TBFLAG_ANY, FGT_ACTIVE, 12, 1)
  /*
   * Bit usage when in AArch32 state, both A- and M-profile.
 diff --git a/target/arm/internals.h b/target/arm/internals.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/internals.h
 +++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline uint64_t arm_mdcr_el2_eff(CPUARMState *env)
      ((1 << (1 - 1)) | (1 << (2 - 1)) |                  \
       (1 << (4 - 1)) | (1 << (8 - 1)) | (1 << (16 - 1)))
 +/*
 + * Return true if it is possible to take a fine-grained-trap to EL2.
 + */
 +static inline bool arm_fgt_active(CPUARMState *env, int el)
 +{
 +    /*
 +     * The Arm ARM only requires the "{E2H,TGE} != {1,1}" test for traps
 +     * that can affect EL0, but it is harmless to do the test also for
 +     * traps on registers that are only accessible at EL1 because if the test
 +     * returns true then we can't be executing at EL1 anyway.
 +     * FGT traps only happen when EL2 is enabled and EL1 is AArch64;
 +     * traps from AArch32 only happen for the EL0 is AArch32 case.
 +     */
 +    return cpu_isar_feature(aa64_fgt, env_archcpu(env)) &&
 +        el < 2 && arm_is_el2_enabled(env) &&
 +        arm_el_is_aa64(env, 1) &&
 +        (arm_hcr_el2_eff(env) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE) &&
 +        (!arm_feature(env, ARM_FEATURE_EL3) || (env->cp15.scr_el3 & SCR_FGTEN));
 +}
 +
  #endif
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
      bool is_nonstreaming;
      /* True if MVE insns are definitely not predicated by VPR or LTPSIZE */
      bool mve_no_pred;
 +    /* True if fine-grained traps are active */
 +    bool fgt_active;
      /*
       * >= 0, a copy of PSTATE.BTYPE, which will be 0 without v8.5-BTI.
       *  < 0, set by the current instruction.
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_common(CPUARMState *env, int fp_el,
      if (arm_singlestep_active(env)) {
          DP_TBFLAG_ANY(flags, SS_ACTIVE, 1);
      }
 +
      return flags;
  }
-+
-+static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a)
+@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a32(CPUARMState *env, int fp_el,
-+{
+         DP_TBFLAG_A32(flags, HSTR_ACTIVE, 1);
-+    gen_helper_gvec_3 *fn_gvec;
+     }
-+    int opr_sz;
-+    TCGv_ptr fpst;
++    if (arm_fgt_active(env, el)) {
-+
++        DP_TBFLAG_ANY(flags, FGT_ACTIVE, 1);
 +    if (!dc_isar_feature(aa32_dp, s)) {
 +        return false;
 +    }
 +
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
+     if (env->uncached_cpsr & CPSR_IL) {
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+         DP_TBFLAG_ANY(flags, PSTATE__IL, 1);
-+        ((a->vd | a->vn) & 0x10)) {
+     }
-+        return false;
+@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
          DP_TBFLAG_ANY(flags, PSTATE__IL, 1);
      }
 +    if (arm_fgt_active(env, el)) {
 +        DP_TBFLAG_ANY(flags, FGT_ACTIVE, 1);
 +    }
 +
-+    if ((a->vd | a->vn) & a->q) {
+     if (cpu_isar_feature(aa64_mte, env_archcpu(env))) {
-+        return false;
+         /*
           * Set MTE_ACTIVE if any access may be Checked, and leave clear
 diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/op_helper.c
 +++ b/target/arm/op_helper.c
@@ -XXX,XX +XXX,XX @@ const void *HELPER(access_check_cp_reg)(CPUARMState *env, uint32_t key,
          }
      }
 +    /*
 +     * Fine-grained traps also are lower priority than undef-to-EL1,
 +     * higher priority than trap-to-EL3, and we don't care about priority
 +     * order with other EL2 traps because the syndrome value is the same.
 +     */
 +    if (arm_fgt_active(env, arm_current_el(env))) {
 +        uint64_t trapword = 0;
 +        unsigned int idx = FIELD_EX32(ri->fgt, FGT, IDX);
 +        unsigned int bitpos = FIELD_EX32(ri->fgt, FGT, BITPOS);
 +        bool rev = FIELD_EX32(ri->fgt, FGT, REV);
 +        bool trapbit;
 +
 +        if (ri->fgt & FGT_EXEC) {
 +            assert(idx < ARRAY_SIZE(env->cp15.fgt_exec));
 +            trapword = env->cp15.fgt_exec[idx];
 +        } else if (isread && (ri->fgt & FGT_R)) {
 +            assert(idx < ARRAY_SIZE(env->cp15.fgt_read));
 +            trapword = env->cp15.fgt_read[idx];
 +        } else if (!isread && (ri->fgt & FGT_W)) {
 +            assert(idx < ARRAY_SIZE(env->cp15.fgt_write));
 +            trapword = env->cp15.fgt_write[idx];
 +        }
 +
 +        trapbit = extract64(trapword, bitpos, 1);
 +        if (trapbit != rev) {
 +            res = CP_ACCESS_TRAP_EL2;
 +            goto fail;
 +        }
 +    }
 +
-+    if (!vfp_access_check(s)) {
+     if (likely(res == CP_ACCESS_OK)) {
-+        return true;
+         return ri;
-+    }
+     }
-+
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
-+    fn_gvec = a->u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
+index XXXXXXX..XXXXXXX 100644
-+    opr_sz = (1 + a->q) * 8;
+--- a/target/arm/translate-a64.c
-+    fpst = get_fpstatus_ptr(1);
++++ b/target/arm/translate-a64.c
-+    tcg_gen_gvec_3_ool(vfp_reg_offset(1, a->vd),
+@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
-+                       vfp_reg_offset(1, a->vn),
+         return;
-+                       vfp_reg_offset(1, a->rm),
+     }
-+                       opr_sz, opr_sz, a->index, fn_gvec);
-+    tcg_temp_free_ptr(fpst);
+-    if (ri->accessfn) {
-+    return true;
++    if (ri->accessfn || (ri->fgt && s->fgt_active)) {
-+}
+         /* Emit code to perform further access permissions checks at
           * runtime; this may result in an exception.
           */
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
      dc->fp_excp_el = EX_TBFLAG_ANY(tb_flags, FPEXC_EL);
      dc->align_mem = EX_TBFLAG_ANY(tb_flags, ALIGN_MEM);
      dc->pstate_il = EX_TBFLAG_ANY(tb_flags, PSTATE__IL);
 +    dc->fgt_active = EX_TBFLAG_ANY(tb_flags, FGT_ACTIVE);
      dc->sve_excp_el = EX_TBFLAG_A64(tb_flags, SVEEXC_EL);
      dc->sme_excp_el = EX_TBFLAG_A64(tb_flags, SMEEXC_EL);
      dc->vl = (EX_TBFLAG_A64(tb_flags, VL) + 1) * 16;
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
-     bool is_long = false, q = extract32(insn, 6, 1);
+     }
-     bool ptr_is_env = false;
+     if ((s->hstr_active && s->current_el == 0) || ri->accessfn ||
--    if ((insn & 0xffb00f00) == 0xfe200d00) {
++        (ri->fgt && s->fgt_active) ||
--        /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
+         (arm_dc_feature(s, ARM_FEATURE_XSCALE) && cpnum < 14)) {
--        int u = extract32(insn, 4, 1);
+         /*
--
+          * Emit code to perform further access permissions checks at
--        if (!dc_isar_feature(aa32_dp, s)) {
+@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
--            return 1;
+     dc->fp_excp_el = EX_TBFLAG_ANY(tb_flags, FPEXC_EL);
--        }
+     dc->align_mem = EX_TBFLAG_ANY(tb_flags, ALIGN_MEM);
--        fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
+     dc->pstate_il = EX_TBFLAG_ANY(tb_flags, PSTATE__IL);
--        /* rm is just Vm, and index is M.  */
++    dc->fgt_active = EX_TBFLAG_ANY(tb_flags, FGT_ACTIVE);
--        data = extract32(insn, 5, 1); /* index */
--        rm = extract32(insn, 0, 4);
+     if (arm_feature(env, ARM_FEATURE_M)) {
--    } else if ((insn & 0xffa00f10) == 0xfe000810) {
+         dc->vfp_enabled = 1;
 +    if ((insn & 0xffa00f10) == 0xfe000810) {
          /* VFM[AS]L -- 1111 1110 0.0S .... .... 1000 .Q.1 .... */
          int is_s = extract32(insn, 20, 1);
          int vm20 = extract32(insn, 0, 3);
 --
-.20.1
+.34.1

-[PULL 36/39] target/arm: Convert Neon 3-reg-same comparisons to decodetree
+[PULL 20/33] target/arm: Mark up sysregs for HFGRTR bits 0..11
-Convert the Neon comparison ops in the 3-reg-same grouping
+Mark up the sysreg definitions for the registers trapped
-to decodetree.
+by HFGRTR/HFGWTR bits 0..11.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-18-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-11-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-11-peter.maydell@linaro.org
 ---
- target/arm/neon-dp.decode       |  8 ++++++++
+ target/arm/cpregs.h | 14 ++++++++++++++
- target/arm/translate-neon.inc.c | 22 ++++++++++++++++++++++
+ target/arm/helper.c | 17 +++++++++++++++++
- target/arm/translate.c          | 23 +++--------------------
+files changed, 31 insertions(+)
 files changed, 33 insertions(+), 20 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/cpregs.h
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
+@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
- VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
+     FGT_HDFGRTR = FGT_RW | (FGTREG_HDFGRTR << R_FGT_IDX_SHIFT),
- VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
+     FGT_HDFGWTR = FGT_W | (FGTREG_HDFGWTR << R_FGT_IDX_SHIFT),
+     FGT_HFGITR = FGT_EXEC | (FGTREG_HFGITR << R_FGT_IDX_SHIFT),
 +VCGT_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 0 .... @3same
 +VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
 +VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
 +VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
 +
- VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
++    /* Trap bits in HFGRTR_EL2 / HFGWTR_EL2, starting from bit 0. */
- VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
++    DO_BIT(HFGRTR, AFSR0_EL1),
- VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
++    DO_BIT(HFGRTR, AFSR1_EL1),
-@@ -XXX,XX +XXX,XX @@ VMIN_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
++    DO_BIT(HFGRTR, AIDR_EL1),
++    DO_BIT(HFGRTR, AMAIR_EL1),
- VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
++    DO_BIT(HFGRTR, APDAKEY),
- VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
++    DO_BIT(HFGRTR, APDBKEY),
-+
++    DO_BIT(HFGRTR, APGAKEY),
-+VTST_3s          1111 001 0 0 . .. .... .... 1000 . . . 1 .... @3same
++    DO_BIT(HFGRTR, APIAKEY),
-+VCEQ_3s          1111 001 1 0 . .. .... .... 1000 . . . 1 .... @3same
++    DO_BIT(HFGRTR, APIBKEY),
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
++    DO_BIT(HFGRTR, CCSIDR_EL1),
 +    DO_BIT(HFGRTR, CLIDR_EL1),
 +    DO_BIT(HFGRTR, CONTEXTIDR_EL1),
  } FGTBit;
  #undef DO_BIT
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cp_reginfo[] = {
- DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
+     { .name = "CONTEXTIDR_EL1", .state = ARM_CP_STATE_BOTH,
- DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
+       .opc0 = 3, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 1,
- DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
-+
++      .fgt = FGT_CONTEXTIDR_EL1,
-+#define DO_3SAME_CMP(INSN, COND)                                        \
+       .secure = ARM_CP_SECSTATE_NS,
-+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+       .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[1]),
-+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+       .resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, },
-+                                uint32_t oprsz, uint32_t maxsz)         \
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
-+    {                                                                   \
+       .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 1, .opc2 = 0,
-+        tcg_gen_gvec_cmp(COND, vece, rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz); \
+       .access = PL1_R,
-+    }                                                                   \
+       .accessfn = access_tid4,
-+    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
++      .fgt = FGT_CCSIDR_EL1,
-+
+       .readfn = ccsidr_read, .type = ARM_CP_NO_RAW },
-+DO_3SAME_CMP(VCGT_S, TCG_COND_GT)
+     { .name = "CSSELR", .state = ARM_CP_STATE_BOTH,
-+DO_3SAME_CMP(VCGT_U, TCG_COND_GTU)
+       .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 2, .opc2 = 0,
-+DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
-+DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
+       .opc0 = 3, .opc1 = 1, .crn = 0, .crm = 0, .opc2 = 7,
-+DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
+       .access = PL1_R, .type = ARM_CP_CONST,
-+
+       .accessfn = access_aa64_tid1,
-+static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++      .fgt = FGT_AIDR_EL1,
-+                         uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
+       .resetvalue = 0 },
-+{
+     /*
-+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
+      * Auxiliary fault status registers: these also are IMPDEF, and we
-+}
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
-+DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
+     { .name = "AFSR0_EL1", .state = ARM_CP_STATE_BOTH,
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+       .opc0 = 3, .opc1 = 0, .crn = 5, .crm = 1, .opc2 = 0,
-index XXXXXXX..XXXXXXX 100644
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
---- a/target/arm/translate.c
++      .fgt = FGT_AFSR0_EL1,
-+++ b/target/arm/translate.c
+       .type = ARM_CP_CONST, .resetvalue = 0 },
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+     { .name = "AFSR1_EL1", .state = ARM_CP_STATE_BOTH,
-                            u ? &mls_op[size] : &mla_op[size]);
+       .opc0 = 3, .opc1 = 0, .crn = 5, .crm = 1, .opc2 = 1,
-             return 0;
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
++      .fgt = FGT_AFSR1_EL1,
--        case NEON_3R_VTST_VCEQ:
+       .type = ARM_CP_CONST, .resetvalue = 0 },
--            if (u) { /* VCEQ */
+     /*
--                tcg_gen_gvec_cmp(TCG_COND_EQ, size, rd_ofs, rn_ofs, rm_ofs,
+      * MAIR can just read-as-written because we don't implement caches
--                                 vec_size, vec_size);
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo lpae_cp_reginfo[] = {
--            } else { /* VTST */
+     { .name = "AMAIR0", .state = ARM_CP_STATE_BOTH,
--                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
+       .opc0 = 3, .crn = 10, .crm = 3, .opc1 = 0, .opc2 = 0,
--                               vec_size, vec_size, &cmtst_op[size]);
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
--            }
++      .fgt = FGT_AMAIR_EL1,
--            return 0;
+       .type = ARM_CP_CONST, .resetvalue = 0 },
--
+     /* AMAIR1 is mapped to AMAIR_EL1[63:32] */
--        case NEON_3R_VCGT:
+     { .name = "AMAIR1", .cp = 15, .crn = 10, .crm = 3, .opc1 = 0, .opc2 = 1,
--            tcg_gen_gvec_cmp(u ? TCG_COND_GTU : TCG_COND_GT, size,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pauth_reginfo[] = {
--                             rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
+     { .name = "APDAKEYLO_EL1", .state = ARM_CP_STATE_AA64,
--            return 0;
+       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 2, .opc2 = 0,
--
+       .access = PL1_RW, .accessfn = access_pauth,
--        case NEON_3R_VCGE:
++      .fgt = FGT_APDAKEY,
--            tcg_gen_gvec_cmp(u ? TCG_COND_GEU : TCG_COND_GE, size,
+       .fieldoffset = offsetof(CPUARMState, keys.apda.lo) },
--                             rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
+     { .name = "APDAKEYHI_EL1", .state = ARM_CP_STATE_AA64,
--            return 0;
+       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 2, .opc2 = 1,
--
+       .access = PL1_RW, .accessfn = access_pauth,
-         case NEON_3R_VSHL:
++      .fgt = FGT_APDAKEY,
-             /* Note the operation is vshl vd,vm,vn */
+       .fieldoffset = offsetof(CPUARMState, keys.apda.hi) },
-             tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
+     { .name = "APDBKEYLO_EL1", .state = ARM_CP_STATE_AA64,
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 2, .opc2 = 2,
-         case NEON_3R_LOGIC:
+       .access = PL1_RW, .accessfn = access_pauth,
-         case NEON_3R_VMAX:
++      .fgt = FGT_APDBKEY,
-         case NEON_3R_VMIN:
+       .fieldoffset = offsetof(CPUARMState, keys.apdb.lo) },
-+        case NEON_3R_VTST_VCEQ:
+     { .name = "APDBKEYHI_EL1", .state = ARM_CP_STATE_AA64,
-+        case NEON_3R_VCGT:
+       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 2, .opc2 = 3,
-+        case NEON_3R_VCGE:
+       .access = PL1_RW, .accessfn = access_pauth,
-             /* Already handled by decodetree */
++      .fgt = FGT_APDBKEY,
-             return 1;
+       .fieldoffset = offsetof(CPUARMState, keys.apdb.hi) },
-         }
+     { .name = "APGAKEYLO_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 3, .opc2 = 0,
        .access = PL1_RW, .accessfn = access_pauth,
 +      .fgt = FGT_APGAKEY,
        .fieldoffset = offsetof(CPUARMState, keys.apga.lo) },
      { .name = "APGAKEYHI_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 3, .opc2 = 1,
        .access = PL1_RW, .accessfn = access_pauth,
 +      .fgt = FGT_APGAKEY,
        .fieldoffset = offsetof(CPUARMState, keys.apga.hi) },
      { .name = "APIAKEYLO_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 0,
        .access = PL1_RW, .accessfn = access_pauth,
 +      .fgt = FGT_APIAKEY,
        .fieldoffset = offsetof(CPUARMState, keys.apia.lo) },
      { .name = "APIAKEYHI_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 1,
        .access = PL1_RW, .accessfn = access_pauth,
 +      .fgt = FGT_APIAKEY,
        .fieldoffset = offsetof(CPUARMState, keys.apia.hi) },
      { .name = "APIBKEYLO_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 2,
        .access = PL1_RW, .accessfn = access_pauth,
 +      .fgt = FGT_APIBKEY,
        .fieldoffset = offsetof(CPUARMState, keys.apib.lo) },
      { .name = "APIBKEYHI_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 3,
        .access = PL1_RW, .accessfn = access_pauth,
 +      .fgt = FGT_APIBKEY,
        .fieldoffset = offsetof(CPUARMState, keys.apib.hi) },
  };
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
              .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 1, .opc2 = 1,
              .access = PL1_R, .type = ARM_CP_CONST,
              .accessfn = access_tid4,
 +            .fgt = FGT_CLIDR_EL1,
              .resetvalue = cpu->clidr
          };
          define_one_arm_cp_reg(cpu, &clidr);
 --
-.20.1
+.34.1

-[PULL 35/39] target/arm: Convert Neon 3-reg-same VMAX/VMIN to decodetree
+[PULL 21/33] target/arm: Mark up sysregs for HFGRTR bits 12..23
-Convert the Neon 3-reg-same VMAX and VMIN insns to decodetree.
+Mark up the sysreg definitions for the registers trapped
 by HFGRTR/HFGWTR bits 12..23.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-17-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-12-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-12-peter.maydell@linaro.org
 ---
- target/arm/neon-dp.decode       |  5 +++++
+ target/arm/cpregs.h | 12 ++++++++++++
- target/arm/translate-neon.inc.c | 14 ++++++++++++++
+ target/arm/helper.c | 12 ++++++++++++
- target/arm/translate.c          | 21 ++-------------------
+files changed, 24 insertions(+)
 files changed, 21 insertions(+), 19 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/cpregs.h
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
+@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
- VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
+     DO_BIT(HFGRTR, CCSIDR_EL1),
- VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
+     DO_BIT(HFGRTR, CLIDR_EL1),
+     DO_BIT(HFGRTR, CONTEXTIDR_EL1),
-+VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
++    DO_BIT(HFGRTR, CPACR_EL1),
-+VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
++    DO_BIT(HFGRTR, CSSELR_EL1),
-+VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
++    DO_BIT(HFGRTR, CTR_EL0),
-+VMIN_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
++    DO_BIT(HFGRTR, DCZID_EL0),
-+
++    DO_BIT(HFGRTR, ESR_EL1),
- VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
++    DO_BIT(HFGRTR, FAR_EL1),
- VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
++    DO_BIT(HFGRTR, ISR_EL1),
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
++    DO_BIT(HFGRTR, LORC_EL1),
 +    DO_BIT(HFGRTR, LOREA_EL1),
 +    DO_BIT(HFGRTR, LORID_EL1),
 +    DO_BIT(HFGRTR, LORN_EL1),
 +    DO_BIT(HFGRTR, LORSA_EL1),
  } FGTBit;
  #undef DO_BIT
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME(VEOR, tcg_gen_gvec_xor)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
- DO_3SAME_BITSEL(VBSL, rd_ofs, rn_ofs, rm_ofs)
+       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0, },
- DO_3SAME_BITSEL(VBIT, rm_ofs, rn_ofs, rd_ofs)
+     { .name = "CPACR", .state = ARM_CP_STATE_BOTH, .opc0 = 3,
- DO_3SAME_BITSEL(VBIF, rm_ofs, rd_ofs, rn_ofs)
+       .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 2, .accessfn = cpacr_access,
-+
++      .fgt = FGT_CPACR_EL1,
-+#define DO_3SAME_NO_SZ_3(INSN, FUNC)                                    \
+       .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.cpacr_el1),
-+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+       .resetfn = cpacr_reset, .writefn = cpacr_write, .readfn = cpacr_read },
-+    {                                                                   \
+ };
-+        if (a->size == 3) {                                             \
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
-+            return false;                                               \
+       .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 2, .opc2 = 0,
-+        }                                                               \
+       .access = PL1_RW,
-+        return do_3same(s, a, FUNC);                                    \
+       .accessfn = access_tid4,
-+    }
++      .fgt = FGT_CSSELR_EL1,
-+
+       .writefn = csselr_write, .resetvalue = 0,
-+DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
+       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.csselr_s),
-+DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
+                              offsetof(CPUARMState, cp15.csselr_ns) } },
-+DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
-+DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
+       .resetfn = arm_cp_reset_ignore },
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+     { .name = "ISR_EL1", .state = ARM_CP_STATE_BOTH,
-index XXXXXXX..XXXXXXX 100644
+       .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 1, .opc2 = 0,
---- a/target/arm/translate.c
++      .fgt = FGT_ISR_EL1,
-+++ b/target/arm/translate.c
+       .type = ARM_CP_NO_RAW, .access = PL1_R, .readfn = isr_read },
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+     /* 32 bit ITLB invalidates */
-                              rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
+     { .name = "ITLBIALL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 0,
-             return 0;
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
+     { .name = "FAR_EL1", .state = ARM_CP_STATE_AA64,
--        case NEON_3R_VMAX:
+       .opc0 = 3, .crn = 6, .crm = 0, .opc1 = 0, .opc2 = 0,
--            if (u) {
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
--                tcg_gen_gvec_umax(size, rd_ofs, rn_ofs, rm_ofs,
++      .fgt = FGT_FAR_EL1,
--                                  vec_size, vec_size);
+       .fieldoffset = offsetof(CPUARMState, cp15.far_el[1]),
--            } else {
+       .resetvalue = 0, },
--                tcg_gen_gvec_smax(size, rd_ofs, rn_ofs, rm_ofs,
+ };
--                                  vec_size, vec_size);
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
--            }
+     { .name = "ESR_EL1", .state = ARM_CP_STATE_AA64,
--            return 0;
+       .opc0 = 3, .crn = 5, .crm = 2, .opc1 = 0, .opc2 = 0,
--        case NEON_3R_VMIN:
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
--            if (u) {
++      .fgt = FGT_ESR_EL1,
--                tcg_gen_gvec_umin(size, rd_ofs, rn_ofs, rm_ofs,
+       .fieldoffset = offsetof(CPUARMState, cp15.esr_el[1]), .resetvalue = 0, },
--                                  vec_size, vec_size);
+     { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH,
--            } else {
+       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
--                tcg_gen_gvec_smin(size, rd_ofs, rn_ofs, rm_ofs,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
--                                  vec_size, vec_size);
+     { .name = "DCZID_EL0", .state = ARM_CP_STATE_AA64,
--            }
+       .opc0 = 3, .opc1 = 3, .opc2 = 7, .crn = 0, .crm = 0,
--            return 0;
+       .access = PL0_R, .type = ARM_CP_NO_RAW,
--
++      .fgt = FGT_DCZID_EL0,
-         case NEON_3R_VSHL:
+       .readfn = aa64_dczid_read },
-             /* Note the operation is vshl vd,vm,vn */
+     { .name = "DC_ZVA", .state = ARM_CP_STATE_AA64,
-             tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 4, .opc2 = 1,
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo lor_reginfo[] = {
+     { .name = "LORSA_EL1", .state = ARM_CP_STATE_AA64,
-         case NEON_3R_VADD_VSUB:
+       .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 0,
-         case NEON_3R_LOGIC:
+       .access = PL1_RW, .accessfn = access_lor_other,
-+        case NEON_3R_VMAX:
++      .fgt = FGT_LORSA_EL1,
-+        case NEON_3R_VMIN:
+       .type = ARM_CP_CONST, .resetvalue = 0 },
-             /* Already handled by decodetree */
+     { .name = "LOREA_EL1", .state = ARM_CP_STATE_AA64,
-             return 1;
+       .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 1,
-         }
+       .access = PL1_RW, .accessfn = access_lor_other,
 +      .fgt = FGT_LOREA_EL1,
        .type = ARM_CP_CONST, .resetvalue = 0 },
      { .name = "LORN_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 2,
        .access = PL1_RW, .accessfn = access_lor_other,
 +      .fgt = FGT_LORN_EL1,
        .type = ARM_CP_CONST, .resetvalue = 0 },
      { .name = "LORC_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 3,
        .access = PL1_RW, .accessfn = access_lor_other,
 +      .fgt = FGT_LORC_EL1,
        .type = ARM_CP_CONST, .resetvalue = 0 },
      { .name = "LORID_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 7,
        .access = PL1_R, .accessfn = access_lor_ns,
 +      .fgt = FGT_LORID_EL1,
        .type = ARM_CP_CONST, .resetvalue = 0 },
  };
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
              { .name = "CTR_EL0", .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 3, .opc2 = 1, .crn = 0, .crm = 0,
                .access = PL0_R, .accessfn = ctr_el0_access,
 +              .fgt = FGT_CTR_EL0,
                .type = ARM_CP_CONST, .resetvalue = cpu->ctr },
              /* TCMTR and TLBTR exist in v8 but have no 64-bit versions */
              { .name = "TCMTR",
 --
-.20.1
+.34.1

-[PULL 34/39] target/arm: Convert Neon 3-reg-same logic ops to decodetree
+[PULL 22/33] target/arm: Mark up sysregs for HFGRTR bits 24..35
-Convert the Neon logic ops in the 3-reg-same grouping to decodetree.
+Mark up the sysreg definitions for the registers trapped
-Note that for the logic ops the 'size' field forms part of their
+by HFGRTR/HFGWTR bits 24..35.
 decode and the actual operations are always bitwise.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-16-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-13-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-13-peter.maydell@linaro.org
 ---
- target/arm/neon-dp.decode       | 12 +++++++++++
+ target/arm/cpregs.h | 12 ++++++++++++
- target/arm/translate-neon.inc.c | 19 +++++++++++++++++
+ target/arm/helper.c | 14 ++++++++++++++
- target/arm/translate.c          | 38 +--------------------------------
+files changed, 26 insertions(+)
 files changed, 32 insertions(+), 37 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/cpregs.h
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
- @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
+     DO_BIT(HFGRTR, LORID_EL1),
-                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
+     DO_BIT(HFGRTR, LORN_EL1),
+     DO_BIT(HFGRTR, LORSA_EL1),
-+@3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
++    DO_BIT(HFGRTR, MAIR_EL1),
-+                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
++    DO_BIT(HFGRTR, MIDR_EL1),
-+
++    DO_BIT(HFGRTR, MPIDR_EL1),
-+VAND_3s          1111 001 0 0 . 00 .... .... 0001 ... 1 .... @3same_logic
++    DO_BIT(HFGRTR, PAR_EL1),
-+VBIC_3s          1111 001 0 0 . 01 .... .... 0001 ... 1 .... @3same_logic
++    DO_BIT(HFGRTR, REVIDR_EL1),
-+VORR_3s          1111 001 0 0 . 10 .... .... 0001 ... 1 .... @3same_logic
++    DO_BIT(HFGRTR, SCTLR_EL1),
-+VORN_3s          1111 001 0 0 . 11 .... .... 0001 ... 1 .... @3same_logic
++    DO_BIT(HFGRTR, SCXTNUM_EL1),
-+VEOR_3s          1111 001 1 0 . 00 .... .... 0001 ... 1 .... @3same_logic
++    DO_BIT(HFGRTR, SCXTNUM_EL0),
-+VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
++    DO_BIT(HFGRTR, TCR_EL1),
-+VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
++    DO_BIT(HFGRTR, TPIDR_EL1),
-+VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
++    DO_BIT(HFGRTR, TPIDRRO_EL0),
-+
++    DO_BIT(HFGRTR, TPIDR_EL0),
- VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
+ } FGTBit;
- VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+ #undef DO_BIT
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
+     { .name = "MAIR_EL1", .state = ARM_CP_STATE_AA64,
- DO_3SAME(VADD, tcg_gen_gvec_add)
+       .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 0,
- DO_3SAME(VSUB, tcg_gen_gvec_sub)
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
-+DO_3SAME(VAND, tcg_gen_gvec_and)
++      .fgt = FGT_MAIR_EL1,
-+DO_3SAME(VBIC, tcg_gen_gvec_andc)
+       .fieldoffset = offsetof(CPUARMState, cp15.mair_el[1]),
-+DO_3SAME(VORR, tcg_gen_gvec_or)
+       .resetvalue = 0 },
-+DO_3SAME(VORN, tcg_gen_gvec_orc)
+     { .name = "MAIR_EL3", .state = ARM_CP_STATE_AA64,
-+DO_3SAME(VEOR, tcg_gen_gvec_xor)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6k_cp_reginfo[] = {
-+
+     { .name = "TPIDR_EL0", .state = ARM_CP_STATE_AA64,
-+/* These insns are all gvec_bitsel but with the inputs in various orders. */
+       .opc0 = 3, .opc1 = 3, .opc2 = 2, .crn = 13, .crm = 0,
-+#define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
+       .access = PL0_RW,
-+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
++      .fgt = FGT_TPIDR_EL0,
-+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+       .fieldoffset = offsetof(CPUARMState, cp15.tpidr_el[0]), .resetvalue = 0 },
-+                                uint32_t oprsz, uint32_t maxsz)         \
+     { .name = "TPIDRURW", .cp = 15, .crn = 13, .crm = 0, .opc1 = 0, .opc2 = 2,
-+    {                                                                   \
+       .access = PL0_RW,
-+        tcg_gen_gvec_bitsel(vece, rd_ofs, O1, O2, O3, oprsz, maxsz);    \
++      .fgt = FGT_TPIDR_EL0,
-+    }                                                                   \
+       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.tpidrurw_s),
-+    DO_3SAME(INSN, gen_##INSN##_3s)
+                              offsetoflow32(CPUARMState, cp15.tpidrurw_ns) },
-+
+       .resetfn = arm_cp_reset_ignore },
-+DO_3SAME_BITSEL(VBSL, rd_ofs, rn_ofs, rm_ofs)
+     { .name = "TPIDRRO_EL0", .state = ARM_CP_STATE_AA64,
-+DO_3SAME_BITSEL(VBIT, rm_ofs, rn_ofs, rd_ofs)
+       .opc0 = 3, .opc1 = 3, .opc2 = 3, .crn = 13, .crm = 0,
-+DO_3SAME_BITSEL(VBIF, rm_ofs, rd_ofs, rn_ofs)
+       .access = PL0_R | PL1_W,
-diff --git a/target/arm/translate.c b/target/arm/translate.c
++      .fgt = FGT_TPIDRRO_EL0,
-index XXXXXXX..XXXXXXX 100644
+       .fieldoffset = offsetof(CPUARMState, cp15.tpidrro_el[0]),
---- a/target/arm/translate.c
+       .resetvalue = 0},
-+++ b/target/arm/translate.c
+     { .name = "TPIDRURO", .cp = 15, .crn = 13, .crm = 0, .opc1 = 0, .opc2 = 3,
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+       .access = PL0_R | PL1_W,
-             }
++      .fgt = FGT_TPIDRRO_EL0,
-             return 1;
+       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.tpidruro_s),
+                              offsetoflow32(CPUARMState, cp15.tpidruro_ns) },
--        case NEON_3R_LOGIC: /* Logic ops.  */
+       .resetfn = arm_cp_reset_ignore },
--            switch ((u << 2) | size) {
+     { .name = "TPIDR_EL1", .state = ARM_CP_STATE_AA64,
--            case 0: /* VAND */
+       .opc0 = 3, .opc1 = 0, .opc2 = 4, .crn = 13, .crm = 0,
--                tcg_gen_gvec_and(0, rd_ofs, rn_ofs, rm_ofs,
+       .access = PL1_RW,
--                                 vec_size, vec_size);
++      .fgt = FGT_TPIDR_EL1,
--                break;
+       .fieldoffset = offsetof(CPUARMState, cp15.tpidr_el[1]), .resetvalue = 0 },
--            case 1: /* VBIC */
+     { .name = "TPIDRPRW", .opc1 = 0, .cp = 15, .crn = 13, .crm = 0, .opc2 = 4,
--                tcg_gen_gvec_andc(0, rd_ofs, rn_ofs, rm_ofs,
+       .access = PL1_RW,
--                                  vec_size, vec_size);
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
--                break;
+     { .name = "TCR_EL1", .state = ARM_CP_STATE_AA64,
--            case 2: /* VORR */
+       .opc0 = 3, .crn = 2, .crm = 0, .opc1 = 0, .opc2 = 2,
--                tcg_gen_gvec_or(0, rd_ofs, rn_ofs, rm_ofs,
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
--                                vec_size, vec_size);
++      .fgt = FGT_TCR_EL1,
--                break;
+       .writefn = vmsa_tcr_el12_write,
--            case 3: /* VORN */
+       .raw_writefn = raw_write,
--                tcg_gen_gvec_orc(0, rd_ofs, rn_ofs, rm_ofs,
+       .resetvalue = 0,
--                                 vec_size, vec_size);
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
--                break;
+       .type = ARM_CP_ALIAS,
--            case 4: /* VEOR */
+       .opc0 = 3, .opc1 = 0, .crn = 7, .crm = 4, .opc2 = 0,
--                tcg_gen_gvec_xor(0, rd_ofs, rn_ofs, rm_ofs,
+       .access = PL1_RW, .resetvalue = 0,
--                                 vec_size, vec_size);
++      .fgt = FGT_PAR_EL1,
--                break;
+       .fieldoffset = offsetof(CPUARMState, cp15.par_el[1]),
--            case 5: /* VBSL */
+       .writefn = par_write },
--                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rd_ofs, rn_ofs, rm_ofs,
+ #endif
--                                    vec_size, vec_size);
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo scxtnum_reginfo[] = {
--                break;
+     { .name = "SCXTNUM_EL0", .state = ARM_CP_STATE_AA64,
--            case 6: /* VBIT */
+       .opc0 = 3, .opc1 = 3, .crn = 13, .crm = 0, .opc2 = 7,
--                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rn_ofs, rd_ofs,
+       .access = PL0_RW, .accessfn = access_scxtnum,
--                                    vec_size, vec_size);
++      .fgt = FGT_SCXTNUM_EL0,
--                break;
+       .fieldoffset = offsetof(CPUARMState, scxtnum_el[0]) },
--            case 7: /* VBIF */
+     { .name = "SCXTNUM_EL1", .state = ARM_CP_STATE_AA64,
--                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rd_ofs, rn_ofs,
+       .opc0 = 3, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 7,
--                                    vec_size, vec_size);
+       .access = PL1_RW, .accessfn = access_scxtnum,
--                break;
++      .fgt = FGT_SCXTNUM_EL1,
--            }
+       .fieldoffset = offsetof(CPUARMState, scxtnum_el[1]) },
--            return 0;
+     { .name = "SCXTNUM_EL2", .state = ARM_CP_STATE_AA64,
--
+       .opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 7,
-         case NEON_3R_VQADD:
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
-             tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+             { .name = "MIDR_EL1", .state = ARM_CP_STATE_BOTH,
-                            rn_ofs, rm_ofs, vec_size, vec_size,
+               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 0, .opc2 = 0,
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+               .access = PL1_R, .type = ARM_CP_NO_RAW, .resetvalue = cpu->midr,
-             return 0;
++              .fgt = FGT_MIDR_EL1,
+               .fieldoffset = offsetof(CPUARMState, cp15.c0_cpuid),
-         case NEON_3R_VADD_VSUB:
+               .readfn = midr_read },
-+        case NEON_3R_LOGIC:
+             /* crn = 0 op1 = 0 crm = 0 op2 = 7 : AArch32 aliases of MIDR */
-             /* Already handled by decodetree */
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
-             return 1;
+               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 0, .opc2 = 6,
-         }
+               .access = PL1_R,
                .accessfn = access_aa64_tid1,
 +              .fgt = FGT_REVIDR_EL1,
                .type = ARM_CP_CONST, .resetvalue = cpu->revidr },
          };
          ARMCPRegInfo id_v8_midr_alias_cp_reginfo = {
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
          ARMCPRegInfo mpidr_cp_reginfo[] = {
              { .name = "MPIDR_EL1", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 5,
 +              .fgt = FGT_MPIDR_EL1,
                .access = PL1_R, .readfn = mpidr_read, .type = ARM_CP_NO_RAW },
          };
  #ifdef CONFIG_USER_ONLY
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
              .name = "SCTLR", .state = ARM_CP_STATE_BOTH,
              .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 0,
              .access = PL1_RW, .accessfn = access_tvm_trvm,
 +            .fgt = FGT_SCTLR_EL1,
              .bank_fieldoffsets = { offsetof(CPUARMState, cp15.sctlr_s),
                                     offsetof(CPUARMState, cp15.sctlr_ns) },
              .writefn = sctlr_write, .resetvalue = cpu->reset_sctlr,
 --
-.20.1
+.34.1

-[PULL 26/39] target/arm: Convert VFM[AS]L (vector) to decodetree
+[PULL 23/33] target/arm: Mark up sysregs for HFGRTR bits 36..63
-Convert the VFM[AS]L (vector) insns to decodetree.  This is the last
+Mark up the sysreg definitions for the registers trapped
-insn in the legacy decoder for the 3same_ext group, so we can
+by HFGRTR/HFGWTR bits 36..63.
 delete the legacy decoder function for the group entirely.
-Note that in disas_thumb2_insn() the parts of this encoding space
+Of these, some correspond to RAS registers which we implement as
-where the decodetree decoder returns false will correctly be directed
+always-UNDEF: these don't need any extra handling for FGT because the
-to illegal_op by the "(insn & (1 << 28))" check so they won't fall
+UNDEF-to-EL1 always takes priority over any theoretical
-into disas_coproc_insn() by mistake.
+FGT-trap-to-EL2.
 Bit 50 (NACCDATA_EL1) is for the ACCDATA_EL1 register which is part
 of the FEAT_LS64_ACCDATA feature which we don't yet implement.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-8-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-14-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-14-peter.maydell@linaro.org
 ---
- target/arm/neon-shared.decode   |  6 +++
+ target/arm/cpregs.h       |  7 +++++++
- target/arm/translate-neon.inc.c | 31 +++++++++++
+ hw/intc/arm_gicv3_cpuif.c |  2 ++
- target/arm/translate.c          | 92 +--------------------------------
+ target/arm/helper.c       | 10 ++++++++++
-files changed, 38 insertions(+), 91 deletions(-)
+files changed, 19 insertions(+)
-diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-shared.decode
+--- a/target/arm/cpregs.h
-+++ b/target/arm/neon-shared.decode
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
+@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
- # VUDOT and VSDOT
+     DO_BIT(HFGRTR, TPIDR_EL1),
- VDOT           1111 110 00 . 10 .... .... 1101 . q:1 . u:1 .... \
+     DO_BIT(HFGRTR, TPIDRRO_EL0),
-                vm=%vm_dp vn=%vn_dp vd=%vd_dp
+     DO_BIT(HFGRTR, TPIDR_EL0),
-+
++    DO_BIT(HFGRTR, TTBR0_EL1),
-+# VFM[AS]L
++    DO_BIT(HFGRTR, TTBR1_EL1),
-+VFML           1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \
++    DO_BIT(HFGRTR, VBAR_EL1),
-+               vm=%vm_sp vn=%vn_sp vd=%vd_dp q=0
++    DO_BIT(HFGRTR, ICC_IGRPENN_EL1),
-+VFML           1111 110 0 s:1 . 10 .... .... 1000 . 1 . 1 .... \
++    DO_BIT(HFGRTR, ERRIDR_EL1),
-+               vm=%vm_dp vn=%vn_dp vd=%vd_dp q=1
++    DO_REV_BIT(HFGRTR, NSMPRI_EL1),
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
++    DO_REV_BIT(HFGRTR, NTPIDR2_EL0),
  } FGTBit;
  #undef DO_BIT
 diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/hw/intc/arm_gicv3_cpuif.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/hw/intc/arm_gicv3_cpuif.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VDOT(DisasContext *s, arg_VDOT *a)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_reginfo[] = {
-                        opr_sz, opr_sz, 0, fn_gvec);
+       .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 12, .opc2 = 6,
-     return true;
+       .type = ARM_CP_IO | ARM_CP_NO_RAW,
- }
+       .access = PL1_RW, .accessfn = gicv3_fiq_access,
-+
++      .fgt = FGT_ICC_IGRPENN_EL1,
-+static bool trans_VFML(DisasContext *s, arg_VFML *a)
+       .readfn = icc_igrpen_read,
-+{
+       .writefn = icc_igrpen_write,
-+    int opr_sz;
+     },
-+
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_reginfo[] = {
-+    if (!dc_isar_feature(aa32_fhm, s)) {
+       .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 12, .opc2 = 7,
-+        return false;
+       .type = ARM_CP_IO | ARM_CP_NO_RAW,
-+    }
+       .access = PL1_RW, .accessfn = gicv3_irq_access,
-+
++      .fgt = FGT_ICC_IGRPENN_EL1,
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
+       .readfn = icc_igrpen_read,
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+       .writefn = icc_igrpen_write,
-+        (a->vd & 0x10)) {
+     },
-+        return false;
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 +    }
 +
 +    if (a->vd & a->q) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    opr_sz = (1 + a->q) * 8;
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(a->q, a->vn),
 +                       vfp_reg_offset(a->q, a->vm),
 +                       cpu_env, opr_sz, opr_sz, a->s, /* is_2 == 0 */
 +                       gen_helper_gvec_fmlal_a32);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
-     return 0;
+     { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH,
- }
+       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
--/* Advanced SIMD three registers of the same length extension.
++      .fgt = FGT_TTBR0_EL1,
-- *  31           25    23  22    20   16   12  11   10   9    8        3     0
+       .writefn = vmsa_ttbr_write, .resetvalue = 0,
-- * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
+       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
-- * | 1 1 1 1 1 1 0 | op1 | D | op2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
+                              offsetof(CPUARMState, cp15.ttbr0_ns) } },
-- * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
+     { .name = "TTBR1_EL1", .state = ARM_CP_STATE_BOTH,
-- */
+       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 1,
--static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
--{
++      .fgt = FGT_TTBR1_EL1,
--    gen_helper_gvec_3 *fn_gvec = NULL;
+       .writefn = vmsa_ttbr_write, .resetvalue = 0,
--    gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
+       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
--    int rd, rn, rm, opr_sz;
+                              offsetof(CPUARMState, cp15.ttbr1_ns) } },
--    int data = 0;
+@@ -XXX,XX +XXX,XX @@ static void disr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t val)
--    int off_rn, off_rm;
+  *   ERRSELR_EL1
--    bool is_long = false, q = extract32(insn, 6, 1);
+  * may generate UNDEFINED, which is the effect we get by not
--    bool ptr_is_env = false;
+  * listing them at all.
--
++ *
--    if ((insn & 0xff300f10) == 0xfc200810) {
++ * These registers have fine-grained trap bits, but UNDEF-to-EL1
--        /* VFM[AS]L -- 1111 1100 S.10 .... .... 1000 .Q.1 .... */
++ * is higher priority than FGT-to-EL2 so we do not need to list them
--        int is_s = extract32(insn, 23, 1);
++ * in order to check for an FGT.
--        if (!dc_isar_feature(aa32_fhm, s)) {
+  */
--            return 1;
+ static const ARMCPRegInfo minimal_ras_reginfo[] = {
--        }
+     { .name = "DISR_EL1", .state = ARM_CP_STATE_BOTH,
--        is_long = true;
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo minimal_ras_reginfo[] = {
--        data = is_s; /* is_2 == 0 */
+     { .name = "ERRIDR_EL1", .state = ARM_CP_STATE_BOTH,
--        fn_gvec_ptr = gen_helper_gvec_fmlal_a32;
+       .opc0 = 3, .opc1 = 0, .crn = 5, .crm = 3, .opc2 = 0,
--        ptr_is_env = true;
+       .access = PL1_R, .accessfn = access_terr,
--    } else {
++      .fgt = FGT_ERRIDR_EL1,
--        return 1;
+       .type = ARM_CP_CONST, .resetvalue = 0 },
--    }
+     { .name = "VDISR_EL2", .state = ARM_CP_STATE_BOTH,
--
+       .opc0 = 3, .opc1 = 4, .crn = 12, .crm = 1, .opc2 = 1,
--    VFP_DREG_D(rd, insn);
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo sme_reginfo[] = {
--    if (rd & q) {
+     { .name = "TPIDR2_EL0", .state = ARM_CP_STATE_AA64,
--        return 1;
+       .opc0 = 3, .opc1 = 3, .crn = 13, .crm = 0, .opc2 = 5,
--    }
+       .access = PL0_RW, .accessfn = access_tpidr2,
--    if (q || !is_long) {
++      .fgt = FGT_NTPIDR2_EL0,
--        VFP_DREG_N(rn, insn);
+       .fieldoffset = offsetof(CPUARMState, cp15.tpidr2_el0) },
--        VFP_DREG_M(rm, insn);
+     { .name = "SVCR", .state = ARM_CP_STATE_AA64,
--        if ((rn | rm) & q & !is_long) {
+       .opc0 = 3, .opc1 = 3, .crn = 4, .crm = 2, .opc2 = 2,
--            return 1;
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo sme_reginfo[] = {
--        }
+     { .name = "SMPRI_EL1", .state = ARM_CP_STATE_AA64,
--        off_rn = vfp_reg_offset(1, rn);
+       .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 2, .opc2 = 4,
--        off_rm = vfp_reg_offset(1, rm);
+       .access = PL1_RW, .accessfn = access_esm,
--    } else {
++      .fgt = FGT_NSMPRI_EL1,
--        rn = VFP_SREG_N(insn);
+       .type = ARM_CP_CONST, .resetvalue = 0 },
--        rm = VFP_SREG_M(insn);
+     { .name = "SMPRIMAP_EL2", .state = ARM_CP_STATE_AA64,
--        off_rn = vfp_reg_offset(0, rn);
+       .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 5,
--        off_rm = vfp_reg_offset(0, rm);
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
--    }
+             { .name = "VBAR", .state = ARM_CP_STATE_BOTH,
--
+               .opc0 = 3, .crn = 12, .crm = 0, .opc1 = 0, .opc2 = 0,
--    if (s->fp_excp_el) {
+               .access = PL1_RW, .writefn = vbar_write,
--        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
++              .fgt = FGT_VBAR_EL1,
--                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
+               .bank_fieldoffsets = { offsetof(CPUARMState, cp15.vbar_s),
--        return 0;
+                                      offsetof(CPUARMState, cp15.vbar_ns) },
--    }
+               .resetvalue = 0 },
 -    if (!s->vfp_enabled) {
 -        return 1;
 -    }
 -
 -    opr_sz = (1 + q) * 8;
 -    if (fn_gvec_ptr) {
 -        TCGv_ptr ptr;
 -        if (ptr_is_env) {
 -            ptr = cpu_env;
 -        } else {
 -            ptr = get_fpstatus_ptr(1);
 -        }
 -        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
 -                           opr_sz, opr_sz, data, fn_gvec_ptr);
 -        if (!ptr_is_env) {
 -            tcg_temp_free_ptr(ptr);
 -        }
 -    } else {
 -        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
 -                           opr_sz, opr_sz, data, fn_gvec);
 -    }
 -    return 0;
 -}
 -
  /* Advanced SIMD two registers and a scalar extension.
   *  31             24   23  22   20   16   12  11   10   9    8        3     0
   * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                      }
                  }
              }
 -        } else if ((insn & 0x0e000a00) == 0x0c000800
 -                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            if (disas_neon_insn_3same_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -            return;
          } else if ((insn & 0x0f000a00) == 0x0e000800
                     && arm_dc_feature(s, ARM_FEATURE_V8)) {
              if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
              }
              break;
          }
 -        if ((insn & 0xfe000a00) == 0xfc000800
 +        if ((insn & 0xff000a00) == 0xfe000800
              && arm_dc_feature(s, ARM_FEATURE_V8)) {
              /* The Thumb2 and ARM encodings are identical.  */
 -            if (disas_neon_insn_3same_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -        } else if ((insn & 0xff000a00) == 0xfe000800
 -                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            /* The Thumb2 and ARM encodings are identical.  */
              if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
                  goto illegal_op;
              }
 --
-.20.1
+.34.1

-[PULL 25/39] target/arm: Convert V[US]DOT (vector) to decodetree
+[PULL 24/33] target/arm: Mark up sysregs for HDFGRTR bits 0..11
-Convert the V[US]DOT (vector) insns to decodetree.
+Mark up the sysreg definitons for the registers trapped
 by HDFGRTR/HDFGWTR bits 0..11. These cover various debug
 related registers.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-7-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-15-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-15-peter.maydell@linaro.org
 ---
- target/arm/neon-shared.decode   |  4 ++++
+ target/arm/cpregs.h       | 12 ++++++++++++
- target/arm/translate-neon.inc.c | 32 ++++++++++++++++++++++++++++++++
+ target/arm/debug_helper.c | 11 +++++++++++
- target/arm/translate.c          |  9 +--------
+files changed, 23 insertions(+)
 files changed, 37 insertions(+), 8 deletions(-)
-diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-shared.decode
+--- a/target/arm/cpregs.h
-+++ b/target/arm/neon-shared.decode
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
+@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
+     DO_BIT(HFGRTR, ERRIDR_EL1),
- VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
+     DO_REV_BIT(HFGRTR, NSMPRI_EL1),
-                vm=%vm_dp vn=%vn_dp vd=%vd_dp
+     DO_REV_BIT(HFGRTR, NTPIDR2_EL0),
 +
-+# VUDOT and VSDOT
++    /* Trap bits in HDFGRTR_EL2 / HDFGWTR_EL2, starting from bit 0. */
-+VDOT           1111 110 00 . 10 .... .... 1101 . q:1 . u:1 .... \
++    DO_BIT(HDFGRTR, DBGBCRN_EL1),
-+               vm=%vm_dp vn=%vn_dp vd=%vd_dp
++    DO_BIT(HDFGRTR, DBGBVRN_EL1),
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
++    DO_BIT(HDFGRTR, DBGWCRN_EL1),
 +    DO_BIT(HDFGRTR, DBGWVRN_EL1),
 +    DO_BIT(HDFGRTR, MDSCR_EL1),
 +    DO_BIT(HDFGRTR, DBGCLAIM),
 +    DO_BIT(HDFGWTR, OSLAR_EL1),
 +    DO_BIT(HDFGRTR, OSLSR_EL1),
 +    DO_BIT(HDFGRTR, OSECCR_EL1),
 +    DO_BIT(HDFGRTR, OSDLR_EL1),
  } FGTBit;
  #undef DO_BIT
 diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/debug_helper.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/debug_helper.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VCADD(DisasContext *s, arg_VCADD *a)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
-     tcg_temp_free_ptr(fpst);
+     { .name = "MDSCR_EL1", .state = ARM_CP_STATE_BOTH,
-     return true;
+       .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 2,
- }
+       .access = PL1_RW, .accessfn = access_tda,
-+
++      .fgt = FGT_MDSCR_EL1,
-+static bool trans_VDOT(DisasContext *s, arg_VDOT *a)
+       .fieldoffset = offsetof(CPUARMState, cp15.mdscr_el1),
-+{
+       .resetvalue = 0 },
-+    int opr_sz;
+     /*
-+    gen_helper_gvec_3 *fn_gvec;
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
-+
+     { .name = "OSECCR_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14,
-+    if (!dc_isar_feature(aa32_dp, s)) {
+       .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 2,
-+        return false;
+       .access = PL1_RW, .accessfn = access_tda,
-+    }
++      .fgt = FGT_OSECCR_EL1,
-+
+       .type = ARM_CP_CONST, .resetvalue = 0 },
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
+     /*
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+      * DBGDSCRint[15,12,5:2] map to MDSCR_EL1[15,12,5:2].  Map all bits as
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
-+        return false;
+       .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4,
-+    }
+       .access = PL1_W, .type = ARM_CP_NO_RAW,
-+
+       .accessfn = access_tdosa,
-+    if ((a->vn | a->vm | a->vd) & a->q) {
++      .fgt = FGT_OSLAR_EL1,
-+        return false;
+       .writefn = oslar_write },
-+    }
+     { .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH,
-+
+       .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4,
-+    if (!vfp_access_check(s)) {
+       .access = PL1_R, .resetvalue = 10,
-+        return true;
+       .accessfn = access_tdosa,
-+    }
++      .fgt = FGT_OSLSR_EL1,
-+
+       .fieldoffset = offsetof(CPUARMState, cp15.oslsr_el1) },
-+    opr_sz = (1 + a->q) * 8;
+     /* Dummy OSDLR_EL1: 32-bit Linux will read this */
-+    fn_gvec = a->u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
+     { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH,
-+    tcg_gen_gvec_3_ool(vfp_reg_offset(1, a->vd),
+       .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4,
-+                       vfp_reg_offset(1, a->vn),
+       .access = PL1_RW, .accessfn = access_tdosa,
-+                       vfp_reg_offset(1, a->vm),
++      .fgt = FGT_OSDLR_EL1,
-+                       opr_sz, opr_sz, 0, fn_gvec);
+       .writefn = osdlr_write,
-+    return true;
+       .fieldoffset = offsetof(CPUARMState, cp15.osdlr_el1) },
-+}
+     /*
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
-index XXXXXXX..XXXXXXX 100644
+       .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 7, .crm = 8, .opc2 = 6,
---- a/target/arm/translate.c
+       .type = ARM_CP_ALIAS,
-+++ b/target/arm/translate.c
+       .access = PL1_RW, .accessfn = access_tda,
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
++      .fgt = FGT_DBGCLAIM,
-     bool is_long = false, q = extract32(insn, 6, 1);
+       .writefn = dbgclaimset_write, .readfn = dbgclaimset_read },
-     bool ptr_is_env = false;
+     { .name = "DBGCLAIMCLR_EL1", .state = ARM_CP_STATE_BOTH,
+       .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 6,
--    if ((insn & 0xfeb00f00) == 0xfc200d00) {
+       .access = PL1_RW, .accessfn = access_tda,
--        /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
++      .fgt = FGT_DBGCLAIM,
--        bool u = extract32(insn, 4, 1);
+       .writefn = dbgclaimclr_write, .raw_writefn = raw_write,
--        if (!dc_isar_feature(aa32_dp, s)) {
+       .fieldoffset = offsetof(CPUARMState, cp15.dbgclaim) },
--            return 1;
+ };
--        }
+@@ -XXX,XX +XXX,XX @@ void define_debug_regs(ARMCPU *cpu)
--        fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
+             { .name = dbgbvr_el1_name, .state = ARM_CP_STATE_BOTH,
--    } else if ((insn & 0xff300f10) == 0xfc200810) {
+               .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = i, .opc2 = 4,
-+    if ((insn & 0xff300f10) == 0xfc200810) {
+               .access = PL1_RW, .accessfn = access_tda,
-         /* VFM[AS]L -- 1111 1100 S.10 .... .... 1000 .Q.1 .... */
++              .fgt = FGT_DBGBVRN_EL1,
-         int is_s = extract32(insn, 23, 1);
+               .fieldoffset = offsetof(CPUARMState, cp15.dbgbvr[i]),
-         if (!dc_isar_feature(aa32_fhm, s)) {
+               .writefn = dbgbvr_write, .raw_writefn = raw_write
              },
              { .name = dbgbcr_el1_name, .state = ARM_CP_STATE_BOTH,
                .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = i, .opc2 = 5,
                .access = PL1_RW, .accessfn = access_tda,
 +              .fgt = FGT_DBGBCRN_EL1,
                .fieldoffset = offsetof(CPUARMState, cp15.dbgbcr[i]),
                .writefn = dbgbcr_write, .raw_writefn = raw_write
              },
@@ -XXX,XX +XXX,XX @@ void define_debug_regs(ARMCPU *cpu)
              { .name = dbgwvr_el1_name, .state = ARM_CP_STATE_BOTH,
                .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = i, .opc2 = 6,
                .access = PL1_RW, .accessfn = access_tda,
 +              .fgt = FGT_DBGWVRN_EL1,
                .fieldoffset = offsetof(CPUARMState, cp15.dbgwvr[i]),
                .writefn = dbgwvr_write, .raw_writefn = raw_write
              },
              { .name = dbgwcr_el1_name, .state = ARM_CP_STATE_BOTH,
                .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = i, .opc2 = 7,
                .access = PL1_RW, .accessfn = access_tda,
 +              .fgt = FGT_DBGWCRN_EL1,
                .fieldoffset = offsetof(CPUARMState, cp15.dbgwcr[i]),
                .writefn = dbgwcr_write, .raw_writefn = raw_write
              },
 --
-.20.1
+.34.1

-[PULL 24/39] target/arm: Convert VCADD (vector) to decodetree
+[PULL 25/33] target/arm: Mark up sysregs for HDFGRTR bits 12..63
-Convert the VCADD (vector) insns to decodetree.
+Mark up the sysreg definitions for the registers trapped
 by HDFGRTR/HDFGWTR bits 12..x.
 Bits 12..22 and bit 58 are for PMU registers.
 The remaining bits in HDFGRTR/HDFGWTR are for traps on
 registers that are part of features we don't implement:
 Bits 23..32 and 63 : FEAT_SPE
 Bits 33..48 : FEAT_ETE
 Bits 50..56 : FEAT_TRBE
 Bits 59..61 : FEAT_BRBE
 Bit 62 : FEAT_SPEv1p2.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-6-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-16-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-16-peter.maydell@linaro.org
 ---
- target/arm/neon-shared.decode   |  3 +++
+ target/arm/cpregs.h | 12 ++++++++++++
- target/arm/translate-neon.inc.c | 37 +++++++++++++++++++++++++++++++++
+ target/arm/helper.c | 37 +++++++++++++++++++++++++++++++++++++
- target/arm/translate.c          | 11 +---------
+files changed, 49 insertions(+)
-files changed, 41 insertions(+), 10 deletions(-)
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-shared.decode
+--- a/target/arm/cpregs.h
-+++ b/target/arm/neon-shared.decode
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
+     DO_BIT(HDFGRTR, OSLSR_EL1),
- VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
+     DO_BIT(HDFGRTR, OSECCR_EL1),
-                vm=%vm_dp vn=%vn_dp vd=%vd_dp
+     DO_BIT(HDFGRTR, OSDLR_EL1),
-+
++    DO_BIT(HDFGRTR, PMEVCNTRN_EL0),
-+VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
++    DO_BIT(HDFGRTR, PMEVTYPERN_EL0),
-+               vm=%vm_dp vn=%vn_dp vd=%vd_dp
++    DO_BIT(HDFGRTR, PMCCFILTR_EL0),
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
++    DO_BIT(HDFGRTR, PMCCNTR_EL0),
 +    DO_BIT(HDFGRTR, PMCNTEN),
 +    DO_BIT(HDFGRTR, PMINTEN),
 +    DO_BIT(HDFGRTR, PMOVS),
 +    DO_BIT(HDFGRTR, PMSELR_EL0),
 +    DO_BIT(HDFGWTR, PMSWINC_EL0),
 +    DO_BIT(HDFGWTR, PMCR_EL0),
 +    DO_BIT(HDFGRTR, PMMIR_EL1),
 +    DO_BIT(HDFGRTR, PMCEIDN_EL0),
  } FGTBit;
  #undef DO_BIT
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
-     tcg_temp_free_ptr(fpst);
+       .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmcnten),
-     return true;
+       .writefn = pmcntenset_write,
- }
+       .accessfn = pmreg_access,
-+
++      .fgt = FGT_PMCNTEN,
-+static bool trans_VCADD(DisasContext *s, arg_VCADD *a)
+       .raw_writefn = raw_write },
-+{
+     { .name = "PMCNTENSET_EL0", .state = ARM_CP_STATE_AA64, .type = ARM_CP_IO,
-+    int opr_sz;
+       .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 1,
-+    TCGv_ptr fpst;
+       .access = PL0_RW, .accessfn = pmreg_access,
-+    gen_helper_gvec_3_ptr *fn_gvec_ptr;
++      .fgt = FGT_PMCNTEN,
-+
+       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcnten), .resetvalue = 0,
-+    if (!dc_isar_feature(aa32_vcma, s)
+       .writefn = pmcntenset_write, .raw_writefn = raw_write },
-+        || (!a->size && !dc_isar_feature(aa32_fp16_arith, s))) {
+     { .name = "PMCNTENCLR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 2,
-+        return false;
+       .access = PL0_RW,
-+    }
+       .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmcnten),
-+
+       .accessfn = pmreg_access,
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
++      .fgt = FGT_PMCNTEN,
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+       .writefn = pmcntenclr_write,
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
+       .type = ARM_CP_ALIAS | ARM_CP_IO },
-+        return false;
+     { .name = "PMCNTENCLR_EL0", .state = ARM_CP_STATE_AA64,
-+    }
+       .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 2,
-+
+       .access = PL0_RW, .accessfn = pmreg_access,
-+    if ((a->vn | a->vm | a->vd) & a->q) {
++      .fgt = FGT_PMCNTEN,
-+        return false;
+       .type = ARM_CP_ALIAS | ARM_CP_IO,
-+    }
+       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcnten),
-+
+       .writefn = pmcntenclr_write },
-+    if (!vfp_access_check(s)) {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
-+        return true;
+       .access = PL0_RW, .type = ARM_CP_IO,
-+    }
+       .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmovsr),
-+
+       .accessfn = pmreg_access,
-+    opr_sz = (1 + a->q) * 8;
++      .fgt = FGT_PMOVS,
-+    fpst = get_fpstatus_ptr(1);
+       .writefn = pmovsr_write,
-+    fn_gvec_ptr = a->size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
+       .raw_writefn = raw_write },
-+    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
+     { .name = "PMOVSCLR_EL0", .state = ARM_CP_STATE_AA64,
-+                       vfp_reg_offset(1, a->vn),
+       .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 3,
-+                       vfp_reg_offset(1, a->vm),
+       .access = PL0_RW, .accessfn = pmreg_access,
-+                       fpst, opr_sz, opr_sz, a->rot,
++      .fgt = FGT_PMOVS,
-+                       fn_gvec_ptr);
+       .type = ARM_CP_ALIAS | ARM_CP_IO,
-+    tcg_temp_free_ptr(fpst);
+       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
-+    return true;
+       .writefn = pmovsr_write,
-+}
+       .raw_writefn = raw_write },
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+     { .name = "PMSWINC", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 4,
-index XXXXXXX..XXXXXXX 100644
+       .access = PL0_W, .accessfn = pmreg_access_swinc,
---- a/target/arm/translate.c
++      .fgt = FGT_PMSWINC_EL0,
-+++ b/target/arm/translate.c
+       .type = ARM_CP_NO_RAW | ARM_CP_IO,
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+       .writefn = pmswinc_write },
-     bool is_long = false, q = extract32(insn, 6, 1);
+     { .name = "PMSWINC_EL0", .state = ARM_CP_STATE_AA64,
-     bool ptr_is_env = false;
+       .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 4,
+       .access = PL0_W, .accessfn = pmreg_access_swinc,
--    if ((insn & 0xfea00f10) == 0xfc800800) {
++      .fgt = FGT_PMSWINC_EL0,
--        /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
+       .type = ARM_CP_NO_RAW | ARM_CP_IO,
--        int size = extract32(insn, 20, 1);
+       .writefn = pmswinc_write },
--        data = extract32(insn, 24, 1); /* rot */
+     { .name = "PMSELR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 5,
--        if (!dc_isar_feature(aa32_vcma, s)
+       .access = PL0_RW, .type = ARM_CP_ALIAS,
--            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
++      .fgt = FGT_PMSELR_EL0,
--            return 1;
+       .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmselr),
--        }
+       .accessfn = pmreg_access_selr, .writefn = pmselr_write,
--        fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
+       .raw_writefn = raw_write},
--    } else if ((insn & 0xfeb00f00) == 0xfc200d00) {
+     { .name = "PMSELR_EL0", .state = ARM_CP_STATE_AA64,
-+    if ((insn & 0xfeb00f00) == 0xfc200d00) {
+       .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 5,
-         /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
+       .access = PL0_RW, .accessfn = pmreg_access_selr,
-         bool u = extract32(insn, 4, 1);
++      .fgt = FGT_PMSELR_EL0,
-         if (!dc_isar_feature(aa32_dp, s)) {
+       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmselr),
        .writefn = pmselr_write, .raw_writefn = raw_write, },
      { .name = "PMCCNTR", .cp = 15, .crn = 9, .crm = 13, .opc1 = 0, .opc2 = 0,
        .access = PL0_RW, .resetvalue = 0, .type = ARM_CP_ALIAS | ARM_CP_IO,
 +      .fgt = FGT_PMCCNTR_EL0,
        .readfn = pmccntr_read, .writefn = pmccntr_write32,
        .accessfn = pmreg_access_ccntr },
      { .name = "PMCCNTR_EL0", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 13, .opc2 = 0,
        .access = PL0_RW, .accessfn = pmreg_access_ccntr,
 +      .fgt = FGT_PMCCNTR_EL0,
        .type = ARM_CP_IO,
        .fieldoffset = offsetof(CPUARMState, cp15.c15_ccnt),
        .readfn = pmccntr_read, .writefn = pmccntr_write,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
      { .name = "PMCCFILTR", .cp = 15, .opc1 = 0, .crn = 14, .crm = 15, .opc2 = 7,
        .writefn = pmccfiltr_write_a32, .readfn = pmccfiltr_read_a32,
        .access = PL0_RW, .accessfn = pmreg_access,
 +      .fgt = FGT_PMCCFILTR_EL0,
        .type = ARM_CP_ALIAS | ARM_CP_IO,
        .resetvalue = 0, },
      { .name = "PMCCFILTR_EL0", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 15, .opc2 = 7,
        .writefn = pmccfiltr_write, .raw_writefn = raw_write,
        .access = PL0_RW, .accessfn = pmreg_access,
 +      .fgt = FGT_PMCCFILTR_EL0,
        .type = ARM_CP_IO,
        .fieldoffset = offsetof(CPUARMState, cp15.pmccfiltr_el0),
        .resetvalue = 0, },
      { .name = "PMXEVTYPER", .cp = 15, .crn = 9, .crm = 13, .opc1 = 0, .opc2 = 1,
        .access = PL0_RW, .type = ARM_CP_NO_RAW | ARM_CP_IO,
        .accessfn = pmreg_access,
 +      .fgt = FGT_PMEVTYPERN_EL0,
        .writefn = pmxevtyper_write, .readfn = pmxevtyper_read },
      { .name = "PMXEVTYPER_EL0", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 13, .opc2 = 1,
        .access = PL0_RW, .type = ARM_CP_NO_RAW | ARM_CP_IO,
        .accessfn = pmreg_access,
 +      .fgt = FGT_PMEVTYPERN_EL0,
        .writefn = pmxevtyper_write, .readfn = pmxevtyper_read },
      { .name = "PMXEVCNTR", .cp = 15, .crn = 9, .crm = 13, .opc1 = 0, .opc2 = 2,
        .access = PL0_RW, .type = ARM_CP_NO_RAW | ARM_CP_IO,
        .accessfn = pmreg_access_xevcntr,
 +      .fgt = FGT_PMEVCNTRN_EL0,
        .writefn = pmxevcntr_write, .readfn = pmxevcntr_read },
      { .name = "PMXEVCNTR_EL0", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 13, .opc2 = 2,
        .access = PL0_RW, .type = ARM_CP_NO_RAW | ARM_CP_IO,
        .accessfn = pmreg_access_xevcntr,
 +      .fgt = FGT_PMEVCNTRN_EL0,
        .writefn = pmxevcntr_write, .readfn = pmxevcntr_read },
      { .name = "PMUSERENR", .cp = 15, .crn = 9, .crm = 14, .opc1 = 0, .opc2 = 0,
        .access = PL0_R | PL1_RW, .accessfn = access_tpm,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
        .writefn = pmuserenr_write, .raw_writefn = raw_write },
      { .name = "PMINTENSET", .cp = 15, .crn = 9, .crm = 14, .opc1 = 0, .opc2 = 1,
        .access = PL1_RW, .accessfn = access_tpm,
 +      .fgt = FGT_PMINTEN,
        .type = ARM_CP_ALIAS | ARM_CP_IO,
        .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pminten),
        .resetvalue = 0,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
      { .name = "PMINTENSET_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 1,
        .access = PL1_RW, .accessfn = access_tpm,
 +      .fgt = FGT_PMINTEN,
        .type = ARM_CP_IO,
        .fieldoffset = offsetof(CPUARMState, cp15.c9_pminten),
        .writefn = pmintenset_write, .raw_writefn = raw_write,
        .resetvalue = 0x0 },
      { .name = "PMINTENCLR", .cp = 15, .crn = 9, .crm = 14, .opc1 = 0, .opc2 = 2,
        .access = PL1_RW, .accessfn = access_tpm,
 +      .fgt = FGT_PMINTEN,
        .type = ARM_CP_ALIAS | ARM_CP_IO | ARM_CP_NO_RAW,
        .fieldoffset = offsetof(CPUARMState, cp15.c9_pminten),
        .writefn = pmintenclr_write, },
      { .name = "PMINTENCLR_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 2,
        .access = PL1_RW, .accessfn = access_tpm,
 +      .fgt = FGT_PMINTEN,
        .type = ARM_CP_ALIAS | ARM_CP_IO | ARM_CP_NO_RAW,
        .fieldoffset = offsetof(CPUARMState, cp15.c9_pminten),
        .writefn = pmintenclr_write },
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pmovsset_cp_reginfo[] = {
      /* PMOVSSET is not implemented in v7 before v7ve */
      { .name = "PMOVSSET", .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 3,
        .access = PL0_RW, .accessfn = pmreg_access,
 +      .fgt = FGT_PMOVS,
        .type = ARM_CP_ALIAS | ARM_CP_IO,
        .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmovsr),
        .writefn = pmovsset_write,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pmovsset_cp_reginfo[] = {
      { .name = "PMOVSSET_EL0", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 14, .opc2 = 3,
        .access = PL0_RW, .accessfn = pmreg_access,
 +      .fgt = FGT_PMOVS,
        .type = ARM_CP_ALIAS | ARM_CP_IO,
        .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
        .writefn = pmovsset_write,
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
      ARMCPRegInfo pmcr = {
          .name = "PMCR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 0,
          .access = PL0_RW,
 +        .fgt = FGT_PMCR_EL0,
          .type = ARM_CP_IO | ARM_CP_ALIAS,
          .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmcr),
          .accessfn = pmreg_access, .writefn = pmcr_write,
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
          .name = "PMCR_EL0", .state = ARM_CP_STATE_AA64,
          .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 0,
          .access = PL0_RW, .accessfn = pmreg_access,
 +        .fgt = FGT_PMCR_EL0,
          .type = ARM_CP_IO,
          .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcr),
          .resetvalue = cpu->isar.reset_pmcr_el0,
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
              { .name = pmevcntr_name, .cp = 15, .crn = 14,
                .crm = 8 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
                .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
 +              .fgt = FGT_PMEVCNTRN_EL0,
                .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
                .accessfn = pmreg_access_xevcntr },
              { .name = pmevcntr_el0_name, .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 8 | (3 & (i >> 3)),
                .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access_xevcntr,
                .type = ARM_CP_IO,
 +              .fgt = FGT_PMEVCNTRN_EL0,
                .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
                .raw_readfn = pmevcntr_rawread,
                .raw_writefn = pmevcntr_rawwrite },
              { .name = pmevtyper_name, .cp = 15, .crn = 14,
                .crm = 12 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
                .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
 +              .fgt = FGT_PMEVTYPERN_EL0,
                .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
                .accessfn = pmreg_access },
              { .name = pmevtyper_el0_name, .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 12 | (3 & (i >> 3)),
                .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
 +              .fgt = FGT_PMEVTYPERN_EL0,
                .type = ARM_CP_IO,
                .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
                .raw_writefn = pmevtyper_rawwrite },
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
              { .name = "PMCEID2", .state = ARM_CP_STATE_AA32,
                .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 4,
                .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 +              .fgt = FGT_PMCEIDN_EL0,
                .resetvalue = extract64(cpu->pmceid0, 32, 32) },
              { .name = "PMCEID3", .state = ARM_CP_STATE_AA32,
                .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 5,
                .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 +              .fgt = FGT_PMCEIDN_EL0,
                .resetvalue = extract64(cpu->pmceid1, 32, 32) },
          };
          define_arm_cp_regs(cpu, v81_pmu_regs);
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
              .name = "PMMIR_EL1", .state = ARM_CP_STATE_BOTH,
              .opc0 = 3, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 6,
              .access = PL1_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 +            .fgt = FGT_PMMIR_EL1,
              .resetvalue = 0
          };
          define_one_arm_cp_reg(cpu, &v84_pmmir);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
              { .name = "PMCEID0", .state = ARM_CP_STATE_AA32,
                .cp = 15, .opc1 = 0, .crn = 9, .crm = 12, .opc2 = 6,
                .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 +              .fgt = FGT_PMCEIDN_EL0,
                .resetvalue = extract64(cpu->pmceid0, 0, 32) },
              { .name = "PMCEID0_EL0", .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 6,
                .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 +              .fgt = FGT_PMCEIDN_EL0,
                .resetvalue = cpu->pmceid0 },
              { .name = "PMCEID1", .state = ARM_CP_STATE_AA32,
                .cp = 15, .opc1 = 0, .crn = 9, .crm = 12, .opc2 = 7,
                .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 +              .fgt = FGT_PMCEIDN_EL0,
                .resetvalue = extract64(cpu->pmceid1, 0, 32) },
              { .name = "PMCEID1_EL0", .state = ARM_CP_STATE_AA64,
                .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 7,
                .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 +              .fgt = FGT_PMCEIDN_EL0,
                .resetvalue = cpu->pmceid1 },
          };
  #ifdef CONFIG_USER_ONLY
 --
-.20.1
+.34.1

-[PULL 23/39] target/arm: Convert VCMLA (vector) to decodetree
+[PULL 26/33] target/arm: Mark up sysregs for HFGITR bits 0..11
-Convert the VCMLA (vector) insns in the 3same extension group to
+Mark up the sysreg definitions for the system instructions
-decodetree.
+trapped by HFGITR bits 0..11. These bits cover various
 cache maintenance operations.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-5-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-17-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-17-peter.maydell@linaro.org
 ---
- target/arm/neon-shared.decode   | 11 ++++++++++
+ target/arm/cpregs.h | 14 ++++++++++++++
- target/arm/translate-neon.inc.c | 37 +++++++++++++++++++++++++++++++++
+ target/arm/helper.c | 28 ++++++++++++++++++++++++++++
- target/arm/translate.c          | 11 +---------
+files changed, 42 insertions(+)
 files changed, 49 insertions(+), 10 deletions(-)
-diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-shared.decode
+--- a/target/arm/cpregs.h
-+++ b/target/arm/neon-shared.decode
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
- # More specifically, this covers:
+     DO_BIT(HDFGWTR, PMCR_EL0),
- # 2reg scalar ext: 0b1111_1110_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
+     DO_BIT(HDFGRTR, PMMIR_EL1),
- # 3same ext:       0b1111_110x_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
+     DO_BIT(HDFGRTR, PMCEIDN_EL0),
 +
-+# VFP/Neon register fields; same as vfp.decode
++    /* Trap bits in HFGITR_EL2, starting from bit 0 */
-+%vm_dp  5:1 0:4
++    DO_BIT(HFGITR, ICIALLUIS),
-+%vm_sp  0:4 5:1
++    DO_BIT(HFGITR, ICIALLU),
-+%vn_dp  7:1 16:4
++    DO_BIT(HFGITR, ICIVAU),
-+%vn_sp  16:4 7:1
++    DO_BIT(HFGITR, DCIVAC),
-+%vd_dp  22:1 12:4
++    DO_BIT(HFGITR, DCISW),
-+%vd_sp  12:4 22:1
++    DO_BIT(HFGITR, DCCSW),
-+
++    DO_BIT(HFGITR, DCCISW),
-+VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
++    DO_BIT(HFGITR, DCCVAU),
-+               vm=%vm_dp vn=%vn_dp vd=%vd_dp
++    DO_BIT(HFGITR, DCCVAP),
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
++    DO_BIT(HFGITR, DCCVADP),
 +    DO_BIT(HFGITR, DCCIVAC),
 +    DO_BIT(HFGITR, DCZVA),
  } FGTBit;
  #undef DO_BIT
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
- #include "decode-neon-dp.inc.c"
+ #ifndef CONFIG_USER_ONLY
- #include "decode-neon-ls.inc.c"
+       /* Avoid overhead of an access check that always passes in user-mode */
- #include "decode-neon-shared.inc.c"
+       .accessfn = aa64_zva_access,
-+
++      .fgt = FGT_DCZVA,
-+static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a)
+ #endif
-+{
+     },
-+    int opr_sz;
+     { .name = "CURRENTEL", .state = ARM_CP_STATE_AA64,
-+    TCGv_ptr fpst;
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
-+    gen_helper_gvec_3_ptr *fn_gvec_ptr;
+     { .name = "IC_IALLUIS", .state = ARM_CP_STATE_AA64,
-+
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
-+    if (!dc_isar_feature(aa32_vcma, s)
+       .access = PL1_W, .type = ARM_CP_NOP,
-+        || (!a->size && !dc_isar_feature(aa32_fp16_arith, s))) {
++      .fgt = FGT_ICIALLUIS,
-+        return false;
+       .accessfn = access_ticab },
-+    }
+     { .name = "IC_IALLU", .state = ARM_CP_STATE_AA64,
-+
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 0,
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
+       .access = PL1_W, .type = ARM_CP_NOP,
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
++      .fgt = FGT_ICIALLU,
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
+       .accessfn = access_tocu },
-+        return false;
+     { .name = "IC_IVAU", .state = ARM_CP_STATE_AA64,
-+    }
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 5, .opc2 = 1,
-+
+       .access = PL0_W, .type = ARM_CP_NOP,
-+    if ((a->vn | a->vm | a->vd) & a->q) {
++      .fgt = FGT_ICIVAU,
-+        return false;
+       .accessfn = access_tocu },
-+    }
+     { .name = "DC_IVAC", .state = ARM_CP_STATE_AA64,
-+
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
-+    if (!vfp_access_check(s)) {
+       .access = PL1_W, .accessfn = aa64_cacheop_poc_access,
-+        return true;
++      .fgt = FGT_DCIVAC,
-+    }
+       .type = ARM_CP_NOP },
-+
+     { .name = "DC_ISW", .state = ARM_CP_STATE_AA64,
-+    opr_sz = (1 + a->q) * 8;
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 2,
-+    fpst = get_fpstatus_ptr(1);
++      .fgt = FGT_DCISW,
-+    fn_gvec_ptr = a->size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
+       .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
-+    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
+     { .name = "DC_CVAC", .state = ARM_CP_STATE_AA64,
-+                       vfp_reg_offset(1, a->vn),
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 1,
-+                       vfp_reg_offset(1, a->vm),
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
-+                       fpst, opr_sz, opr_sz, a->rot,
+       .accessfn = aa64_cacheop_poc_access },
-+                       fn_gvec_ptr);
+     { .name = "DC_CSW", .state = ARM_CP_STATE_AA64,
-+    tcg_temp_free_ptr(fpst);
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
-+    return true;
++      .fgt = FGT_DCCSW,
-+}
+       .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+     { .name = "DC_CVAU", .state = ARM_CP_STATE_AA64,
-index XXXXXXX..XXXXXXX 100644
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 11, .opc2 = 1,
---- a/target/arm/translate.c
+       .access = PL0_W, .type = ARM_CP_NOP,
-+++ b/target/arm/translate.c
++      .fgt = FGT_DCCVAU,
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+       .accessfn = access_tocu },
-     bool is_long = false, q = extract32(insn, 6, 1);
+     { .name = "DC_CIVAC", .state = ARM_CP_STATE_AA64,
-     bool ptr_is_env = false;
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 1,
+       .access = PL0_W, .type = ARM_CP_NOP,
--    if ((insn & 0xfe200f10) == 0xfc200800) {
++      .fgt = FGT_DCCIVAC,
--        /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
+       .accessfn = aa64_cacheop_poc_access },
--        int size = extract32(insn, 20, 1);
+     { .name = "DC_CISW", .state = ARM_CP_STATE_AA64,
--        data = extract32(insn, 23, 2); /* rot */
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 2,
--        if (!dc_isar_feature(aa32_vcma, s)
++      .fgt = FGT_DCCISW,
--            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
+       .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
--            return 1;
+     /* TLBI operations */
--        }
+     { .name = "TLBI_VMALLE1IS", .state = ARM_CP_STATE_AA64,
--        fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dcpop_reg[] = {
--    } else if ((insn & 0xfea00f10) == 0xfc800800) {
+     { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
-+    if ((insn & 0xfea00f10) == 0xfc800800) {
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
-         /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
+       .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
-         int size = extract32(insn, 20, 1);
++      .fgt = FGT_DCCVAP,
-         data = extract32(insn, 24, 1); /* rot */
+       .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
  };
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dcpodp_reg[] = {
      { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
        .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
 +      .fgt = FGT_DCCVADP,
        .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
  };
  #endif /*CONFIG_USER_ONLY*/
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo mte_reginfo[] = {
      { .name = "DC_IGVAC", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 3,
        .type = ARM_CP_NOP, .access = PL1_W,
 +      .fgt = FGT_DCIVAC,
        .accessfn = aa64_cacheop_poc_access },
      { .name = "DC_IGSW", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 4,
 +      .fgt = FGT_DCISW,
        .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
      { .name = "DC_IGDVAC", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 5,
        .type = ARM_CP_NOP, .access = PL1_W,
 +      .fgt = FGT_DCIVAC,
        .accessfn = aa64_cacheop_poc_access },
      { .name = "DC_IGDSW", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 6,
 +      .fgt = FGT_DCISW,
        .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
      { .name = "DC_CGSW", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 4,
 +      .fgt = FGT_DCCSW,
        .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
      { .name = "DC_CGDSW", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 6,
 +      .fgt = FGT_DCCSW,
        .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
      { .name = "DC_CIGSW", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 4,
 +      .fgt = FGT_DCCISW,
        .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
      { .name = "DC_CIGDSW", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 6,
 +      .fgt = FGT_DCCISW,
        .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
  };
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
      { .name = "DC_CGVAP", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 3,
        .type = ARM_CP_NOP, .access = PL0_W,
 +      .fgt = FGT_DCCVAP,
        .accessfn = aa64_cacheop_poc_access },
      { .name = "DC_CGDVAP", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 5,
        .type = ARM_CP_NOP, .access = PL0_W,
 +      .fgt = FGT_DCCVAP,
        .accessfn = aa64_cacheop_poc_access },
      { .name = "DC_CGVADP", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 3,
        .type = ARM_CP_NOP, .access = PL0_W,
 +      .fgt = FGT_DCCVADP,
        .accessfn = aa64_cacheop_poc_access },
      { .name = "DC_CGDVADP", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 5,
        .type = ARM_CP_NOP, .access = PL0_W,
 +      .fgt = FGT_DCCVADP,
        .accessfn = aa64_cacheop_poc_access },
      { .name = "DC_CIGVAC", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 3,
        .type = ARM_CP_NOP, .access = PL0_W,
 +      .fgt = FGT_DCCIVAC,
        .accessfn = aa64_cacheop_poc_access },
      { .name = "DC_CIGDVAC", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 5,
        .type = ARM_CP_NOP, .access = PL0_W,
 +      .fgt = FGT_DCCIVAC,
        .accessfn = aa64_cacheop_poc_access },
      { .name = "DC_GVA", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 4, .opc2 = 3,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
  #ifndef CONFIG_USER_ONLY
        /* Avoid overhead of an access check that always passes in user-mode */
        .accessfn = aa64_zva_access,
 +      .fgt = FGT_DCZVA,
  #endif
      },
      { .name = "DC_GZVA", .state = ARM_CP_STATE_AA64,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
  #ifndef CONFIG_USER_ONLY
        /* Avoid overhead of an access check that always passes in user-mode */
        .accessfn = aa64_zva_access,
 +      .fgt = FGT_DCZVA,
  #endif
      },
  };
 --
-.20.1
+.34.1

-[PULL 27/39] target/arm: Convert VCMLA (scalar) to decodetree
+[PULL 27/33] target/arm: Mark up sysregs for HFGITR bits 12..17
-Convert VCMLA (scalar) in the 2reg-scalar-ext group to decodetree.
+Mark up the sysreg definitions for the system instructions
 trapped by HFGITR bits 12..17. These bits cover AT address
 translation instructions.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-9-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-18-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-18-peter.maydell@linaro.org
 ---
- target/arm/neon-shared.decode   |  5 +++++
+ target/arm/cpregs.h | 6 ++++++
- target/arm/translate-neon.inc.c | 40 +++++++++++++++++++++++++++++++++
+ target/arm/helper.c | 6 ++++++
- target/arm/translate.c          | 26 +--------------------
+files changed, 12 insertions(+)
 files changed, 46 insertions(+), 25 deletions(-)
-diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-shared.decode
+--- a/target/arm/cpregs.h
-+++ b/target/arm/neon-shared.decode
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ VFML           1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \
+@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
-                vm=%vm_sp vn=%vn_sp vd=%vd_dp q=0
+     DO_BIT(HFGITR, DCCVADP),
- VFML           1111 110 0 s:1 . 10 .... .... 1000 . 1 . 1 .... \
+     DO_BIT(HFGITR, DCCIVAC),
-                vm=%vm_dp vn=%vn_dp vd=%vd_dp q=1
+     DO_BIT(HFGITR, DCZVA),
-+
++    DO_BIT(HFGITR, ATS1E1R),
-+VCMLA_scalar   1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \
++    DO_BIT(HFGITR, ATS1E1W),
-+               vn=%vn_dp vd=%vd_dp size=0
++    DO_BIT(HFGITR, ATS1E0R),
-+VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
++    DO_BIT(HFGITR, ATS1E0W),
-+               vm=%vm_dp vn=%vn_dp vd=%vd_dp size=1 index=0
++    DO_BIT(HFGITR, ATS1E1RP),
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
++    DO_BIT(HFGITR, ATS1E1WP),
  } FGTBit;
  #undef DO_BIT
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VFML(DisasContext *s, arg_VFML *a)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
-                        gen_helper_gvec_fmlal_a32);
+     { .name = "AT_S1E1R", .state = ARM_CP_STATE_AA64,
-     return true;
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 8, .opc2 = 0,
- }
+       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
-+
++      .fgt = FGT_ATS1E1R,
-+static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a)
+       .writefn = ats_write64 },
-+{
+     { .name = "AT_S1E1W", .state = ARM_CP_STATE_AA64,
-+    gen_helper_gvec_3_ptr *fn_gvec_ptr;
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 8, .opc2 = 1,
-+    int opr_sz;
+       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
-+    TCGv_ptr fpst;
++      .fgt = FGT_ATS1E1W,
-+
+       .writefn = ats_write64 },
-+    if (!dc_isar_feature(aa32_vcma, s)) {
+     { .name = "AT_S1E0R", .state = ARM_CP_STATE_AA64,
-+        return false;
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 8, .opc2 = 2,
-+    }
+       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
-+    if (a->size == 0 && !dc_isar_feature(aa32_fp16_arith, s)) {
++      .fgt = FGT_ATS1E0R,
-+        return false;
+       .writefn = ats_write64 },
-+    }
+     { .name = "AT_S1E0W", .state = ARM_CP_STATE_AA64,
-+
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 8, .opc2 = 3,
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
+       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
++      .fgt = FGT_ATS1E0W,
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
+       .writefn = ats_write64 },
-+        return false;
+     { .name = "AT_S12E1R", .state = ARM_CP_STATE_AA64,
-+    }
+       .opc0 = 1, .opc1 = 4, .crn = 7, .crm = 8, .opc2 = 4,
-+
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo ats1e1_reginfo[] = {
-+    if ((a->vd | a->vn) & a->q) {
+     { .name = "AT_S1E1RP", .state = ARM_CP_STATE_AA64,
-+        return false;
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 0,
-+    }
+       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
-+
++      .fgt = FGT_ATS1E1RP,
-+    if (!vfp_access_check(s)) {
+       .writefn = ats_write64 },
-+        return true;
+     { .name = "AT_S1E1WP", .state = ARM_CP_STATE_AA64,
-+    }
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 1,
-+
+       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
-+    fn_gvec_ptr = (a->size ? gen_helper_gvec_fcmlas_idx
++      .fgt = FGT_ATS1E1WP,
-+                   : gen_helper_gvec_fcmlah_idx);
+       .writefn = ats_write64 },
-+    opr_sz = (1 + a->q) * 8;
+ };
 +    fpst = get_fpstatus_ptr(1);
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(1, a->vn),
 +                       vfp_reg_offset(1, a->vm),
 +                       fpst, opr_sz, opr_sz,
 +                       (a->index << 2) | a->rot, fn_gvec_ptr);
 +    tcg_temp_free_ptr(fpst);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
      bool is_long = false, q = extract32(insn, 6, 1);
      bool ptr_is_env = false;
 -    if ((insn & 0xff000f10) == 0xfe000800) {
 -        /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */
 -        int rot = extract32(insn, 20, 2);
 -        int size = extract32(insn, 23, 1);
 -        int index;
 -
 -        if (!dc_isar_feature(aa32_vcma, s)) {
 -            return 1;
 -        }
 -        if (size == 0) {
 -            if (!dc_isar_feature(aa32_fp16_arith, s)) {
 -                return 1;
 -            }
 -            /* For fp16, rm is just Vm, and index is M.  */
 -            rm = extract32(insn, 0, 4);
 -            index = extract32(insn, 5, 1);
 -        } else {
 -            /* For fp32, rm is the usual M:Vm, and index is 0.  */
 -            VFP_DREG_M(rm, insn);
 -            index = 0;
 -        }
 -        data = (index << 2) | rot;
 -        fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx
 -                       : gen_helper_gvec_fcmlah_idx);
 -    } else if ((insn & 0xffb00f00) == 0xfe200d00) {
 +    if ((insn & 0xffb00f00) == 0xfe200d00) {
          /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
          int u = extract32(insn, 4, 1);
 --
-.20.1
+.34.1

-[PULL 03/39] target/arm: Don't use a TLB for ARMMMUIdx_Stage2
+[PULL 28/33] target/arm: Mark up sysregs for HFGITR bits 18..47
-We define ARMMMUIdx_Stage2 as being an MMU index which uses a QEMU
+Mark up the sysreg definitions for the system instructions
-TLB.  However we never actually use the TLB -- all stage 2 lookups
+trapped by HFGITR bits 18..47. These bits cover TLBI
-are done by direct calls to get_phys_addr_lpae() followed by a
+TLB maintenance instructions.
 physical address load via address_space_ld*().
-Remove Stage2 from the list of ARM MMU indexes which correspond to
+(If we implemented FEAT_XS we would need to trap some of the
-real core MMU indexes, and instead put it in the set of "NOTLB" ARM
+instructions added by that feature using these bits; but we don't
-MMU indexes.
+yet, so will need to add the .fgt markup when we do.)
 This allows us to drop NB_MMU_MODES to 11.  It also means we can
 safely add support for the ARMv8.3-TTS2UXN extension, which adds
 permission bits to the stage 2 descriptors which define execute
 permission separatel for EL0 and EL1; supporting that while keeping
 Stage2 in a QEMU TLB would require us to use separate TLBs for
 "Stage2 for an EL0 access" and "Stage2 for an EL1 access", which is a
 lot of extra complication given we aren't even using the QEMU TLB.
 In the process of updating the comment on our MMU index use,
 fix a couple of other minor errors:
  * NS EL2 EL2&0 was missing from the list in the comment
  * some text hadn't been updated from when we bumped NB_MMU_MODES
    above 8
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200330210400.11724-2-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-19-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-19-peter.maydell@linaro.org
 ---
- target/arm/cpu-param.h |   2 +-
+ target/arm/cpregs.h | 30 ++++++++++++++++++++++++++++++
- target/arm/cpu.h       |  21 +++++---
+ target/arm/helper.c | 30 ++++++++++++++++++++++++++++++
- target/arm/helper.c    | 112 ++++-------------------------------------
+files changed, 60 insertions(+)
 files changed, 27 insertions(+), 108 deletions(-)
-diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu-param.h
+--- a/target/arm/cpregs.h
-+++ b/target/arm/cpu-param.h
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
- # define TARGET_PAGE_BITS_MIN  10
+     DO_BIT(HFGITR, ATS1E0W),
- #endif
+     DO_BIT(HFGITR, ATS1E1RP),
+     DO_BIT(HFGITR, ATS1E1WP),
--#define NB_MMU_MODES 12
++    DO_BIT(HFGITR, TLBIVMALLE1OS),
-+#define NB_MMU_MODES 11
++    DO_BIT(HFGITR, TLBIVAE1OS),
++    DO_BIT(HFGITR, TLBIASIDE1OS),
- #endif
++    DO_BIT(HFGITR, TLBIVAAE1OS),
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
++    DO_BIT(HFGITR, TLBIVALE1OS),
-index XXXXXXX..XXXXXXX 100644
++    DO_BIT(HFGITR, TLBIVAALE1OS),
---- a/target/arm/cpu.h
++    DO_BIT(HFGITR, TLBIRVAE1OS),
-+++ b/target/arm/cpu.h
++    DO_BIT(HFGITR, TLBIRVAAE1OS),
-@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
++    DO_BIT(HFGITR, TLBIRVALE1OS),
-  *     handling via the TLB. The only way to do a stage 1 translation without
++    DO_BIT(HFGITR, TLBIRVAALE1OS),
-  *     the immediate stage 2 translation is via the ATS or AT system insns,
++    DO_BIT(HFGITR, TLBIVMALLE1IS),
-  *     which can be slow-pathed and always do a page table walk.
++    DO_BIT(HFGITR, TLBIVAE1IS),
-+ *     The only use of stage 2 translations is either as part of an s1+2
++    DO_BIT(HFGITR, TLBIASIDE1IS),
-+ *     lookup or when loading the descriptors during a stage 1 page table walk,
++    DO_BIT(HFGITR, TLBIVAAE1IS),
-+ *     and in both those cases we don't use the TLB.
++    DO_BIT(HFGITR, TLBIVALE1IS),
-  *  4. we can also safely fold together the "32 bit EL3" and "64 bit EL3"
++    DO_BIT(HFGITR, TLBIVAALE1IS),
-  *     translation regimes, because they map reasonably well to each other
++    DO_BIT(HFGITR, TLBIRVAE1IS),
-  *     and they can't both be active at the same time.
++    DO_BIT(HFGITR, TLBIRVAAE1IS),
-@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
++    DO_BIT(HFGITR, TLBIRVALE1IS),
-  * NS EL1 EL1&0 stage 1+2 (aka NS PL1)
++    DO_BIT(HFGITR, TLBIRVAALE1IS),
-  * NS EL1 EL1&0 stage 1+2 +PAN
++    DO_BIT(HFGITR, TLBIRVAE1),
-  * NS EL0 EL2&0
++    DO_BIT(HFGITR, TLBIRVAAE1),
-+ * NS EL2 EL2&0
++    DO_BIT(HFGITR, TLBIRVALE1),
-  * NS EL2 EL2&0 +PAN
++    DO_BIT(HFGITR, TLBIRVAALE1),
-  * NS EL2 (aka NS PL2)
++    DO_BIT(HFGITR, TLBIVMALLE1),
-  * S EL0 EL1&0 (aka S PL0)
++    DO_BIT(HFGITR, TLBIVAE1),
-  * S EL1 EL1&0 (not used if EL3 is 32 bit)
++    DO_BIT(HFGITR, TLBIASIDE1),
-  * S EL1 EL1&0 +PAN
++    DO_BIT(HFGITR, TLBIVAAE1),
-  * S EL3 (aka S PL1)
++    DO_BIT(HFGITR, TLBIVALE1),
-- * NS EL1&0 stage 2
++    DO_BIT(HFGITR, TLBIVAALE1),
-  *
+ } FGTBit;
-- * for a total of 12 different mmu_idx.
-+ * for a total of 11 different mmu_idx.
+ #undef DO_BIT
   *
   * R profile CPUs have an MPU, but can use the same set of MMU indexes
   * as A profile. They only need to distinguish NS EL0 and NS EL1 (and
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
   * are not quite the same -- different CPU types (most notably M profile
   * vs A/R profile) would like to use MMU indexes with different semantics,
   * but since we don't ever need to use all of those in a single CPU we
 - * can avoid setting NB_MMU_MODES to more than 8. The lower bits of
 + * can avoid having to set NB_MMU_MODES to "total number of A profile MMU
 + * modes + total number of M profile MMU modes". The lower bits of
   * ARMMMUIdx are the core TLB mmu index, and the higher bits are always
   * the same for any particular CPU.
   * Variables of type ARMMUIdx are always full values, and the core
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
      ARMMMUIdx_SE10_1_PAN = 9 | ARM_MMU_IDX_A,
      ARMMMUIdx_SE3        = 10 | ARM_MMU_IDX_A,
 -    ARMMMUIdx_Stage2     = 11 | ARM_MMU_IDX_A,
 -
      /*
       * These are not allocated TLBs and are used only for AT system
       * instructions or for the first stage of an S12 page table walk.
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
      ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
      ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
      ARMMMUIdx_Stage1_E1_PAN = 2 | ARM_MMU_IDX_NOTLB,
 +    /*
 +     * Not allocated a TLB: used only for second stage of an S12 page
 +     * table walk, or for descriptor loads during first stage of an S1
 +     * page table walk. Note that if we ever want to have a TLB for this
 +     * then various TLB flush insns which currently are no-ops or flush
 +     * only stage 1 MMU indexes will need to change to flush stage 2.
 +     */
 +    ARMMMUIdx_Stage2     = 3 | ARM_MMU_IDX_NOTLB,
      /*
       * M-profile.
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
      TO_CORE_BIT(SE10_1),
      TO_CORE_BIT(SE10_1_PAN),
      TO_CORE_BIT(SE3),
 -    TO_CORE_BIT(Stage2),
      TO_CORE_BIT(MUser),
      TO_CORE_BIT(MPriv),
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
-     tlb_flush_by_mmuidx(cs,
-                         ARMMMUIdxBit_E10_1 |
-                         ARMMMUIdxBit_E10_1_PAN |
--                        ARMMMUIdxBit_E10_0 |
--                        ARMMMUIdxBit_Stage2);
-+                        ARMMMUIdxBit_E10_0);
- }
- static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-     tlb_flush_by_mmuidx_all_cpus_synced(cs,
-                                         ARMMMUIdxBit_E10_1 |
-                                         ARMMMUIdxBit_E10_1_PAN |
--                                        ARMMMUIdxBit_E10_0 |
--                                        ARMMMUIdxBit_Stage2);
-+                                        ARMMMUIdxBit_E10_0);
- }
--static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
--                            uint64_t value)
--{
--    /* Invalidate by IPA. This has to invalidate any structures that
--     * contain only stage 2 translation information, but does not need
--     * to apply to structures that contain combined stage 1 and stage 2
--     * translation information.
--     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
--     */
--    CPUState *cs = env_cpu(env);
--    uint64_t pageaddr;
--
--    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
--        return;
--    }
--
--    pageaddr = sextract64(value << 12, 0, 40);
--
--    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
--}
--
--static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
--                               uint64_t value)
--{
--    CPUState *cs = env_cpu(env);
--    uint64_t pageaddr;
--
--    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
--        return;
--    }
--
--    pageaddr = sextract64(value << 12, 0, 40);
--
--    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
--                                             ARMMMUIdxBit_Stage2);
--}
- static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                               uint64_t value)
-@@ -XXX,XX +XXX,XX @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-         tlb_flush_by_mmuidx(cs,
-                             ARMMMUIdxBit_E10_1 |
-                             ARMMMUIdxBit_E10_1_PAN |
--                            ARMMMUIdxBit_E10_0 |
--                            ARMMMUIdxBit_Stage2);
-+                            ARMMMUIdxBit_E10_0);
-         raw_write(env, ri, value);
-     }
- }
-@@ -XXX,XX +XXX,XX @@ static int alle1_tlbmask(CPUARMState *env)
-         return ARMMMUIdxBit_SE10_1 |
-                ARMMMUIdxBit_SE10_1_PAN |
-                ARMMMUIdxBit_SE10_0;
--    } else if (arm_feature(env, ARM_FEATURE_EL2)) {
--        return ARMMMUIdxBit_E10_1 |
--               ARMMMUIdxBit_E10_1_PAN |
--               ARMMMUIdxBit_E10_0 |
--               ARMMMUIdxBit_Stage2;
-     } else {
-         return ARMMMUIdxBit_E10_1 |
-                ARMMMUIdxBit_E10_1_PAN |
-@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                                              ARMMMUIdxBit_SE3);
- }
--static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
--                                    uint64_t value)
--{
--    /* Invalidate by IPA. This has to invalidate any structures that
--     * contain only stage 2 translation information, but does not need
--     * to apply to structures that contain combined stage 1 and stage 2
--     * translation information.
--     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
--     */
--    ARMCPU *cpu = env_archcpu(env);
--    CPUState *cs = CPU(cpu);
--    uint64_t pageaddr;
--
--    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
--        return;
--    }
--
--    pageaddr = sextract64(value << 12, 0, 48);
--
--    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
--}
--
--static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
--                                      uint64_t value)
--{
--    CPUState *cs = env_cpu(env);
--    uint64_t pageaddr;
--
--    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
--        return;
--    }
--
--    pageaddr = sextract64(value << 12, 0, 48);
--
--    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
--                                             ARMMMUIdxBit_Stage2);
--}
--
- static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
-                                       bool isread)
- {
 @@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+     { .name = "TLBI_VMALLE1IS", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 0,
+       .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIVMALLE1IS,
+       .writefn = tlbi_aa64_vmalle1is_write },
+     { .name = "TLBI_VAE1IS", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 1,
+       .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIVAE1IS,
+       .writefn = tlbi_aa64_vae1is_write },
+     { .name = "TLBI_ASIDE1IS", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 2,
+       .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIASIDE1IS,
+       .writefn = tlbi_aa64_vmalle1is_write },
+     { .name = "TLBI_VAAE1IS", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
+       .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIVAAE1IS,
+       .writefn = tlbi_aa64_vae1is_write },
+     { .name = "TLBI_VALE1IS", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 5,
+       .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIVALE1IS,
+       .writefn = tlbi_aa64_vae1is_write },
+     { .name = "TLBI_VAALE1IS", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 7,
+       .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIVAALE1IS,
+       .writefn = tlbi_aa64_vae1is_write },
+     { .name = "TLBI_VMALLE1", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 0,
+       .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIVMALLE1,
+       .writefn = tlbi_aa64_vmalle1_write },
+     { .name = "TLBI_VAE1", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 1,
+       .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIVAE1,
+       .writefn = tlbi_aa64_vae1_write },
+     { .name = "TLBI_ASIDE1", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 2,
+       .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIASIDE1,
+       .writefn = tlbi_aa64_vmalle1_write },
+     { .name = "TLBI_VAAE1", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 3,
+       .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIVAAE1,
+       .writefn = tlbi_aa64_vae1_write },
+     { .name = "TLBI_VALE1", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 5,
+       .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIVALE1,
+       .writefn = tlbi_aa64_vae1_write },
+     { .name = "TLBI_VAALE1", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 7,
+       .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
++      .fgt = FGT_TLBIVAALE1,
        .writefn = tlbi_aa64_vae1_write },
      { .name = "TLBI_IPAS2E1IS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
--      .access = PL2_W, .type = ARM_CP_NO_RAW,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo tlbirange_reginfo[] = {
--      .writefn = tlbi_aa64_ipas2e1is_write },
+     { .name = "TLBI_RVAE1IS", .state = ARM_CP_STATE_AA64,
-+      .access = PL2_W, .type = ARM_CP_NOP },
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 1,
-     { .name = "TLBI_IPAS2LE1IS", .state = ARM_CP_STATE_AA64,
+       .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
-       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
++      .fgt = FGT_TLBIRVAE1IS,
--      .access = PL2_W, .type = ARM_CP_NO_RAW,
+       .writefn = tlbi_aa64_rvae1is_write },
--      .writefn = tlbi_aa64_ipas2e1is_write },
+     { .name = "TLBI_RVAAE1IS", .state = ARM_CP_STATE_AA64,
-+      .access = PL2_W, .type = ARM_CP_NOP },
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 3,
-     { .name = "TLBI_ALLE1IS", .state = ARM_CP_STATE_AA64,
+       .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
-       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 4,
++      .fgt = FGT_TLBIRVAAE1IS,
-       .access = PL2_W, .type = ARM_CP_NO_RAW,
+       .writefn = tlbi_aa64_rvae1is_write },
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+    { .name = "TLBI_RVALE1IS", .state = ARM_CP_STATE_AA64,
-       .writefn = tlbi_aa64_alle1is_write },
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 5,
-     { .name = "TLBI_IPAS2E1", .state = ARM_CP_STATE_AA64,
+       .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
-       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
++      .fgt = FGT_TLBIRVALE1IS,
--      .access = PL2_W, .type = ARM_CP_NO_RAW,
+       .writefn = tlbi_aa64_rvae1is_write },
--      .writefn = tlbi_aa64_ipas2e1_write },
+     { .name = "TLBI_RVAALE1IS", .state = ARM_CP_STATE_AA64,
-+      .access = PL2_W, .type = ARM_CP_NOP },
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 2, .opc2 = 7,
-     { .name = "TLBI_IPAS2LE1", .state = ARM_CP_STATE_AA64,
+       .access = PL1_W, .accessfn = access_ttlbis, .type = ARM_CP_NO_RAW,
-       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
++      .fgt = FGT_TLBIRVAALE1IS,
--      .access = PL2_W, .type = ARM_CP_NO_RAW,
+       .writefn = tlbi_aa64_rvae1is_write },
--      .writefn = tlbi_aa64_ipas2e1_write },
+     { .name = "TLBI_RVAE1OS", .state = ARM_CP_STATE_AA64,
-+      .access = PL2_W, .type = ARM_CP_NOP },
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 1,
-     { .name = "TLBI_ALLE1", .state = ARM_CP_STATE_AA64,
+       .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
-       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 4,
++      .fgt = FGT_TLBIRVAE1OS,
-       .access = PL2_W, .type = ARM_CP_NO_RAW,
+       .writefn = tlbi_aa64_rvae1is_write },
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+     { .name = "TLBI_RVAAE1OS", .state = ARM_CP_STATE_AA64,
-       .writefn = tlbimva_hyp_is_write },
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 3,
-     { .name = "TLBIIPAS2",
+       .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
-       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
++      .fgt = FGT_TLBIRVAAE1OS,
--      .type = ARM_CP_NO_RAW, .access = PL2_W,
+       .writefn = tlbi_aa64_rvae1is_write },
--      .writefn = tlbiipas2_write },
+    { .name = "TLBI_RVALE1OS", .state = ARM_CP_STATE_AA64,
-+      .type = ARM_CP_NOP, .access = PL2_W },
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 5,
-     { .name = "TLBIIPAS2IS",
+       .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
-       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
++      .fgt = FGT_TLBIRVALE1OS,
--      .type = ARM_CP_NO_RAW, .access = PL2_W,
+       .writefn = tlbi_aa64_rvae1is_write },
--      .writefn = tlbiipas2_is_write },
+     { .name = "TLBI_RVAALE1OS", .state = ARM_CP_STATE_AA64,
-+      .type = ARM_CP_NOP, .access = PL2_W },
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 7,
-     { .name = "TLBIIPAS2L",
+       .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
-       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
++      .fgt = FGT_TLBIRVAALE1OS,
--      .type = ARM_CP_NO_RAW, .access = PL2_W,
+       .writefn = tlbi_aa64_rvae1is_write },
--      .writefn = tlbiipas2_write },
+     { .name = "TLBI_RVAE1", .state = ARM_CP_STATE_AA64,
-+      .type = ARM_CP_NOP, .access = PL2_W },
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 1,
-     { .name = "TLBIIPAS2LIS",
+       .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
-       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
++      .fgt = FGT_TLBIRVAE1,
--      .type = ARM_CP_NO_RAW, .access = PL2_W,
+       .writefn = tlbi_aa64_rvae1_write },
--      .writefn = tlbiipas2_is_write },
+     { .name = "TLBI_RVAAE1", .state = ARM_CP_STATE_AA64,
-+      .type = ARM_CP_NOP, .access = PL2_W },
+       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 3,
-     /* 32 bit cache operations */
+       .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
-     { .name = "ICIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
++      .fgt = FGT_TLBIRVAAE1,
-       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
+       .writefn = tlbi_aa64_rvae1_write },
     { .name = "TLBI_RVALE1", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 5,
        .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
 +      .fgt = FGT_TLBIRVALE1,
        .writefn = tlbi_aa64_rvae1_write },
      { .name = "TLBI_RVAALE1", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 7,
        .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
 +      .fgt = FGT_TLBIRVAALE1,
        .writefn = tlbi_aa64_rvae1_write },
      { .name = "TLBI_RIPAS2E1IS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 2,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo tlbios_reginfo[] = {
      { .name = "TLBI_VMALLE1OS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 0,
        .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
 +      .fgt = FGT_TLBIVMALLE1OS,
        .writefn = tlbi_aa64_vmalle1is_write },
      { .name = "TLBI_VAE1OS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 1,
 +      .fgt = FGT_TLBIVAE1OS,
        .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vae1is_write },
      { .name = "TLBI_ASIDE1OS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 2,
        .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
 +      .fgt = FGT_TLBIASIDE1OS,
        .writefn = tlbi_aa64_vmalle1is_write },
      { .name = "TLBI_VAAE1OS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 3,
        .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
 +      .fgt = FGT_TLBIVAAE1OS,
        .writefn = tlbi_aa64_vae1is_write },
      { .name = "TLBI_VALE1OS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 5,
        .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
 +      .fgt = FGT_TLBIVALE1OS,
        .writefn = tlbi_aa64_vae1is_write },
      { .name = "TLBI_VAALE1OS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 1, .opc2 = 7,
        .access = PL1_W, .accessfn = access_ttlbos, .type = ARM_CP_NO_RAW,
 +      .fgt = FGT_TLBIVAALE1OS,
        .writefn = tlbi_aa64_vae1is_write },
      { .name = "TLBI_ALLE2OS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 1, .opc2 = 0,
 --
-.20.1
+.34.1

-[PULL 21/39] target/arm: Don't allow Thumb Neon insns without FEATURE_NEON
+[PULL 29/33] target/arm: Mark up sysregs for HFGITR bits 48..63
-We were accidentally permitting decode of Thumb Neon insns even if
+Mark up the sysreg definitions for the system instructions
-the CPU didn't have the FEATURE_NEON bit set, because the feature
+trapped by HFGITR bits 48..63.
-check was being done before the call to disas_neon_data_insn() and
-disas_neon_ls_insn() in the Arm decoder but was omitted from the
+Some of these bits are for trapping instructions which are
-Thumb decoder.  Push the feature bit check down into the called
+not in the system instruction encoding (i.e. which are
-functions so it is done for both Arm and Thumb encodings.
+not handled by the ARMCPRegInfo mechanism):
  * ERET, ERETAA, ERETAB
  * SVC
 We will have to handle those separately and manually.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Tested-by: Fuad Tabba <tabba@google.com>
-Message-id: 20200430181003.21682-3-peter.maydell@linaro.org
+Message-id: 20230130182459.3309057-20-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-20-peter.maydell@linaro.org
 ---
- target/arm/translate.c | 16 ++++++++--------
+ target/arm/cpregs.h | 4 ++++
-file changed, 8 insertions(+), 8 deletions(-)
+ target/arm/helper.c | 9 +++++++++
 files changed, 13 insertions(+)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/cpregs.h
-+++ b/target/arm/translate.c
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ typedef enum FGTBit {
-     TCGv_i32 tmp2;
+     DO_BIT(HFGITR, TLBIVAAE1),
-     TCGv_i64 tmp64;
+     DO_BIT(HFGITR, TLBIVALE1),
+     DO_BIT(HFGITR, TLBIVAALE1),
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
++    DO_BIT(HFGITR, CFPRCTX),
-+        return 1;
++    DO_BIT(HFGITR, DVPRCTX),
-+    }
++    DO_BIT(HFGITR, CPPRCTX),
-+
++    DO_BIT(HFGITR, DCCVAC),
-     /* FIXME: this access check should not take precedence over UNDEF
+ } FGTBit;
-      * for invalid encodings; we will generate incorrect syndrome information
-      * for attempts to execute invalid vfp/neon encodings with FP disabled.
+ #undef DO_BIT
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
-     TCGv_ptr ptr1, ptr2, ptr3;
+index XXXXXXX..XXXXXXX 100644
-     TCGv_i64 tmp64;
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
-+        return 1;
+     { .name = "DC_CVAC", .state = ARM_CP_STATE_AA64,
-+    }
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 1,
-+
+       .access = PL0_W, .type = ARM_CP_NOP,
-     /* FIXME: this access check should not take precedence over UNDEF
++      .fgt = FGT_DCCVAC,
-      * for invalid encodings; we will generate incorrect syndrome information
+       .accessfn = aa64_cacheop_poc_access },
-      * for attempts to execute invalid vfp/neon encodings with FP disabled.
+     { .name = "DC_CSW", .state = ARM_CP_STATE_AA64,
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
-         if (((insn >> 25) & 7) == 1) {
+     { .name = "DC_CGVAC", .state = ARM_CP_STATE_AA64,
-             /* NEON Data processing.  */
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 3,
--            if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+       .type = ARM_CP_NOP, .access = PL0_W,
--                goto illegal_op;
++      .fgt = FGT_DCCVAC,
--            }
+       .accessfn = aa64_cacheop_poc_access },
--
+     { .name = "DC_CGDVAC", .state = ARM_CP_STATE_AA64,
-             if (disas_neon_data_insn(s, insn)) {
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 5,
-                 goto illegal_op;
+       .type = ARM_CP_NOP, .access = PL0_W,
-             }
++      .fgt = FGT_DCCVAC,
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+       .accessfn = aa64_cacheop_poc_access },
-         }
+     { .name = "DC_CGVAP", .state = ARM_CP_STATE_AA64,
-         if ((insn & 0x0f100000) == 0x04000000) {
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 3,
-             /* NEON load/store.  */
+@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_predinv(CPUARMState *env, const ARMCPRegInfo *ri,
--            if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ static const ARMCPRegInfo predinv_reginfo[] = {
--                goto illegal_op;
+     { .name = "CFP_RCTX", .state = ARM_CP_STATE_AA64,
--            }
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 3, .opc2 = 4,
--
++      .fgt = FGT_CFPRCTX,
-             if (disas_neon_ls_insn(s, insn)) {
+       .type = ARM_CP_NOP, .access = PL0_W, .accessfn = access_predinv },
-                 goto illegal_op;
+     { .name = "DVP_RCTX", .state = ARM_CP_STATE_AA64,
-             }
+       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 3, .opc2 = 5,
 +      .fgt = FGT_DVPRCTX,
        .type = ARM_CP_NOP, .access = PL0_W, .accessfn = access_predinv },
      { .name = "CPP_RCTX", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 3, .opc2 = 7,
 +      .fgt = FGT_CPPRCTX,
        .type = ARM_CP_NOP, .access = PL0_W, .accessfn = access_predinv },
      /*
       * Note the AArch32 opcodes have a different OPC1.
       */
      { .name = "CFPRCTX", .state = ARM_CP_STATE_AA32,
        .cp = 15, .opc1 = 0, .crn = 7, .crm = 3, .opc2 = 4,
 +      .fgt = FGT_CFPRCTX,
        .type = ARM_CP_NOP, .access = PL0_W, .accessfn = access_predinv },
      { .name = "DVPRCTX", .state = ARM_CP_STATE_AA32,
        .cp = 15, .opc1 = 0, .crn = 7, .crm = 3, .opc2 = 5,
 +      .fgt = FGT_DVPRCTX,
        .type = ARM_CP_NOP, .access = PL0_W, .accessfn = access_predinv },
      { .name = "CPPRCTX", .state = ARM_CP_STATE_AA32,
        .cp = 15, .opc1 = 0, .crn = 7, .crm = 3, .opc2 = 7,
 +      .fgt = FGT_CPPRCTX,
        .type = ARM_CP_NOP, .access = PL0_W, .accessfn = access_predinv },
  };
 --
-.20.1
+.34.1

-[PULL 39/39] target/arm: Move gen_ function typedefs to translate.h
+[PULL 30/33] target/arm: Implement the HFGITR_EL2.ERET trap
-We're going to want at least some of the NeonGen* typedefs
+Implement the HFGITR_EL2.ERET fine-grained trap.  This traps
-for the refactored 32-bit Neon decoder, so move them all
+execution from AArch64 EL1 of ERET, ERETAA and ERETAB.  The trap is
-to translate.h since it makes more sense to keep them in
+reported with a syndrome value of 0x1a.
-one group.
 The trap must take precedence over a possible pointer-authentication
 trap for ERETAA and ERETAB.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-23-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-21-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-21-peter.maydell@linaro.org
 ---
- target/arm/translate.h     | 17 +++++++++++++++++
+ target/arm/cpu.h           |  1 +
- target/arm/translate-a64.c | 17 -----------------
+ target/arm/syndrome.h      | 10 ++++++++++
-files changed, 17 insertions(+), 17 deletions(-)
+ target/arm/translate.h     |  2 ++
  target/arm/helper.c        |  3 +++
  target/arm/translate-a64.c | 10 ++++++++++
 files changed, 26 insertions(+)
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, PSTATE_ZA, 23, 1)
+ FIELD(TBFLAG_A64, SVL, 24, 4)
+ /* Indicates that SME Streaming mode is active, and SMCR_ELx.FA64 is not. */
+ FIELD(TBFLAG_A64, SME_TRAP_NONSTREAMING, 28, 1)
++FIELD(TBFLAG_A64, FGT_ERET, 29, 1)
+ /*
+  * Helpers for using the above.
+diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/syndrome.h
++++ b/target/arm/syndrome.h
+@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
+     EC_AA64_SMC               = 0x17,
+     EC_SYSTEMREGISTERTRAP     = 0x18,
+     EC_SVEACCESSTRAP          = 0x19,
++    EC_ERETTRAP               = 0x1a,
+     EC_SMETRAP                = 0x1d,
+     EC_INSNABORT              = 0x20,
+     EC_INSNABORT_SAME_EL      = 0x21,
+@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_sve_access_trap(void)
+     return EC_SVEACCESSTRAP << ARM_EL_EC_SHIFT;
+ }
++/*
++ * eret_op is bits [1:0] of the ERET instruction, so:
++ * 0 for ERET, 2 for ERETAA, 3 for ERETAB.
++ */
++static inline uint32_t syn_erettrap(int eret_op)
++{
++    return (EC_ERETTRAP << ARM_EL_EC_SHIFT) | ARM_EL_IL | eret_op;
++}
++
+ static inline uint32_t syn_smetrap(SMEExceptionType etype, bool is_16bit)
+ {
+     return (EC_SMETRAP << ARM_EL_EC_SHIFT)
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
+@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
- typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
+     bool mve_no_pred;
-                         uint32_t, uint32_t, uint32_t);
+     /* True if fine-grained traps are active */
+     bool fgt_active;
-+/* Function prototype for gen_ functions for calling Neon helpers */
++    /* True if fine-grained trap on ERET is enabled */
-+typedef void NeonGenOneOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32);
++    bool fgt_eret;
-+typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
+     /*
-+typedef void NeonGenTwoOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
+      * >= 0, a copy of PSTATE.BTYPE, which will be 0 without v8.5-BTI.
-+typedef void NeonGenTwo64OpFn(TCGv_i64, TCGv_i64, TCGv_i64);
+      *  < 0, set by the current instruction.
-+typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
+diff --git a/target/arm/helper.c b/target/arm/helper.c
-+typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
+index XXXXXXX..XXXXXXX 100644
-+typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
+--- a/target/arm/helper.c
-+typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
++++ b/target/arm/helper.c
-+typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
+@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
-+typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
-+typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
+     if (arm_fgt_active(env, el)) {
-+typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
+         DP_TBFLAG_ANY(flags, FGT_ACTIVE, 1);
-+typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
++        if (FIELD_EX64(env->cp15.fgt_exec[FGTREG_HFGITR], HFGITR_EL2, ERET)) {
-+typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
++            DP_TBFLAG_A64(flags, FGT_ERET, 1);
-+typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
++        }
-+
+     }
- #endif /* TARGET_ARM_TRANSLATE_H */
      if (cpu_isar_feature(aa64_mte, env_archcpu(env))) {
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ typedef struct AArch64DecodeTable {
+@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
-     AArch64DecodeFn *disas_fn;
+             if (op4 != 0) {
- } AArch64DecodeTable;
+                 goto do_unallocated;
+             }
--/* Function prototype for gen_ functions for calling Neon helpers */
++            if (s->fgt_eret) {
--typedef void NeonGenOneOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32);
++                gen_exception_insn_el(s, 0, EXCP_UDEF, syn_erettrap(op3), 2);
--typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
++                return;
--typedef void NeonGenTwoOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
++            }
--typedef void NeonGenTwo64OpFn(TCGv_i64, TCGv_i64, TCGv_i64);
+             dst = tcg_temp_new_i64();
--typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
+             tcg_gen_ld_i64(dst, cpu_env,
--typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
+                            offsetof(CPUARMState, elr_el[s->current_el]));
--typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
+@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
--typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
+             if (rn != 0x1f || op4 != 0x1f) {
--typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
+                 goto do_unallocated;
--typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
+             }
--typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
++            /* The FGT trap takes precedence over an auth trap. */
--typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
++            if (s->fgt_eret) {
--typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
++                gen_exception_insn_el(s, 0, EXCP_UDEF, syn_erettrap(op3), 2);
--typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
++                return;
--typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
++            }
--
+             dst = tcg_temp_new_i64();
- /* initialize TCG globals.  */
+             tcg_gen_ld_i64(dst, cpu_env,
- void a64_translate_init(void)
+                            offsetof(CPUARMState, elr_el[s->current_el]));
- {
+@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
      dc->align_mem = EX_TBFLAG_ANY(tb_flags, ALIGN_MEM);
      dc->pstate_il = EX_TBFLAG_ANY(tb_flags, PSTATE__IL);
      dc->fgt_active = EX_TBFLAG_ANY(tb_flags, FGT_ACTIVE);
 +    dc->fgt_eret = EX_TBFLAG_A64(tb_flags, FGT_ERET);
      dc->sve_excp_el = EX_TBFLAG_A64(tb_flags, SVEEXC_EL);
      dc->sme_excp_el = EX_TBFLAG_A64(tb_flags, SMEEXC_EL);
      dc->vl = (EX_TBFLAG_A64(tb_flags, VL) + 1) * 16;
 --
-.20.1
+.34.1

-[PULL 33/39] target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree
+[PULL 31/33] target/arm: Implement the HFGITR_EL2.SVC_EL0 and SVC_EL1 traps
-Convert the Neon 3-reg-same VADD and VSUB insns to decodetree.
+Implement the HFGITR_EL2.SVC_EL0 and SVC_EL1 fine-grained traps.
+These trap execution of the SVC instruction from AArch32 and AArch64.
-Note that we don't need the neon_3r_sizes[op] check here because all
+(As usual, AArch32 can only trap from EL0, as fine grained traps are
-size values are OK for VADD and VSUB; we'll add this when we convert
+disabled with an AArch32 EL1.)
 the first insn that has size restrictions.
 For this we need one of the GVecGen*Fn typedefs currently in
 translate-a64.h; move them all to translate.h as a block so they
 are visible to the 32-bit decoder.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-15-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-22-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-22-peter.maydell@linaro.org
 ---
- target/arm/translate-a64.h      |  9 --------
+ target/arm/cpu.h           |  1 +
- target/arm/translate.h          |  9 ++++++++
+ target/arm/translate.h     |  2 ++
- target/arm/neon-dp.decode       | 17 +++++++++++++++
+ target/arm/helper.c        | 20 ++++++++++++++++++++
- target/arm/translate-neon.inc.c | 38 +++++++++++++++++++++++++++++++++
+ target/arm/translate-a64.c |  9 ++++++++-
- target/arm/translate.c          | 14 ++++--------
+ target/arm/translate.c     | 12 +++++++++---
-files changed, 68 insertions(+), 19 deletions(-)
+files changed, 40 insertions(+), 4 deletions(-)
-diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.h
+--- a/target/arm/cpu.h
-+++ b/target/arm/translate-a64.h
++++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ static inline int vec_full_reg_size(DisasContext *s)
+@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, FPEXC_EL, 8, 2)
+ FIELD(TBFLAG_ANY, ALIGN_MEM, 10, 1)
- bool disas_sve(DisasContext *, uint32_t);
+ FIELD(TBFLAG_ANY, PSTATE__IL, 11, 1)
+ FIELD(TBFLAG_ANY, FGT_ACTIVE, 12, 1)
--/* Note that the gvec expanders operate on offsets + sizes.  */
++FIELD(TBFLAG_ANY, FGT_SVC, 13, 1)
--typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t);
--typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
+ /*
--                         uint32_t, uint32_t);
+  * Bit usage when in AArch32 state, both A- and M-profile.
 -typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
 -                        uint32_t, uint32_t, uint32_t);
 -typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
 -                        uint32_t, uint32_t, uint32_t);
 -
  #endif /* TARGET_ARM_TRANSLATE_A64_H */
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
+@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
- #define dc_isar_feature(name, ctx) \
+     bool fgt_active;
-     ({ DisasContext *ctx_ = (ctx); isar_feature_##name(ctx_->isar); })
+     /* True if fine-grained trap on ERET is enabled */
+     bool fgt_eret;
-+/* Note that the gvec expanders operate on offsets + sizes.  */
++    /* True if fine-grained trap on SVC is enabled */
-+typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t);
++    bool fgt_svc;
-+typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
+     /*
-+                         uint32_t, uint32_t);
+      * >= 0, a copy of PSTATE.BTYPE, which will be 0 without v8.5-BTI.
-+typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
+      *  < 0, set by the current instruction.
-+                        uint32_t, uint32_t, uint32_t);
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 +typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
 +                        uint32_t, uint32_t, uint32_t);
 +
  #endif /* TARGET_ARM_TRANSLATE_H */
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/helper.c
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_mmu_idx(CPUARMState *env)
- #
+     return arm_mmu_idx_el(env, arm_current_el(env));
  # This file is processed by scripts/decodetree.py
  #
 +# VFP/Neon register fields; same as vfp.decode
 +%vm_dp  5:1 0:4
 +%vn_dp  7:1 16:4
 +%vd_dp  22:1 12:4
  # Encodings for Neon data processing instructions where the T32 encoding
  # is a simple transformation of the A32 encoding.
@@ -XXX,XX +XXX,XX @@
  #   0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
  # This file works on the A32 encoding only; calling code for T32 has to
  # transform the insn into the A32 version first.
 +
 +######################################################################
 +# 3-reg-same grouping:
 +# 1111 001 U 0 D sz:2 Vn:4 Vd:4 opc:4 N Q M op Vm:4
 +######################################################################
 +
 +&3same vm vn vd q size
 +
 +@3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
 +                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
 +VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
 +VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a)
      return true;
  }
-+
-+static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn)
++static inline bool fgt_svc(CPUARMState *env, int el)
 +{
-+    int vec_size = a->q ? 16 : 8;
++    /*
-+    int rd_ofs = neon_reg_offset(a->vd, 0);
++     * Assuming fine-grained-traps are active, return true if we
-+    int rn_ofs = neon_reg_offset(a->vn, 0);
++     * should be trapping on SVC instructions. Only AArch64 can
-+    int rm_ofs = neon_reg_offset(a->vm, 0);
++     * trap on an SVC at EL1, but we don't need to special-case this
-+
++     * because if this is AArch32 EL1 then arm_fgt_active() is false.
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
++     * We also know el is 0 or 1.
-+        return false;
++     */
-+    }
++    return el == 0 ?
-+
++        FIELD_EX64(env->cp15.fgt_exec[FGTREG_HFGITR], HFGITR_EL2, SVC_EL0) :
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
++        FIELD_EX64(env->cp15.fgt_exec[FGTREG_HFGITR], HFGITR_EL2, SVC_EL1);
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & a->q) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    fn(a->size, rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
 +    return true;
 +}
 +
-+#define DO_3SAME(INSN, FUNC)                                            \
+ static CPUARMTBFlags rebuild_hflags_common(CPUARMState *env, int fp_el,
-+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+                                            ARMMMUIdx mmu_idx,
-+    {                                                                   \
+                                            CPUARMTBFlags flags)
-+        return do_3same(s, a, FUNC);                                    \
+@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a32(CPUARMState *env, int fp_el,
-+    }
-+
+     if (arm_fgt_active(env, el)) {
-+DO_3SAME(VADD, tcg_gen_gvec_add)
+         DP_TBFLAG_ANY(flags, FGT_ACTIVE, 1);
-+DO_3SAME(VSUB, tcg_gen_gvec_sub)
++        if (fgt_svc(env, el)) {
 +            DP_TBFLAG_ANY(flags, FGT_SVC, 1);
 +        }
      }
      if (env->uncached_cpsr & CPSR_IL) {
@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
          if (FIELD_EX64(env->cp15.fgt_exec[FGTREG_HFGITR], HFGITR_EL2, ERET)) {
              DP_TBFLAG_A64(flags, FGT_ERET, 1);
          }
 +        if (fgt_svc(env, el)) {
 +            DP_TBFLAG_ANY(flags, FGT_SVC, 1);
 +        }
      }
      if (cpu_isar_feature(aa64_mte, env_archcpu(env))) {
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
      int opc = extract32(insn, 21, 3);
      int op2_ll = extract32(insn, 0, 5);
      int imm16 = extract32(insn, 5, 16);
 +    uint32_t syndrome;
      switch (opc) {
      case 0:
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
           */
          switch (op2_ll) {
          case 1:                                                     /* SVC */
 +            syndrome = syn_aa64_svc(imm16);
 +            if (s->fgt_svc) {
 +                gen_exception_insn_el(s, 0, EXCP_UDEF, syndrome, 2);
 +                break;
 +            }
              gen_ss_advance(s);
 -            gen_exception_insn(s, 4, EXCP_SWI, syn_aa64_svc(imm16));
 +            gen_exception_insn(s, 4, EXCP_SWI, syndrome);
              break;
          case 2:                                                     /* HVC */
              if (s->current_el == 0) {
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
      dc->align_mem = EX_TBFLAG_ANY(tb_flags, ALIGN_MEM);
      dc->pstate_il = EX_TBFLAG_ANY(tb_flags, PSTATE__IL);
      dc->fgt_active = EX_TBFLAG_ANY(tb_flags, FGT_ACTIVE);
 +    dc->fgt_svc = EX_TBFLAG_ANY(tb_flags, FGT_SVC);
      dc->fgt_eret = EX_TBFLAG_A64(tb_flags, FGT_ERET);
      dc->sve_excp_el = EX_TBFLAG_A64(tb_flags, SVEEXC_EL);
      dc->sme_excp_el = EX_TBFLAG_A64(tb_flags, SMEEXC_EL);
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static bool trans_SVC(DisasContext *s, arg_SVC *a)
-             }
+         (a->imm == semihost_imm)) {
-             return 0;
+         gen_exception_internal_insn(s, EXCP_SEMIHOST);
+     } else {
--        case NEON_3R_VADD_VSUB:
+-        gen_update_pc(s, curr_insn_len(s));
--            if (u) {
+-        s->svc_imm = a->imm;
--                tcg_gen_gvec_sub(size, rd_ofs, rn_ofs, rm_ofs,
+-        s->base.is_jmp = DISAS_SWI;
--                                 vec_size, vec_size);
++        if (s->fgt_svc) {
--            } else {
++            uint32_t syndrome = syn_aa32_svc(a->imm, s->thumb);
--                tcg_gen_gvec_add(size, rd_ofs, rn_ofs, rm_ofs,
++            gen_exception_insn_el(s, 0, EXCP_UDEF, syndrome, 2);
--                                 vec_size, vec_size);
++        } else {
--            }
++            gen_update_pc(s, curr_insn_len(s));
--            return 0;
++            s->svc_imm = a->imm;
--
++            s->base.is_jmp = DISAS_SWI;
-         case NEON_3R_VQADD:
++        }
-             tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+     }
-                            rn_ofs, rm_ofs, vec_size, vec_size,
+     return true;
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+ }
-             tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
+@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
-                            u ? &ushl_op[size] : &sshl_op[size]);
+     dc->align_mem = EX_TBFLAG_ANY(tb_flags, ALIGN_MEM);
-             return 0;
+     dc->pstate_il = EX_TBFLAG_ANY(tb_flags, PSTATE__IL);
-+
+     dc->fgt_active = EX_TBFLAG_ANY(tb_flags, FGT_ACTIVE);
-+        case NEON_3R_VADD_VSUB:
++    dc->fgt_svc = EX_TBFLAG_ANY(tb_flags, FGT_SVC);
-+            /* Already handled by decodetree */
-+            return 1;
+     if (arm_feature(env, ARM_FEATURE_M)) {
-         }
+         dc->vfp_enabled = 1;
          if (size == 3) {
 --
-.20.1
+.34.1

-[PULL 32/39] target/arm: Convert Neon 'load/store single structure' to decodetree
+[PULL 32/33] target/arm: Implement MDCR_EL2.TDCC and MDCR_EL3.TDCC traps
-Convert the Neon "load/store single structure to one lane" insns to
+FEAT_FGT also implements an extra trap bit in the MDCR_EL2 and
-decodetree.
+MDCR_EL3 registers: bit TDCC enables trapping of use of the Debug
 Comms Channel registers OSDTRRX_EL1, OSDTRTX_EL1, MDCCSR_EL0,
 MDCCINT_EL0, DBGDTR_EL0, DBGDTRRX_EL0 and DBGDTRTX_EL0 (and their
 AArch32 equivalents).  This trapping is independent of whether
 fine-grained traps are enabled or not.
-As this is the last set of insns in the neon load/store group,
+Implement these extra traps.  (We don't implement DBGDTR_EL0,
-we can remove the whole disas_neon_ls_insn() function.
+DBGDTRRX_EL0 and DBGDTRTX_EL0.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200430181003.21682-14-peter.maydell@linaro.org
+Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-23-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-23-peter.maydell@linaro.org
 ---
- target/arm/neon-ls.decode       |  11 +++
+ target/arm/debug_helper.c | 35 +++++++++++++++++++++++++++++++----
- target/arm/translate-neon.inc.c |  89 +++++++++++++++++++
+file changed, 31 insertions(+), 4 deletions(-)
  target/arm/translate.c          | 147 --------------------------------
 files changed, 100 insertions(+), 147 deletions(-)
-diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
+diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-ls.decode
+--- a/target/arm/debug_helper.c
-+++ b/target/arm/neon-ls.decode
++++ b/target/arm/debug_helper.c
-@@ -XXX,XX +XXX,XX @@ VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
+@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tda(CPUARMState *env, const ARMCPRegInfo *ri,
+     return CP_ACCESS_OK;
- VLD_all_lanes  1111 0100 1 . 1 0 rn:4 .... 11 n:2 size:2 t:1 a:1 rm:4 \
+ }
-                vd=%vd_dp
 +/*
 + * Check for traps to Debug Comms Channel registers. If FEAT_FGT
 + * is implemented then these are controlled by MDCR_EL2.TDCC for
 + * EL2 and MDCR_EL3.TDCC for EL3. They are also controlled by
 + * the general debug access trap bits MDCR_EL2.TDA and MDCR_EL3.TDA.
 + */
 +static CPAccessResult access_tdcc(CPUARMState *env, const ARMCPRegInfo *ri,
 +                                  bool isread)
 +{
 +    int el = arm_current_el(env);
 +    uint64_t mdcr_el2 = arm_mdcr_el2_eff(env);
 +    bool mdcr_el2_tda = (mdcr_el2 & MDCR_TDA) || (mdcr_el2 & MDCR_TDE) ||
 +        (arm_hcr_el2_eff(env) & HCR_TGE);
 +    bool mdcr_el2_tdcc = cpu_isar_feature(aa64_fgt, env_archcpu(env)) &&
 +                                          (mdcr_el2 & MDCR_TDCC);
 +    bool mdcr_el3_tdcc = cpu_isar_feature(aa64_fgt, env_archcpu(env)) &&
 +                                          (env->cp15.mdcr_el3 & MDCR_TDCC);
 +
-+# Neon load/store single structure to one lane
++    if (el < 2 && (mdcr_el2_tda || mdcr_el2_tdcc)) {
-+%imm1_5_p1 5:1 !function=plus1
++        return CP_ACCESS_TRAP_EL2;
-+%imm1_6_p1 6:1 !function=plus1
++    }
-+
++    if (el < 3 && ((env->cp15.mdcr_el3 & MDCR_TDA) || mdcr_el3_tdcc)) {
-+VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 00 n:2 reg_idx:3 align:1 rm:4 \
++        return CP_ACCESS_TRAP_EL3;
-+               vd=%vd_dp size=0 stride=1
++    }
-+VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 01 n:2 reg_idx:2 align:2 rm:4 \
++    return CP_ACCESS_OK;
 +               vd=%vd_dp size=1 stride=%imm1_5_p1
 +VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 10 n:2 reg_idx:1 align:3 rm:4 \
 +               vd=%vd_dp size=2 stride=%imm1_6_p1
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@
   * It might be possible to convert it to a standalone .c file eventually.
   */
 +static inline int plus1(DisasContext *s, int x)
 +{
 +    return x + 1;
 +}
 +
- /* Include the generated Neon decoder */
+ static void oslar_write(CPUARMState *env, const ARMCPRegInfo *ri,
- #include "decode-neon-dp.inc.c"
+                         uint64_t value)
  #include "decode-neon-ls.inc.c"
@@ -XXX,XX +XXX,XX @@ static bool trans_VLD_all_lanes(DisasContext *s, arg_VLD_all_lanes *a)
      return true;
  }
 +
 +static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a)
 +{
 +    /* Neon load/store single structure to one lane */
 +    int reg;
 +    int nregs = a->n + 1;
 +    int vd = a->vd;
 +    TCGv_i32 addr, tmp;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist */
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
 +        return false;
 +    }
 +
 +    /* Catch the UNDEF cases. This is unavoidably a bit messy. */
 +    switch (nregs) {
 +    case 1:
 +        if (((a->align & (1 << a->size)) != 0) ||
 +            (a->size == 2 && ((a->align & 3) == 1 || (a->align & 3) == 2))) {
 +            return false;
 +        }
 +        break;
 +    case 3:
 +        if ((a->align & 1) != 0) {
 +            return false;
 +        }
 +        /* fall through */
 +    case 2:
 +        if (a->size == 2 && (a->align & 2) != 0) {
 +            return false;
 +        }
 +        break;
 +    case 4:
 +        if ((a->size == 2) && ((a->align & 3) == 3)) {
 +            return false;
 +        }
 +        break;
 +    default:
 +        abort();
 +    }
 +    if ((vd + a->stride * (nregs - 1)) > 31) {
 +        /*
 +         * Attempts to write off the end of the register file are
 +         * UNPREDICTABLE; we choose to UNDEF because otherwise we would
 +         * access off the end of the array that holds the register data.
 +         */
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    tmp = tcg_temp_new_i32();
 +    addr = tcg_temp_new_i32();
 +    load_reg_var(s, addr, a->rn);
 +    /*
 +     * TODO: if we implemented alignment exceptions, we should check
 +     * addr against the alignment encoded in a->align here.
 +     */
 +    for (reg = 0; reg < nregs; reg++) {
 +        if (a->l) {
 +            gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 +                            s->be_data | a->size);
 +            neon_store_element(vd, a->reg_idx, a->size, tmp);
 +        } else { /* Store */
 +            neon_load_element(tmp, vd, a->reg_idx, a->size);
 +            gen_aa32_st_i32(s, tmp, addr, get_mem_index(s),
 +                            s->be_data | a->size);
 +        }
 +        vd += a->stride;
 +        tcg_gen_addi_i32(addr, addr, 1 << a->size);
 +    }
 +    tcg_temp_free_i32(addr);
 +    tcg_temp_free_i32(tmp);
 +
 +    gen_neon_ldst_base_update(s, a->rm, a->rn, (1 << a->size) * nregs);
 +
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
      tcg_temp_free_i32(rd);
  }
 -
 -/* Translate a NEON load/store element instruction.  Return nonzero if the
 -   instruction is invalid.  */
 -static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
 -{
 -    int rd, rn, rm;
 -    int nregs;
 -    int stride;
 -    int size;
 -    int reg;
 -    int load;
 -    TCGv_i32 addr;
 -    TCGv_i32 tmp;
 -
 -    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 -        return 1;
 -    }
 -
 -    /* FIXME: this access check should not take precedence over UNDEF
 -     * for invalid encodings; we will generate incorrect syndrome information
 -     * for attempts to execute invalid vfp/neon encodings with FP disabled.
 -     */
 -    if (s->fp_excp_el) {
 -        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
 -                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
 -        return 0;
 -    }
 -
 -    if (!s->vfp_enabled)
 -      return 1;
 -    VFP_DREG_D(rd, insn);
 -    rn = (insn >> 16) & 0xf;
 -    rm = insn & 0xf;
 -    load = (insn & (1 << 21)) != 0;
 -    if ((insn & (1 << 23)) == 0) {
 -        /* Load store all elements -- handled already by decodetree */
 -        return 1;
 -    } else {
 -        size = (insn >> 10) & 3;
 -        if (size == 3) {
 -            /* Load single element to all lanes -- handled by decodetree  */
 -            return 1;
 -        } else {
 -            /* Single element.  */
 -            int idx = (insn >> 4) & 0xf;
 -            int reg_idx;
 -            switch (size) {
 -            case 0:
 -                reg_idx = (insn >> 5) & 7;
 -                stride = 1;
 -                break;
 -            case 1:
 -                reg_idx = (insn >> 6) & 3;
 -                stride = (insn & (1 << 5)) ? 2 : 1;
 -                break;
 -            case 2:
 -                reg_idx = (insn >> 7) & 1;
 -                stride = (insn & (1 << 6)) ? 2 : 1;
 -                break;
 -            default:
 -                abort();
 -            }
 -            nregs = ((insn >> 8) & 3) + 1;
 -            /* Catch the UNDEF cases. This is unavoidably a bit messy. */
 -            switch (nregs) {
 -            case 1:
 -                if (((idx & (1 << size)) != 0) ||
 -                    (size == 2 && ((idx & 3) == 1 || (idx & 3) == 2))) {
 -                    return 1;
 -                }
 -                break;
 -            case 3:
 -                if ((idx & 1) != 0) {
 -                    return 1;
 -                }
 -                /* fall through */
 -            case 2:
 -                if (size == 2 && (idx & 2) != 0) {
 -                    return 1;
 -                }
 -                break;
 -            case 4:
 -                if ((size == 2) && ((idx & 3) == 3)) {
 -                    return 1;
 -                }
 -                break;
 -            default:
 -                abort();
 -            }
 -            if ((rd + stride * (nregs - 1)) > 31) {
 -                /* Attempts to write off the end of the register file
 -                 * are UNPREDICTABLE; we choose to UNDEF because otherwise
 -                 * the neon_load_reg() would write off the end of the array.
 -                 */
 -                return 1;
 -            }
 -            tmp = tcg_temp_new_i32();
 -            addr = tcg_temp_new_i32();
 -            load_reg_var(s, addr, rn);
 -            for (reg = 0; reg < nregs; reg++) {
 -                if (load) {
 -                    gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 -                                    s->be_data | size);
 -                    neon_store_element(rd, reg_idx, size, tmp);
 -                } else { /* Store */
 -                    neon_load_element(tmp, rd, reg_idx, size);
 -                    gen_aa32_st_i32(s, tmp, addr, get_mem_index(s),
 -                                    s->be_data | size);
 -                }
 -                rd += stride;
 -                tcg_gen_addi_i32(addr, addr, 1 << size);
 -            }
 -            tcg_temp_free_i32(addr);
 -            tcg_temp_free_i32(tmp);
 -            stride = nregs * (1 << size);
 -        }
 -    }
 -    if (rm != 15) {
 -        TCGv_i32 base;
 -
 -        base = load_reg(s, rn);
 -        if (rm == 13) {
 -            tcg_gen_addi_i32(base, base, stride);
 -        } else {
 -            TCGv_i32 index;
 -            index = load_reg(s, rm);
 -            tcg_gen_add_i32(base, base, index);
 -            tcg_temp_free_i32(index);
 -        }
 -        store_reg(s, rn, base);
 -    }
 -    return 0;
 -}
 -
  static inline void gen_neon_narrow(int size, TCGv_i32 dest, TCGv_i64 src)
  {
-     switch (size) {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+      */
-             }
+     { .name = "MDCCSR_EL0", .state = ARM_CP_STATE_AA64,
-             return;
+       .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 1, .opc2 = 0,
-         }
+-      .access = PL0_R, .accessfn = access_tda,
--        if ((insn & 0x0f100000) == 0x04000000) {
++      .access = PL0_R, .accessfn = access_tdcc,
--            /* NEON load/store.  */
+       .type = ARM_CP_CONST, .resetvalue = 0 },
--            if (disas_neon_ls_insn(s, insn)) {
+     /*
--                goto illegal_op;
+      * OSDTRRX_EL1/OSDTRTX_EL1 are used for save and restore of DBGDTRRX_EL0.
--            }
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
--            return;
+      */
--        }
+     { .name = "OSDTRRX_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14,
-         if ((insn & 0x0e000f00) == 0x0c000100) {
+       .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 0, .opc2 = 2,
-             if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
+-      .access = PL1_RW, .accessfn = access_tda,
-                 /* iWMMXt register transfer.  */
++      .access = PL1_RW, .accessfn = access_tdcc,
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+       .type = ARM_CP_CONST, .resetvalue = 0 },
-         }
+     { .name = "OSDTRTX_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14,
-         break;
+       .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 2,
-     case 12:
+-      .access = PL1_RW, .accessfn = access_tda,
--        if ((insn & 0x01100000) == 0x01000000) {
++      .access = PL1_RW, .accessfn = access_tdcc,
--            if (disas_neon_ls_insn(s, insn)) {
+       .type = ARM_CP_CONST, .resetvalue = 0 },
--                goto illegal_op;
+     /*
--            }
+      * OSECCR_EL1 provides a mechanism for an operating system
--            break;
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
--        }
+      */
-         goto illegal_op;
+     { .name = "MDCCINT_EL1", .state = ARM_CP_STATE_BOTH,
-     default:
+       .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 0,
-     illegal_op:
+-      .access = PL1_RW, .accessfn = access_tda,
 +      .access = PL1_RW, .accessfn = access_tdcc,
        .type = ARM_CP_NOP },
      /*
       * Dummy DBGCLAIM registers.
 --
-.20.1
+.34.1

-[PULL 07/39] target/arm: Use correct variable for setting 'max' cpu's ID_AA64DFR0
+[PULL 33/33] target/arm: Enable FEAT_FGT on '-cpu max'
-In aarch64_max_initfn() we update both 32-bit and 64-bit ID
+Update the ID registers for TCG's '-cpu max' to report the
-registers.  The intended pattern is that for 64-bit ID registers we
+presence of FEAT_FGT Fine-Grained Traps support.
 use FIELD_DP64 and the uint64_t 't' register, while 32-bit ID
 registers use FIELD_DP32 and the uint32_t 'u' register.  For
 ID_AA64DFR0 we accidentally used 'u', meaning that the top 32 bits of
 this 64-bit ID register would end up always zero.  Luckily at the
 moment that's what they should be anyway, so this bug has no visible
 effects.
-Use the right-sized variable.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Tested-by: Fuad Tabba <tabba@google.com>
 Message-id: 20230130182459.3309057-24-peter.maydell@linaro.org
 Message-id: 20230127175507.2895013-24-peter.maydell@linaro.org
 ---
  docs/system/arm/emulation.rst | 1 +
  target/arm/cpu64.c            | 1 +
 files changed, 2 insertions(+)
-Fixes: 3bec78447a958d481991
+diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+index XXXXXXX..XXXXXXX 100644
-Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
+--- a/docs/system/arm/emulation.rst
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
++++ b/docs/system/arm/emulation.rst
-Message-id: 20200423110915.10527-1-peter.maydell@linaro.org
+@@ -XXX,XX +XXX,XX @@ the following architecture extensions:
----
+ - FEAT_ETS (Enhanced Translation Synchronization)
- target/arm/cpu64.c | 6 +++---
+ - FEAT_EVT (Enhanced Virtualization Traps)
-file changed, 3 insertions(+), 3 deletions(-)
+ - FEAT_FCMA (Floating-point complex number instructions)
++- FEAT_FGT (Fine-Grained Traps)
  - FEAT_FHM (Floating-point half-precision multiplication instructions)
  - FEAT_FP16 (Half-precision floating-point data processing)
  - FEAT_FRINTTS (Floating-point to integer instructions)
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
 @@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
-         u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
+     t = FIELD_DP64(t, ID_AA64MMFR0, TGRAN16_2, 2); /* 16k stage2 supported */
-         cpu->isar.id_mmfr4 = u;
+     t = FIELD_DP64(t, ID_AA64MMFR0, TGRAN64_2, 2); /* 64k stage2 supported */
+     t = FIELD_DP64(t, ID_AA64MMFR0, TGRAN4_2, 2);  /*  4k stage2 supported */
--        u = cpu->isar.id_aa64dfr0;
++    t = FIELD_DP64(t, ID_AA64MMFR0, FGT, 1);       /* FEAT_FGT */
--        u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
+     cpu->isar.id_aa64mmfr0 = t;
--        cpu->isar.id_aa64dfr0 = u;
-+        t = cpu->isar.id_aa64dfr0;
+     t = cpu->isar.id_aa64mmfr1;
 +        t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
 +        cpu->isar.id_aa64dfr0 = t;
          u = cpu->isar.id_dfr0;
          u = FIELD_DP32(u, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
 --
-.20.1
+.34.1

-[PULL 08/39] target/arm: Use uint64_t for midr field in CPU state struct
+Deleted patch
-From: Philippe Mathieu-Daudé <f4bug@amsat.org>
-MIDR_EL1 is a 64-bit system register with the top 32-bit being RES0.
-Represent it in QEMU's ARMCPU struct with a uint64_t, not a
-uint32_t.
-This fixes an error when compiling with -Werror=conversion
-because we were manipulating the register value using a
-local uint64_t variable:
-  target/arm/cpu64.c: In function ‘aarch64_max_initfn’:
-  target/arm/cpu64.c:628:21: error: conversion from ‘uint64_t’ {aka ‘long unsigned int’} to ‘uint32_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
-|         cpu->midr = t;
-        |                     ^
-and future-proofs us against a possible future architecture
-change using some of the top 32 bits.
-Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
-Suggested-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
-Message-id: 20200428172634.29707-1-f4bug@amsat.org
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/cpu.h | 2 +-
- target/arm/cpu.c | 2 +-
-files changed, 2 insertions(+), 2 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
-         uint64_t id_aa64dfr0;
-         uint64_t id_aa64dfr1;
-     } isar;
--    uint32_t midr;
-+    uint64_t midr;
-     uint32_t revidr;
-     uint32_t reset_fpsid;
-     uint32_t ctr;
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
-+++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo arm_cpus[] = {
- static Property arm_cpu_properties[] = {
-     DEFINE_PROP_BOOL("start-powered-off", ARMCPU, start_powered_off, false),
-     DEFINE_PROP_UINT32("psci-conduit", ARMCPU, psci_conduit, 0),
--    DEFINE_PROP_UINT32("midr", ARMCPU, midr, 0),
-+    DEFINE_PROP_UINT64("midr", ARMCPU, midr, 0),
-     DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
-                         mp_affinity, ARM64_AFFINITY_INVALID),
-     DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
---
-.20.1

-[PULL 12/39] hw/arm: versal: Embed the UARTs into the SoC type
+Deleted patch
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Embed the UARTs into the SoC type.
-Suggested-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Luc Michel <luc.michel@greensocs.com>
-Message-id: 20200427181649.26851-5-edgar.iglesias@gmail.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/arm/xlnx-versal.h |  3 ++-
- hw/arm/xlnx-versal.c         | 12 ++++++------
-files changed, 8 insertions(+), 7 deletions(-)
-diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/xlnx-versal.h
-+++ b/include/hw/arm/xlnx-versal.h
-@@ -XXX,XX +XXX,XX @@
- #include "hw/sysbus.h"
- #include "hw/arm/boot.h"
- #include "hw/intc/arm_gicv3.h"
-+#include "hw/char/pl011.h"
- #define TYPE_XLNX_VERSAL "xlnx-versal"
- #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
-@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
-         MemoryRegion mr_ocm;
-         struct {
--            SysBusDevice *uart[XLNX_VERSAL_NR_UARTS];
-+            PL011State uart[XLNX_VERSAL_NR_UARTS];
-             SysBusDevice *gem[XLNX_VERSAL_NR_GEMS];
-             SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
-         } iou;
-diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-versal.c
-+++ b/hw/arm/xlnx-versal.c
-@@ -XXX,XX +XXX,XX @@
- #include "kvm_arm.h"
- #include "hw/misc/unimp.h"
- #include "hw/arm/xlnx-versal.h"
--#include "hw/char/pl011.h"
- #define XLNX_VERSAL_ACPU_TYPE ARM_CPU_TYPE_NAME("cortex-a72")
- #define GEM_REVISION        0x40070106
-@@ -XXX,XX +XXX,XX @@ static void versal_create_uarts(Versal *s, qemu_irq *pic)
-         DeviceState *dev;
-         MemoryRegion *mr;
--        dev = qdev_create(NULL, TYPE_PL011);
--        s->lpd.iou.uart[i] = SYS_BUS_DEVICE(dev);
-+        sysbus_init_child_obj(OBJECT(s), name,
-+                              &s->lpd.iou.uart[i], sizeof(s->lpd.iou.uart[i]),
-+                              TYPE_PL011);
-+        dev = DEVICE(&s->lpd.iou.uart[i]);
-         qdev_prop_set_chr(dev, "chardev", serial_hd(i));
--        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
-         qdev_init_nofail(dev);
--        mr = sysbus_mmio_get_region(s->lpd.iou.uart[i], 0);
-+        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
-         memory_region_add_subregion(&s->mr_ps, addrs[i], mr);
--        sysbus_connect_irq(s->lpd.iou.uart[i], 0, pic[irqs[i]]);
-+        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[irqs[i]]);
-         g_free(name);
-     }
- }
---
-.20.1

-[PULL 13/39] hw/arm: versal: Embed the GEMs into the SoC type
+Deleted patch
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Embed the GEMs into the SoC type.
-Suggested-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Luc Michel <luc.michel@greensocs.com>
-Message-id: 20200427181649.26851-6-edgar.iglesias@gmail.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/arm/xlnx-versal.h |  3 ++-
- hw/arm/xlnx-versal.c         | 15 ++++++++-------
-files changed, 10 insertions(+), 8 deletions(-)
-diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/xlnx-versal.h
-+++ b/include/hw/arm/xlnx-versal.h
-@@ -XXX,XX +XXX,XX @@
- #include "hw/arm/boot.h"
- #include "hw/intc/arm_gicv3.h"
- #include "hw/char/pl011.h"
-+#include "hw/net/cadence_gem.h"
- #define TYPE_XLNX_VERSAL "xlnx-versal"
- #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
-@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
-         struct {
-             PL011State uart[XLNX_VERSAL_NR_UARTS];
--            SysBusDevice *gem[XLNX_VERSAL_NR_GEMS];
-+            CadenceGEMState gem[XLNX_VERSAL_NR_GEMS];
-             SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
-         } iou;
-     } lpd;
-diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-versal.c
-+++ b/hw/arm/xlnx-versal.c
-@@ -XXX,XX +XXX,XX @@ static void versal_create_gems(Versal *s, qemu_irq *pic)
-         DeviceState *dev;
-         MemoryRegion *mr;
--        dev = qdev_create(NULL, "cadence_gem");
--        s->lpd.iou.gem[i] = SYS_BUS_DEVICE(dev);
--        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
-+        sysbus_init_child_obj(OBJECT(s), name,
-+                              &s->lpd.iou.gem[i], sizeof(s->lpd.iou.gem[i]),
-+                              TYPE_CADENCE_GEM);
-+        dev = DEVICE(&s->lpd.iou.gem[i]);
-         if (nd->used) {
-             qemu_check_nic_model(nd, "cadence_gem");
-             qdev_set_nic_properties(dev, nd);
-         }
--        object_property_set_int(OBJECT(s->lpd.iou.gem[i]),
-+        object_property_set_int(OBJECT(dev),
-, "num-priority-queues",
-                                 &error_abort);
--        object_property_set_link(OBJECT(s->lpd.iou.gem[i]),
-+        object_property_set_link(OBJECT(dev),
-                                  OBJECT(&s->mr_ps), "dma",
-                                  &error_abort);
-         qdev_init_nofail(dev);
--        mr = sysbus_mmio_get_region(s->lpd.iou.gem[i], 0);
-+        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
-         memory_region_add_subregion(&s->mr_ps, addrs[i], mr);
--        sysbus_connect_irq(s->lpd.iou.gem[i], 0, pic[irqs[i]]);
-+        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[irqs[i]]);
-         g_free(name);
-     }
- }
---
-.20.1

-[PULL 14/39] hw/arm: versal: Embed the ADMAs into the SoC type
+Deleted patch
-From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Embed the ADMAs into the SoC type.
-Suggested-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Luc Michel <luc.michel@greensocs.com>
-Message-id: 20200427181649.26851-7-edgar.iglesias@gmail.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/arm/xlnx-versal.h |  3 ++-
- hw/arm/xlnx-versal.c         | 14 +++++++-------
-files changed, 9 insertions(+), 8 deletions(-)
-diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/xlnx-versal.h
-+++ b/include/hw/arm/xlnx-versal.h
-@@ -XXX,XX +XXX,XX @@
- #include "hw/arm/boot.h"
- #include "hw/intc/arm_gicv3.h"
- #include "hw/char/pl011.h"
-+#include "hw/dma/xlnx-zdma.h"
- #include "hw/net/cadence_gem.h"
- #define TYPE_XLNX_VERSAL "xlnx-versal"
-@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
-         struct {
-             PL011State uart[XLNX_VERSAL_NR_UARTS];
-             CadenceGEMState gem[XLNX_VERSAL_NR_GEMS];
--            SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
-+            XlnxZDMA adma[XLNX_VERSAL_NR_ADMAS];
-         } iou;
-     } lpd;
-diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-versal.c
-+++ b/hw/arm/xlnx-versal.c
-@@ -XXX,XX +XXX,XX @@ static void versal_create_admas(Versal *s, qemu_irq *pic)
-         DeviceState *dev;
-         MemoryRegion *mr;
--        dev = qdev_create(NULL, "xlnx.zdma");
--        s->lpd.iou.adma[i] = SYS_BUS_DEVICE(dev);
--        object_property_set_int(OBJECT(s->lpd.iou.adma[i]), 128, "bus-width",
--                                &error_abort);
--        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
-+        sysbus_init_child_obj(OBJECT(s), name,
-+                              &s->lpd.iou.adma[i], sizeof(s->lpd.iou.adma[i]),
-+                              TYPE_XLNX_ZDMA);
-+        dev = DEVICE(&s->lpd.iou.adma[i]);
-+        object_property_set_int(OBJECT(dev), 128, "bus-width", &error_abort);
-         qdev_init_nofail(dev);
--        mr = sysbus_mmio_get_region(s->lpd.iou.adma[i], 0);
-+        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
-         memory_region_add_subregion(&s->mr_ps,
-                                     MM_ADMA_CH0 + i * MM_ADMA_CH0_SIZE, mr);
--        sysbus_connect_irq(s->lpd.iou.adma[i], 0, pic[VERSAL_ADMA_IRQ_0 + i]);
-+        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[VERSAL_ADMA_IRQ_0 + i]);
-         g_free(name);
-     }
- }
---
-.20.1

-[PULL 20/39] target/arm/translate-vfp.inc.c: Remove duplicate simd_r32 check
+Deleted patch
-Somewhere along theline we accidentally added a duplicate
-"using D16-D31 when they don't exist" check to do_vfm_dp()
-(probably an artifact of a patchseries rebase). Remove it.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Message-id: 20200430181003.21682-2-peter.maydell@linaro.org
----
- target/arm/translate-vfp.inc.c | 6 ------
-file changed, 6 deletions(-)
-diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-vfp.inc.c
-+++ b/target/arm/translate-vfp.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
-         return false;
-     }
--    /* UNDEF accesses to D16-D31 if they don't exist. */
--    if (!dc_isar_feature(aa32_simd_r32, s) &&
--        ((a->vd | a->vn | a->vm) & 0x10)) {
--        return false;
--    }
--
-     if (!vfp_access_check(s)) {
-         return true;
-     }
---
-.20.1

Most of this is the Neon decodetree patches, followed by Edgar's versal cleanups.

thanks
-- PMM

The following changes since commit 2ef486e76d64436be90f7359a3071fb2a56ce835:

Merge remote-tracking branch 'remotes/marcel/tags/rdma-pull-request' into staging (2020-05-03 14:12:56 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200504

for you to fetch changes up to 9aefc6cf9b73f66062d2f914a0136756e7a28211:

target/arm: Move gen_ function typedefs to translate.h (2020-05-04 12:59:26 +0100)

----------------------------------------------------------------
target-arm queue:
 * Start of conversion of Neon insns to decodetree
 * versal board: support SD and RTC
 * Implement ARMv8.2-TTS2UXN
 * Make VQDMULL undefined when U=1
 * Some minor code cleanups

----------------------------------------------------------------
Edgar E. Iglesias (11):
      hw/arm: versal: Remove inclusion of arm_gicv3_common.h
      hw/arm: versal: Move misplaced comment
      hw/arm: versal-virt: Fix typo xlnx-ve -> xlnx-versal
      hw/arm: versal: Embed the UARTs into the SoC type
      hw/arm: versal: Embed the GEMs into the SoC type
      hw/arm: versal: Embed the ADMAs into the SoC type
      hw/arm: versal: Embed the APUs into the SoC type
      hw/arm: versal: Add support for SD
      hw/arm: versal: Add support for the RTC
      hw/arm: versal-virt: Add support for SD
      hw/arm: versal-virt: Add support for the RTC

Fredrik Strupe (1):
      target/arm: Make VQDMULL undefined when U=1

Peter Maydell (25):
      target/arm: Don't use a TLB for ARMMMUIdx_Stage2
      target/arm: Use enum constant in get_phys_addr_lpae() call
      target/arm: Add new 's1_is_el0' argument to get_phys_addr_lpae()
      target/arm: Implement ARMv8.2-TTS2UXN
      target/arm: Use correct variable for setting 'max' cpu's ID_AA64DFR0
      target/arm/translate-vfp.inc.c: Remove duplicate simd_r32 check
      target/arm: Don't allow Thumb Neon insns without FEATURE_NEON
      target/arm: Add stubs for AArch32 Neon decodetree
      target/arm: Convert VCMLA (vector) to decodetree
      target/arm: Convert VCADD (vector) to decodetree
      target/arm: Convert V[US]DOT (vector) to decodetree
      target/arm: Convert VFM[AS]L (vector) to decodetree
      target/arm: Convert VCMLA (scalar) to decodetree
      target/arm: Convert V[US]DOT (scalar) to decodetree
      target/arm: Convert VFM[AS]L (scalar) to decodetree
      target/arm: Convert Neon load/store multiple structures to decodetree
      target/arm: Convert Neon 'load single structure to all lanes' to decodetree
      target/arm: Convert Neon 'load/store single structure' to decodetree
      target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree
      target/arm: Convert Neon 3-reg-same logic ops to decodetree
      target/arm: Convert Neon 3-reg-same VMAX/VMIN to decodetree
      target/arm: Convert Neon 3-reg-same comparisons to decodetree
      target/arm: Convert Neon 3-reg-same VQADD/VQSUB to decodetree
      target/arm: Convert Neon 3-reg-same VMUL, VMLA, VMLS, VSHL to decodetree
      target/arm: Move gen_ function typedefs to translate.h

Philippe Mathieu-Daudé (2):
      hw/arm/mps2-tz: Use TYPE_IOTKIT instead of hardcoded string
      target/arm: Use uint64_t for midr field in CPU state struct

include/hw/arm/xlnx-versal.h    |  31 +-
 target/arm/cpu-param.h          |   2 +-
 target/arm/cpu.h                |  38 ++-
 target/arm/translate-a64.h      |   9 -
 target/arm/translate.h          |  26 ++
 target/arm/neon-dp.decode       |  86 +++++
 target/arm/neon-ls.decode       |  52 +++
 target/arm/neon-shared.decode   |  66 ++++
 hw/arm/mps2-tz.c                |   2 +-
 hw/arm/xlnx-versal-virt.c       |  74 ++++-
 hw/arm/xlnx-versal.c            | 115 +++++--
 target/arm/cpu.c                |   3 +-
 target/arm/cpu64.c              |   8 +-
 target/arm/helper.c             | 183 ++++------
 target/arm/translate-a64.c      |  17 -
 target/arm/translate-neon.inc.c | 714 +++++++++++++++++++++++++++++++++++++++
 target/arm/translate-vfp.inc.c  |   6 -
 target/arm/translate.c          | 716 +++-------------------------------------
 target/arm/Makefile.objs        |  18 +
 19 files changed, 1302 insertions(+), 864 deletions(-)
 create mode 100644 target/arm/neon-dp.decode
 create mode 100644 target/arm/neon-ls.decode
 create mode 100644 target/arm/neon-shared.decode
 create mode 100644 target/arm/translate-neon.inc.c

From: Fredrik Strupe <fredrik@strupe.net>

According to Arm ARM, VQDMULL is only valid when U=0, while having
U=1 is unallocated.

Signed-off-by: Fredrik Strupe <fredrik@strupe.net>
Fixes: 695272dcb976 ("target-arm: Handle UNDEF cases for Neon 3-regs-different-widths")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     {0, 0, 0, 0}, /* VMLSL */
                     {0, 0, 0, 9}, /* VQDMLSL */
                     {0, 0, 0, 0}, /* Integer VMULL */
-                    {0, 0, 0, 1}, /* VQDMULL */
+                    {0, 0, 0, 9}, /* VQDMULL */
                     {0, 0, 0, 0xa}, /* Polynomial VMULL */
                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
                 };
-- 
2.20.1

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

By using the TYPE_* definitions for devices, we can:
 - quickly find where devices are used with 'git-grep'
 - easily rename a device (one-line change).

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200428154650.21991-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/mps2-tz.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/mps2-tz.c
+++ b/hw/arm/mps2-tz.c
@@ -XXX,XX +XXX,XX @@ static void mps2tz_common_init(MachineState *machine)
         exit(EXIT_FAILURE);
     }
 
-    sysbus_init_child_obj(OBJECT(machine), "iotkit", &mms->iotkit,
+    sysbus_init_child_obj(OBJECT(machine), TYPE_IOTKIT, &mms->iotkit,
                           sizeof(mms->iotkit), mmc->armsse_type);
     iotkitdev = DEVICE(&mms->iotkit);
     object_property_set_link(OBJECT(&mms->iotkit), OBJECT(system_memory),
-- 
2.20.1

We define ARMMMUIdx_Stage2 as being an MMU index which uses a QEMU
TLB.  However we never actually use the TLB -- all stage 2 lookups
are done by direct calls to get_phys_addr_lpae() followed by a
physical address load via address_space_ld*().

Remove Stage2 from the list of ARM MMU indexes which correspond to
real core MMU indexes, and instead put it in the set of "NOTLB" ARM
MMU indexes.

This allows us to drop NB_MMU_MODES to 11.  It also means we can
safely add support for the ARMv8.3-TTS2UXN extension, which adds
permission bits to the stage 2 descriptors which define execute
permission separatel for EL0 and EL1; supporting that while keeping
Stage2 in a QEMU TLB would require us to use separate TLBs for
"Stage2 for an EL0 access" and "Stage2 for an EL1 access", which is a
lot of extra complication given we aren't even using the QEMU TLB.

In the process of updating the comment on our MMU index use,
fix a couple of other minor errors:
 * NS EL2 EL2&0 was missing from the list in the comment
 * some text hadn't been updated from when we bumped NB_MMU_MODES
   above 8

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-2-peter.maydell@linaro.org
---
 target/arm/cpu-param.h |   2 +-
 target/arm/cpu.h       |  21 +++++---
 target/arm/helper.c    | 112 ++++-------------------------------------
 3 files changed, 27 insertions(+), 108 deletions(-)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -XXX,XX +XXX,XX @@
 # define TARGET_PAGE_BITS_MIN  10
 #endif
 
-#define NB_MMU_MODES 12
+#define NB_MMU_MODES 11
 
 #endif
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  *     handling via the TLB. The only way to do a stage 1 translation without
  *     the immediate stage 2 translation is via the ATS or AT system insns,
  *     which can be slow-pathed and always do a page table walk.
+ *     The only use of stage 2 translations is either as part of an s1+2
+ *     lookup or when loading the descriptors during a stage 1 page table walk,
+ *     and in both those cases we don't use the TLB.
  *  4. we can also safely fold together the "32 bit EL3" and "64 bit EL3"
  *     translation regimes, because they map reasonably well to each other
  *     and they can't both be active at the same time.
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  * NS EL1 EL1&0 stage 1+2 (aka NS PL1)
  * NS EL1 EL1&0 stage 1+2 +PAN
  * NS EL0 EL2&0
+ * NS EL2 EL2&0
  * NS EL2 EL2&0 +PAN
  * NS EL2 (aka NS PL2)
  * S EL0 EL1&0 (aka S PL0)
  * S EL1 EL1&0 (not used if EL3 is 32 bit)
  * S EL1 EL1&0 +PAN
  * S EL3 (aka S PL1)
- * NS EL1&0 stage 2
  *
- * for a total of 12 different mmu_idx.
+ * for a total of 11 different mmu_idx.
  *
  * R profile CPUs have an MPU, but can use the same set of MMU indexes
  * as A profile. They only need to distinguish NS EL0 and NS EL1 (and
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  * are not quite the same -- different CPU types (most notably M profile
  * vs A/R profile) would like to use MMU indexes with different semantics,
  * but since we don't ever need to use all of those in a single CPU we
- * can avoid setting NB_MMU_MODES to more than 8. The lower bits of
+ * can avoid having to set NB_MMU_MODES to "total number of A profile MMU
+ * modes + total number of M profile MMU modes". The lower bits of
  * ARMMMUIdx are the core TLB mmu index, and the higher bits are always
  * the same for any particular CPU.
  * Variables of type ARMMUIdx are always full values, and the core
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
     ARMMMUIdx_SE10_1_PAN = 9 | ARM_MMU_IDX_A,
     ARMMMUIdx_SE3        = 10 | ARM_MMU_IDX_A,
 
-    ARMMMUIdx_Stage2     = 11 | ARM_MMU_IDX_A,
-
     /*
      * These are not allocated TLBs and are used only for AT system
      * instructions or for the first stage of an S12 page table walk.
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
     ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
     ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
     ARMMMUIdx_Stage1_E1_PAN = 2 | ARM_MMU_IDX_NOTLB,
+    /*
+     * Not allocated a TLB: used only for second stage of an S12 page
+     * table walk, or for descriptor loads during first stage of an S1
+     * page table walk. Note that if we ever want to have a TLB for this
+     * then various TLB flush insns which currently are no-ops or flush
+     * only stage 1 MMU indexes will need to change to flush stage 2.
+     */
+    ARMMMUIdx_Stage2     = 3 | ARM_MMU_IDX_NOTLB,
 
     /*
      * M-profile.
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
     TO_CORE_BIT(SE10_1),
     TO_CORE_BIT(SE10_1_PAN),
     TO_CORE_BIT(SE3),
-    TO_CORE_BIT(Stage2),
 
     TO_CORE_BIT(MUser),
     TO_CORE_BIT(MPriv),
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
     tlb_flush_by_mmuidx(cs,
                         ARMMMUIdxBit_E10_1 |
                         ARMMMUIdxBit_E10_1_PAN |
-                        ARMMMUIdxBit_E10_0 |
-                        ARMMMUIdxBit_Stage2);
+                        ARMMMUIdxBit_E10_0);
 }
 
 static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     tlb_flush_by_mmuidx_all_cpus_synced(cs,
                                         ARMMMUIdxBit_E10_1 |
                                         ARMMMUIdxBit_E10_1_PAN |
-                                        ARMMMUIdxBit_E10_0 |
-                                        ARMMMUIdxBit_Stage2);
+                                        ARMMMUIdxBit_E10_0);
 }
 
-static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                            uint64_t value)
-{
-    /* Invalidate by IPA. This has to invalidate any structures that
-     * contain only stage 2 translation information, but does not need
-     * to apply to structures that contain combined stage 1 and stage 2
-     * translation information.
-     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
-     */
-    CPUState *cs = env_cpu(env);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 40);
-
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
-}
-
-static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                               uint64_t value)
-{
-    CPUState *cs = env_cpu(env);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 40);
-
-    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
-                                             ARMMMUIdxBit_Stage2);
-}
 
 static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
                               uint64_t value)
@@ -XXX,XX +XXX,XX @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
         tlb_flush_by_mmuidx(cs,
                             ARMMMUIdxBit_E10_1 |
                             ARMMMUIdxBit_E10_1_PAN |
-                            ARMMMUIdxBit_E10_0 |
-                            ARMMMUIdxBit_Stage2);
+                            ARMMMUIdxBit_E10_0);
         raw_write(env, ri, value);
     }
 }
@@ -XXX,XX +XXX,XX @@ static int alle1_tlbmask(CPUARMState *env)
         return ARMMMUIdxBit_SE10_1 |
                ARMMMUIdxBit_SE10_1_PAN |
                ARMMMUIdxBit_SE10_0;
-    } else if (arm_feature(env, ARM_FEATURE_EL2)) {
-        return ARMMMUIdxBit_E10_1 |
-               ARMMMUIdxBit_E10_1_PAN |
-               ARMMMUIdxBit_E10_0 |
-               ARMMMUIdxBit_Stage2;
     } else {
         return ARMMMUIdxBit_E10_1 |
                ARMMMUIdxBit_E10_1_PAN |
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                              ARMMMUIdxBit_SE3);
 }
 
-static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                                    uint64_t value)
-{
-    /* Invalidate by IPA. This has to invalidate any structures that
-     * contain only stage 2 translation information, but does not need
-     * to apply to structures that contain combined stage 1 and stage 2
-     * translation information.
-     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
-     */
-    ARMCPU *cpu = env_archcpu(env);
-    CPUState *cs = CPU(cpu);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 48);
-
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
-}
-
-static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                                      uint64_t value)
-{
-    CPUState *cs = env_cpu(env);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 48);
-
-    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
-                                             ARMMMUIdxBit_Stage2);
-}
-
 static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
                                       bool isread)
 {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .writefn = tlbi_aa64_vae1_write },
     { .name = "TLBI_IPAS2E1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1is_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_IPAS2LE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1is_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_ALLE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 4,
       .access = PL2_W, .type = ARM_CP_NO_RAW,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .writefn = tlbi_aa64_alle1is_write },
     { .name = "TLBI_IPAS2E1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_IPAS2LE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_ALLE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 4,
       .access = PL2_W, .type = ARM_CP_NO_RAW,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .writefn = tlbimva_hyp_is_write },
     { .name = "TLBIIPAS2",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     { .name = "TLBIIPAS2IS",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_is_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     { .name = "TLBIIPAS2L",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     { .name = "TLBIIPAS2LIS",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_is_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     /* 32 bit cache operations */
     { .name = "ICIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
-- 
2.20.1

The access_type argument to get_phys_addr_lpae() is an MMUAccessType;
use the enum constant MMU_DATA_LOAD rather than a literal 0 when we
call it in S1_ptw_translate().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-3-peter.maydell@linaro.org
---
 target/arm/helper.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
             pcacheattrs = &cacheattrs;
         }
 
-        ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_Stage2, &s2pa,
-                                 &txattrs, &s2prot, &s2size, fi, pcacheattrs);
+        ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
+                                 &s2pa, &txattrs, &s2prot, &s2size, fi,
+                                 pcacheattrs);
         if (ret) {
             assert(fi->type != ARMFault_None);
             fi->s2addr = addr;
-- 
2.20.1

For ARMv8.2-TTS2UXN, the stage 2 page table walk wants to know
whether the stage 1 access is for EL0 or not, because whether
exec permission is given can depend on whether this is an EL0
or EL1 access. Add a new argument to get_phys_addr_lpae() so
the call sites can pass this information in.

Since get_phys_addr_lpae() doesn't already have a doc comment,
add one so we have a place to put the documentation of the
semantics of the new s1_is_el0 argument.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-4-peter.maydell@linaro.org
---
 target/arm/helper.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@
 
 static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
+                               bool s1_is_el0,
                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
                                target_ulong *page_size_ptr,
                                ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs);
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
         }
 
         ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
+                                 false,
                                  &s2pa, &txattrs, &s2prot, &s2size, fi,
                                  pcacheattrs);
         if (ret) {
@@ -XXX,XX +XXX,XX @@ static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
     };
 }
 
+/**
+ * get_phys_addr_lpae: perform one stage of page table walk, LPAE format
+ *
+ * Returns false if the translation was successful. Otherwise, phys_ptr, attrs,
+ * prot and page_size may not be filled in, and the populated fsr value provides
+ * information on why the translation aborted, in the format of a long-format
+ * DFSR/IFSR fault register, with the following caveats:
+ *  * the WnR bit is never set (the caller must do this).
+ *
+ * @env: CPUARMState
+ * @address: virtual address to get physical address for
+ * @access_type: MMU_DATA_LOAD, MMU_DATA_STORE or MMU_INST_FETCH
+ * @mmu_idx: MMU index indicating required translation regime
+ * @s1_is_el0: if @mmu_idx is ARMMMUIdx_Stage2 (so this is a stage 2 page table
+ *             walk), must be true if this is stage 2 of a stage 1+2 walk for an
+ *             EL0 access). If @mmu_idx is anything else, @s1_is_el0 is ignored.
+ * @phys_ptr: set to the physical address corresponding to the virtual address
+ * @attrs: set to the memory transaction attributes to use
+ * @prot: set to the permissions for the page containing phys_ptr
+ * @page_size_ptr: set to the size of the page containing phys_ptr
+ * @fi: set to fault info if the translation fails
+ * @cacheattrs: (if non-NULL) set to the cacheability/shareability attributes
+ */
 static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
+                               bool s1_is_el0,
                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
                                target_ulong *page_size_ptr,
                                ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs)
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
 
             /* S1 is done. Now do S2 translation.  */
             ret = get_phys_addr_lpae(env, ipa, access_type, ARMMMUIdx_Stage2,
+                                     mmu_idx == ARMMMUIdx_E10_0,
                                      phys_ptr, attrs, &s2_prot,
                                      page_size, fi,
                                      cacheattrs != NULL ? &cacheattrs2 : NULL);
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
     }
 
     if (regime_using_lpae_format(env, mmu_idx)) {
-        return get_phys_addr_lpae(env, address, access_type, mmu_idx,
+        return get_phys_addr_lpae(env, address, access_type, mmu_idx, false,
                                   phys_ptr, attrs, prot, page_size,
                                   fi, cacheattrs);
     } else if (regime_sctlr(env, mmu_idx) & SCTLR_XP) {
-- 
2.20.1

The ARMv8.2-TTS2UXN feature extends the XN field in stage 2
translation table descriptors from just bit [54] to bits [54:53],
allowing stage 2 to control execution permissions separately for EL0
and EL1. Implement the new semantics of the XN field and enable
the feature for our 'max' CPU.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-5-peter.maydell@linaro.org
---
 target/arm/cpu.h    | 15 +++++++++++++++
 target/arm/cpu.c    |  1 +
 target/arm/cpu64.c  |  2 ++
 target/arm/helper.c | 37 +++++++++++++++++++++++++++++++------
 4 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_ccidx(const ARMISARegisters *id)
     return FIELD_EX32(id->id_mmfr4, ID_MMFR4, CCIDX) != 0;
 }
 
+static inline bool isar_feature_aa32_tts2uxn(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_mmfr4, ID_MMFR4, XNX) != 0;
+}
+
 /*
  * 64-bit feature tests via id registers.
  */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_ccidx(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, CCIDX) != 0;
 }
 
+static inline bool isar_feature_aa64_tts2uxn(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, XNX) != 0;
+}
+
 /*
  * Feature tests for "does this exist in either 32-bit or 64-bit?"
  */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_ccidx(const ARMISARegisters *id)
     return isar_feature_aa64_ccidx(id) || isar_feature_aa32_ccidx(id);
 }
 
+static inline bool isar_feature_any_tts2uxn(const ARMISARegisters *id)
+{
+    return isar_feature_aa64_tts2uxn(id) || isar_feature_aa32_tts2uxn(id);
+}
+
 /*
  * Forward to the above feature tests given an ARMCPU pointer.
  */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
             t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
             t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
             t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
+            t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* TTS2UXN */
             cpu->isar.id_mmfr4 = t;
         }
 #endif
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);
         t = FIELD_DP64(t, ID_AA64MMFR1, PAN, 2); /* ATS1E1 */
         t = FIELD_DP64(t, ID_AA64MMFR1, VMIDBITS, 2); /* VMID16 */
+        t = FIELD_DP64(t, ID_AA64MMFR1, XNX, 1); /* TTS2UXN */
         cpu->isar.id_aa64mmfr1 = t;
 
         t = cpu->isar.id_aa64mmfr2;
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         u = FIELD_DP32(u, ID_MMFR4, HPDS, 1); /* AA32HPD */
         u = FIELD_DP32(u, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
         u = FIELD_DP32(u, ID_MMFR4, CNP, 1); /* TTCNP */
+        u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
         cpu->isar.id_mmfr4 = u;
 
         u = cpu->isar.id_aa64dfr0;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ simple_ap_to_rw_prot(CPUARMState *env, ARMMMUIdx mmu_idx, int ap)
  *
  * @env:     CPUARMState
  * @s2ap:    The 2-bit stage2 access permissions (S2AP)
- * @xn:      XN (execute-never) bit
+ * @xn:      XN (execute-never) bits
+ * @s1_is_el0: true if this is S2 of an S1+2 walk for EL0
  */
-static int get_S2prot(CPUARMState *env, int s2ap, int xn)
+static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
 {
     int prot = 0;
 
@@ -XXX,XX +XXX,XX @@ static int get_S2prot(CPUARMState *env, int s2ap, int xn)
     if (s2ap & 2) {
         prot |= PAGE_WRITE;
     }
-    if (!xn) {
-        if (arm_el_is_aa64(env, 2) || prot & PAGE_READ) {
+
+    if (cpu_isar_feature(any_tts2uxn, env_archcpu(env))) {
+        switch (xn) {
+        case 0:
             prot |= PAGE_EXEC;
+            break;
+        case 1:
+            if (s1_is_el0) {
+                prot |= PAGE_EXEC;
+            }
+            break;
+        case 2:
+            break;
+        case 3:
+            if (!s1_is_el0) {
+                prot |= PAGE_EXEC;
+            }
+            break;
+        default:
+            g_assert_not_reached();
+        }
+    } else {
+        if (!extract32(xn, 1, 1)) {
+            if (arm_el_is_aa64(env, 2) || prot & PAGE_READ) {
+                prot |= PAGE_EXEC;
+            }
         }
     }
     return prot;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
     }
 
     ap = extract32(attrs, 4, 2);
-    xn = extract32(attrs, 12, 1);
 
     if (mmu_idx == ARMMMUIdx_Stage2) {
         ns = true;
-        *prot = get_S2prot(env, ap, xn);
+        xn = extract32(attrs, 11, 2);
+        *prot = get_S2prot(env, ap, xn, s1_is_el0);
     } else {
         ns = extract32(attrs, 3, 1);
+        xn = extract32(attrs, 12, 1);
         pxn = extract32(attrs, 11, 1);
         *prot = get_S1prot(env, mmu_idx, aarch64, ap, ns, xn, pxn);
     }
-- 
2.20.1

In aarch64_max_initfn() we update both 32-bit and 64-bit ID
registers.  The intended pattern is that for 64-bit ID registers we
use FIELD_DP64 and the uint64_t 't' register, while 32-bit ID
registers use FIELD_DP32 and the uint32_t 'u' register.  For
ID_AA64DFR0 we accidentally used 'u', meaning that the top 32 bits of
this 64-bit ID register would end up always zero.  Luckily at the
moment that's what they should be anyway, so this bug has no visible
effects.

Use the right-sized variable.

Fixes: 3bec78447a958d481991
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200423110915.10527-1-peter.maydell@linaro.org
---
 target/arm/cpu64.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
         cpu->isar.id_mmfr4 = u;
 
-        u = cpu->isar.id_aa64dfr0;
-        u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
-        cpu->isar.id_aa64dfr0 = u;
+        t = cpu->isar.id_aa64dfr0;
+        t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
+        cpu->isar.id_aa64dfr0 = t;
 
         u = cpu->isar.id_dfr0;
         u = FIELD_DP32(u, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
-- 
2.20.1

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

MIDR_EL1 is a 64-bit system register with the top 32-bit being RES0.
Represent it in QEMU's ARMCPU struct with a uint64_t, not a
uint32_t.

This fixes an error when compiling with -Werror=conversion
because we were manipulating the register value using a
local uint64_t variable:

target/arm/cpu64.c: In function ‘aarch64_max_initfn’:
  target/arm/cpu64.c:628:21: error: conversion from ‘uint64_t’ {aka ‘long unsigned int’} to ‘uint32_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
    628 |         cpu->midr = t;
        |                     ^

and future-proofs us against a possible future architecture
change using some of the top 32 bits.

Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20200428172634.29707-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h | 2 +-
 target/arm/cpu.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
         uint64_t id_aa64dfr0;
         uint64_t id_aa64dfr1;
     } isar;
-    uint32_t midr;
+    uint64_t midr;
     uint32_t revidr;
     uint32_t reset_fpsid;
     uint32_t ctr;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo arm_cpus[] = {
 static Property arm_cpu_properties[] = {
     DEFINE_PROP_BOOL("start-powered-off", ARMCPU, start_powered_off, false),
     DEFINE_PROP_UINT32("psci-conduit", ARMCPU, psci_conduit, 0),
-    DEFINE_PROP_UINT32("midr", ARMCPU, midr, 0),
+    DEFINE_PROP_UINT64("midr", ARMCPU, midr, 0),
     DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
                         mp_affinity, ARM64_AFFINITY_INVALID),
     DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
-- 
2.20.1