Series comparison

-[PULL 00/37] target-arm queue
+[PULL 00/39] target-arm queue
-Nothing much exciting here, but it's 37 patches worth...
+Most of this is the Neon decodetree patches, followed by Edgar's versal cleanups.
 thanks
 -- PMM
-The following changes since commit e64a62df378a746c0b257105959613c9f8122e59:
-  Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-040320-1' into staging (2020-03-05 12:13:51 +0000)
+The following changes since commit 2ef486e76d64436be90f7359a3071fb2a56ce835:
   Merge remote-tracking branch 'remotes/marcel/tags/rdma-pull-request' into staging (2020-05-03 14:12:56 +0100)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200305
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200504
-for you to fetch changes up to 597d61a3b1f94c53a3aaa77671697c0c5f797dbf:
+for you to fetch changes up to 9aefc6cf9b73f66062d2f914a0136756e7a28211:
-  target/arm: Clean address for DC ZVA (2020-03-05 16:09:21 +0000)
+  target/arm: Move gen_ function typedefs to translate.h (2020-05-04 12:59:26 +0100)
 ----------------------------------------------------------------
- * versal: Implement ADMA
+target-arm queue:
- * Implement (trivially) ARMv8.2-TTCNP
+ * Start of conversion of Neon insns to decodetree
- * hw/arm/smmu-common: a fix to smmu_find_smmu_pcibus
+ * versal board: support SD and RTC
- * Remove unnecessary endianness-handling on some boards
+ * Implement ARMv8.2-TTS2UXN
- * Avoid minor memory leaks from timer_new in some devices
+ * Make VQDMULL undefined when U=1
- * Honour more of the HCR_EL2 trap bits
+ * Some minor code cleanups
  * Complain rather than ignoring bad command line options for cubieboard
  * Honour TBI for DC ZVA and exception return
 ----------------------------------------------------------------
-Edgar E. Iglesias (2):
+Edgar E. Iglesias (11):
-      hw/arm: versal: Add support for the LPD ADMAs
+      hw/arm: versal: Remove inclusion of arm_gicv3_common.h
-      hw/arm: versal: Generate xlnx-versal-virt zdma FDT nodes
+      hw/arm: versal: Move misplaced comment
       hw/arm: versal-virt: Fix typo xlnx-ve -> xlnx-versal
       hw/arm: versal: Embed the UARTs into the SoC type
       hw/arm: versal: Embed the GEMs into the SoC type
       hw/arm: versal: Embed the ADMAs into the SoC type
       hw/arm: versal: Embed the APUs into the SoC type
       hw/arm: versal: Add support for SD
       hw/arm: versal: Add support for the RTC
       hw/arm: versal-virt: Add support for SD
       hw/arm: versal-virt: Add support for the RTC
-Eric Auger (1):
+Fredrik Strupe (1):
-      hw/arm/smmu-common: a fix to smmu_find_smmu_pcibus
+      target/arm: Make VQDMULL undefined when U=1
-Niek Linnenbank (4):
+Peter Maydell (25):
-      hw/arm/cubieboard: use ARM Cortex-A8 as the default CPU in machine definition
+      target/arm: Don't use a TLB for ARMMMUIdx_Stage2
-      hw/arm/cubieboard: restrict allowed CPU type to ARM Cortex-A8
+      target/arm: Use enum constant in get_phys_addr_lpae() call
-      hw/arm/cubieboard: restrict allowed RAM size to 512MiB and 1GiB
+      target/arm: Add new 's1_is_el0' argument to get_phys_addr_lpae()
-      hw/arm/cubieboard: report error when using unsupported -bios argument
+      target/arm: Implement ARMv8.2-TTS2UXN
       target/arm: Use correct variable for setting 'max' cpu's ID_AA64DFR0
       target/arm/translate-vfp.inc.c: Remove duplicate simd_r32 check
       target/arm: Don't allow Thumb Neon insns without FEATURE_NEON
       target/arm: Add stubs for AArch32 Neon decodetree
       target/arm: Convert VCMLA (vector) to decodetree
       target/arm: Convert VCADD (vector) to decodetree
       target/arm: Convert V[US]DOT (vector) to decodetree
       target/arm: Convert VFM[AS]L (vector) to decodetree
       target/arm: Convert VCMLA (scalar) to decodetree
       target/arm: Convert V[US]DOT (scalar) to decodetree
       target/arm: Convert VFM[AS]L (scalar) to decodetree
       target/arm: Convert Neon load/store multiple structures to decodetree
       target/arm: Convert Neon 'load single structure to all lanes' to decodetree
       target/arm: Convert Neon 'load/store single structure' to decodetree
       target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree
       target/arm: Convert Neon 3-reg-same logic ops to decodetree
       target/arm: Convert Neon 3-reg-same VMAX/VMIN to decodetree
       target/arm: Convert Neon 3-reg-same comparisons to decodetree
       target/arm: Convert Neon 3-reg-same VQADD/VQSUB to decodetree
       target/arm: Convert Neon 3-reg-same VMUL, VMLA, VMLS, VSHL to decodetree
       target/arm: Move gen_ function typedefs to translate.h
-Pan Nengyuan (4):
+Philippe Mathieu-Daudé (2):
-      hw/arm/pxa2xx: move timer_new from init() into realize() to avoid memleaks
+      hw/arm/mps2-tz: Use TYPE_IOTKIT instead of hardcoded string
-      hw/arm/spitz: move timer_new from init() into realize() to avoid memleaks
+      target/arm: Use uint64_t for midr field in CPU state struct
       hw/arm/strongarm: move timer_new from init() into realize() to avoid memleaks
       hw/timer/cadence_ttc: move timer_new from init() into realize() to avoid memleaks
-Peter Maydell (1):
+ include/hw/arm/xlnx-versal.h    |  31 +-
-      target/arm: Implement (trivially) ARMv8.2-TTCNP
+ target/arm/cpu-param.h          |   2 +-
  target/arm/cpu.h                |  38 ++-
  target/arm/translate-a64.h      |   9 -
  target/arm/translate.h          |  26 ++
  target/arm/neon-dp.decode       |  86 +++++
  target/arm/neon-ls.decode       |  52 +++
  target/arm/neon-shared.decode   |  66 ++++
  hw/arm/mps2-tz.c                |   2 +-
  hw/arm/xlnx-versal-virt.c       |  74 ++++-
  hw/arm/xlnx-versal.c            | 115 +++++--
  target/arm/cpu.c                |   3 +-
  target/arm/cpu64.c              |   8 +-
  target/arm/helper.c             | 183 ++++------
  target/arm/translate-a64.c      |  17 -
  target/arm/translate-neon.inc.c | 714 +++++++++++++++++++++++++++++++++++++++
  target/arm/translate-vfp.inc.c  |   6 -
  target/arm/translate.c          | 716 +++-------------------------------------
  target/arm/Makefile.objs        |  18 +
 files changed, 1302 insertions(+), 864 deletions(-)
  create mode 100644 target/arm/neon-dp.decode
  create mode 100644 target/arm/neon-ls.decode
  create mode 100644 target/arm/neon-shared.decode
  create mode 100644 target/arm/translate-neon.inc.c
-Philippe Mathieu-Daudé (6):
-      hw/arm/smmu-common: Simplify smmu_find_smmu_pcibus() logic
-      hw/arm/gumstix: Simplify since the machines are little-endian only
-      hw/arm/mainstone: Simplify since the machines are little-endian only
-      hw/arm/omap_sx1: Simplify since the machines are little-endian only
-      hw/arm/z2: Simplify since the machines are little-endian only
-      hw/arm/musicpal: Simplify since the machines are little-endian only
-Richard Henderson (19):
-      target/arm: Improve masking of HCR/HCR2 RES0 bits
-      target/arm: Add HCR_EL2 bit definitions from ARMv8.6
-      target/arm: Disable has_el2 and has_el3 for user-only
-      target/arm: Remove EL2 and EL3 setup from user-only
-      target/arm: Improve masking in arm_hcr_el2_eff
-      target/arm: Honor the HCR_EL2.{TVM,TRVM} bits
-      target/arm: Honor the HCR_EL2.TSW bit
-      target/arm: Honor the HCR_EL2.TACR bit
-      target/arm: Honor the HCR_EL2.TPCP bit
-      target/arm: Honor the HCR_EL2.TPU bit
-      target/arm: Honor the HCR_EL2.TTLB bit
-      tests/tcg/aarch64: Add newline in pauth-1 printf
-      target/arm: Replicate TBI/TBID bits for single range regimes
-      target/arm: Optimize cpu_mmu_index
-      target/arm: Introduce core_to_aa64_mmu_idx
-      target/arm: Apply TBI to ESR_ELx in helper_exception_return
-      target/arm: Move helper_dc_zva to helper-a64.c
-      target/arm: Use DEF_HELPER_FLAGS for helper_dc_zva
-      target/arm: Clean address for DC ZVA
- include/hw/arm/xlnx-versal.h |   6 +
- target/arm/cpu.h             |  30 ++--
- target/arm/helper-a64.h      |   1 +
- target/arm/helper.h          |   1 -
- target/arm/internals.h       |   6 +
- hw/arm/cubieboard.c          |  29 +++-
- hw/arm/gumstix.c             |  16 +-
- hw/arm/mainstone.c           |   8 +-
- hw/arm/musicpal.c            |  10 --
- hw/arm/omap_sx1.c            |  11 +-
- hw/arm/pxa2xx.c              |  17 +-
- hw/arm/smmu-common.c         |  20 +--
- hw/arm/spitz.c               |   8 +-
- hw/arm/strongarm.c           |  18 ++-
- hw/arm/xlnx-versal-virt.c    |  28 ++++
- hw/arm/xlnx-versal.c         |  24 +++
- hw/arm/z2.c                  |   8 +-
- hw/timer/cadence_ttc.c       |  18 ++-
- target/arm/cpu.c             |  13 +-
- target/arm/cpu64.c           |   2 +
- target/arm/helper-a64.c      | 114 ++++++++++++-
- target/arm/helper.c          | 373 ++++++++++++++++++++++++++++++-------------
- target/arm/op_helper.c       |  93 -----------
- target/arm/translate-a64.c   |   4 +-
- tests/tcg/aarch64/pauth-1.c  |   2 +-
-files changed, 551 insertions(+), 309 deletions(-)

-[PULL 26/37] tests/tcg/aarch64: Add newline in pauth-1 printf
+[PULL 01/39] target/arm: Make VQDMULL undefined when U=1
-From: Richard Henderson <richard.henderson@linaro.org>
+From: Fredrik Strupe <fredrik@strupe.net>
-Make the output just a bit prettier when running by hand.
+According to Arm ARM, VQDMULL is only valid when U=0, while having
 U=1 is unallocated.
-Cc: Alex Bennée <alex.bennee@linaro.org>
+Signed-off-by: Fredrik Strupe <fredrik@strupe.net>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Fixes: 695272dcb976 ("target-arm: Handle UNDEF cases for Neon 3-regs-different-widths")
 Message-id: 20200229012811.24129-13-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- tests/tcg/aarch64/pauth-1.c | 2 +-
+ target/arm/translate.c | 2 +-
 file changed, 1 insertion(+), 1 deletion(-)
-diff --git a/tests/tcg/aarch64/pauth-1.c b/tests/tcg/aarch64/pauth-1.c
+diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
---- a/tests/tcg/aarch64/pauth-1.c
+--- a/target/arm/translate.c
-+++ b/tests/tcg/aarch64/pauth-1.c
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ int main()
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     }
+                     {0, 0, 0, 0}, /* VMLSL */
+                     {0, 0, 0, 9}, /* VQDMLSL */
-     perc = (float) count / (float) (TESTS * 2);
+                     {0, 0, 0, 0}, /* Integer VMULL */
--    printf("Ptr Check: %0.2f%%", perc * 100.0);
+-                    {0, 0, 0, 1}, /* VQDMULL */
-+    printf("Ptr Check: %0.2f%%\n", perc * 100.0);
++                    {0, 0, 0, 9}, /* VQDMULL */
-     assert(perc > 0.95);
+                     {0, 0, 0, 0xa}, /* Polynomial VMULL */
-     return 0;
+                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
- }
+                 };
 --
 .20.1

-[PULL 06/37] hw/arm/gumstix: Simplify since the machines are little-endian only
+[PULL 02/39] hw/arm/mps2-tz: Use TYPE_IOTKIT instead of hardcoded string
 From: Philippe Mathieu-Daudé <f4bug@amsat.org>
-As the Connex and Verdex machines only boot in little-endian,
+By using the TYPE_* definitions for devices, we can:
-we can simplify the code.
+ - quickly find where devices are used with 'git-grep'
  - easily rename a device (one-line change).
+Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20200428154650.21991-1-f4bug@amsat.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/gumstix.c | 16 ++--------------
+ hw/arm/mps2-tz.c | 2 +-
-file changed, 2 insertions(+), 14 deletions(-)
+file changed, 1 insertion(+), 1 deletion(-)
-diff --git a/hw/arm/gumstix.c b/hw/arm/gumstix.c
+diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/gumstix.c
+--- a/hw/arm/mps2-tz.c
-+++ b/hw/arm/gumstix.c
++++ b/hw/arm/mps2-tz.c
-@@ -XXX,XX +XXX,XX @@ static void connex_init(MachineState *machine)
+@@ -XXX,XX +XXX,XX @@ static void mps2tz_common_init(MachineState *machine)
- {
+         exit(EXIT_FAILURE);
      PXA2xxState *cpu;
      DriveInfo *dinfo;
 -    int be;
      MemoryRegion *address_space_mem = get_system_memory();
      uint32_t connex_rom = 0x01000000;
@@ -XXX,XX +XXX,XX @@ static void connex_init(MachineState *machine)
          exit(1);
      }
--#ifdef TARGET_WORDS_BIGENDIAN
+-    sysbus_init_child_obj(OBJECT(machine), "iotkit", &mms->iotkit,
--    be = 1;
++    sysbus_init_child_obj(OBJECT(machine), TYPE_IOTKIT, &mms->iotkit,
--#else
+                           sizeof(mms->iotkit), mmc->armsse_type);
--    be = 0;
+     iotkitdev = DEVICE(&mms->iotkit);
--#endif
+     object_property_set_link(OBJECT(&mms->iotkit), OBJECT(system_memory),
      if (!pflash_cfi01_register(0x00000000, "connext.rom", connex_rom,
                                 dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
 -                               sector_len, 2, 0, 0, 0, 0, be)) {
 +                               sector_len, 2, 0, 0, 0, 0, 0)) {
          error_report("Error registering flash memory");
          exit(1);
      }
@@ -XXX,XX +XXX,XX @@ static void verdex_init(MachineState *machine)
  {
      PXA2xxState *cpu;
      DriveInfo *dinfo;
 -    int be;
      MemoryRegion *address_space_mem = get_system_memory();
      uint32_t verdex_rom = 0x02000000;
@@ -XXX,XX +XXX,XX @@ static void verdex_init(MachineState *machine)
          exit(1);
      }
 -#ifdef TARGET_WORDS_BIGENDIAN
 -    be = 1;
 -#else
 -    be = 0;
 -#endif
      if (!pflash_cfi01_register(0x00000000, "verdex.rom", verdex_rom,
                                 dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
 -                               sector_len, 2, 0, 0, 0, 0, be)) {
 +                               sector_len, 2, 0, 0, 0, 0, 0)) {
          error_report("Error registering flash memory");
          exit(1);
      }
 --
 .20.1

-[PULL 32/37] target/arm: Optimize cpu_mmu_index
+[PULL 03/39] target/arm: Don't use a TLB for ARMMMUIdx_Stage2
-From: Richard Henderson <richard.henderson@linaro.org>
+We define ARMMMUIdx_Stage2 as being an MMU index which uses a QEMU
+TLB.  However we never actually use the TLB -- all stage 2 lookups
-We now cache the core mmu_idx in env->hflags.  Rather than recompute
+are done by direct calls to get_phys_addr_lpae() followed by a
-from scratch, extract the field.  All of the uses of cpu_mmu_index
+physical address load via address_space_ld*().
-within target/arm are within helpers, and env->hflags is always stable
-within a translation block from whence helpers are called.
+Remove Stage2 from the list of ARM MMU indexes which correspond to
+real core MMU indexes, and instead put it in the set of "NOTLB" ARM
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+MMU indexes.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20200302175829.2183-3-richard.henderson@linaro.org
+This allows us to drop NB_MMU_MODES to 11.  It also means we can
 safely add support for the ARMv8.3-TTS2UXN extension, which adds
 permission bits to the stage 2 descriptors which define execute
 permission separatel for EL0 and EL1; supporting that while keeping
 Stage2 in a QEMU TLB would require us to use separate TLBs for
 "Stage2 for an EL0 access" and "Stage2 for an EL1 access", which is a
 lot of extra complication given we aren't even using the QEMU TLB.
 In the process of updating the comment on our MMU index use,
 fix a couple of other minor errors:
  * NS EL2 EL2&0 was missing from the list in the comment
  * some text hadn't been updated from when we bumped NB_MMU_MODES
    above 8
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200330210400.11724-2-peter.maydell@linaro.org
 ---
- target/arm/cpu.h    | 23 +++++++++++++----------
+ target/arm/cpu-param.h |   2 +-
- target/arm/helper.c |  5 -----
+ target/arm/cpu.h       |  21 +++++---
-files changed, 13 insertions(+), 15 deletions(-)
+ target/arm/helper.c    | 112 ++++-------------------------------------
+files changed, 27 insertions(+), 108 deletions(-)
 diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu-param.h
 +++ b/target/arm/cpu-param.h
@@ -XXX,XX +XXX,XX @@
  # define TARGET_PAGE_BITS_MIN  10
  #endif
 -#define NB_MMU_MODES 12
 +#define NB_MMU_MODES 11
  #endif
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
+  *     handling via the TLB. The only way to do a stage 1 translation without
+  *     the immediate stage 2 translation is via the ATS or AT system insns,
+  *     which can be slow-pathed and always do a page table walk.
++ *     The only use of stage 2 translations is either as part of an s1+2
++ *     lookup or when loading the descriptors during a stage 1 page table walk,
++ *     and in both those cases we don't use the TLB.
+  *  4. we can also safely fold together the "32 bit EL3" and "64 bit EL3"
+  *     translation regimes, because they map reasonably well to each other
+  *     and they can't both be active at the same time.
+@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
+  * NS EL1 EL1&0 stage 1+2 (aka NS PL1)
+  * NS EL1 EL1&0 stage 1+2 +PAN
+  * NS EL0 EL2&0
++ * NS EL2 EL2&0
+  * NS EL2 EL2&0 +PAN
+  * NS EL2 (aka NS PL2)
+  * S EL0 EL1&0 (aka S PL0)
+  * S EL1 EL1&0 (not used if EL3 is 32 bit)
+  * S EL1 EL1&0 +PAN
+  * S EL3 (aka S PL1)
+- * NS EL1&0 stage 2
+  *
+- * for a total of 12 different mmu_idx.
++ * for a total of 11 different mmu_idx.
+  *
+  * R profile CPUs have an MPU, but can use the same set of MMU indexes
+  * as A profile. They only need to distinguish NS EL0 and NS EL1 (and
+@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
+  * are not quite the same -- different CPU types (most notably M profile
+  * vs A/R profile) would like to use MMU indexes with different semantics,
+  * but since we don't ever need to use all of those in a single CPU we
+- * can avoid setting NB_MMU_MODES to more than 8. The lower bits of
++ * can avoid having to set NB_MMU_MODES to "total number of A profile MMU
++ * modes + total number of M profile MMU modes". The lower bits of
+  * ARMMMUIdx are the core TLB mmu index, and the higher bits are always
+  * the same for any particular CPU.
+  * Variables of type ARMMUIdx are always full values, and the core
+@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
+     ARMMMUIdx_SE10_1_PAN = 9 | ARM_MMU_IDX_A,
+     ARMMMUIdx_SE3        = 10 | ARM_MMU_IDX_A,
+-    ARMMMUIdx_Stage2     = 11 | ARM_MMU_IDX_A,
+-
+     /*
+      * These are not allocated TLBs and are used only for AT system
+      * instructions or for the first stage of an S12 page table walk.
+@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
+     ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
+     ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
+     ARMMMUIdx_Stage1_E1_PAN = 2 | ARM_MMU_IDX_NOTLB,
++    /*
++     * Not allocated a TLB: used only for second stage of an S12 page
++     * table walk, or for descriptor loads during first stage of an S1
++     * page table walk. Note that if we ever want to have a TLB for this
++     * then various TLB flush insns which currently are no-ops or flush
++     * only stage 1 MMU indexes will need to change to flush stage 2.
++     */
++    ARMMMUIdx_Stage2     = 3 | ARM_MMU_IDX_NOTLB,
+     /*
+      * M-profile.
 @@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
+     TO_CORE_BIT(SE10_1),
- #define MMU_USER_IDX 0
+     TO_CORE_BIT(SE10_1_PAN),
+     TO_CORE_BIT(SE3),
--/**
+-    TO_CORE_BIT(Stage2),
-- * cpu_mmu_index:
-- * @env: The cpu environment
+     TO_CORE_BIT(MUser),
-- * @ifetch: True for code access, false for data access.
+     TO_CORE_BIT(MPriv),
 - *
 - * Return the core mmu index for the current translation regime.
 - * This function is used by generic TCG code paths.
 - */
 -int cpu_mmu_index(CPUARMState *env, bool ifetch);
 -
  /* Indexes used when registering address spaces with cpu_address_space_init */
  typedef enum ARMASIdx {
      ARMASIdx_NS = 0,
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, BTYPE, 10, 2)         /* Not cached. */
  FIELD(TBFLAG_A64, TBID, 12, 2)
  FIELD(TBFLAG_A64, UNPRIV, 14, 1)
 +/**
 + * cpu_mmu_index:
 + * @env: The cpu environment
 + * @ifetch: True for code access, false for data access.
 + *
 + * Return the core mmu index for the current translation regime.
 + * This function is used by generic TCG code paths.
 + */
 +static inline int cpu_mmu_index(CPUARMState *env, bool ifetch)
 +{
 +    return FIELD_EX32(env->hflags, TBFLAG_ANY, MMUIDX);
 +}
 +
  static inline bool bswap_code(bool sctlr_b)
  {
  #ifdef CONFIG_USER_ONLY
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_mmu_idx(CPUARMState *env)
+@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
-     return arm_mmu_idx_el(env, arm_current_el(env));
+     tlb_flush_by_mmuidx(cs,
                          ARMMMUIdxBit_E10_1 |
                          ARMMMUIdxBit_E10_1_PAN |
 -                        ARMMMUIdxBit_E10_0 |
 -                        ARMMMUIdxBit_Stage2);
 +                        ARMMMUIdxBit_E10_0);
  }
--int cpu_mmu_index(CPUARMState *env, bool ifetch)
+ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
      tlb_flush_by_mmuidx_all_cpus_synced(cs,
                                          ARMMMUIdxBit_E10_1 |
                                          ARMMMUIdxBit_E10_1_PAN |
 -                                        ARMMMUIdxBit_E10_0 |
 -                                        ARMMMUIdxBit_Stage2);
 +                                        ARMMMUIdxBit_E10_0);
  }
 -static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                            uint64_t value)
 -{
--    return arm_to_core_mmu_idx(arm_mmu_idx(env));
+-    /* Invalidate by IPA. This has to invalidate any structures that
 -     * contain only stage 2 translation information, but does not need
 -     * to apply to structures that contain combined stage 1 and stage 2
 -     * translation information.
 -     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
 -     */
 -    CPUState *cs = env_cpu(env);
 -    uint64_t pageaddr;
 -
 -    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
 -        return;
 -    }
 -
 -    pageaddr = sextract64(value << 12, 0, 40);
 -
 -    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
 -}
 -
- #ifndef CONFIG_USER_ONLY
+-static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
- ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env)
+-                               uint64_t value)
 -{
 -    CPUState *cs = env_cpu(env);
 -    uint64_t pageaddr;
 -
 -    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
 -        return;
 -    }
 -
 -    pageaddr = sextract64(value << 12, 0, 40);
 -
 -    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
 -                                             ARMMMUIdxBit_Stage2);
 -}
  static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                uint64_t value)
@@ -XXX,XX +XXX,XX @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
          tlb_flush_by_mmuidx(cs,
                              ARMMMUIdxBit_E10_1 |
                              ARMMMUIdxBit_E10_1_PAN |
 -                            ARMMMUIdxBit_E10_0 |
 -                            ARMMMUIdxBit_Stage2);
 +                            ARMMMUIdxBit_E10_0);
          raw_write(env, ri, value);
      }
  }
@@ -XXX,XX +XXX,XX @@ static int alle1_tlbmask(CPUARMState *env)
          return ARMMMUIdxBit_SE10_1 |
                 ARMMMUIdxBit_SE10_1_PAN |
                 ARMMMUIdxBit_SE10_0;
 -    } else if (arm_feature(env, ARM_FEATURE_EL2)) {
 -        return ARMMMUIdxBit_E10_1 |
 -               ARMMMUIdxBit_E10_1_PAN |
 -               ARMMMUIdxBit_E10_0 |
 -               ARMMMUIdxBit_Stage2;
      } else {
          return ARMMMUIdxBit_E10_1 |
                 ARMMMUIdxBit_E10_1_PAN |
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                               ARMMMUIdxBit_SE3);
  }
 -static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                                    uint64_t value)
 -{
 -    /* Invalidate by IPA. This has to invalidate any structures that
 -     * contain only stage 2 translation information, but does not need
 -     * to apply to structures that contain combined stage 1 and stage 2
 -     * translation information.
 -     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
 -     */
 -    ARMCPU *cpu = env_archcpu(env);
 -    CPUState *cs = CPU(cpu);
 -    uint64_t pageaddr;
 -
 -    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
 -        return;
 -    }
 -
 -    pageaddr = sextract64(value << 12, 0, 48);
 -
 -    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
 -}
 -
 -static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                                      uint64_t value)
 -{
 -    CPUState *cs = env_cpu(env);
 -    uint64_t pageaddr;
 -
 -    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
 -        return;
 -    }
 -
 -    pageaddr = sextract64(value << 12, 0, 48);
 -
 -    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
 -                                             ARMMMUIdxBit_Stage2);
 -}
 -
  static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
                                        bool isread)
  {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+       .writefn = tlbi_aa64_vae1_write },
+     { .name = "TLBI_IPAS2E1IS", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
+-      .access = PL2_W, .type = ARM_CP_NO_RAW,
+-      .writefn = tlbi_aa64_ipas2e1is_write },
++      .access = PL2_W, .type = ARM_CP_NOP },
+     { .name = "TLBI_IPAS2LE1IS", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
+-      .access = PL2_W, .type = ARM_CP_NO_RAW,
+-      .writefn = tlbi_aa64_ipas2e1is_write },
++      .access = PL2_W, .type = ARM_CP_NOP },
+     { .name = "TLBI_ALLE1IS", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 4,
+       .access = PL2_W, .type = ARM_CP_NO_RAW,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+       .writefn = tlbi_aa64_alle1is_write },
+     { .name = "TLBI_IPAS2E1", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
+-      .access = PL2_W, .type = ARM_CP_NO_RAW,
+-      .writefn = tlbi_aa64_ipas2e1_write },
++      .access = PL2_W, .type = ARM_CP_NOP },
+     { .name = "TLBI_IPAS2LE1", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
+-      .access = PL2_W, .type = ARM_CP_NO_RAW,
+-      .writefn = tlbi_aa64_ipas2e1_write },
++      .access = PL2_W, .type = ARM_CP_NOP },
+     { .name = "TLBI_ALLE1", .state = ARM_CP_STATE_AA64,
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 4,
+       .access = PL2_W, .type = ARM_CP_NO_RAW,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+       .writefn = tlbimva_hyp_is_write },
+     { .name = "TLBIIPAS2",
+       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
+-      .type = ARM_CP_NO_RAW, .access = PL2_W,
+-      .writefn = tlbiipas2_write },
++      .type = ARM_CP_NOP, .access = PL2_W },
+     { .name = "TLBIIPAS2IS",
+       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
+-      .type = ARM_CP_NO_RAW, .access = PL2_W,
+-      .writefn = tlbiipas2_is_write },
++      .type = ARM_CP_NOP, .access = PL2_W },
+     { .name = "TLBIIPAS2L",
+       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
+-      .type = ARM_CP_NO_RAW, .access = PL2_W,
+-      .writefn = tlbiipas2_write },
++      .type = ARM_CP_NOP, .access = PL2_W },
+     { .name = "TLBIIPAS2LIS",
+       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
+-      .type = ARM_CP_NO_RAW, .access = PL2_W,
+-      .writefn = tlbiipas2_is_write },
++      .type = ARM_CP_NOP, .access = PL2_W },
+     /* 32 bit cache operations */
+     { .name = "ICIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
+       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
 --
 .20.1

-[PULL 25/37] target/arm: Honor the HCR_EL2.TTLB bit
+[PULL 04/39] target/arm: Use enum constant in get_phys_addr_lpae() call
-From: Richard Henderson <richard.henderson@linaro.org>
+The access_type argument to get_phys_addr_lpae() is an MMUAccessType;
 use the enum constant MMU_DATA_LOAD rather than a literal 0 when we
 call it in S1_ptw_translate().
-This bit traps EL1 access to tlb maintenance insns.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200229012811.24129-12-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200330210400.11724-3-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 85 +++++++++++++++++++++++++++++----------------
+ target/arm/helper.c | 5 +++--
-file changed, 55 insertions(+), 30 deletions(-)
+file changed, 3 insertions(+), 2 deletions(-)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tacr(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
-     return CP_ACCESS_OK;
+             pcacheattrs = &cacheattrs;
- }
+         }
-+/* Check for traps from EL1 due to HCR_EL2.TTLB. */
+-        ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_Stage2, &s2pa,
-+static CPAccessResult access_ttlb(CPUARMState *env, const ARMCPRegInfo *ri,
+-                                 &txattrs, &s2prot, &s2size, fi, pcacheattrs);
-+                                  bool isread)
++        ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
-+{
++                                 &s2pa, &txattrs, &s2prot, &s2size, fi,
-+    if (arm_current_el(env) == 1 && (arm_hcr_el2_eff(env) & HCR_TTLB)) {
++                                 pcacheattrs);
-+        return CP_ACCESS_TRAP_EL2;
+         if (ret) {
-+    }
+             assert(fi->type != ARMFault_None);
-+    return CP_ACCESS_OK;
+             fi->s2addr = addr;
 +}
 +
  static void dacr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
  {
      ARMCPU *cpu = env_archcpu(env);
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
        .type = ARM_CP_NO_RAW, .access = PL1_R, .readfn = isr_read },
      /* 32 bit ITLB invalidates */
      { .name = "ITLBIALL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 0,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiall_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbiall_write },
      { .name = "ITLBIMVA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 1,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbimva_write },
      { .name = "ITLBIASID", .cp = 15, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 2,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiasid_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbiasid_write },
      /* 32 bit DTLB invalidates */
      { .name = "DTLBIALL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 0,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiall_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbiall_write },
      { .name = "DTLBIMVA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 1,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbimva_write },
      { .name = "DTLBIASID", .cp = 15, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 2,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiasid_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbiasid_write },
      /* 32 bit TLB invalidates */
      { .name = "TLBIALL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 0,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiall_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbiall_write },
      { .name = "TLBIMVA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 1,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbimva_write },
      { .name = "TLBIASID", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 2,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiasid_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbiasid_write },
      { .name = "TLBIMVAA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 3,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimvaa_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbimvaa_write },
      REGINFO_SENTINEL
  };
  static const ARMCPRegInfo v7mp_cp_reginfo[] = {
      /* 32 bit TLB invalidates, Inner Shareable */
      { .name = "TLBIALLIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 0,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiall_is_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbiall_is_write },
      { .name = "TLBIMVAIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 1,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_is_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbimva_is_write },
      { .name = "TLBIASIDIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 2,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W,
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
        .writefn = tlbiasid_is_write },
      { .name = "TLBIMVAAIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W,
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
        .writefn = tlbimvaa_is_write },
      REGINFO_SENTINEL
  };
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
      /* TLBI operations */
      { .name = "TLBI_VMALLE1IS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 0,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vmalle1is_write },
      { .name = "TLBI_VAE1IS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 1,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vae1is_write },
      { .name = "TLBI_ASIDE1IS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 2,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vmalle1is_write },
      { .name = "TLBI_VAAE1IS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vae1is_write },
      { .name = "TLBI_VALE1IS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 5,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vae1is_write },
      { .name = "TLBI_VAALE1IS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 7,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vae1is_write },
      { .name = "TLBI_VMALLE1", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 0,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vmalle1_write },
      { .name = "TLBI_VAE1", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 1,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vae1_write },
      { .name = "TLBI_ASIDE1", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 2,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vmalle1_write },
      { .name = "TLBI_VAAE1", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 3,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vae1_write },
      { .name = "TLBI_VALE1", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 5,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vae1_write },
      { .name = "TLBI_VAALE1", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 7,
 -      .access = PL1_W, .type = ARM_CP_NO_RAW,
 +      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vae1_write },
      { .name = "TLBI_IPAS2E1IS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
  #endif
      /* TLB invalidate last level of translation table walk */
      { .name = "TLBIMVALIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 5,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_is_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbimva_is_write },
      { .name = "TLBIMVAALIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 7,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W,
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
        .writefn = tlbimvaa_is_write },
      { .name = "TLBIMVAL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 5,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbimva_write },
      { .name = "TLBIMVAAL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 7,
 -      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimvaa_write },
 +      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
 +      .writefn = tlbimvaa_write },
      { .name = "TLBIMVALH", .cp = 15, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 5,
        .type = ARM_CP_NO_RAW, .access = PL2_W,
        .writefn = tlbimva_hyp_write },
 --
 .20.1

-[PULL 24/37] target/arm: Honor the HCR_EL2.TPU bit
+[PULL 05/39] target/arm: Add new 's1_is_el0' argument to get_phys_addr_lpae()
-From: Richard Henderson <richard.henderson@linaro.org>
+For ARMv8.2-TTS2UXN, the stage 2 page table walk wants to know
 whether the stage 1 access is for EL0 or not, because whether
 exec permission is given can depend on whether this is an EL0
 or EL1 access. Add a new argument to get_phys_addr_lpae() so
 the call sites can pass this information in.
-This bit traps EL1 access to cache maintenance insns that operate
+Since get_phys_addr_lpae() doesn't already have a doc comment,
-to the point of unification.  There are no longer any references to
+add one so we have a place to put the documentation of the
-plain aa64_cacheop_access, so remove it.
+semantics of the new s1_is_el0 argument.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200229012811.24129-11-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200330210400.11724-4-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 53 +++++++++++++++++++++++++++------------------
+ target/arm/helper.c | 29 ++++++++++++++++++++++++++++-
-file changed, 32 insertions(+), 21 deletions(-)
+file changed, 28 insertions(+), 1 deletion(-)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo uao_reginfo = {
+@@ -XXX,XX +XXX,XX @@
-     .readfn = aa64_uao_read, .writefn = aa64_uao_write
- };
+ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
+                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
--static CPAccessResult aa64_cacheop_access(CPUARMState *env,
++                               bool s1_is_el0,
--                                          const ARMCPRegInfo *ri,
+                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
--                                          bool isread)
+                                target_ulong *page_size_ptr,
--{
+                                ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs);
--    /* Cache invalidate/clean: NOP, but EL0 must UNDEF unless
+@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
--     * SCTLR_EL1.UCI is set.
+         }
--     */
--    if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UCI)) {
+         ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
--        return CP_ACCESS_TRAP;
++                                 false,
--    }
+                                  &s2pa, &txattrs, &s2prot, &s2size, fi,
--    return CP_ACCESS_OK;
+                                  pcacheattrs);
--}
+         if (ret) {
--
+@@ -XXX,XX +XXX,XX @@ static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
- static CPAccessResult aa64_cacheop_poc_access(CPUARMState *env,
+     };
                                                const ARMCPRegInfo *ri,
                                                bool isread)
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_poc_access(CPUARMState *env,
      return CP_ACCESS_OK;
  }
-+static CPAccessResult aa64_cacheop_pou_access(CPUARMState *env,
++/**
-+                                              const ARMCPRegInfo *ri,
++ * get_phys_addr_lpae: perform one stage of page table walk, LPAE format
-+                                              bool isread)
++ *
-+{
++ * Returns false if the translation was successful. Otherwise, phys_ptr, attrs,
-+    /* Cache invalidate/clean to Point of Unification... */
++ * prot and page_size may not be filled in, and the populated fsr value provides
-+    switch (arm_current_el(env)) {
++ * information on why the translation aborted, in the format of a long-format
-+    case 0:
++ * DFSR/IFSR fault register, with the following caveats:
-+        /* ... EL0 must UNDEF unless SCTLR_EL1.UCI is set.  */
++ *  * the WnR bit is never set (the caller must do this).
-+        if (!(arm_sctlr(env, 0) & SCTLR_UCI)) {
++ *
-+            return CP_ACCESS_TRAP;
++ * @env: CPUARMState
-+        }
++ * @address: virtual address to get physical address for
-+        /* fall through */
++ * @access_type: MMU_DATA_LOAD, MMU_DATA_STORE or MMU_INST_FETCH
-+    case 1:
++ * @mmu_idx: MMU index indicating required translation regime
-+        /* ... EL1 must trap to EL2 if HCR_EL2.TPU is set.  */
++ * @s1_is_el0: if @mmu_idx is ARMMMUIdx_Stage2 (so this is a stage 2 page table
-+        if (arm_hcr_el2_eff(env) & HCR_TPU) {
++ *             walk), must be true if this is stage 2 of a stage 1+2 walk for an
-+            return CP_ACCESS_TRAP_EL2;
++ *             EL0 access). If @mmu_idx is anything else, @s1_is_el0 is ignored.
-+        }
++ * @phys_ptr: set to the physical address corresponding to the virtual address
-+        break;
++ * @attrs: set to the memory transaction attributes to use
-+    }
++ * @prot: set to the permissions for the page containing phys_ptr
-+    return CP_ACCESS_OK;
++ * @page_size_ptr: set to the size of the page containing phys_ptr
-+}
++ * @fi: set to fault info if the translation fails
-+
++ * @cacheattrs: (if non-NULL) set to the cacheability/shareability attributes
- /* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
++ */
-  * Page D4-1736 (DDI0487A.b)
+ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
-  */
+                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
++                               bool s1_is_el0,
-     /* Cache ops: all NOPs since we don't emulate caches */
+                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
-     { .name = "IC_IALLUIS", .state = ARM_CP_STATE_AA64,
+                                target_ulong *page_size_ptr,
-       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
+                                ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs)
--      .access = PL1_W, .type = ARM_CP_NOP },
+@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
-+      .access = PL1_W, .type = ARM_CP_NOP,
-+      .accessfn = aa64_cacheop_pou_access },
+             /* S1 is done. Now do S2 translation.  */
-     { .name = "IC_IALLU", .state = ARM_CP_STATE_AA64,
+             ret = get_phys_addr_lpae(env, ipa, access_type, ARMMMUIdx_Stage2,
-       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 0,
++                                     mmu_idx == ARMMMUIdx_E10_0,
--      .access = PL1_W, .type = ARM_CP_NOP },
+                                      phys_ptr, attrs, &s2_prot,
-+      .access = PL1_W, .type = ARM_CP_NOP,
+                                      page_size, fi,
-+      .accessfn = aa64_cacheop_pou_access },
+                                      cacheattrs != NULL ? &cacheattrs2 : NULL);
-     { .name = "IC_IVAU", .state = ARM_CP_STATE_AA64,
+@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
-       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 5, .opc2 = 1,
+     }
-       .access = PL0_W, .type = ARM_CP_NOP,
--      .accessfn = aa64_cacheop_access },
+     if (regime_using_lpae_format(env, mmu_idx)) {
-+      .accessfn = aa64_cacheop_pou_access },
+-        return get_phys_addr_lpae(env, address, access_type, mmu_idx,
-     { .name = "DC_IVAC", .state = ARM_CP_STATE_AA64,
++        return get_phys_addr_lpae(env, address, access_type, mmu_idx, false,
-       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
+                                   phys_ptr, attrs, prot, page_size,
-       .access = PL1_W, .accessfn = aa64_cacheop_poc_access,
+                                   fi, cacheattrs);
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+     } else if (regime_sctlr(env, mmu_idx) & SCTLR_XP) {
      { .name = "DC_CVAU", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 11, .opc2 = 1,
        .access = PL0_W, .type = ARM_CP_NOP,
 -      .accessfn = aa64_cacheop_access },
 +      .accessfn = aa64_cacheop_pou_access },
      { .name = "DC_CIVAC", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 1,
        .access = PL0_W, .type = ARM_CP_NOP,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
        .writefn = tlbiipas2_is_write },
      /* 32 bit cache operations */
      { .name = "ICIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
 -      .type = ARM_CP_NOP, .access = PL1_W },
 +      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
      { .name = "BPIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 6,
        .type = ARM_CP_NOP, .access = PL1_W },
      { .name = "ICIALLU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 0,
 -      .type = ARM_CP_NOP, .access = PL1_W },
 +      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
      { .name = "ICIMVAU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 1,
 -      .type = ARM_CP_NOP, .access = PL1_W },
 +      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
      { .name = "BPIALL", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 6,
        .type = ARM_CP_NOP, .access = PL1_W },
      { .name = "BPIMVA", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 7,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
      { .name = "DCCSW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
        .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
      { .name = "DCCMVAU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 11, .opc2 = 1,
 -      .type = ARM_CP_NOP, .access = PL1_W },
 +      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
      { .name = "DCCIMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 1,
        .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_poc_access },
      { .name = "DCCISW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 2,
 --
 .20.1

-[PULL 03/37] target/arm: Implement (trivially) ARMv8.2-TTCNP
+[PULL 06/39] target/arm: Implement ARMv8.2-TTS2UXN
-The ARMv8.2-TTCNP extension allows an implementation to optimize by
+The ARMv8.2-TTS2UXN feature extends the XN field in stage 2
-sharing TLB entries between multiple cores, provided that software
+translation table descriptors from just bit [54] to bits [54:53],
-declares that it's ready to deal with this by setting a CnP bit in
+allowing stage 2 to control execution permissions separately for EL0
-the TTBRn_ELx.  It is mandatory from ARMv8.2 onward.
+and EL1. Implement the new semantics of the XN field and enable
+the feature for our 'max' CPU.
 For QEMU's TLB implementation, sharing TLB entries between different
 cores would not really benefit us and would be a lot of work to
 implement.  So we implement this extension in the "trivial" manner:
 we allow the guest to set and read back the CnP bit, but don't change
 our behaviour (this is an architecturally valid implementation
 choice).
 The only code path which looks at the TTBRn_ELx values for the
 long-descriptor format where the CnP bit is defined is already doing
 enough masking to not get confused when the CnP bit at the bottom of
 the register is set, so we can simply add a comment noting why we're
 relying on that mask.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200225193822.18874-1-peter.maydell@linaro.org
+Message-id: 20200330210400.11724-5-peter.maydell@linaro.org
 ---
- target/arm/cpu.c    | 1 +
+ target/arm/cpu.h    | 15 +++++++++++++++
- target/arm/cpu64.c  | 2 ++
+ target/arm/cpu.c    |  1 +
- target/arm/helper.c | 4 ++++
+ target/arm/cpu64.c  |  2 ++
-files changed, 7 insertions(+)
+ target/arm/helper.c | 37 +++++++++++++++++++++++++++++++------
 files changed, 49 insertions(+), 6 deletions(-)
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_ccidx(const ARMISARegisters *id)
+     return FIELD_EX32(id->id_mmfr4, ID_MMFR4, CCIDX) != 0;
+ }
++static inline bool isar_feature_aa32_tts2uxn(const ARMISARegisters *id)
++{
++    return FIELD_EX32(id->id_mmfr4, ID_MMFR4, XNX) != 0;
++}
++
+ /*
+  * 64-bit feature tests via id registers.
+  */
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_ccidx(const ARMISARegisters *id)
+     return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, CCIDX) != 0;
+ }
++static inline bool isar_feature_aa64_tts2uxn(const ARMISARegisters *id)
++{
++    return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, XNX) != 0;
++}
++
+ /*
+  * Feature tests for "does this exist in either 32-bit or 64-bit?"
+  */
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_ccidx(const ARMISARegisters *id)
+     return isar_feature_aa64_ccidx(id) || isar_feature_aa32_ccidx(id);
+ }
++static inline bool isar_feature_any_tts2uxn(const ARMISARegisters *id)
++{
++    return isar_feature_aa64_tts2uxn(id) || isar_feature_aa32_tts2uxn(id);
++}
++
+ /*
+  * Forward to the above feature tests given an ARMCPU pointer.
+  */
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
 @@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
-             t = cpu->isar.id_mmfr4;
              t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
              t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
-+            t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
+             t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
 +            t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* TTS2UXN */
              cpu->isar.id_mmfr4 = t;
          }
  #endif
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
 @@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
+         t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);
+         t = FIELD_DP64(t, ID_AA64MMFR1, PAN, 2); /* ATS1E1 */
+         t = FIELD_DP64(t, ID_AA64MMFR1, VMIDBITS, 2); /* VMID16 */
++        t = FIELD_DP64(t, ID_AA64MMFR1, XNX, 1); /* TTS2UXN */
+         cpu->isar.id_aa64mmfr1 = t;
          t = cpu->isar.id_aa64mmfr2;
-         t = FIELD_DP64(t, ID_AA64MMFR2, UAO, 1);
-+        t = FIELD_DP64(t, ID_AA64MMFR2, CNP, 1); /* TTCNP */
-         cpu->isar.id_aa64mmfr2 = t;
-         /* Replicate the same data to the 32-bit id registers.  */
 @@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
-         u = cpu->isar.id_mmfr4;
          u = FIELD_DP32(u, ID_MMFR4, HPDS, 1); /* AA32HPD */
          u = FIELD_DP32(u, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
-+        u = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
+         u = FIELD_DP32(u, ID_MMFR4, CNP, 1); /* TTCNP */
 +        u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
          cpu->isar.id_mmfr4 = u;
          u = cpu->isar.id_aa64dfr0;
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ simple_ap_to_rw_prot(CPUARMState *env, ARMMMUIdx mmu_idx, int ap)
+  *
+  * @env:     CPUARMState
+  * @s2ap:    The 2-bit stage2 access permissions (S2AP)
+- * @xn:      XN (execute-never) bit
++ * @xn:      XN (execute-never) bits
++ * @s1_is_el0: true if this is S2 of an S1+2 walk for EL0
+  */
+-static int get_S2prot(CPUARMState *env, int s2ap, int xn)
++static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
+ {
+     int prot = 0;
+@@ -XXX,XX +XXX,XX @@ static int get_S2prot(CPUARMState *env, int s2ap, int xn)
+     if (s2ap & 2) {
+         prot |= PAGE_WRITE;
+     }
+-    if (!xn) {
+-        if (arm_el_is_aa64(env, 2) || prot & PAGE_READ) {
++
++    if (cpu_isar_feature(any_tts2uxn, env_archcpu(env))) {
++        switch (xn) {
++        case 0:
+             prot |= PAGE_EXEC;
++            break;
++        case 1:
++            if (s1_is_el0) {
++                prot |= PAGE_EXEC;
++            }
++            break;
++        case 2:
++            break;
++        case 3:
++            if (!s1_is_el0) {
++                prot |= PAGE_EXEC;
++            }
++            break;
++        default:
++            g_assert_not_reached();
++        }
++    } else {
++        if (!extract32(xn, 1, 1)) {
++            if (arm_el_is_aa64(env, 2) || prot & PAGE_READ) {
++                prot |= PAGE_EXEC;
++            }
+         }
+     }
+     return prot;
 @@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
+     }
-     /* Now we can extract the actual base address from the TTBR */
-     descaddr = extract64(ttbr, 0, 48);
+     ap = extract32(attrs, 4, 2);
-+    /*
+-    xn = extract32(attrs, 12, 1);
-+     * We rely on this masking to clear the RES0 bits at the bottom of the TTBR
-+     * and also to mask out CnP (bit 0) which could validly be non-zero.
+     if (mmu_idx == ARMMMUIdx_Stage2) {
-+     */
+         ns = true;
-     descaddr &= ~indexmask;
+-        *prot = get_S2prot(env, ap, xn);
++        xn = extract32(attrs, 11, 2);
-     /* The address field in the descriptor goes up to bit 39 for ARMv7
++        *prot = get_S2prot(env, ap, xn, s1_is_el0);
      } else {
          ns = extract32(attrs, 3, 1);
 +        xn = extract32(attrs, 12, 1);
          pxn = extract32(attrs, 11, 1);
          *prot = get_S1prot(env, mmu_idx, aarch64, ap, ns, xn, pxn);
      }
 --
 .20.1

-[PULL 22/37] target/arm: Honor the HCR_EL2.TACR bit
+[PULL 07/39] target/arm: Use correct variable for setting 'max' cpu's ID_AA64DFR0
-From: Richard Henderson <richard.henderson@linaro.org>
+In aarch64_max_initfn() we update both 32-bit and 64-bit ID
 registers.  The intended pattern is that for 64-bit ID registers we
 use FIELD_DP64 and the uint64_t 't' register, while 32-bit ID
 registers use FIELD_DP32 and the uint32_t 'u' register.  For
 ID_AA64DFR0 we accidentally used 'u', meaning that the top 32 bits of
 this 64-bit ID register would end up always zero.  Luckily at the
 moment that's what they should be anyway, so this bug has no visible
 effects.
-This bit traps EL1 access to the auxiliary control registers.
+Use the right-sized variable.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Fixes: 3bec78447a958d481991
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200229012811.24129-9-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20200423110915.10527-1-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 18 ++++++++++++++----
+ target/arm/cpu64.c | 6 +++---
-file changed, 14 insertions(+), 4 deletions(-)
+file changed, 3 insertions(+), 3 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/cpu64.c
-+++ b/target/arm/helper.c
++++ b/target/arm/cpu64.c
-@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tsw(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
-     return CP_ACCESS_OK;
+         u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
- }
+         cpu->isar.id_mmfr4 = u;
-+/* Check for traps from EL1 due to HCR_EL2.TACR.  */
+-        u = cpu->isar.id_aa64dfr0;
-+static CPAccessResult access_tacr(CPUARMState *env, const ARMCPRegInfo *ri,
+-        u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
-+                                  bool isread)
+-        cpu->isar.id_aa64dfr0 = u;
-+{
++        t = cpu->isar.id_aa64dfr0;
-+    if (arm_current_el(env) == 1 && (arm_hcr_el2_eff(env) & HCR_TACR)) {
++        t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
-+        return CP_ACCESS_TRAP_EL2;
++        cpu->isar.id_aa64dfr0 = t;
-+    }
-+    return CP_ACCESS_OK;
+         u = cpu->isar.id_dfr0;
-+}
+         u = FIELD_DP32(u, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
 +
  static void dacr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
  {
      ARMCPU *cpu = env_archcpu(env);
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo ats1cp_reginfo[] = {
  static const ARMCPRegInfo actlr2_hactlr2_reginfo[] = {
      { .name = "ACTLR2", .state = ARM_CP_STATE_AA32,
        .cp = 15, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 3,
 -      .access = PL1_RW, .type = ARM_CP_CONST,
 -      .resetvalue = 0 },
 +      .access = PL1_RW, .accessfn = access_tacr,
 +      .type = ARM_CP_CONST, .resetvalue = 0 },
      { .name = "HACTLR2", .state = ARM_CP_STATE_AA32,
        .cp = 15, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 3,
        .access = PL2_RW, .type = ARM_CP_CONST,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
          ARMCPRegInfo auxcr_reginfo[] = {
              { .name = "ACTLR_EL1", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 1,
 -              .access = PL1_RW, .type = ARM_CP_CONST,
 -              .resetvalue = cpu->reset_auxcr },
 +              .access = PL1_RW, .accessfn = access_tacr,
 +              .type = ARM_CP_CONST, .resetvalue = cpu->reset_auxcr },
              { .name = "ACTLR_EL2", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 1,
                .access = PL2_RW, .type = ARM_CP_CONST,
 --
 .20.1

-[PULL 16/37] target/arm: Add HCR_EL2 bit definitions from ARMv8.6
+[PULL 08/39] target/arm: Use uint64_t for midr field in CPU state struct
-From: Richard Henderson <richard.henderson@linaro.org>
+From: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+MIDR_EL1 is a 64-bit system register with the top 32-bit being RES0.
-Message-id: 20200229012811.24129-3-richard.henderson@linaro.org
+Represent it in QEMU's ARMCPU struct with a uint64_t, not a
 uint32_t.
 This fixes an error when compiling with -Werror=conversion
 because we were manipulating the register value using a
 local uint64_t variable:
   target/arm/cpu64.c: In function ‘aarch64_max_initfn’:
   target/arm/cpu64.c:628:21: error: conversion from ‘uint64_t’ {aka ‘long unsigned int’} to ‘uint32_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
 |         cpu->midr = t;
         |                     ^
 and future-proofs us against a possible future architecture
 change using some of the top 32 bits.
 Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
 Suggested-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
 Message-id: 20200428172634.29707-1-f4bug@amsat.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h | 7 +++++++
+ target/arm/cpu.h | 2 +-
-file changed, 7 insertions(+)
+ target/arm/cpu.c | 2 +-
 files changed, 2 insertions(+), 2 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
+@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
- #define HCR_TERR      (1ULL << 36)
+         uint64_t id_aa64dfr0;
- #define HCR_TEA       (1ULL << 37)
+         uint64_t id_aa64dfr1;
- #define HCR_MIOCNCE   (1ULL << 38)
+     } isar;
-+/* RES0 bit 39 */
+-    uint32_t midr;
- #define HCR_APK       (1ULL << 40)
++    uint64_t midr;
- #define HCR_API       (1ULL << 41)
+     uint32_t revidr;
- #define HCR_NV        (1ULL << 42)
+     uint32_t reset_fpsid;
-@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
+     uint32_t ctr;
- #define HCR_NV2       (1ULL << 45)
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
- #define HCR_FWB       (1ULL << 46)
+index XXXXXXX..XXXXXXX 100644
- #define HCR_FIEN      (1ULL << 47)
+--- a/target/arm/cpu.c
-+/* RES0 bit 48 */
++++ b/target/arm/cpu.c
- #define HCR_TID4      (1ULL << 49)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo arm_cpus[] = {
- #define HCR_TICAB     (1ULL << 50)
+ static Property arm_cpu_properties[] = {
-+#define HCR_AMVOFFEN  (1ULL << 51)
+     DEFINE_PROP_BOOL("start-powered-off", ARMCPU, start_powered_off, false),
- #define HCR_TOCU      (1ULL << 52)
+     DEFINE_PROP_UINT32("psci-conduit", ARMCPU, psci_conduit, 0),
-+#define HCR_ENSCXT    (1ULL << 53)
+-    DEFINE_PROP_UINT32("midr", ARMCPU, midr, 0),
- #define HCR_TTLBIS    (1ULL << 54)
++    DEFINE_PROP_UINT64("midr", ARMCPU, midr, 0),
- #define HCR_TTLBOS    (1ULL << 55)
+     DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
- #define HCR_ATA       (1ULL << 56)
+                         mp_affinity, ARM64_AFFINITY_INVALID),
- #define HCR_DCT       (1ULL << 57)
+     DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
 +#define HCR_TID5      (1ULL << 58)
 +#define HCR_TWEDEN    (1ULL << 59)
 +#define HCR_TWEDEL    MAKE_64BIT_MASK(60, 4)
  #define SCR_NS                (1U << 0)
  #define SCR_IRQ               (1U << 1)
 --
 .20.1

-[PULL 21/37] target/arm: Honor the HCR_EL2.TSW bit
+[PULL 09/39] hw/arm: versal: Remove inclusion of arm_gicv3_common.h
-From: Richard Henderson <richard.henderson@linaro.org>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-These bits trap EL1 access to set/way cache maintenance insns.
+Remove inclusion of arm_gicv3_common.h, this already gets
 included via xlnx-versal.h.
-Buglink: https://bugs.launchpad.net/bugs/1863685
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
-Message-id: 20200229012811.24129-8-richard.henderson@linaro.org
+Message-id: 20200427181649.26851-2-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.c | 22 ++++++++++++++++------
+ hw/arm/xlnx-versal.c | 1 -
-file changed, 16 insertions(+), 6 deletions(-)
+file changed, 1 deletion(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/hw/arm/xlnx-versal.c
-+++ b/target/arm/helper.c
++++ b/hw/arm/xlnx-versal.c
-@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tvm_trvm(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@
-     return CP_ACCESS_OK;
+ #include "hw/arm/boot.h"
- }
+ #include "kvm_arm.h"
+ #include "hw/misc/unimp.h"
-+/* Check for traps from EL1 due to HCR_EL2.TSW.  */
+-#include "hw/intc/arm_gicv3_common.h"
-+static CPAccessResult access_tsw(CPUARMState *env, const ARMCPRegInfo *ri,
+ #include "hw/arm/xlnx-versal.h"
-+                                 bool isread)
+ #include "hw/char/pl011.h"
-+{
 +    if (arm_current_el(env) == 1 && (arm_hcr_el2_eff(env) & HCR_TSW)) {
 +        return CP_ACCESS_TRAP_EL2;
 +    }
 +    return CP_ACCESS_OK;
 +}
 +
  static void dacr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
  {
      ARMCPU *cpu = env_archcpu(env);
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
        .access = PL1_W, .type = ARM_CP_NOP },
      { .name = "DC_ISW", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 2,
 -      .access = PL1_W, .type = ARM_CP_NOP },
 +      .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
      { .name = "DC_CVAC", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 1,
        .access = PL0_W, .type = ARM_CP_NOP,
        .accessfn = aa64_cacheop_access },
      { .name = "DC_CSW", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
 -      .access = PL1_W, .type = ARM_CP_NOP },
 +      .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
      { .name = "DC_CVAU", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 11, .opc2 = 1,
        .access = PL0_W, .type = ARM_CP_NOP,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
        .accessfn = aa64_cacheop_access },
      { .name = "DC_CISW", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 2,
 -      .access = PL1_W, .type = ARM_CP_NOP },
 +      .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
      /* TLBI operations */
      { .name = "TLBI_VMALLE1IS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 0,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
      { .name = "DCIMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
        .type = ARM_CP_NOP, .access = PL1_W },
      { .name = "DCISW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 2,
 -      .type = ARM_CP_NOP, .access = PL1_W },
 +      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
      { .name = "DCCMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 1,
        .type = ARM_CP_NOP, .access = PL1_W },
      { .name = "DCCSW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
 -      .type = ARM_CP_NOP, .access = PL1_W },
 +      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
      { .name = "DCCMVAU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 11, .opc2 = 1,
        .type = ARM_CP_NOP, .access = PL1_W },
      { .name = "DCCIMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 1,
        .type = ARM_CP_NOP, .access = PL1_W },
      { .name = "DCCISW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 2,
 -      .type = ARM_CP_NOP, .access = PL1_W },
 +      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
      /* MMU Domain access control / MPU write buffer control */
      { .name = "DACR", .cp = 15, .opc1 = 0, .crn = 3, .crm = 0, .opc2 = 0,
        .access = PL1_RW, .accessfn = access_tvm_trvm, .resetvalue = 0,
 --
 .20.1

-[PULL 36/37] target/arm: Use DEF_HELPER_FLAGS for helper_dc_zva
+[PULL 10/39] hw/arm: versal: Move misplaced comment
-From: Richard Henderson <richard.henderson@linaro.org>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-The function does not write registers, and only reads them by
+Move misplaced comment.
 implication via the exception path.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Message-id: 20200302175829.2183-7-richard.henderson@linaro.org
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
 Message-id: 20200427181649.26851-3-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper-a64.h | 2 +-
+ hw/arm/xlnx-versal.c | 2 +-
 file changed, 1 insertion(+), 1 deletion(-)
-diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
+diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper-a64.h
+--- a/hw/arm/xlnx-versal.c
-+++ b/target/arm/helper-a64.h
++++ b/hw/arm/xlnx-versal.c
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(advsimd_f16touinth, i32, f16, ptr)
+@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
- DEF_HELPER_2(sqrt_f16, f16, f16, ptr)
+         obj = object_new(XLNX_VERSAL_ACPU_TYPE);
- DEF_HELPER_2(exception_return, void, env, i64)
+         if (!obj) {
--DEF_HELPER_2(dc_zva, void, env, i64)
+-            /* Secondary CPUs start in PSCI powered-down state */
-+DEF_HELPER_FLAGS_2(dc_zva, TCG_CALL_NO_WG, void, env, i64)
+             error_report("Unable to create apu.cpu[%d] of type %s",
+                          i, XLNX_VERSAL_ACPU_TYPE);
- DEF_HELPER_FLAGS_3(pacia, TCG_CALL_NO_WG, i64, env, i64, i64)
+             exit(EXIT_FAILURE);
- DEF_HELPER_FLAGS_3(pacib, TCG_CALL_NO_WG, i64, env, i64, i64)
+@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
          object_property_set_int(obj, s->cfg.psci_conduit,
                                  "psci-conduit", &error_abort);
          if (i) {
 +            /* Secondary CPUs start in PSCI powered-down state */
              object_property_set_bool(obj, true,
                                       "start-powered-off", &error_abort);
          }
 --
 .20.1

-[PULL 17/37] target/arm: Disable has_el2 and has_el3 for user-only
+[PULL 11/39] hw/arm: versal-virt: Fix typo xlnx-ve -> xlnx-versal
-From: Richard Henderson <richard.henderson@linaro.org>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-In arm_cpu_reset, we configure many system registers so that user-only
+Fix typo xlnx-ve -> xlnx-versal.
 behaves as it should with a minimum of ifdefs.  However, we do not set
 all of the system registers as required for a cpu with EL2 and EL3.
-Disabling EL2 and EL3 mean that we will not look at those registers,
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-which means that we don't have to worry about configuring them.
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
-Message-id: 20200229012811.24129-4-richard.henderson@linaro.org
+Message-id: 20200427181649.26851-4-edgar.iglesias@gmail.com
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.c | 6 ++++--
+ hw/arm/xlnx-versal-virt.c | 2 +-
-file changed, 4 insertions(+), 2 deletions(-)
+file changed, 1 insertion(+), 1 deletion(-)
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+--- a/hw/arm/xlnx-versal-virt.c
-+++ b/target/arm/cpu.c
++++ b/hw/arm/xlnx-versal-virt.c
-@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_reset_hivecs_property =
+@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
- static Property arm_cpu_rvbar_property =
+         psci_conduit = QEMU_PSCI_CONDUIT_SMC;
              DEFINE_PROP_UINT64("rvbar", ARMCPU, rvbar, 0);
 +#ifndef CONFIG_USER_ONLY
  static Property arm_cpu_has_el2_property =
              DEFINE_PROP_BOOL("has_el2", ARMCPU, has_el2, true);
  static Property arm_cpu_has_el3_property =
              DEFINE_PROP_BOOL("has_el3", ARMCPU, has_el3, true);
 +#endif
  static Property arm_cpu_cfgend_property =
              DEFINE_PROP_BOOL("cfgend", ARMCPU, cfgend, false);
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
          qdev_property_add_static(DEVICE(obj), &arm_cpu_rvbar_property);
      }
-+#ifndef CONFIG_USER_ONLY
+-    sysbus_init_child_obj(OBJECT(machine), "xlnx-ve", &s->soc,
-     if (arm_feature(&cpu->env, ARM_FEATURE_EL3)) {
++    sysbus_init_child_obj(OBJECT(machine), "xlnx-versal", &s->soc,
-         /* Add the has_el3 state CPU property only if EL3 is allowed.  This will
+                           sizeof(s->soc), TYPE_XLNX_VERSAL);
-          * prevent "has_el3" from existing on CPUs which cannot support EL3.
+     object_property_set_link(OBJECT(&s->soc), OBJECT(machine->ram),
-          */
+                              "ddr", &error_abort);
          qdev_property_add_static(DEVICE(obj), &arm_cpu_has_el3_property);
 -#ifndef CONFIG_USER_ONLY
          object_property_add_link(obj, "secure-memory",
                                   TYPE_MEMORY_REGION,
                                   (Object **)&cpu->secure_memory,
                                   qdev_prop_allow_set_link_before_realize,
                                   OBJ_PROP_LINK_STRONG,
                                   &error_abort);
 -#endif
      }
      if (arm_feature(&cpu->env, ARM_FEATURE_EL2)) {
          qdev_property_add_static(DEVICE(obj), &arm_cpu_has_el2_property);
      }
 +#endif
      if (arm_feature(&cpu->env, ARM_FEATURE_PMU)) {
          cpu->has_pmu = true;
 --
 .20.1

-[PULL 34/37] target/arm: Apply TBI to ESR_ELx in helper_exception_return
+[PULL 12/39] hw/arm: versal: Embed the UARTs into the SoC type
-From: Richard Henderson <richard.henderson@linaro.org>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-We missed this case within AArch64.ExceptionReturn.
+Embed the UARTs into the SoC type.
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Suggested-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Message-id: 20200302175829.2183-5-richard.henderson@linaro.org
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
 Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Luc Michel <luc.michel@greensocs.com>
 Message-id: 20200427181649.26851-5-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper-a64.c | 23 ++++++++++++++++++++++-
+ include/hw/arm/xlnx-versal.h |  3 ++-
-file changed, 22 insertions(+), 1 deletion(-)
+ hw/arm/xlnx-versal.c         | 12 ++++++------
 files changed, 8 insertions(+), 7 deletions(-)
-diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper-a64.c
+--- a/include/hw/arm/xlnx-versal.h
-+++ b/target/arm/helper-a64.c
++++ b/include/hw/arm/xlnx-versal.h
-@@ -XXX,XX +XXX,XX @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
+@@ -XXX,XX +XXX,XX @@
-                       "AArch32 EL%d PC 0x%" PRIx32 "\n",
+ #include "hw/sysbus.h"
-                       cur_el, new_el, env->regs[15]);
+ #include "hw/arm/boot.h"
-     } else {
+ #include "hw/intc/arm_gicv3.h"
-+        int tbii;
++#include "hw/char/pl011.h"
-+
-         env->aarch64 = 1;
+ #define TYPE_XLNX_VERSAL "xlnx-versal"
-         spsr &= aarch64_pstate_valid_mask(&env_archcpu(env)->isar);
+ #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
-         pstate_write(env, spsr);
+@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
-@@ -XXX,XX +XXX,XX @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
+         MemoryRegion mr_ocm;
-             env->pstate &= ~PSTATE_SS;
-         }
+         struct {
-         aarch64_restore_sp(env, new_el);
+-            SysBusDevice *uart[XLNX_VERSAL_NR_UARTS];
--        env->pc = new_pc;
++            PL011State uart[XLNX_VERSAL_NR_UARTS];
-         helper_rebuild_hflags_a64(env, new_el);
+             SysBusDevice *gem[XLNX_VERSAL_NR_GEMS];
-+
+             SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
-+        /*
+         } iou;
-+         * Apply TBI to the exception return address.  We had to delay this
+diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
-+         * until after we selected the new EL, so that we could select the
+index XXXXXXX..XXXXXXX 100644
-+         * correct TBI+TBID bits.  This is made easier by waiting until after
+--- a/hw/arm/xlnx-versal.c
-+         * the hflags rebuild, since we can pull the composite TBII field
++++ b/hw/arm/xlnx-versal.c
-+         * from there.
+@@ -XXX,XX +XXX,XX @@
-+         */
+ #include "kvm_arm.h"
-+        tbii = FIELD_EX32(env->hflags, TBFLAG_A64, TBII);
+ #include "hw/misc/unimp.h"
-+        if ((tbii >> extract64(new_pc, 55, 1)) & 1) {
+ #include "hw/arm/xlnx-versal.h"
-+            /* TBI is enabled. */
+-#include "hw/char/pl011.h"
-+            int core_mmu_idx = cpu_mmu_index(env, false);
-+            if (regime_has_2_ranges(core_to_aa64_mmu_idx(core_mmu_idx))) {
+ #define XLNX_VERSAL_ACPU_TYPE ARM_CPU_TYPE_NAME("cortex-a72")
-+                new_pc = sextract64(new_pc, 0, 56);
+ #define GEM_REVISION        0x40070106
-+            } else {
+@@ -XXX,XX +XXX,XX @@ static void versal_create_uarts(Versal *s, qemu_irq *pic)
-+                new_pc = extract64(new_pc, 0, 56);
+         DeviceState *dev;
-+            }
+         MemoryRegion *mr;
-+        }
-+        env->pc = new_pc;
+-        dev = qdev_create(NULL, TYPE_PL011);
-+
+-        s->lpd.iou.uart[i] = SYS_BUS_DEVICE(dev);
-         qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
++        sysbus_init_child_obj(OBJECT(s), name,
-                       "AArch64 EL%d PC 0x%" PRIx64 "\n",
++                              &s->lpd.iou.uart[i], sizeof(s->lpd.iou.uart[i]),
-                       cur_el, new_el, env->pc);
++                              TYPE_PL011);
 +        dev = DEVICE(&s->lpd.iou.uart[i]);
          qdev_prop_set_chr(dev, "chardev", serial_hd(i));
 -        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
          qdev_init_nofail(dev);
 -        mr = sysbus_mmio_get_region(s->lpd.iou.uart[i], 0);
 +        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
          memory_region_add_subregion(&s->mr_ps, addrs[i], mr);
 -        sysbus_connect_irq(s->lpd.iou.uart[i], 0, pic[irqs[i]]);
 +        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[irqs[i]]);
          g_free(name);
      }
  }
 --
 .20.1

-[PULL 35/37] target/arm: Move helper_dc_zva to helper-a64.c
+[PULL 13/39] hw/arm: versal: Embed the GEMs into the SoC type
-From: Richard Henderson <richard.henderson@linaro.org>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-This is an aarch64-only function.  Move it out of the shared file.
+Embed the GEMs into the SoC type.
 This patch is code movement only.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Suggested-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Message-id: 20200302175829.2183-6-richard.henderson@linaro.org
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Luc Michel <luc.michel@greensocs.com>
 Message-id: 20200427181649.26851-6-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper-a64.h |  1 +
+ include/hw/arm/xlnx-versal.h |  3 ++-
- target/arm/helper.h     |  1 -
+ hw/arm/xlnx-versal.c         | 15 ++++++++-------
- target/arm/helper-a64.c | 91 ++++++++++++++++++++++++++++++++++++++++
+files changed, 10 insertions(+), 8 deletions(-)
  target/arm/op_helper.c  | 93 -----------------------------------------
 files changed, 92 insertions(+), 94 deletions(-)
-diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper-a64.h
+--- a/include/hw/arm/xlnx-versal.h
-+++ b/target/arm/helper-a64.h
++++ b/include/hw/arm/xlnx-versal.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(advsimd_f16touinth, i32, f16, ptr)
+@@ -XXX,XX +XXX,XX @@
- DEF_HELPER_2(sqrt_f16, f16, f16, ptr)
+ #include "hw/arm/boot.h"
+ #include "hw/intc/arm_gicv3.h"
- DEF_HELPER_2(exception_return, void, env, i64)
+ #include "hw/char/pl011.h"
-+DEF_HELPER_2(dc_zva, void, env, i64)
++#include "hw/net/cadence_gem.h"
- DEF_HELPER_FLAGS_3(pacia, TCG_CALL_NO_WG, i64, env, i64, i64)
+ #define TYPE_XLNX_VERSAL "xlnx-versal"
- DEF_HELPER_FLAGS_3(pacib, TCG_CALL_NO_WG, i64, env, i64, i64)
+ #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
+@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
          struct {
              PL011State uart[XLNX_VERSAL_NR_UARTS];
 -            SysBusDevice *gem[XLNX_VERSAL_NR_GEMS];
 +            CadenceGEMState gem[XLNX_VERSAL_NR_GEMS];
              SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
          } iou;
      } lpd;
 diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/hw/arm/xlnx-versal.c
-+++ b/target/arm/helper.h
++++ b/hw/arm/xlnx-versal.c
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
+@@ -XXX,XX +XXX,XX @@ static void versal_create_gems(Versal *s, qemu_irq *pic)
+         DeviceState *dev;
- DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
+         MemoryRegion *mr;
- DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
--DEF_HELPER_2(dc_zva, void, env, i64)
+-        dev = qdev_create(NULL, "cadence_gem");
+-        s->lpd.iou.gem[i] = SYS_BUS_DEVICE(dev);
- DEF_HELPER_FLAGS_5(gvec_qrdmlah_s16, TCG_CALL_NO_RWG,
+-        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
-                    void, ptr, ptr, ptr, ptr, i32)
++        sysbus_init_child_obj(OBJECT(s), name,
-diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
++                              &s->lpd.iou.gem[i], sizeof(s->lpd.iou.gem[i]),
-index XXXXXXX..XXXXXXX 100644
++                              TYPE_CADENCE_GEM);
---- a/target/arm/helper-a64.c
++        dev = DEVICE(&s->lpd.iou.gem[i]);
-+++ b/target/arm/helper-a64.c
+         if (nd->used) {
-@@ -XXX,XX +XXX,XX @@
+             qemu_check_nic_model(nd, "cadence_gem");
-  */
+             qdev_set_nic_properties(dev, nd);
+         }
- #include "qemu/osdep.h"
+-        object_property_set_int(OBJECT(s->lpd.iou.gem[i]),
-+#include "qemu/units.h"
++        object_property_set_int(OBJECT(dev),
- #include "cpu.h"
+, "num-priority-queues",
- #include "exec/gdbstub.h"
+                                 &error_abort);
- #include "exec/helper-proto.h"
+-        object_property_set_link(OBJECT(s->lpd.iou.gem[i]),
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sqrt_f16)(uint32_t a, void *fpstp)
++        object_property_set_link(OBJECT(dev),
-     return float16_sqrt(a, s);
+                                  OBJECT(&s->mr_ps), "dma",
- }
+                                  &error_abort);
+         qdev_init_nofail(dev);
-+void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
-+{
+-        mr = sysbus_mmio_get_region(s->lpd.iou.gem[i], 0);
-+    /*
++        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
-+     * Implement DC ZVA, which zeroes a fixed-length block of memory.
+         memory_region_add_subregion(&s->mr_ps, addrs[i], mr);
-+     * Note that we do not implement the (architecturally mandated)
-+     * alignment fault for attempts to use this on Device memory
+-        sysbus_connect_irq(s->lpd.iou.gem[i], 0, pic[irqs[i]]);
-+     * (which matches the usual QEMU behaviour of not implementing either
++        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[irqs[i]]);
-+     * alignment faults or any memory attribute handling).
+         g_free(name);
 +     */
 +    ARMCPU *cpu = env_archcpu(env);
 +    uint64_t blocklen = 4 << cpu->dcz_blocksize;
 +    uint64_t vaddr = vaddr_in & ~(blocklen - 1);
 +
 +#ifndef CONFIG_USER_ONLY
 +    {
 +        /*
 +         * Slightly awkwardly, QEMU's TARGET_PAGE_SIZE may be less than
 +         * the block size so we might have to do more than one TLB lookup.
 +         * We know that in fact for any v8 CPU the page size is at least 4K
 +         * and the block size must be 2K or less, but TARGET_PAGE_SIZE is only
 +         * 1K as an artefact of legacy v5 subpage support being present in the
 +         * same QEMU executable. So in practice the hostaddr[] array has
 +         * two entries, given the current setting of TARGET_PAGE_BITS_MIN.
 +         */
 +        int maxidx = DIV_ROUND_UP(blocklen, TARGET_PAGE_SIZE);
 +        void *hostaddr[DIV_ROUND_UP(2 * KiB, 1 << TARGET_PAGE_BITS_MIN)];
 +        int try, i;
 +        unsigned mmu_idx = cpu_mmu_index(env, false);
 +        TCGMemOpIdx oi = make_memop_idx(MO_UB, mmu_idx);
 +
 +        assert(maxidx <= ARRAY_SIZE(hostaddr));
 +
 +        for (try = 0; try < 2; try++) {
 +
 +            for (i = 0; i < maxidx; i++) {
 +                hostaddr[i] = tlb_vaddr_to_host(env,
 +                                                vaddr + TARGET_PAGE_SIZE * i,
 +                                                1, mmu_idx);
 +                if (!hostaddr[i]) {
 +                    break;
 +                }
 +            }
 +            if (i == maxidx) {
 +                /*
 +                 * If it's all in the TLB it's fair game for just writing to;
 +                 * we know we don't need to update dirty status, etc.
 +                 */
 +                for (i = 0; i < maxidx - 1; i++) {
 +                    memset(hostaddr[i], 0, TARGET_PAGE_SIZE);
 +                }
 +                memset(hostaddr[i], 0, blocklen - (i * TARGET_PAGE_SIZE));
 +                return;
 +            }
 +            /*
 +             * OK, try a store and see if we can populate the tlb. This
 +             * might cause an exception if the memory isn't writable,
 +             * in which case we will longjmp out of here. We must for
 +             * this purpose use the actual register value passed to us
 +             * so that we get the fault address right.
 +             */
 +            helper_ret_stb_mmu(env, vaddr_in, 0, oi, GETPC());
 +            /* Now we can populate the other TLB entries, if any */
 +            for (i = 0; i < maxidx; i++) {
 +                uint64_t va = vaddr + TARGET_PAGE_SIZE * i;
 +                if (va != (vaddr_in & TARGET_PAGE_MASK)) {
 +                    helper_ret_stb_mmu(env, va, 0, oi, GETPC());
 +                }
 +            }
 +        }
 +
 +        /*
 +         * Slow path (probably attempt to do this to an I/O device or
 +         * similar, or clearing of a block of code we have translations
 +         * cached for). Just do a series of byte writes as the architecture
 +         * demands. It's not worth trying to use a cpu_physical_memory_map(),
 +         * memset(), unmap() sequence here because:
 +         *  + we'd need to account for the blocksize being larger than a page
 +         *  + the direct-RAM access case is almost always going to be dealt
 +         *    with in the fastpath code above, so there's no speed benefit
 +         *  + we would have to deal with the map returning NULL because the
 +         *    bounce buffer was in use
 +         */
 +        for (i = 0; i < blocklen; i++) {
 +            helper_ret_stb_mmu(env, vaddr + i, 0, oi, GETPC());
 +        }
 +    }
 +#else
 +    memset(g2h(vaddr), 0, blocklen);
 +#endif
 +}
 diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/op_helper.c
 +++ b/target/arm/op_helper.c
@@ -XXX,XX +XXX,XX @@
   * License along with this library; if not, see <http://www.gnu.org/licenses/>.
   */
  #include "qemu/osdep.h"
 -#include "qemu/units.h"
  #include "qemu/log.h"
  #include "qemu/main-loop.h"
  #include "cpu.h"
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(ror_cc)(CPUARMState *env, uint32_t x, uint32_t i)
          return ((uint32_t)x >> shift) | (x << (32 - shift));
      }
  }
--
--void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
--{
--    /*
--     * Implement DC ZVA, which zeroes a fixed-length block of memory.
--     * Note that we do not implement the (architecturally mandated)
--     * alignment fault for attempts to use this on Device memory
--     * (which matches the usual QEMU behaviour of not implementing either
--     * alignment faults or any memory attribute handling).
--     */
--
--    ARMCPU *cpu = env_archcpu(env);
--    uint64_t blocklen = 4 << cpu->dcz_blocksize;
--    uint64_t vaddr = vaddr_in & ~(blocklen - 1);
--
--#ifndef CONFIG_USER_ONLY
--    {
--        /*
--         * Slightly awkwardly, QEMU's TARGET_PAGE_SIZE may be less than
--         * the block size so we might have to do more than one TLB lookup.
--         * We know that in fact for any v8 CPU the page size is at least 4K
--         * and the block size must be 2K or less, but TARGET_PAGE_SIZE is only
--         * 1K as an artefact of legacy v5 subpage support being present in the
--         * same QEMU executable. So in practice the hostaddr[] array has
--         * two entries, given the current setting of TARGET_PAGE_BITS_MIN.
--         */
--        int maxidx = DIV_ROUND_UP(blocklen, TARGET_PAGE_SIZE);
--        void *hostaddr[DIV_ROUND_UP(2 * KiB, 1 << TARGET_PAGE_BITS_MIN)];
--        int try, i;
--        unsigned mmu_idx = cpu_mmu_index(env, false);
--        TCGMemOpIdx oi = make_memop_idx(MO_UB, mmu_idx);
--
--        assert(maxidx <= ARRAY_SIZE(hostaddr));
--
--        for (try = 0; try < 2; try++) {
--
--            for (i = 0; i < maxidx; i++) {
--                hostaddr[i] = tlb_vaddr_to_host(env,
--                                                vaddr + TARGET_PAGE_SIZE * i,
--                                                1, mmu_idx);
--                if (!hostaddr[i]) {
--                    break;
--                }
--            }
--            if (i == maxidx) {
--                /*
--                 * If it's all in the TLB it's fair game for just writing to;
--                 * we know we don't need to update dirty status, etc.
--                 */
--                for (i = 0; i < maxidx - 1; i++) {
--                    memset(hostaddr[i], 0, TARGET_PAGE_SIZE);
--                }
--                memset(hostaddr[i], 0, blocklen - (i * TARGET_PAGE_SIZE));
--                return;
--            }
--            /*
--             * OK, try a store and see if we can populate the tlb. This
--             * might cause an exception if the memory isn't writable,
--             * in which case we will longjmp out of here. We must for
--             * this purpose use the actual register value passed to us
--             * so that we get the fault address right.
--             */
--            helper_ret_stb_mmu(env, vaddr_in, 0, oi, GETPC());
--            /* Now we can populate the other TLB entries, if any */
--            for (i = 0; i < maxidx; i++) {
--                uint64_t va = vaddr + TARGET_PAGE_SIZE * i;
--                if (va != (vaddr_in & TARGET_PAGE_MASK)) {
--                    helper_ret_stb_mmu(env, va, 0, oi, GETPC());
--                }
--            }
--        }
--
--        /*
--         * Slow path (probably attempt to do this to an I/O device or
--         * similar, or clearing of a block of code we have translations
--         * cached for). Just do a series of byte writes as the architecture
--         * demands. It's not worth trying to use a cpu_physical_memory_map(),
--         * memset(), unmap() sequence here because:
--         *  + we'd need to account for the blocksize being larger than a page
--         *  + the direct-RAM access case is almost always going to be dealt
--         *    with in the fastpath code above, so there's no speed benefit
--         *  + we would have to deal with the map returning NULL because the
--         *    bounce buffer was in use
--         */
--        for (i = 0; i < blocklen; i++) {
--            helper_ret_stb_mmu(env, vaddr + i, 0, oi, GETPC());
--        }
--    }
--#else
--    memset(g2h(vaddr), 0, blocklen);
--#endif
--}
 --
 .20.1

-[PULL 04/37] hw/arm/smmu-common: a fix to smmu_find_smmu_pcibus
+[PULL 14/39] hw/arm: versal: Embed the ADMAs into the SoC type
-From: Eric Auger <eric.auger@redhat.com>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Make sure a null SMMUPciBus is returned in case we were
+Embed the ADMAs into the SoC type.
 not able to identify a pci bus matching the @bus_num.
-This matches the fix done on intel iommu in commit:
+Suggested-by: Peter Maydell <peter.maydell@linaro.org>
-a2e1cd41ccfe796529abfd1b6aeb1dd4393762a2
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Signed-off-by: Eric Auger <eric.auger@redhat.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Peter Xu <peterx@redhat.com>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
-Message-Id: <20200226172628.17449-1-eric.auger@redhat.com>
+Message-id: 20200427181649.26851-7-edgar.iglesias@gmail.com
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/smmu-common.c | 1 +
+ include/hw/arm/xlnx-versal.h |  3 ++-
-file changed, 1 insertion(+)
+ hw/arm/xlnx-versal.c         | 14 +++++++-------
 files changed, 9 insertions(+), 8 deletions(-)
-diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/smmu-common.c
+--- a/include/hw/arm/xlnx-versal.h
-+++ b/hw/arm/smmu-common.c
++++ b/include/hw/arm/xlnx-versal.h
-@@ -XXX,XX +XXX,XX @@ SMMUPciBus *smmu_find_smmu_pcibus(SMMUState *s, uint8_t bus_num)
+@@ -XXX,XX +XXX,XX @@
-                 return smmu_pci_bus;
+ #include "hw/arm/boot.h"
-             }
+ #include "hw/intc/arm_gicv3.h"
-         }
+ #include "hw/char/pl011.h"
-+        smmu_pci_bus = NULL;
++#include "hw/dma/xlnx-zdma.h"
  #include "hw/net/cadence_gem.h"
  #define TYPE_XLNX_VERSAL "xlnx-versal"
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
          struct {
              PL011State uart[XLNX_VERSAL_NR_UARTS];
              CadenceGEMState gem[XLNX_VERSAL_NR_GEMS];
 -            SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
 +            XlnxZDMA adma[XLNX_VERSAL_NR_ADMAS];
          } iou;
      } lpd;
 diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/xlnx-versal.c
 +++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_admas(Versal *s, qemu_irq *pic)
          DeviceState *dev;
          MemoryRegion *mr;
 -        dev = qdev_create(NULL, "xlnx.zdma");
 -        s->lpd.iou.adma[i] = SYS_BUS_DEVICE(dev);
 -        object_property_set_int(OBJECT(s->lpd.iou.adma[i]), 128, "bus-width",
 -                                &error_abort);
 -        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
 +        sysbus_init_child_obj(OBJECT(s), name,
 +                              &s->lpd.iou.adma[i], sizeof(s->lpd.iou.adma[i]),
 +                              TYPE_XLNX_ZDMA);
 +        dev = DEVICE(&s->lpd.iou.adma[i]);
 +        object_property_set_int(OBJECT(dev), 128, "bus-width", &error_abort);
          qdev_init_nofail(dev);
 -        mr = sysbus_mmio_get_region(s->lpd.iou.adma[i], 0);
 +        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
          memory_region_add_subregion(&s->mr_ps,
                                      MM_ADMA_CH0 + i * MM_ADMA_CH0_SIZE, mr);
 -        sysbus_connect_irq(s->lpd.iou.adma[i], 0, pic[VERSAL_ADMA_IRQ_0 + i]);
 +        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[VERSAL_ADMA_IRQ_0 + i]);
          g_free(name);
      }
-     return smmu_pci_bus;
  }
 --
 .20.1

-[PULL 15/37] target/arm: Improve masking of HCR/HCR2 RES0 bits
+[PULL 15/39] hw/arm: versal: Embed the APUs into the SoC type
-From: Richard Henderson <richard.henderson@linaro.org>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Don't merely start with v8.0, handle v7VE as well.  Ensure that writes
+Embed the APUs into the SoC type.
 from aarch32 mode do not change bits in the other half of the register.
 Protect reads of aa64 id registers with ARM_FEATURE_AARCH64.
 Suggested-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Message-id: 20200229012811.24129-2-richard.henderson@linaro.org
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Luc Michel <luc.michel@greensocs.com>
 Message-id: 20200427181649.26851-8-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.c | 38 +++++++++++++++++++++++++-------------
+ include/hw/arm/xlnx-versal.h |  2 +-
-file changed, 25 insertions(+), 13 deletions(-)
+ hw/arm/xlnx-versal-virt.c    |  4 ++--
  hw/arm/xlnx-versal.c         | 19 +++++--------------
 files changed, 8 insertions(+), 17 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/include/hw/arm/xlnx-versal.h
-+++ b/target/arm/helper.c
++++ b/include/hw/arm/xlnx-versal.h
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
-     REGINFO_SENTINEL
+     struct {
- };
+         struct {
+             MemoryRegion mr;
--static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+-            ARMCPU *cpu[XLNX_VERSAL_NR_ACPUS];
-+static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
++            ARMCPU cpu[XLNX_VERSAL_NR_ACPUS];
- {
+             GICv3State gic;
-     ARMCPU *cpu = env_archcpu(env);
+         } apu;
--    /* Begin with bits defined in base ARMv8.0.  */
+     } fpd;
--    uint64_t valid_mask = MAKE_64BIT_MASK(0, 34);
+diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
-+
+index XXXXXXX..XXXXXXX 100644
-+    if (arm_feature(env, ARM_FEATURE_V8)) {
+--- a/hw/arm/xlnx-versal-virt.c
-+        valid_mask |= MAKE_64BIT_MASK(0, 34);  /* ARMv8.0 */
++++ b/hw/arm/xlnx-versal-virt.c
-+    } else {
+@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
-+        valid_mask |= MAKE_64BIT_MASK(0, 28);  /* ARMv7VE */
+     s->binfo.get_dtb = versal_virt_get_dtb;
-+    }
+     s->binfo.modify_dtb = versal_virt_modify_dtb;
+     if (machine->kernel_filename) {
-     if (arm_feature(env, ARM_FEATURE_EL3)) {
+-        arm_load_kernel(s->soc.fpd.apu.cpu[0], machine, &s->binfo);
-         valid_mask &= ~HCR_HCD;
++        arm_load_kernel(&s->soc.fpd.apu.cpu[0], machine, &s->binfo);
-@@ -XXX,XX +XXX,XX @@ static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+     } else {
-          */
+-        AddressSpace *as = arm_boot_address_space(s->soc.fpd.apu.cpu[0],
-         valid_mask &= ~HCR_TSC;
++        AddressSpace *as = arm_boot_address_space(&s->soc.fpd.apu.cpu[0],
                                                    &s->binfo);
          /* Some boot-loaders (e.g u-boot) don't like blobs at address 0 (NULL).
           * Offset things by 4K.  */
 diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/xlnx-versal.c
 +++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
      for (i = 0; i < ARRAY_SIZE(s->fpd.apu.cpu); i++) {
          Object *obj;
 -        char *name;
 -
 -        obj = object_new(XLNX_VERSAL_ACPU_TYPE);
 -        if (!obj) {
 -            error_report("Unable to create apu.cpu[%d] of type %s",
 -                         i, XLNX_VERSAL_ACPU_TYPE);
 -            exit(EXIT_FAILURE);
 -        }
 -
 -        name = g_strdup_printf("apu-cpu[%d]", i);
 -        object_property_add_child(OBJECT(s), name, obj, &error_fatal);
 -        g_free(name);
 +        object_initialize_child(OBJECT(s), "apu-cpu[*]",
 +                                &s->fpd.apu.cpu[i], sizeof(s->fpd.apu.cpu[i]),
 +                                XLNX_VERSAL_ACPU_TYPE, &error_abort, NULL);
 +        obj = OBJECT(&s->fpd.apu.cpu[i]);
          object_property_set_int(obj, s->cfg.psci_conduit,
                                  "psci-conduit", &error_abort);
          if (i) {
@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
          object_property_set_link(obj, OBJECT(&s->fpd.apu.mr), "memory",
                                   &error_abort);
          object_property_set_bool(obj, true, "realized", &error_fatal);
 -        s->fpd.apu.cpu[i] = ARM_CPU(obj);
      }
--    if (cpu_isar_feature(aa64_vh, cpu)) {
+ }
--        valid_mask |= HCR_E2H;
--    }
+@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_gic(Versal *s, qemu_irq *pic)
 -    if (cpu_isar_feature(aa64_lor, cpu)) {
 -        valid_mask |= HCR_TLOR;
 -    }
 -    if (cpu_isar_feature(aa64_pauth, cpu)) {
 -        valid_mask |= HCR_API | HCR_APK;
 +
 +    if (arm_feature(env, ARM_FEATURE_AARCH64)) {
 +        if (cpu_isar_feature(aa64_vh, cpu)) {
 +            valid_mask |= HCR_E2H;
 +        }
 +        if (cpu_isar_feature(aa64_lor, cpu)) {
 +            valid_mask |= HCR_TLOR;
 +        }
 +        if (cpu_isar_feature(aa64_pauth, cpu)) {
 +            valid_mask |= HCR_API | HCR_APK;
 +        }
      }
-     /* Clear RES0 bits.  */
+     for (i = 0; i < nr_apu_cpus; i++) {
-@@ -XXX,XX +XXX,XX @@ static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+-        DeviceState *cpudev = DEVICE(s->fpd.apu.cpu[i]);
-     arm_cpu_update_vfiq(cpu);
++        DeviceState *cpudev = DEVICE(&s->fpd.apu.cpu[i]);
- }
+         int ppibase = XLNX_VERSAL_NR_IRQS + i * GIC_INTERNAL + GIC_NR_SGIS;
+         qemu_irq maint_irq;
-+static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+         int ti;
 +{
 +    do_hcr_write(env, value, 0);
 +}
 +
  static void hcr_writehigh(CPUARMState *env, const ARMCPRegInfo *ri,
                            uint64_t value)
  {
      /* Handle HCR2 write, i.e. write to high half of HCR_EL2 */
      value = deposit64(env->cp15.hcr_el2, 32, 32, value);
 -    hcr_write(env, NULL, value);
 +    do_hcr_write(env, value, MAKE_64BIT_MASK(0, 32));
  }
  static void hcr_writelow(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static void hcr_writelow(CPUARMState *env, const ARMCPRegInfo *ri,
  {
      /* Handle HCR write, i.e. write to low half of HCR_EL2 */
      value = deposit64(env->cp15.hcr_el2, 0, 32, value);
 -    hcr_write(env, NULL, value);
 +    do_hcr_write(env, value, MAKE_64BIT_MASK(32, 32));
  }
  /*
 --
 .20.1

-[PULL 01/37] hw/arm: versal: Add support for the LPD ADMAs
+[PULL 16/39] hw/arm: versal: Add support for SD
 From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Add support for the Versal LPD ADMAs.
+Add support for SD.
 Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-9-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/arm/xlnx-versal.h |  6 ++++++
+ include/hw/arm/xlnx-versal.h | 12 ++++++++++++
- hw/arm/xlnx-versal.c         | 24 ++++++++++++++++++++++++
+ hw/arm/xlnx-versal.c         | 31 +++++++++++++++++++++++++++++++
-files changed, 30 insertions(+)
+files changed, 43 insertions(+)
 diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
 index XXXXXXX..XXXXXXX 100644
 --- a/include/hw/arm/xlnx-versal.h
 +++ b/include/hw/arm/xlnx-versal.h
 @@ -XXX,XX +XXX,XX @@
- #define XLNX_VERSAL_NR_ACPUS   2
  #include "hw/sysbus.h"
  #include "hw/arm/boot.h"
 +#include "hw/sd/sdhci.h"
  #include "hw/intc/arm_gicv3.h"
  #include "hw/char/pl011.h"
  #include "hw/dma/xlnx-zdma.h"
@@ -XXX,XX +XXX,XX @@
  #define XLNX_VERSAL_NR_UARTS   2
  #define XLNX_VERSAL_NR_GEMS    2
-+#define XLNX_VERSAL_NR_ADMAS   8
+ #define XLNX_VERSAL_NR_ADMAS   8
 +#define XLNX_VERSAL_NR_SDS     2
  #define XLNX_VERSAL_NR_IRQS    192
  typedef struct Versal {
 @@ -XXX,XX +XXX,XX @@ typedef struct Versal {
-         struct {
-             SysBusDevice *uart[XLNX_VERSAL_NR_UARTS];
-             SysBusDevice *gem[XLNX_VERSAL_NR_GEMS];
-+            SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
          } iou;
      } lpd;
++    /* The Platform Management Controller subsystem.  */
++    struct {
++        struct {
++            SDHCIState sd[XLNX_VERSAL_NR_SDS];
++        } iou;
++    } pmc;
++
+     struct {
+         MemoryRegion *mr_ddr;
+         uint32_t psci_conduit;
 @@ -XXX,XX +XXX,XX @@ typedef struct Versal {
- #define VERSAL_GEM0_WAKE_IRQ_0     57
  #define VERSAL_GEM1_IRQ_0          58
  #define VERSAL_GEM1_WAKE_IRQ_0     59
-+#define VERSAL_ADMA_IRQ_0          60
+ #define VERSAL_ADMA_IRQ_0          60
 +#define VERSAL_SD0_IRQ_0           126
  /* Architecturally reserved IRQs suitable for virtualization.  */
  #define VERSAL_RSVD_IRQ_FIRST 111
 @@ -XXX,XX +XXX,XX @@ typedef struct Versal {
- #define MM_GEM1                     0xff0d0000U
+ #define MM_FPD_CRF                  0xfd1a0000U
- #define MM_GEM1_SIZE                0x10000
+ #define MM_FPD_CRF_SIZE             0x140000
-+#define MM_ADMA_CH0                 0xffa80000U
++#define MM_PMC_SD0                  0xf1040000U
-+#define MM_ADMA_CH0_SIZE            0x10000
++#define MM_PMC_SD0_SIZE             0x10000
-+
+ #define MM_PMC_CRP                  0xf1260000U
- #define MM_OCM                      0xfffc0000U
+ #define MM_PMC_CRP_SIZE             0x10000
- #define MM_OCM_SIZE                 0x40000
+ #endif
 diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/xlnx-versal.c
 +++ b/hw/arm/xlnx-versal.c
-@@ -XXX,XX +XXX,XX @@ static void versal_create_gems(Versal *s, qemu_irq *pic)
+@@ -XXX,XX +XXX,XX @@ static void versal_create_admas(Versal *s, qemu_irq *pic)
      }
  }
-+static void versal_create_admas(Versal *s, qemu_irq *pic)
++#define SDHCI_CAPABILITIES  0x280737ec6481 /* Same as on ZynqMP.  */
 +static void versal_create_sds(Versal *s, qemu_irq *pic)
 +{
 +    int i;
 +
-+    for (i = 0; i < ARRAY_SIZE(s->lpd.iou.adma); i++) {
++    for (i = 0; i < ARRAY_SIZE(s->pmc.iou.sd); i++) {
 +        char *name = g_strdup_printf("adma%d", i);
 +        DeviceState *dev;
 +        MemoryRegion *mr;
 +
-+        dev = qdev_create(NULL, "xlnx.zdma");
++        sysbus_init_child_obj(OBJECT(s), "sd[*]",
-+        s->lpd.iou.adma[i] = SYS_BUS_DEVICE(dev);
++                              &s->pmc.iou.sd[i], sizeof(s->pmc.iou.sd[i]),
-+        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
++                              TYPE_SYSBUS_SDHCI);
 +        dev = DEVICE(&s->pmc.iou.sd[i]);
 +
 +        object_property_set_uint(OBJECT(dev),
 +                                 3, "sd-spec-version", &error_fatal);
 +        object_property_set_uint(OBJECT(dev), SDHCI_CAPABILITIES, "capareg",
 +                                 &error_fatal);
 +        object_property_set_uint(OBJECT(dev), UHS_I, "uhs", &error_fatal);
 +        qdev_init_nofail(dev);
 +
-+        mr = sysbus_mmio_get_region(s->lpd.iou.adma[i], 0);
++        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
 +        memory_region_add_subregion(&s->mr_ps,
-+                                    MM_ADMA_CH0 + i * MM_ADMA_CH0_SIZE, mr);
++                                    MM_PMC_SD0 + i * MM_PMC_SD0_SIZE, mr);
 +
-+        sysbus_connect_irq(s->lpd.iou.adma[i], 0, pic[VERSAL_ADMA_IRQ_0 + i]);
++        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0,
-+        g_free(name);
++                           pic[VERSAL_SD0_IRQ_0 + i * 2]);
 +    }
 +}
 +
  /* This takes the board allocated linear DDR memory and creates aliases
   * for each split DDR range/aperture on the Versal address map.
   */
 @@ -XXX,XX +XXX,XX @@ static void versal_realize(DeviceState *dev, Error **errp)
-     versal_create_apu_gic(s, pic);
      versal_create_uarts(s, pic);
      versal_create_gems(s, pic);
-+    versal_create_admas(s, pic);
+     versal_create_admas(s, pic);
 +    versal_create_sds(s, pic);
      versal_map_ddr(s);
      versal_unimp(s);
 --
 .20.1

-[PULL 31/37] target/arm: Replicate TBI/TBID bits for single range regimes
+[PULL 17/39] hw/arm: versal: Add support for the RTC
-From: Richard Henderson <richard.henderson@linaro.org>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Replicate the single TBI bit from TCR_EL2 and TCR_EL3 so that
+hw/arm: versal: Add support for the RTC.
 we can unconditionally use pointer bit 55 to index into our
 composite TBI1:TBI0 field.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Message-id: 20200302175829.2183-2-richard.henderson@linaro.org
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
 Message-id: 20200427181649.26851-10-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.c | 6 ++++--
+ include/hw/arm/xlnx-versal.h |  8 ++++++++
-file changed, 4 insertions(+), 2 deletions(-)
+ hw/arm/xlnx-versal.c         | 21 +++++++++++++++++++++
 files changed, 29 insertions(+)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/include/hw/arm/xlnx-versal.h
-+++ b/target/arm/helper.c
++++ b/include/hw/arm/xlnx-versal.h
-@@ -XXX,XX +XXX,XX @@ static int aa64_va_parameter_tbi(uint64_t tcr, ARMMMUIdx mmu_idx)
+@@ -XXX,XX +XXX,XX @@
-     } else if (mmu_idx == ARMMMUIdx_Stage2) {
+ #include "hw/char/pl011.h"
-         return 0; /* VTCR_EL2 */
+ #include "hw/dma/xlnx-zdma.h"
-     } else {
+ #include "hw/net/cadence_gem.h"
--        return extract32(tcr, 20, 1);
++#include "hw/rtc/xlnx-zynqmp-rtc.h"
-+        /* Replicate the single TBI bit so we always have 2 bits.  */
-+        return extract32(tcr, 20, 1) * 3;
+ #define TYPE_XLNX_VERSAL "xlnx-versal"
  #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
          struct {
              SDHCIState sd[XLNX_VERSAL_NR_SDS];
          } iou;
 +
 +        XlnxZynqMPRTC rtc;
      } pmc;
      struct {
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
  #define VERSAL_GEM1_IRQ_0          58
  #define VERSAL_GEM1_WAKE_IRQ_0     59
  #define VERSAL_ADMA_IRQ_0          60
 +#define VERSAL_RTC_APB_ERR_IRQ     121
  #define VERSAL_SD0_IRQ_0           126
 +#define VERSAL_RTC_ALARM_IRQ       142
 +#define VERSAL_RTC_SECONDS_IRQ     143
  /* Architecturally reserved IRQs suitable for virtualization.  */
  #define VERSAL_RSVD_IRQ_FIRST 111
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
  #define MM_PMC_SD0_SIZE             0x10000
  #define MM_PMC_CRP                  0xf1260000U
  #define MM_PMC_CRP_SIZE             0x10000
 +#define MM_PMC_RTC                  0xf12a0000
 +#define MM_PMC_RTC_SIZE             0x10000
  #endif
 diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/xlnx-versal.c
 +++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_sds(Versal *s, qemu_irq *pic)
      }
  }
-@@ -XXX,XX +XXX,XX @@ static int aa64_va_parameter_tbid(uint64_t tcr, ARMMMUIdx mmu_idx)
++static void versal_create_rtc(Versal *s, qemu_irq *pic)
-     } else if (mmu_idx == ARMMMUIdx_Stage2) {
++{
-         return 0; /* VTCR_EL2 */
++    SysBusDevice *sbd;
-     } else {
++    MemoryRegion *mr;
--        return extract32(tcr, 29, 1);
++
-+        /* Replicate the single TBID bit so we always have 2 bits.  */
++    sysbus_init_child_obj(OBJECT(s), "rtc", &s->pmc.rtc, sizeof(s->pmc.rtc),
-+        return extract32(tcr, 29, 1) * 3;
++                          TYPE_XLNX_ZYNQMP_RTC);
-     }
++    sbd = SYS_BUS_DEVICE(&s->pmc.rtc);
- }
++    qdev_init_nofail(DEVICE(sbd));
 +
 +    mr = sysbus_mmio_get_region(sbd, 0);
 +    memory_region_add_subregion(&s->mr_ps, MM_PMC_RTC, mr);
 +
 +    /*
 +     * TODO: Connect the ALARM and SECONDS interrupts once our RTC model
 +     * supports them.
 +     */
 +    sysbus_connect_irq(sbd, 1, pic[VERSAL_RTC_APB_ERR_IRQ]);
 +}
 +
  /* This takes the board allocated linear DDR memory and creates aliases
   * for each split DDR range/aperture on the Versal address map.
   */
@@ -XXX,XX +XXX,XX @@ static void versal_realize(DeviceState *dev, Error **errp)
      versal_create_gems(s, pic);
      versal_create_admas(s, pic);
      versal_create_sds(s, pic);
 +    versal_create_rtc(s, pic);
      versal_map_ddr(s);
      versal_unimp(s);
 --
 .20.1

-[PULL 02/37] hw/arm: versal: Generate xlnx-versal-virt zdma FDT nodes
+[PULL 18/39] hw/arm: versal-virt: Add support for SD
 From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Generate xlnx-versal-virt zdma FDT nodes.
+Add support for SD.
 Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
 Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com>
 Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-11-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/xlnx-versal-virt.c | 28 ++++++++++++++++++++++++++++
+ hw/arm/xlnx-versal-virt.c | 46 +++++++++++++++++++++++++++++++++++++++
-file changed, 28 insertions(+)
+file changed, 46 insertions(+)
 diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/xlnx-versal-virt.c
 +++ b/hw/arm/xlnx-versal-virt.c
-@@ -XXX,XX +XXX,XX @@ static void fdt_add_gem_nodes(VersalVirt *s)
+@@ -XXX,XX +XXX,XX @@
  #include "hw/arm/sysbus-fdt.h"
  #include "hw/arm/fdt.h"
  #include "cpu.h"
 +#include "hw/qdev-properties.h"
  #include "hw/arm/xlnx-versal.h"
  #define TYPE_XLNX_VERSAL_VIRT_MACHINE MACHINE_TYPE_NAME("xlnx-versal-virt")
@@ -XXX,XX +XXX,XX @@ static void fdt_add_zdma_nodes(VersalVirt *s)
      }
  }
-+static void fdt_add_zdma_nodes(VersalVirt *s)
++static void fdt_add_sd_nodes(VersalVirt *s)
 +{
-+    const char clocknames[] = "clk_main\0clk_apb";
++    const char clocknames[] = "clk_xin\0clk_ahb";
-+    const char compat[] = "xlnx,zynqmp-dma-1.0";
++    const char compat[] = "arasan,sdhci-8.9a";
 +    int i;
 +
-+    for (i = XLNX_VERSAL_NR_ADMAS - 1; i >= 0; i--) {
++    for (i = ARRAY_SIZE(s->soc.pmc.iou.sd) - 1; i >= 0; i--) {
-+        uint64_t addr = MM_ADMA_CH0 + MM_ADMA_CH0_SIZE * i;
++        uint64_t addr = MM_PMC_SD0 + MM_PMC_SD0_SIZE * i;
-+        char *name = g_strdup_printf("/dma@%" PRIx64, addr);
++        char *name = g_strdup_printf("/sdhci@%" PRIx64, addr);
 +
 +        qemu_fdt_add_subnode(s->fdt, name);
 +
-+        qemu_fdt_setprop_cell(s->fdt, name, "xlnx,bus-width", 64);
 +        qemu_fdt_setprop_cells(s->fdt, name, "clocks",
 +                               s->phandle.clk_25Mhz, s->phandle.clk_25Mhz);
 +        qemu_fdt_setprop(s->fdt, name, "clock-names",
 +                         clocknames, sizeof(clocknames));
 +        qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
-+                               GIC_FDT_IRQ_TYPE_SPI, VERSAL_ADMA_IRQ_0 + i,
++                               GIC_FDT_IRQ_TYPE_SPI, VERSAL_SD0_IRQ_0 + i * 2,
 +                               GIC_FDT_IRQ_FLAGS_LEVEL_HI);
 +        qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
-+                                     2, addr, 2, 0x1000);
++                                     2, addr, 2, MM_PMC_SD0_SIZE);
 +        qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat));
 +        g_free(name);
 +    }
 +}
 +
  static void fdt_nop_memory_nodes(void *fdt, Error **errp)
  {
      Error *err = NULL;
+@@ -XXX,XX +XXX,XX @@ static void create_virtio_regions(VersalVirt *s)
+     }
+ }
++static void sd_plugin_card(SDHCIState *sd, DriveInfo *di)
++{
++    BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL;
++    DeviceState *card;
++
++    card = qdev_create(qdev_get_child_bus(DEVICE(sd), "sd-bus"), TYPE_SD_CARD);
++    object_property_add_child(OBJECT(sd), "card[*]", OBJECT(card),
++                              &error_fatal);
++    qdev_prop_set_drive(card, "drive", blk, &error_fatal);
++    object_property_set_bool(OBJECT(card), true, "realized", &error_fatal);
++}
++
+ static void versal_virt_init(MachineState *machine)
+ {
+     VersalVirt *s = XLNX_VERSAL_VIRT_MACHINE(machine);
+     int psci_conduit = QEMU_PSCI_CONDUIT_DISABLED;
++    int i;
+     /*
+      * If the user provides an Operating System to be loaded, we expect them
 @@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
-     fdt_add_uart_nodes(s);
      fdt_add_gic_nodes(s);
      fdt_add_timer_nodes(s);
-+    fdt_add_zdma_nodes(s);
+     fdt_add_zdma_nodes(s);
 +    fdt_add_sd_nodes(s);
      fdt_add_cpu_nodes(s, psci_conduit);
      fdt_add_clk_node(s, "/clk125", 125000000, s->phandle.clk_125Mhz);
      fdt_add_clk_node(s, "/clk25", 25000000, s->phandle.clk_25Mhz);
+@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
+     memory_region_add_subregion_overlap(get_system_memory(),
+, &s->soc.fpd.apu.mr, 0);
++    /* Plugin SD cards.  */
++    for (i = 0; i < ARRAY_SIZE(s->soc.pmc.iou.sd); i++) {
++        sd_plugin_card(&s->soc.pmc.iou.sd[i], drive_get_next(IF_SD));
++    }
++
+     s->binfo.ram_size = machine->ram_size;
+     s->binfo.loader_start = 0x0;
+     s->binfo.get_dtb = versal_virt_get_dtb;
 --
 .20.1

-[PULL 33/37] target/arm: Introduce core_to_aa64_mmu_idx
+[PULL 19/39] hw/arm: versal-virt: Add support for the RTC
-From: Richard Henderson <richard.henderson@linaro.org>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-If by context we know that we're in AArch64 mode, we need not
+Add support for the RTC.
 test for M-profile when reconstructing the full ARMMMUIdx.
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
-Message-id: 20200302175829.2183-4-richard.henderson@linaro.org
+Message-id: 20200427181649.26851-12-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/internals.h     | 6 ++++++
+ hw/arm/xlnx-versal-virt.c | 22 ++++++++++++++++++++++
- target/arm/translate-a64.c | 2 +-
+file changed, 22 insertions(+)
 files changed, 7 insertions(+), 1 deletion(-)
-diff --git a/target/arm/internals.h b/target/arm/internals.h
+diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/internals.h
+--- a/hw/arm/xlnx-versal-virt.c
-+++ b/target/arm/internals.h
++++ b/hw/arm/xlnx-versal-virt.c
-@@ -XXX,XX +XXX,XX @@ static inline ARMMMUIdx core_to_arm_mmu_idx(CPUARMState *env, int mmu_idx)
+@@ -XXX,XX +XXX,XX @@ static void fdt_add_sd_nodes(VersalVirt *s)
      }
  }
-+static inline ARMMMUIdx core_to_aa64_mmu_idx(int mmu_idx)
++static void fdt_add_rtc_node(VersalVirt *s)
 +{
-+    /* AArch64 is always a-profile. */
++    const char compat[] = "xlnx,zynqmp-rtc";
-+    return mmu_idx | ARM_MMU_IDX_A;
++    const char interrupt_names[] = "alarm\0sec";
 +    char *name = g_strdup_printf("/rtc@%x", MM_PMC_RTC);
 +
 +    qemu_fdt_add_subnode(s->fdt, name);
 +
 +    qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
 +                           GIC_FDT_IRQ_TYPE_SPI, VERSAL_RTC_ALARM_IRQ,
 +                           GIC_FDT_IRQ_FLAGS_LEVEL_HI,
 +                           GIC_FDT_IRQ_TYPE_SPI, VERSAL_RTC_SECONDS_IRQ,
 +                           GIC_FDT_IRQ_FLAGS_LEVEL_HI);
 +    qemu_fdt_setprop(s->fdt, name, "interrupt-names",
 +                     interrupt_names, sizeof(interrupt_names));
 +    qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
 +                                 2, MM_PMC_RTC, 2, MM_PMC_RTC_SIZE);
 +    qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat));
 +    g_free(name);
 +}
 +
- int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx);
+ static void fdt_nop_memory_nodes(void *fdt, Error **errp)
+ {
- /*
+     Error *err = NULL;
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
-index XXXXXXX..XXXXXXX 100644
+     fdt_add_timer_nodes(s);
---- a/target/arm/translate-a64.c
+     fdt_add_zdma_nodes(s);
-+++ b/target/arm/translate-a64.c
+     fdt_add_sd_nodes(s);
-@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
++    fdt_add_rtc_node(s);
-     dc->condexec_mask = 0;
+     fdt_add_cpu_nodes(s, psci_conduit);
-     dc->condexec_cond = 0;
+     fdt_add_clk_node(s, "/clk125", 125000000, s->phandle.clk_125Mhz);
-     core_mmu_idx = FIELD_EX32(tb_flags, TBFLAG_ANY, MMUIDX);
+     fdt_add_clk_node(s, "/clk25", 25000000, s->phandle.clk_25Mhz);
 -    dc->mmu_idx = core_to_arm_mmu_idx(env, core_mmu_idx);
 +    dc->mmu_idx = core_to_aa64_mmu_idx(core_mmu_idx);
      dc->tbii = FIELD_EX32(tb_flags, TBFLAG_A64, TBII);
      dc->tbid = FIELD_EX32(tb_flags, TBFLAG_A64, TBID);
      dc->current_el = arm_mmu_idx_to_el(dc->mmu_idx);
 --
 .20.1

-[PULL 18/37] target/arm: Remove EL2 and EL3 setup from user-only
+[PULL 20/39] target/arm/translate-vfp.inc.c: Remove duplicate simd_r32 check
-From: Richard Henderson <richard.henderson@linaro.org>
+Somewhere along theline we accidentally added a duplicate
 "using D16-D31 when they don't exist" check to do_vfm_dp()
 (probably an artifact of a patchseries rebase). Remove it.
-We have disabled EL2 and EL3 for user-only, which means that these
-registers "don't exist" and should not be set.
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200229012811.24129-5-richard.henderson@linaro.org
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20200430181003.21682-2-peter.maydell@linaro.org
 ---
- target/arm/cpu.c | 6 ------
+ target/arm/translate-vfp.inc.c | 6 ------
 file changed, 6 deletions(-)
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+--- a/target/arm/translate-vfp.inc.c
-+++ b/target/arm/cpu.c
++++ b/target/arm/translate-vfp.inc.c
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
+@@ -XXX,XX +XXX,XX @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
-         /* Enable all PAC keys.  */
+         return false;
-         env->cp15.sctlr_el[1] |= (SCTLR_EnIA | SCTLR_EnIB |
+     }
-                                   SCTLR_EnDA | SCTLR_EnDB);
--        /* Enable all PAC instructions */
+-    /* UNDEF accesses to D16-D31 if they don't exist. */
--        env->cp15.hcr_el2 |= HCR_API;
+-    if (!dc_isar_feature(aa32_simd_r32, s) &&
--        env->cp15.scr_el3 |= SCR_API;
+-        ((a->vd | a->vn | a->vm) & 0x10)) {
-         /* and to the FP/Neon instructions */
+-        return false;
-         env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3);
+-    }
-         /* and to the SVE instructions */
+-
-         env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 16, 2, 3);
+     if (!vfp_access_check(s)) {
--        env->cp15.cptr_el[3] |= CPTR_EZ;
+         return true;
-         /* with maximum vector length */
+     }
          env->vfp.zcr_el[1] = cpu_isar_feature(aa64_sve, cpu) ?
                               cpu->sve_max_vq - 1 : 0;
 -        env->vfp.zcr_el[2] = env->vfp.zcr_el[1];
 -        env->vfp.zcr_el[3] = env->vfp.zcr_el[1];
          /*
           * Enable TBI0 and TBI1.  While the real kernel only enables TBI0,
           * turning on both here will produce smaller code and otherwise
 --
 .20.1

-New patch
+[PULL 21/39] target/arm: Don't allow Thumb Neon insns without FEATURE_NEON
+We were accidentally permitting decode of Thumb Neon insns even if
+the CPU didn't have the FEATURE_NEON bit set, because the feature
+check was being done before the call to disas_neon_data_insn() and
+disas_neon_ls_insn() in the Arm decoder but was omitted from the
+Thumb decoder.  Push the feature bit check down into the called
+functions so it is done for both Arm and Thumb encodings.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20200430181003.21682-3-peter.maydell@linaro.org
+---
+ target/arm/translate.c | 16 ++++++++--------
+file changed, 8 insertions(+), 8 deletions(-)
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
+     TCGv_i32 tmp2;
+     TCGv_i64 tmp64;
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
++        return 1;
++    }
++
+     /* FIXME: this access check should not take precedence over UNDEF
+      * for invalid encodings; we will generate incorrect syndrome information
+      * for attempts to execute invalid vfp/neon encodings with FP disabled.
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+     TCGv_ptr ptr1, ptr2, ptr3;
+     TCGv_i64 tmp64;
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
++        return 1;
++    }
++
+     /* FIXME: this access check should not take precedence over UNDEF
+      * for invalid encodings; we will generate incorrect syndrome information
+      * for attempts to execute invalid vfp/neon encodings with FP disabled.
+@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+         if (((insn >> 25) & 7) == 1) {
+             /* NEON Data processing.  */
+-            if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+-                goto illegal_op;
+-            }
+-
+             if (disas_neon_data_insn(s, insn)) {
+                 goto illegal_op;
+             }
+@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+         }
+         if ((insn & 0x0f100000) == 0x04000000) {
+             /* NEON load/store.  */
+-            if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+-                goto illegal_op;
+-            }
+-
+             if (disas_neon_ls_insn(s, insn)) {
+                 goto illegal_op;
+             }
+--
+.20.1

-[PULL 20/37] target/arm: Honor the HCR_EL2.{TVM,TRVM} bits
+[PULL 22/39] target/arm: Add stubs for AArch32 Neon decodetree
-From: Richard Henderson <richard.henderson@linaro.org>
+Add the infrastructure for building and invoking a decodetree decoder
+for the AArch32 Neon encodings.  At the moment the new decoder covers
-These bits trap EL1 access to various virtual memory controls.
+nothing, so we always fall back to the existing hand-written decode.
-Buglink: https://bugs.launchpad.net/bugs/1855072
+We follow the same pattern we did for the VFP decodetree conversion
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+(commit 78e138bc1f672c145ef6ace74617d and following): code that deals
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+with Neon will be moving gradually out to translate-neon.vfp.inc,
-Message-id: 20200229012811.24129-7-richard.henderson@linaro.org
+which we #include into translate.c.
 In order to share the decode files between A32 and T32, we
 split Neon into 3 parts:
  * data-processing
  * load-store
  * 'shared' encodings
 The first two groups of instructions have similar but not identical
 A32 and T32 encodings, so we need to manually transform the T32
 encoding into the A32 one before calling the decoder; the third group
 covers the Neon instructions which are identical in A32 and T32.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-4-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 82 ++++++++++++++++++++++++++++++---------------
+ target/arm/neon-dp.decode       | 29 ++++++++++++++++++++++++++
-file changed, 55 insertions(+), 27 deletions(-)
+ target/arm/neon-ls.decode       | 29 ++++++++++++++++++++++++++
+ target/arm/neon-shared.decode   | 27 +++++++++++++++++++++++++
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+ target/arm/translate-neon.inc.c | 32 +++++++++++++++++++++++++++++
  target/arm/translate.c          | 36 +++++++++++++++++++++++++++++++--
  target/arm/Makefile.objs        | 18 +++++++++++++++++
 files changed, 169 insertions(+), 2 deletions(-)
  create mode 100644 target/arm/neon-dp.decode
  create mode 100644 target/arm/neon-ls.decode
  create mode 100644 target/arm/neon-shared.decode
  create mode 100644 target/arm/translate-neon.inc.c
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 +# AArch32 Neon data-processing instruction descriptions
 +#
 +#  Copyright (c) 2020 Linaro, Ltd
 +#
 +# This library is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU Lesser General Public
 +# License as published by the Free Software Foundation; either
 +# version 2 of the License, or (at your option) any later version.
 +#
 +# This library is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +# Lesser General Public License for more details.
 +#
 +# You should have received a copy of the GNU Lesser General Public
 +# License along with this library; if not, see <http://www.gnu.org/licenses/>.
 +
 +#
 +# This file is processed by scripts/decodetree.py
 +#
 +
 +# Encodings for Neon data processing instructions where the T32 encoding
 +# is a simple transformation of the A32 encoding.
 +# More specifically, this file covers instructions where the A32 encoding is
 +#   0b1111_001p_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 +# and the T32 encoding is
 +#   0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 +# This file works on the A32 encoding only; calling code for T32 has to
 +# transform the insn into the A32 version first.
 diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/neon-ls.decode
@@ -XXX,XX +XXX,XX @@
 +# AArch32 Neon load/store instruction descriptions
 +#
 +#  Copyright (c) 2020 Linaro, Ltd
 +#
 +# This library is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU Lesser General Public
 +# License as published by the Free Software Foundation; either
 +# version 2 of the License, or (at your option) any later version.
 +#
 +# This library is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +# Lesser General Public License for more details.
 +#
 +# You should have received a copy of the GNU Lesser General Public
 +# License along with this library; if not, see <http://www.gnu.org/licenses/>.
 +
 +#
 +# This file is processed by scripts/decodetree.py
 +#
 +
 +# Encodings for Neon load/store instructions where the T32 encoding
 +# is a simple transformation of the A32 encoding.
 +# More specifically, this file covers instructions where the A32 encoding is
 +#   0b1111_0100_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
 +# and the T32 encoding is
 +#   0b1111_1001_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
 +# This file works on the A32 encoding only; calling code for T32 has to
 +# transform the insn into the A32 version first.
 diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/neon-shared.decode
@@ -XXX,XX +XXX,XX @@
 +# AArch32 Neon instruction descriptions
 +#
 +#  Copyright (c) 2020 Linaro, Ltd
 +#
 +# This library is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU Lesser General Public
 +# License as published by the Free Software Foundation; either
 +# version 2 of the License, or (at your option) any later version.
 +#
 +# This library is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +# Lesser General Public License for more details.
 +#
 +# You should have received a copy of the GNU Lesser General Public
 +# License along with this library; if not, see <http://www.gnu.org/licenses/>.
 +
 +#
 +# This file is processed by scripts/decodetree.py
 +#
 +
 +# Encodings for Neon instructions whose encoding is the same for
 +# both A32 and T32.
 +
 +# More specifically, this covers:
 +# 2reg scalar ext: 0b1111_1110_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
 +# 3same ext:       0b1111_110x_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + *  ARM translation: AArch32 Neon instructions
 + *
 + *  Copyright (c) 2003 Fabrice Bellard
 + *  Copyright (c) 2005-2007 CodeSourcery
 + *  Copyright (c) 2007 OpenedHand, Ltd.
 + *  Copyright (c) 2020 Linaro, Ltd.
 + *
 + * This library is free software; you can redistribute it and/or
 + * modify it under the terms of the GNU Lesser General Public
 + * License as published by the Free Software Foundation; either
 + * version 2 of the License, or (at your option) any later version.
 + *
 + * This library is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 + * Lesser General Public License for more details.
 + *
 + * You should have received a copy of the GNU Lesser General Public
 + * License along with this library; if not, see <http://www.gnu.org/licenses/>.
 + */
 +
 +/*
 + * This file is intended to be included from translate.c; it uses
 + * some macros and definitions provided by that file.
 + * It might be possible to convert it to a standalone .c file eventually.
 + */
 +
 +/* Include the generated Neon decoder */
 +#include "decode-neon-dp.inc.c"
 +#include "decode-neon-ls.inc.c"
 +#include "decode-neon-shared.inc.c"
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/translate.c
-+++ b/target/arm/helper.c
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tpm(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static TCGv_ptr vfp_reg_ptr(bool dp, int reg)
-     return CP_ACCESS_OK;
- }
+ #define ARM_CP_RW_BIT   (1 << 20)
-+/* Check for traps from EL1 due to HCR_EL2.TVM and HCR_EL2.TRVM.  */
+-/* Include the VFP decoder */
-+static CPAccessResult access_tvm_trvm(CPUARMState *env, const ARMCPRegInfo *ri,
++/* Include the VFP and Neon decoders */
-+                                      bool isread)
+ #include "translate-vfp.inc.c"
-+{
++#include "translate-neon.inc.c"
-+    if (arm_current_el(env) == 1) {
-+        uint64_t trap = isread ? HCR_TRVM : HCR_TVM;
+ static inline void iwmmxt_load_reg(TCGv_i64 var, int reg)
-+        if (arm_hcr_el2_eff(env) & trap) {
+ {
-+            return CP_ACCESS_TRAP_EL2;
+@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
          /* Unconditional instructions.  */
          /* TODO: Perhaps merge these into one decodetree output file.  */
          if (disas_a32_uncond(s, insn) ||
 -            disas_vfp_uncond(s, insn)) {
 +            disas_vfp_uncond(s, insn) ||
 +            disas_neon_dp(s, insn) ||
 +            disas_neon_ls(s, insn) ||
 +            disas_neon_shared(s, insn)) {
              return;
          }
          /* fall back to legacy decoder */
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
          ARCH(6T2);
      }
 +    if ((insn & 0xef000000) == 0xef000000) {
 +        /*
 +         * T32 encodings 0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 +         * transform into
 +         * A32 encodings 0b1111_001p_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 +         */
 +        uint32_t a32_insn = (insn & 0xe2ffffff) |
 +            ((insn & (1 << 28)) >> 4) | (1 << 28);
 +
 +        if (disas_neon_dp(s, a32_insn)) {
 +            return;
 +        }
 +    }
-+    return CP_ACCESS_OK;
++
-+}
++    if ((insn & 0xff100000) == 0xf9000000) {
-+
++        /*
- static void dacr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
++         * T32 encodings 0b1111_1001_ppp0_qqqq_qqqq_qqqq_qqqq_qqqq
- {
++         * transform into
-     ARMCPU *cpu = env_archcpu(env);
++         * A32 encodings 0b1111_0100_ppp0_qqqq_qqqq_qqqq_qqqq_qqqq
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cp_reginfo[] = {
++         */
 +        uint32_t a32_insn = (insn & 0x00ffffff) | 0xf4000000;
 +
 +        if (disas_neon_ls(s, a32_insn)) {
 +            return;
 +        }
 +    }
 +
      /*
       * TODO: Perhaps merge these into one decodetree output file.
       * Note disas_vfp is written for a32 with cond field in the
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
       */
-     { .name = "CONTEXTIDR_EL1", .state = ARM_CP_STATE_BOTH,
+     if (disas_t32(s, insn) ||
-       .opc0 = 3, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 1,
+         disas_vfp_uncond(s, insn) ||
--      .access = PL1_RW, .secure = ARM_CP_SECSTATE_NS,
++        disas_neon_shared(s, insn) ||
-+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+         ((insn >> 28) == 0xe && disas_vfp(s, insn))) {
-+      .secure = ARM_CP_SECSTATE_NS,
+         return;
-       .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[1]),
+     }
-       .resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, },
+diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
-     { .name = "CONTEXTIDR_S", .state = ARM_CP_STATE_AA32,
+index XXXXXXX..XXXXXXX 100644
-       .cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 1,
+--- a/target/arm/Makefile.objs
--      .access = PL1_RW, .secure = ARM_CP_SECSTATE_S,
++++ b/target/arm/Makefile.objs
-+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+@@ -XXX,XX +XXX,XX @@ target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.decode $(DECODETREE)
-+      .secure = ARM_CP_SECSTATE_S,
+       $(PYTHON) $(DECODETREE) --decode disas_sve -o $@ $<,\
-       .fieldoffset = offsetof(CPUARMState, cp15.contextidr_s),
+       "GEN", $(TARGET_DIR)$@)
-       .resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, },
-     REGINFO_SENTINEL
++target/arm/decode-neon-shared.inc.c: $(SRC_PATH)/target/arm/neon-shared.decode $(DECODETREE)
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v8_cp_reginfo[] = {
++    $(call quiet-command,\
-     /* MMU Domain access control / MPU write buffer control */
++      $(PYTHON) $(DECODETREE) --static-decode disas_neon_shared -o $@ $<,\
-     { .name = "DACR",
++      "GEN", $(TARGET_DIR)$@)
-       .cp = 15, .opc1 = CP_ANY, .crn = 3, .crm = CP_ANY, .opc2 = CP_ANY,
++
--      .access = PL1_RW, .resetvalue = 0,
++target/arm/decode-neon-dp.inc.c: $(SRC_PATH)/target/arm/neon-dp.decode $(DECODETREE)
-+      .access = PL1_RW, .accessfn = access_tvm_trvm, .resetvalue = 0,
++    $(call quiet-command,\
-       .writefn = dacr_write, .raw_writefn = raw_write,
++      $(PYTHON) $(DECODETREE) --static-decode disas_neon_dp -o $@ $<,\
-       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.dacr_s),
++      "GEN", $(TARGET_DIR)$@)
-                              offsetoflow32(CPUARMState, cp15.dacr_ns) } },
++
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
++target/arm/decode-neon-ls.inc.c: $(SRC_PATH)/target/arm/neon-ls.decode $(DECODETREE)
-     { .name = "DMB", .cp = 15, .crn = 7, .crm = 10, .opc1 = 0, .opc2 = 5,
++    $(call quiet-command,\
-       .access = PL0_W, .type = ARM_CP_NOP },
++      $(PYTHON) $(DECODETREE) --static-decode disas_neon_ls -o $@ $<,\
-     { .name = "IFAR", .cp = 15, .crn = 6, .crm = 0, .opc1 = 0, .opc2 = 2,
++      "GEN", $(TARGET_DIR)$@)
--      .access = PL1_RW,
++
-+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+ target/arm/decode-vfp.inc.c: $(SRC_PATH)/target/arm/vfp.decode $(DECODETREE)
-       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ifar_s),
+     $(call quiet-command,\
-                              offsetof(CPUARMState, cp15.ifar_ns) },
+       $(PYTHON) $(DECODETREE) --static-decode disas_vfp -o $@ $<,\
-       .resetvalue = 0, },
+@@ -XXX,XX +XXX,XX @@ target/arm/decode-t16.inc.c: $(SRC_PATH)/target/arm/t16.decode $(DECODETREE)
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
+       "GEN", $(TARGET_DIR)$@)
-      */
-     { .name = "AFSR0_EL1", .state = ARM_CP_STATE_BOTH,
+ target/arm/translate-sve.o: target/arm/decode-sve.inc.c
-       .opc0 = 3, .opc1 = 0, .crn = 5, .crm = 1, .opc2 = 0,
++target/arm/translate.o: target/arm/decode-neon-shared.inc.c
--      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
++target/arm/translate.o: target/arm/decode-neon-dp.inc.c
-+      .access = PL1_RW, .accessfn = access_tvm_trvm,
++target/arm/translate.o: target/arm/decode-neon-ls.inc.c
-+      .type = ARM_CP_CONST, .resetvalue = 0 },
+ target/arm/translate.o: target/arm/decode-vfp.inc.c
-     { .name = "AFSR1_EL1", .state = ARM_CP_STATE_BOTH,
+ target/arm/translate.o: target/arm/decode-vfp-uncond.inc.c
-       .opc0 = 3, .opc1 = 0, .crn = 5, .crm = 1, .opc2 = 1,
+ target/arm/translate.o: target/arm/decode-a32.inc.c
 -      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .type = ARM_CP_CONST, .resetvalue = 0 },
      /* MAIR can just read-as-written because we don't implement caches
       * and so don't need to care about memory attributes.
       */
      { .name = "MAIR_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 0,
 -      .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.mair_el[1]),
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .fieldoffset = offsetof(CPUARMState, cp15.mair_el[1]),
        .resetvalue = 0 },
      { .name = "MAIR_EL3", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 6, .crn = 10, .crm = 2, .opc2 = 0,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
        * handled in the field definitions.
        */
      { .name = "MAIR0", .state = ARM_CP_STATE_AA32,
 -      .cp = 15, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 0, .access = PL1_RW,
 +      .cp = 15, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 0,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.mair0_s),
                               offsetof(CPUARMState, cp15.mair0_ns) },
        .resetfn = arm_cp_reset_ignore },
      { .name = "MAIR1", .state = ARM_CP_STATE_AA32,
 -      .cp = 15, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 1, .access = PL1_RW,
 +      .cp = 15, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 1,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.mair1_s),
                               offsetof(CPUARMState, cp15.mair1_ns) },
        .resetfn = arm_cp_reset_ignore },
@@ -XXX,XX +XXX,XX @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
  static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
      { .name = "DFSR", .cp = 15, .crn = 5, .crm = 0, .opc1 = 0, .opc2 = 0,
 -      .access = PL1_RW, .type = ARM_CP_ALIAS,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm, .type = ARM_CP_ALIAS,
        .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.dfsr_s),
                               offsetoflow32(CPUARMState, cp15.dfsr_ns) }, },
      { .name = "IFSR", .cp = 15, .crn = 5, .crm = 0, .opc1 = 0, .opc2 = 1,
 -      .access = PL1_RW, .resetvalue = 0,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm, .resetvalue = 0,
        .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.ifsr_s),
                               offsetoflow32(CPUARMState, cp15.ifsr_ns) } },
      { .name = "DFAR", .cp = 15, .opc1 = 0, .crn = 6, .crm = 0, .opc2 = 0,
 -      .access = PL1_RW, .resetvalue = 0,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm, .resetvalue = 0,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.dfar_s),
                               offsetof(CPUARMState, cp15.dfar_ns) } },
      { .name = "FAR_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .crn = 6, .crm = 0, .opc1 = 0, .opc2 = 0,
 -      .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.far_el[1]),
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .fieldoffset = offsetof(CPUARMState, cp15.far_el[1]),
        .resetvalue = 0, },
      REGINFO_SENTINEL
  };
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
  static const ARMCPRegInfo vmsa_cp_reginfo[] = {
      { .name = "ESR_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .crn = 5, .crm = 2, .opc1 = 0, .opc2 = 0,
 -      .access = PL1_RW,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
        .fieldoffset = offsetof(CPUARMState, cp15.esr_el[1]), .resetvalue = 0, },
      { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH,
        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
 -      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .writefn = vmsa_ttbr_write, .resetvalue = 0,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                               offsetof(CPUARMState, cp15.ttbr0_ns) } },
      { .name = "TTBR1_EL1", .state = ARM_CP_STATE_BOTH,
        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 1,
 -      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .writefn = vmsa_ttbr_write, .resetvalue = 0,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                               offsetof(CPUARMState, cp15.ttbr1_ns) } },
      { .name = "TCR_EL1", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .crn = 2, .crm = 0, .opc1 = 0, .opc2 = 2,
 -      .access = PL1_RW, .writefn = vmsa_tcr_el12_write,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .writefn = vmsa_tcr_el12_write,
        .resetfn = vmsa_ttbcr_reset, .raw_writefn = raw_write,
        .fieldoffset = offsetof(CPUARMState, cp15.tcr_el[1]) },
      { .name = "TTBCR", .cp = 15, .crn = 2, .crm = 0, .opc1 = 0, .opc2 = 2,
 -      .access = PL1_RW, .type = ARM_CP_ALIAS, .writefn = vmsa_ttbcr_write,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .type = ARM_CP_ALIAS, .writefn = vmsa_ttbcr_write,
        .raw_writefn = vmsa_ttbcr_raw_write,
        .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.tcr_el[3]),
                               offsetoflow32(CPUARMState, cp15.tcr_el[1])} },
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
   */
  static const ARMCPRegInfo ttbcr2_reginfo = {
      .name = "TTBCR2", .cp = 15, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 3,
 -    .access = PL1_RW, .type = ARM_CP_ALIAS,
 +    .access = PL1_RW, .accessfn = access_tvm_trvm,
 +    .type = ARM_CP_ALIAS,
      .bank_fieldoffsets = { offsetofhigh32(CPUARMState, cp15.tcr_el[3]),
                             offsetofhigh32(CPUARMState, cp15.tcr_el[1]) },
  };
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo lpae_cp_reginfo[] = {
      /* NOP AMAIR0/1 */
      { .name = "AMAIR0", .state = ARM_CP_STATE_BOTH,
        .opc0 = 3, .crn = 10, .crm = 3, .opc1 = 0, .opc2 = 0,
 -      .access = PL1_RW, .type = ARM_CP_CONST,
 -      .resetvalue = 0 },
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .type = ARM_CP_CONST, .resetvalue = 0 },
      /* AMAIR1 is mapped to AMAIR_EL1[63:32] */
      { .name = "AMAIR1", .cp = 15, .crn = 10, .crm = 3, .opc1 = 0, .opc2 = 1,
 -      .access = PL1_RW, .type = ARM_CP_CONST,
 -      .resetvalue = 0 },
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .type = ARM_CP_CONST, .resetvalue = 0 },
      { .name = "PAR", .cp = 15, .crm = 7, .opc1 = 0,
        .access = PL1_RW, .type = ARM_CP_64BIT, .resetvalue = 0,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.par_s),
                               offsetof(CPUARMState, cp15.par_ns)} },
      { .name = "TTBR0", .cp = 15, .crm = 2, .opc1 = 0,
 -      .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .type = ARM_CP_64BIT | ARM_CP_ALIAS,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                               offsetof(CPUARMState, cp15.ttbr0_ns) },
        .writefn = vmsa_ttbr_write, },
      { .name = "TTBR1", .cp = 15, .crm = 2, .opc1 = 1,
 -      .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm,
 +      .type = ARM_CP_64BIT | ARM_CP_ALIAS,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                               offsetof(CPUARMState, cp15.ttbr1_ns) },
        .writefn = vmsa_ttbr_write, },
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
        .type = ARM_CP_NOP, .access = PL1_W },
      /* MMU Domain access control / MPU write buffer control */
      { .name = "DACR", .cp = 15, .opc1 = 0, .crn = 3, .crm = 0, .opc2 = 0,
 -      .access = PL1_RW, .resetvalue = 0,
 +      .access = PL1_RW, .accessfn = access_tvm_trvm, .resetvalue = 0,
        .writefn = dacr_write, .raw_writefn = raw_write,
        .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.dacr_s),
                               offsetoflow32(CPUARMState, cp15.dacr_ns) } },
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
          ARMCPRegInfo sctlr = {
              .name = "SCTLR", .state = ARM_CP_STATE_BOTH,
              .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 0,
 -            .access = PL1_RW,
 +            .access = PL1_RW, .accessfn = access_tvm_trvm,
              .bank_fieldoffsets = { offsetof(CPUARMState, cp15.sctlr_s),
                                     offsetof(CPUARMState, cp15.sctlr_ns) },
              .writefn = sctlr_write, .resetvalue = cpu->reset_sctlr,
 --
 .20.1

-[PULL 30/37] hw/arm/cubieboard: report error when using unsupported -bios argument
+[PULL 23/39] target/arm: Convert VCMLA (vector) to decodetree
-From: Niek Linnenbank <nieklinnenbank@gmail.com>
+Convert the VCMLA (vector) insns in the 3same extension group to
 decodetree.
-The Cubieboard machine does not support the -bios argument.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Report an error when -bios is used and exit immediately.
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200430181003.21682-5-peter.maydell@linaro.org
 ---
  target/arm/neon-shared.decode   | 11 ++++++++++
  target/arm/translate-neon.inc.c | 37 +++++++++++++++++++++++++++++++++
  target/arm/translate.c          | 11 +---------
 files changed, 49 insertions(+), 10 deletions(-)
-Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 Message-id: 20200227220149.6845-5-nieklinnenbank@gmail.com
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  hw/arm/cubieboard.c | 7 +++++++
 file changed, 7 insertions(+)
 diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/cubieboard.c
+--- a/target/arm/neon-shared.decode
-+++ b/hw/arm/cubieboard.c
++++ b/target/arm/neon-shared.decode
 @@ -XXX,XX +XXX,XX @@
- #include "exec/address-spaces.h"
+ # More specifically, this covers:
- #include "qapi/error.h"
+ # 2reg scalar ext: 0b1111_1110_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
- #include "cpu.h"
+ # 3same ext:       0b1111_110x_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
-+#include "sysemu/sysemu.h"
++
- #include "hw/sysbus.h"
++# VFP/Neon register fields; same as vfp.decode
- #include "hw/boards.h"
++%vm_dp  5:1 0:4
- #include "hw/arm/allwinner-a10.h"
++%vm_sp  0:4 5:1
-@@ -XXX,XX +XXX,XX @@ static void cubieboard_init(MachineState *machine)
++%vn_dp  7:1 16:4
-     AwA10State *a10;
++%vn_sp  16:4 7:1
-     Error *err = NULL;
++%vd_dp  22:1 12:4
++%vd_sp  12:4 22:1
-+    /* BIOS is not supported by this board */
++
-+    if (bios_name) {
++VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
-+        error_report("BIOS not supported for this machine");
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+        exit(1);
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@
  #include "decode-neon-dp.inc.c"
  #include "decode-neon-ls.inc.c"
  #include "decode-neon-shared.inc.c"
 +
 +static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a)
 +{
 +    int opr_sz;
 +    TCGv_ptr fpst;
 +    gen_helper_gvec_3_ptr *fn_gvec_ptr;
 +
 +    if (!dc_isar_feature(aa32_vcma, s)
 +        || (!a->size && !dc_isar_feature(aa32_fp16_arith, s))) {
 +        return false;
 +    }
 +
-     /* This board has fixed size RAM (512MiB or 1GiB) */
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-     if (machine->ram_size != 512 * MiB &&
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-         machine->ram_size != 1 * GiB) {
++        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & a->q) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    opr_sz = (1 + a->q) * 8;
 +    fpst = get_fpstatus_ptr(1);
 +    fn_gvec_ptr = a->size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(1, a->vn),
 +                       vfp_reg_offset(1, a->vm),
 +                       fpst, opr_sz, opr_sz, a->rot,
 +                       fn_gvec_ptr);
 +    tcg_temp_free_ptr(fpst);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
      bool is_long = false, q = extract32(insn, 6, 1);
      bool ptr_is_env = false;
 -    if ((insn & 0xfe200f10) == 0xfc200800) {
 -        /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
 -        int size = extract32(insn, 20, 1);
 -        data = extract32(insn, 23, 2); /* rot */
 -        if (!dc_isar_feature(aa32_vcma, s)
 -            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
 -            return 1;
 -        }
 -        fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
 -    } else if ((insn & 0xfea00f10) == 0xfc800800) {
 +    if ((insn & 0xfea00f10) == 0xfc800800) {
          /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
          int size = extract32(insn, 20, 1);
          data = extract32(insn, 24, 1); /* rot */
 --
 .20.1

-New patch
+[PULL 24/39] target/arm: Convert VCADD (vector) to decodetree
+Convert the VCADD (vector) insns to decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-6-peter.maydell@linaro.org
+---
+ target/arm/neon-shared.decode   |  3 +++
+ target/arm/translate-neon.inc.c | 37 +++++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 11 +---------
+files changed, 41 insertions(+), 10 deletions(-)
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon-shared.decode
++++ b/target/arm/neon-shared.decode
+@@ -XXX,XX +XXX,XX @@
+ VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
+                vm=%vm_dp vn=%vn_dp vd=%vd_dp
++
++VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a)
+     tcg_temp_free_ptr(fpst);
+     return true;
+ }
++
++static bool trans_VCADD(DisasContext *s, arg_VCADD *a)
++{
++    int opr_sz;
++    TCGv_ptr fpst;
++    gen_helper_gvec_3_ptr *fn_gvec_ptr;
++
++    if (!dc_isar_feature(aa32_vcma, s)
++        || (!a->size && !dc_isar_feature(aa32_fp16_arith, s))) {
++        return false;
++    }
++
++    /* UNDEF accesses to D16-D31 if they don't exist. */
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
++        ((a->vd | a->vn | a->vm) & 0x10)) {
++        return false;
++    }
++
++    if ((a->vn | a->vm | a->vd) & a->q) {
++        return false;
++    }
++
++    if (!vfp_access_check(s)) {
++        return true;
++    }
++
++    opr_sz = (1 + a->q) * 8;
++    fpst = get_fpstatus_ptr(1);
++    fn_gvec_ptr = a->size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
++    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
++                       vfp_reg_offset(1, a->vn),
++                       vfp_reg_offset(1, a->vm),
++                       fpst, opr_sz, opr_sz, a->rot,
++                       fn_gvec_ptr);
++    tcg_temp_free_ptr(fpst);
++    return true;
++}
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+     bool is_long = false, q = extract32(insn, 6, 1);
+     bool ptr_is_env = false;
+-    if ((insn & 0xfea00f10) == 0xfc800800) {
+-        /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
+-        int size = extract32(insn, 20, 1);
+-        data = extract32(insn, 24, 1); /* rot */
+-        if (!dc_isar_feature(aa32_vcma, s)
+-            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
+-            return 1;
+-        }
+-        fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
+-    } else if ((insn & 0xfeb00f00) == 0xfc200d00) {
++    if ((insn & 0xfeb00f00) == 0xfc200d00) {
+         /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
+         bool u = extract32(insn, 4, 1);
+         if (!dc_isar_feature(aa32_dp, s)) {
+--
+.20.1

-[PULL 29/37] hw/arm/cubieboard: restrict allowed RAM size to 512MiB and 1GiB
+[PULL 25/39] target/arm: Convert V[US]DOT (vector) to decodetree
-From: Niek Linnenbank <nieklinnenbank@gmail.com>
+Convert the V[US]DOT (vector) insns to decodetree.
-The Cubieboard contains either 512MiB or 1GiB of onboard RAM [1].
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Prevent changing RAM to a different size which could break user programs.
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200430181003.21682-7-peter.maydell@linaro.org
 ---
  target/arm/neon-shared.decode   |  4 ++++
  target/arm/translate-neon.inc.c | 32 ++++++++++++++++++++++++++++++++
  target/arm/translate.c          |  9 +--------
 files changed, 37 insertions(+), 8 deletions(-)
- [1] http://linux-sunxi.org/Cubieboard
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
 Message-id: 20200227220149.6845-4-nieklinnenbank@gmail.com
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  hw/arm/cubieboard.c | 8 ++++++++
 file changed, 8 insertions(+)
 diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/cubieboard.c
+--- a/target/arm/neon-shared.decode
-+++ b/hw/arm/cubieboard.c
++++ b/target/arm/neon-shared.decode
-@@ -XXX,XX +XXX,XX @@ static void cubieboard_init(MachineState *machine)
+@@ -XXX,XX +XXX,XX @@ VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
-     AwA10State *a10;
-     Error *err = NULL;
+ VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
+                vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+    /* This board has fixed size RAM (512MiB or 1GiB) */
++
-+    if (machine->ram_size != 512 * MiB &&
++# VUDOT and VSDOT
-+        machine->ram_size != 1 * GiB) {
++VDOT           1111 110 00 . 10 .... .... 1101 . q:1 . u:1 .... \
-+        error_report("This machine can only be used with 512MiB or 1GiB RAM");
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+        exit(1);
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VCADD(DisasContext *s, arg_VCADD *a)
      tcg_temp_free_ptr(fpst);
      return true;
  }
 +
 +static bool trans_VDOT(DisasContext *s, arg_VDOT *a)
 +{
 +    int opr_sz;
 +    gen_helper_gvec_3 *fn_gvec;
 +
 +    if (!dc_isar_feature(aa32_dp, s)) {
 +        return false;
 +    }
 +
-     /* Only allow Cortex-A8 for this board */
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-     if (strcmp(machine->cpu_type, ARM_CPU_TYPE_NAME("cortex-a8")) != 0) {
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-         error_report("This board can only be used with cortex-a8 CPU");
++        ((a->vd | a->vn | a->vm) & 0x10)) {
-@@ -XXX,XX +XXX,XX @@ static void cubieboard_machine_init(MachineClass *mc)
++        return false;
- {
++    }
-     mc->desc = "cubietech cubieboard (Cortex-A8)";
++
-     mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a8");
++    if ((a->vn | a->vm | a->vd) & a->q) {
-+    mc->default_ram_size = 1 * GiB;
++        return false;
-     mc->init = cubieboard_init;
++    }
-     mc->block_default_type = IF_IDE;
++
-     mc->units_per_default_bus = 1;
++    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    opr_sz = (1 + a->q) * 8;
 +    fn_gvec = a->u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
 +    tcg_gen_gvec_3_ool(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(1, a->vn),
 +                       vfp_reg_offset(1, a->vm),
 +                       opr_sz, opr_sz, 0, fn_gvec);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
      bool is_long = false, q = extract32(insn, 6, 1);
      bool ptr_is_env = false;
 -    if ((insn & 0xfeb00f00) == 0xfc200d00) {
 -        /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
 -        bool u = extract32(insn, 4, 1);
 -        if (!dc_isar_feature(aa32_dp, s)) {
 -            return 1;
 -        }
 -        fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
 -    } else if ((insn & 0xff300f10) == 0xfc200810) {
 +    if ((insn & 0xff300f10) == 0xfc200810) {
          /* VFM[AS]L -- 1111 1100 S.10 .... .... 1000 .Q.1 .... */
          int is_s = extract32(insn, 23, 1);
          if (!dc_isar_feature(aa32_fhm, s)) {
 --
 .20.1

-[PULL 05/37] hw/arm/smmu-common: Simplify smmu_find_smmu_pcibus() logic
+[PULL 26/39] target/arm: Convert VFM[AS]L (vector) to decodetree
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+Convert the VFM[AS]L (vector) insns to decodetree.  This is the last
 insn in the legacy decoder for the 3same_ext group, so we can
 delete the legacy decoder function for the group entirely.
-The smmu_find_smmu_pcibus() function was introduced (in commit
+Note that in disas_thumb2_insn() the parts of this encoding space
-cac994ef43b) in a code format that could return an incorrect
+where the decodetree decoder returns false will correctly be directed
-pointer, which was then fixed by the previous commit.
+to illegal_op by the "(insn & (1 << 28))" check so they won't fall
-We could have avoided this by writing the if() statement
+into disas_coproc_insn() by mistake.
 differently. Do it now, in case this function is re-used.
 The code is easier to review (harder to miss bugs).
-Acked-by: Eric Auger <eric.auger@redhat.com>
-Reviewed-by: Peter Xu <peterx@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-8-peter.maydell@linaro.org
 ---
- hw/arm/smmu-common.c | 25 +++++++++++++------------
+ target/arm/neon-shared.decode   |  6 +++
-file changed, 13 insertions(+), 12 deletions(-)
+ target/arm/translate-neon.inc.c | 31 +++++++++++
  target/arm/translate.c          | 92 +--------------------------------
 files changed, 38 insertions(+), 91 deletions(-)
-diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/smmu-common.c
+--- a/target/arm/neon-shared.decode
-+++ b/hw/arm/smmu-common.c
++++ b/target/arm/neon-shared.decode
-@@ -XXX,XX +XXX,XX @@ inline int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
+@@ -XXX,XX +XXX,XX @@ VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
- SMMUPciBus *smmu_find_smmu_pcibus(SMMUState *s, uint8_t bus_num)
+ # VUDOT and VSDOT
- {
+ VDOT           1111 110 00 . 10 .... .... 1101 . q:1 . u:1 .... \
-     SMMUPciBus *smmu_pci_bus = s->smmu_pcibus_by_bus_num[bus_num];
+                vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +    GHashTableIter iter;
 -    if (!smmu_pci_bus) {
 -        GHashTableIter iter;
 -
 -        g_hash_table_iter_init(&iter, s->smmu_pcibus_by_busptr);
 -        while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
 -            if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
 -                s->smmu_pcibus_by_bus_num[bus_num] = smmu_pci_bus;
 -                return smmu_pci_bus;
 -            }
 -        }
 -        smmu_pci_bus = NULL;
 +    if (smmu_pci_bus) {
 +        return smmu_pci_bus;
      }
 -    return smmu_pci_bus;
 +
-+    g_hash_table_iter_init(&iter, s->smmu_pcibus_by_busptr);
++# VFM[AS]L
-+    while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
++VFML           1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \
-+        if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
++               vm=%vm_sp vn=%vn_sp vd=%vd_dp q=0
-+            s->smmu_pcibus_by_bus_num[bus_num] = smmu_pci_bus;
++VFML           1111 110 0 s:1 . 10 .... .... 1000 . 1 . 1 .... \
-+            return smmu_pci_bus;
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp q=1
-+        }
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VDOT(DisasContext *s, arg_VDOT *a)
                         opr_sz, opr_sz, 0, fn_gvec);
      return true;
  }
 +
 +static bool trans_VFML(DisasContext *s, arg_VFML *a)
 +{
 +    int opr_sz;
 +
 +    if (!dc_isar_feature(aa32_fhm, s)) {
 +        return false;
 +    }
 +
-+    return NULL;
++    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        (a->vd & 0x10)) {
 +        return false;
 +    }
 +
 +    if (a->vd & a->q) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    opr_sz = (1 + a->q) * 8;
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(a->q, a->vn),
 +                       vfp_reg_offset(a->q, a->vm),
 +                       cpu_env, opr_sz, opr_sz, a->s, /* is_2 == 0 */
 +                       gen_helper_gvec_fmlal_a32);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      return 0;
  }
- static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
+-/* Advanced SIMD three registers of the same length extension.
 - *  31           25    23  22    20   16   12  11   10   9    8        3     0
 - * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
 - * | 1 1 1 1 1 1 0 | op1 | D | op2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
 - * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
 - */
 -static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
 -{
 -    gen_helper_gvec_3 *fn_gvec = NULL;
 -    gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
 -    int rd, rn, rm, opr_sz;
 -    int data = 0;
 -    int off_rn, off_rm;
 -    bool is_long = false, q = extract32(insn, 6, 1);
 -    bool ptr_is_env = false;
 -
 -    if ((insn & 0xff300f10) == 0xfc200810) {
 -        /* VFM[AS]L -- 1111 1100 S.10 .... .... 1000 .Q.1 .... */
 -        int is_s = extract32(insn, 23, 1);
 -        if (!dc_isar_feature(aa32_fhm, s)) {
 -            return 1;
 -        }
 -        is_long = true;
 -        data = is_s; /* is_2 == 0 */
 -        fn_gvec_ptr = gen_helper_gvec_fmlal_a32;
 -        ptr_is_env = true;
 -    } else {
 -        return 1;
 -    }
 -
 -    VFP_DREG_D(rd, insn);
 -    if (rd & q) {
 -        return 1;
 -    }
 -    if (q || !is_long) {
 -        VFP_DREG_N(rn, insn);
 -        VFP_DREG_M(rm, insn);
 -        if ((rn | rm) & q & !is_long) {
 -            return 1;
 -        }
 -        off_rn = vfp_reg_offset(1, rn);
 -        off_rm = vfp_reg_offset(1, rm);
 -    } else {
 -        rn = VFP_SREG_N(insn);
 -        rm = VFP_SREG_M(insn);
 -        off_rn = vfp_reg_offset(0, rn);
 -        off_rm = vfp_reg_offset(0, rm);
 -    }
 -
 -    if (s->fp_excp_el) {
 -        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
 -                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
 -        return 0;
 -    }
 -    if (!s->vfp_enabled) {
 -        return 1;
 -    }
 -
 -    opr_sz = (1 + q) * 8;
 -    if (fn_gvec_ptr) {
 -        TCGv_ptr ptr;
 -        if (ptr_is_env) {
 -            ptr = cpu_env;
 -        } else {
 -            ptr = get_fpstatus_ptr(1);
 -        }
 -        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
 -                           opr_sz, opr_sz, data, fn_gvec_ptr);
 -        if (!ptr_is_env) {
 -            tcg_temp_free_ptr(ptr);
 -        }
 -    } else {
 -        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
 -                           opr_sz, opr_sz, data, fn_gvec);
 -    }
 -    return 0;
 -}
 -
  /* Advanced SIMD two registers and a scalar extension.
   *  31             24   23  22   20   16   12  11   10   9    8        3     0
   * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                      }
                  }
              }
 -        } else if ((insn & 0x0e000a00) == 0x0c000800
 -                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            if (disas_neon_insn_3same_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -            return;
          } else if ((insn & 0x0f000a00) == 0x0e000800
                     && arm_dc_feature(s, ARM_FEATURE_V8)) {
              if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
              }
              break;
          }
 -        if ((insn & 0xfe000a00) == 0xfc000800
 +        if ((insn & 0xff000a00) == 0xfe000800
              && arm_dc_feature(s, ARM_FEATURE_V8)) {
              /* The Thumb2 and ARM encodings are identical.  */
 -            if (disas_neon_insn_3same_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -        } else if ((insn & 0xff000a00) == 0xfe000800
 -                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            /* The Thumb2 and ARM encodings are identical.  */
              if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
                  goto illegal_op;
              }
 --
 .20.1

-[PULL 10/37] hw/arm/musicpal: Simplify since the machines are little-endian only
+[PULL 27/39] target/arm: Convert VCMLA (scalar) to decodetree
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+Convert VCMLA (scalar) in the 2reg-scalar-ext group to decodetree.
-We only build the little-endian softmmu configurations. Checking
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-for big endian is pointless, remove the unused code.
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200430181003.21682-9-peter.maydell@linaro.org
 ---
  target/arm/neon-shared.decode   |  5 +++++
  target/arm/translate-neon.inc.c | 40 +++++++++++++++++++++++++++++++++
  target/arm/translate.c          | 26 +--------------------
 files changed, 46 insertions(+), 25 deletions(-)
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  hw/arm/musicpal.c | 10 ----------
 file changed, 10 deletions(-)
 diff --git a/hw/arm/musicpal.c b/hw/arm/musicpal.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/musicpal.c
+--- a/target/arm/neon-shared.decode
-+++ b/hw/arm/musicpal.c
++++ b/target/arm/neon-shared.decode
-@@ -XXX,XX +XXX,XX @@ static void musicpal_init(MachineState *machine)
+@@ -XXX,XX +XXX,XX @@ VFML           1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \
-          * 0xFF800000 (if there is 8 MB flash). So remap flash access if the
+                vm=%vm_sp vn=%vn_sp vd=%vd_dp q=0
-          * image is smaller than 32 MB.
+ VFML           1111 110 0 s:1 . 10 .... .... 1000 . 1 . 1 .... \
-          */
+                vm=%vm_dp vn=%vn_dp vd=%vd_dp q=1
--#ifdef TARGET_WORDS_BIGENDIAN
++
--        pflash_cfi02_register(0x100000000ULL - MP_FLASH_SIZE_MAX,
++VCMLA_scalar   1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \
--                              "musicpal.flash", flash_size,
++               vn=%vn_dp vd=%vd_dp size=0
--                              blk, 0x10000,
++VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
--                              MP_FLASH_SIZE_MAX / flash_size,
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp size=1 index=0
--                              2, 0x00BF, 0x236D, 0x0000, 0x0000,
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
--                              0x5555, 0x2AAA, 1);
+index XXXXXXX..XXXXXXX 100644
--#else
+--- a/target/arm/translate-neon.inc.c
-         pflash_cfi02_register(0x100000000ULL - MP_FLASH_SIZE_MAX,
++++ b/target/arm/translate-neon.inc.c
-                               "musicpal.flash", flash_size,
+@@ -XXX,XX +XXX,XX @@ static bool trans_VFML(DisasContext *s, arg_VFML *a)
-                               blk, 0x10000,
+                        gen_helper_gvec_fmlal_a32);
-                               MP_FLASH_SIZE_MAX / flash_size,
+     return true;
-, 0x00BF, 0x236D, 0x0000, 0x0000,
+ }
-x5555, 0x2AAA, 0);
++
--#endif
++static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a)
 +{
 +    gen_helper_gvec_3_ptr *fn_gvec_ptr;
 +    int opr_sz;
 +    TCGv_ptr fpst;
 +
 +    if (!dc_isar_feature(aa32_vcma, s)) {
 +        return false;
 +    }
 +    if (a->size == 0 && !dc_isar_feature(aa32_fp16_arith, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vd | a->vn) & a->q) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    fn_gvec_ptr = (a->size ? gen_helper_gvec_fcmlas_idx
 +                   : gen_helper_gvec_fcmlah_idx);
 +    opr_sz = (1 + a->q) * 8;
 +    fpst = get_fpstatus_ptr(1);
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(1, a->vn),
 +                       vfp_reg_offset(1, a->vm),
 +                       fpst, opr_sz, opr_sz,
 +                       (a->index << 2) | a->rot, fn_gvec_ptr);
 +    tcg_temp_free_ptr(fpst);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
      bool is_long = false, q = extract32(insn, 6, 1);
      bool ptr_is_env = false;
 -    if ((insn & 0xff000f10) == 0xfe000800) {
 -        /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */
 -        int rot = extract32(insn, 20, 2);
 -        int size = extract32(insn, 23, 1);
 -        int index;
 -
-     }
+-        if (!dc_isar_feature(aa32_vcma, s)) {
-     sysbus_create_simple(TYPE_MV88W8618_FLASHCFG, MP_FLASHCFG_BASE, NULL);
+-            return 1;
 -        }
 -        if (size == 0) {
 -            if (!dc_isar_feature(aa32_fp16_arith, s)) {
 -                return 1;
 -            }
 -            /* For fp16, rm is just Vm, and index is M.  */
 -            rm = extract32(insn, 0, 4);
 -            index = extract32(insn, 5, 1);
 -        } else {
 -            /* For fp32, rm is the usual M:Vm, and index is 0.  */
 -            VFP_DREG_M(rm, insn);
 -            index = 0;
 -        }
 -        data = (index << 2) | rot;
 -        fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx
 -                       : gen_helper_gvec_fcmlah_idx);
 -    } else if ((insn & 0xffb00f00) == 0xfe200d00) {
 +    if ((insn & 0xffb00f00) == 0xfe200d00) {
          /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
          int u = extract32(insn, 4, 1);
 --
 .20.1

-[PULL 28/37] hw/arm/cubieboard: restrict allowed CPU type to ARM Cortex-A8
+[PULL 28/39] target/arm: Convert V[US]DOT (scalar) to decodetree
-From: Niek Linnenbank <nieklinnenbank@gmail.com>
+Convert the V[US]DOT (scalar) insns in the 2reg-scalar-ext group
 to decodetree.
-The Cubieboard has an ARM Cortex-A8.  Instead of simply ignoring a
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-bogus -cpu option provided by the user, give them an error message so
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-they know their command line is wrong.
+Message-id: 20200430181003.21682-10-peter.maydell@linaro.org
 ---
  target/arm/neon-shared.decode   |  3 +++
  target/arm/translate-neon.inc.c | 35 +++++++++++++++++++++++++++++++++
  target/arm/translate.c          | 13 +-----------
 files changed, 39 insertions(+), 12 deletions(-)
-Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 Message-id: 20200227220149.6845-3-nieklinnenbank@gmail.com
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 [PMM: tweaked commit message]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  hw/arm/cubieboard.c | 10 +++++++++-
 file changed, 9 insertions(+), 1 deletion(-)
 diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/cubieboard.c
+--- a/target/arm/neon-shared.decode
-+++ b/hw/arm/cubieboard.c
++++ b/target/arm/neon-shared.decode
-@@ -XXX,XX +XXX,XX @@ static struct arm_boot_info cubieboard_binfo = {
+@@ -XXX,XX +XXX,XX @@ VCMLA_scalar   1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \
+                vn=%vn_dp vd=%vd_dp size=0
- static void cubieboard_init(MachineState *machine)
+ VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
- {
+                vm=%vm_dp vn=%vn_dp vd=%vd_dp size=1 index=0
--    AwA10State *a10 = AW_A10(object_new(TYPE_AW_A10));
++
-+    AwA10State *a10;
++VDOT_scalar    1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \
-     Error *err = NULL;
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-+    /* Only allow Cortex-A8 for this board */
+index XXXXXXX..XXXXXXX 100644
-+    if (strcmp(machine->cpu_type, ARM_CPU_TYPE_NAME("cortex-a8")) != 0) {
+--- a/target/arm/translate-neon.inc.c
-+        error_report("This board can only be used with cortex-a8 CPU");
++++ b/target/arm/translate-neon.inc.c
-+        exit(1);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a)
      tcg_temp_free_ptr(fpst);
      return true;
  }
 +
 +static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a)
 +{
 +    gen_helper_gvec_3 *fn_gvec;
 +    int opr_sz;
 +    TCGv_ptr fpst;
 +
 +    if (!dc_isar_feature(aa32_dp, s)) {
 +        return false;
 +    }
 +
-+    a10 = AW_A10(object_new(TYPE_AW_A10));
++    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn) & 0x10)) {
 +        return false;
 +    }
 +
-     object_property_set_int(OBJECT(&a10->emac), 1, "phy-addr", &err);
++    if ((a->vd | a->vn) & a->q) {
-     if (err != NULL) {
++        return false;
-         error_reportf_err(err, "Couldn't set phy address: ");
++    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    fn_gvec = a->u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
 +    opr_sz = (1 + a->q) * 8;
 +    fpst = get_fpstatus_ptr(1);
 +    tcg_gen_gvec_3_ool(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(1, a->vn),
 +                       vfp_reg_offset(1, a->rm),
 +                       opr_sz, opr_sz, a->index, fn_gvec);
 +    tcg_temp_free_ptr(fpst);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
      bool is_long = false, q = extract32(insn, 6, 1);
      bool ptr_is_env = false;
 -    if ((insn & 0xffb00f00) == 0xfe200d00) {
 -        /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
 -        int u = extract32(insn, 4, 1);
 -
 -        if (!dc_isar_feature(aa32_dp, s)) {
 -            return 1;
 -        }
 -        fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
 -        /* rm is just Vm, and index is M.  */
 -        data = extract32(insn, 5, 1); /* index */
 -        rm = extract32(insn, 0, 4);
 -    } else if ((insn & 0xffa00f10) == 0xfe000810) {
 +    if ((insn & 0xffa00f10) == 0xfe000810) {
          /* VFM[AS]L -- 1111 1110 0.0S .... .... 1000 .Q.1 .... */
          int is_s = extract32(insn, 20, 1);
          int vm20 = extract32(insn, 0, 3);
 --
 .20.1

-[PULL 11/37] hw/arm/pxa2xx: move timer_new from init() into realize() to avoid memleaks
+[PULL 29/39] target/arm: Convert VFM[AS]L (scalar) to decodetree
-From: Pan Nengyuan <pannengyuan@huawei.com>
+Convert the VFM[AS]L (scalar) insns in the 2reg-scalar-ext group
+to decodetree. These are the last ones in the group so we can remove
-There are some memleaks when we call 'device_list_properties'. This patch move timer_new from init into realize to fix it.
+all the legacy decode for the group.
-Reported-by: Euler Robot <euler.robot@huawei.com>
+Note that in disas_thumb2_insn() the parts of this encoding space
-Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
+where the decodetree decoder returns false will correctly be directed
-Message-id: 20200227025055.14341-3-pannengyuan@huawei.com
+to illegal_op by the "(insn & (1 << 28))" check so they won't fall
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+into disas_coproc_insn() by mistake.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-11-peter.maydell@linaro.org
 ---
- hw/arm/pxa2xx.c | 17 +++++++++++------
+ target/arm/neon-shared.decode   |   7 +++
-file changed, 11 insertions(+), 6 deletions(-)
+ target/arm/translate-neon.inc.c |  32 ++++++++++
+ target/arm/translate.c          | 107 +-------------------------------
-diff --git a/hw/arm/pxa2xx.c b/hw/arm/pxa2xx.c
+files changed, 40 insertions(+), 106 deletions(-)
 diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/pxa2xx.c
+--- a/target/arm/neon-shared.decode
-+++ b/hw/arm/pxa2xx.c
++++ b/target/arm/neon-shared.decode
-@@ -XXX,XX +XXX,XX @@ static void pxa2xx_rtc_init(Object *obj)
+@@ -XXX,XX +XXX,XX @@ VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
-     s->last_rtcpicr = 0;
-     s->last_hz = s->last_sw = s->last_pi = qemu_clock_get_ms(rtc_clock);
+ VDOT_scalar    1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \
+                vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+    sysbus_init_irq(dev, &s->rtc_irq);
++
-+
++%vfml_scalar_q0_rm 0:3 5:1
-+    memory_region_init_io(&s->iomem, obj, &pxa2xx_rtc_ops, s,
++%vfml_scalar_q1_index 5:1 3:1
-+                          "pxa2xx-rtc", 0x10000);
++VFML_scalar    1111 1110 0 . 0 s:1 .... .... 1000 . 0 . 1 index:1 ... \
-+    sysbus_init_mmio(dev, &s->iomem);
++               rm=%vfml_scalar_q0_rm vn=%vn_sp vd=%vd_dp q=0
 +VFML_scalar    1111 1110 0 . 0 s:1 .... .... 1000 . 1 . 1 . rm:3 \
 +               index=%vfml_scalar_q1_index vn=%vn_dp vd=%vd_dp q=1
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a)
      tcg_temp_free_ptr(fpst);
      return true;
  }
 +
 +static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a)
 +{
 +    int opr_sz;
 +
 +    if (!dc_isar_feature(aa32_fhm, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd & 0x10) || (a->q && (a->vn & 0x10)))) {
 +        return false;
 +    }
 +
 +    if (a->vd & a->q) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    opr_sz = (1 + a->q) * 8;
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(a->q, a->vn),
 +                       vfp_reg_offset(a->q, a->rm),
 +                       cpu_env, opr_sz, opr_sz,
 +                       (a->index << 2) | a->s, /* is_2 == 0 */
 +                       gen_helper_gvec_fmlal_idx_a32);
 +    return true;
 +}
-+
+diff --git a/target/arm/translate.c b/target/arm/translate.c
-+static void pxa2xx_rtc_realize(DeviceState *dev, Error **errp)
+index XXXXXXX..XXXXXXX 100644
-+{
+--- a/target/arm/translate.c
-+    PXA2xxRTCState *s = PXA2XX_RTC(dev);
++++ b/target/arm/translate.c
-     s->rtc_hz    = timer_new_ms(rtc_clock, pxa2xx_rtc_hz_tick,    s);
+@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
      s->rtc_rdal1 = timer_new_ms(rtc_clock, pxa2xx_rtc_rdal1_tick, s);
      s->rtc_rdal2 = timer_new_ms(rtc_clock, pxa2xx_rtc_rdal2_tick, s);
      s->rtc_swal1 = timer_new_ms(rtc_clock, pxa2xx_rtc_swal1_tick, s);
      s->rtc_swal2 = timer_new_ms(rtc_clock, pxa2xx_rtc_swal2_tick, s);
      s->rtc_pi    = timer_new_ms(rtc_clock, pxa2xx_rtc_pi_tick,    s);
 -
 -    sysbus_init_irq(dev, &s->rtc_irq);
 -
 -    memory_region_init_io(&s->iomem, obj, &pxa2xx_rtc_ops, s,
 -                          "pxa2xx-rtc", 0x10000);
 -    sysbus_init_mmio(dev, &s->iomem);
  }
- static int pxa2xx_rtc_pre_save(void *opaque)
+ #define VFP_REG_SHR(x, n) (((n) > 0) ? (x) >> (n) : (x) << -(n))
-@@ -XXX,XX +XXX,XX @@ static void pxa2xx_rtc_sysbus_class_init(ObjectClass *klass, void *data)
+-#define VFP_SREG(insn, bigbit, smallbit) \
+-  ((VFP_REG_SHR(insn, bigbit - 1) & 0x1e) | (((insn) >> (smallbit)) & 1))
-     dc->desc = "PXA2xx RTC Controller";
+ #define VFP_DREG(reg, insn, bigbit, smallbit) do { \
-     dc->vmsd = &vmstate_pxa2xx_rtc_regs;
+     if (dc_isar_feature(aa32_simd_r32, s)) { \
-+    dc->realize = pxa2xx_rtc_realize;
+         reg = (((insn) >> (bigbit)) & 0x0f) \
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
          reg = ((insn) >> (bigbit)) & 0x0f; \
      }} while (0)
 -#define VFP_SREG_D(insn) VFP_SREG(insn, 12, 22)
  #define VFP_DREG_D(reg, insn) VFP_DREG(reg, insn, 12, 22)
 -#define VFP_SREG_N(insn) VFP_SREG(insn, 16,  7)
  #define VFP_DREG_N(reg, insn) VFP_DREG(reg, insn, 16,  7)
 -#define VFP_SREG_M(insn) VFP_SREG(insn,  0,  5)
  #define VFP_DREG_M(reg, insn) VFP_DREG(reg, insn,  0,  5)
  static void gen_neon_dup_low16(TCGv_i32 var)
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      return 0;
  }
- static const TypeInfo pxa2xx_rtc_sysbus_info = {
+-/* Advanced SIMD two registers and a scalar extension.
 - *  31             24   23  22   20   16   12  11   10   9    8        3     0
 - * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
 - * | 1 1 1 1 1 1 1 0 | o1 | D | o2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
 - * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
 - *
 - */
 -
 -static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
 -{
 -    gen_helper_gvec_3 *fn_gvec = NULL;
 -    gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
 -    int rd, rn, rm, opr_sz, data;
 -    int off_rn, off_rm;
 -    bool is_long = false, q = extract32(insn, 6, 1);
 -    bool ptr_is_env = false;
 -
 -    if ((insn & 0xffa00f10) == 0xfe000810) {
 -        /* VFM[AS]L -- 1111 1110 0.0S .... .... 1000 .Q.1 .... */
 -        int is_s = extract32(insn, 20, 1);
 -        int vm20 = extract32(insn, 0, 3);
 -        int vm3 = extract32(insn, 3, 1);
 -        int m = extract32(insn, 5, 1);
 -        int index;
 -
 -        if (!dc_isar_feature(aa32_fhm, s)) {
 -            return 1;
 -        }
 -        if (q) {
 -            rm = vm20;
 -            index = m * 2 + vm3;
 -        } else {
 -            rm = vm20 * 2 + m;
 -            index = vm3;
 -        }
 -        is_long = true;
 -        data = (index << 2) | is_s; /* is_2 == 0 */
 -        fn_gvec_ptr = gen_helper_gvec_fmlal_idx_a32;
 -        ptr_is_env = true;
 -    } else {
 -        return 1;
 -    }
 -
 -    VFP_DREG_D(rd, insn);
 -    if (rd & q) {
 -        return 1;
 -    }
 -    if (q || !is_long) {
 -        VFP_DREG_N(rn, insn);
 -        if (rn & q & !is_long) {
 -            return 1;
 -        }
 -        off_rn = vfp_reg_offset(1, rn);
 -        off_rm = vfp_reg_offset(1, rm);
 -    } else {
 -        rn = VFP_SREG_N(insn);
 -        off_rn = vfp_reg_offset(0, rn);
 -        off_rm = vfp_reg_offset(0, rm);
 -    }
 -    if (s->fp_excp_el) {
 -        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
 -                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
 -        return 0;
 -    }
 -    if (!s->vfp_enabled) {
 -        return 1;
 -    }
 -
 -    opr_sz = (1 + q) * 8;
 -    if (fn_gvec_ptr) {
 -        TCGv_ptr ptr;
 -        if (ptr_is_env) {
 -            ptr = cpu_env;
 -        } else {
 -            ptr = get_fpstatus_ptr(1);
 -        }
 -        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
 -                           opr_sz, opr_sz, data, fn_gvec_ptr);
 -        if (!ptr_is_env) {
 -            tcg_temp_free_ptr(ptr);
 -        }
 -    } else {
 -        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
 -                           opr_sz, opr_sz, data, fn_gvec);
 -    }
 -    return 0;
 -}
 -
  static int disas_coproc_insn(DisasContext *s, uint32_t insn)
  {
      int cpnum, is64, crn, crm, opc1, opc2, isread, rt, rt2;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                      }
                  }
              }
 -        } else if ((insn & 0x0f000a00) == 0x0e000800
 -                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -            return;
          }
          goto illegal_op;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
              }
              break;
          }
 -        if ((insn & 0xff000a00) == 0xfe000800
 -            && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            /* The Thumb2 and ARM encodings are identical.  */
 -            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -        } else if (((insn >> 24) & 3) == 3) {
 +        if (((insn >> 24) & 3) == 3) {
              /* Translate into the equivalent ARM encoding.  */
              insn = (insn & 0xe2ffffff) | ((insn & (1 << 28)) >> 4) | (1 << 28);
              if (disas_neon_data_insn(s, insn)) {
 --
 .20.1

-[PULL 12/37] hw/arm/spitz: move timer_new from init() into realize() to avoid memleaks
+[PULL 30/39] target/arm: Convert Neon load/store multiple structures to decodetree
-From: Pan Nengyuan <pannengyuan@huawei.com>
+Convert the Neon "load/store multiple structures" insns to decodetree.
 There are some memleaks when we call 'device_list_properties'. This patch move timer_new from init into realize to fix it.
 Reported-by: Euler Robot <euler.robot@huawei.com>
 Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
 Message-id: 20200227025055.14341-4-pannengyuan@huawei.com
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-12-peter.maydell@linaro.org
 ---
- hw/arm/spitz.c | 8 +++++++-
+ target/arm/neon-ls.decode       |   7 ++
-file changed, 7 insertions(+), 1 deletion(-)
+ target/arm/translate-neon.inc.c | 124 ++++++++++++++++++++++++++++++++
+ target/arm/translate.c          |  91 +----------------------
-diff --git a/hw/arm/spitz.c b/hw/arm/spitz.c
+files changed, 133 insertions(+), 89 deletions(-)
 diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/spitz.c
+--- a/target/arm/neon-ls.decode
-+++ b/hw/arm/spitz.c
++++ b/target/arm/neon-ls.decode
-@@ -XXX,XX +XXX,XX @@ static void spitz_keyboard_init(Object *obj)
+@@ -XXX,XX +XXX,XX @@
+ #   0b1111_1001_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
-     spitz_keyboard_pre_map(s);
+ # This file works on the A32 encoding only; calling code for T32 has to
+ # transform the insn into the A32 version first.
--    s->kbdtimer = timer_new_ns(QEMU_CLOCK_VIRTUAL, spitz_keyboard_tick, s);
++
-     qdev_init_gpio_in(dev, spitz_keyboard_strobe, SPITZ_KEY_STROBE_NUM);
++%vd_dp  22:1 12:4
-     qdev_init_gpio_out(dev, s->sense, SPITZ_KEY_SENSE_NUM);
++
 +# Neon load/store multiple structures
 +
 +VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
 +               vd=%vd_dp
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a)
                         gen_helper_gvec_fmlal_idx_a32);
      return true;
  }
++
-+static void spitz_keyboard_realize(DeviceState *dev, Error **errp)
++static struct {
 +    int nregs;
 +    int interleave;
 +    int spacing;
 +} const neon_ls_element_type[11] = {
 +    {1, 4, 1},
 +    {1, 4, 2},
 +    {4, 1, 1},
 +    {2, 2, 2},
 +    {1, 3, 1},
 +    {1, 3, 2},
 +    {3, 1, 1},
 +    {1, 1, 1},
 +    {1, 2, 1},
 +    {1, 2, 2},
 +    {2, 1, 1}
 +};
 +
 +static void gen_neon_ldst_base_update(DisasContext *s, int rm, int rn,
 +                                      int stride)
 +{
-+    SpitzKeyboardState *s = SPITZ_KEYBOARD(dev);
++    if (rm != 15) {
-+    s->kbdtimer = timer_new_ns(QEMU_CLOCK_VIRTUAL, spitz_keyboard_tick, s);
++        TCGv_i32 base;
 +
 +        base = load_reg(s, rn);
 +        if (rm == 13) {
 +            tcg_gen_addi_i32(base, base, stride);
 +        } else {
 +            TCGv_i32 index;
 +            index = load_reg(s, rm);
 +            tcg_gen_add_i32(base, base, index);
 +            tcg_temp_free_i32(index);
 +        }
 +        store_reg(s, rn, base);
 +    }
 +}
 +
- /* LCD backlight controller */
++static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
++{
- #define LCDTG_RESCTL    0x00
++    /* Neon load/store multiple structures */
-@@ -XXX,XX +XXX,XX @@ static void spitz_keyboard_class_init(ObjectClass *klass, void *data)
++    int nregs, interleave, spacing, reg, n;
-     DeviceClass *dc = DEVICE_CLASS(klass);
++    MemOp endian = s->be_data;
++    int mmu_idx = get_mem_index(s);
-     dc->vmsd = &vmstate_spitz_kbd;
++    int size = a->size;
-+    dc->realize = spitz_keyboard_realize;
++    TCGv_i64 tmp64;
 +    TCGv_i32 addr, tmp;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist */
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
 +        return false;
 +    }
 +    if (a->itype > 10) {
 +        return false;
 +    }
 +    /* Catch UNDEF cases for bad values of align field */
 +    switch (a->itype & 0xc) {
 +    case 4:
 +        if (a->align >= 2) {
 +            return false;
 +        }
 +        break;
 +    case 8:
 +        if (a->align == 3) {
 +            return false;
 +        }
 +        break;
 +    default:
 +        break;
 +    }
 +    nregs = neon_ls_element_type[a->itype].nregs;
 +    interleave = neon_ls_element_type[a->itype].interleave;
 +    spacing = neon_ls_element_type[a->itype].spacing;
 +    if (size == 3 && (interleave | spacing) != 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    /* For our purposes, bytes are always little-endian.  */
 +    if (size == 0) {
 +        endian = MO_LE;
 +    }
 +    /*
 +     * Consecutive little-endian elements from a single register
 +     * can be promoted to a larger little-endian operation.
 +     */
 +    if (interleave == 1 && endian == MO_LE) {
 +        size = 3;
 +    }
 +    tmp64 = tcg_temp_new_i64();
 +    addr = tcg_temp_new_i32();
 +    tmp = tcg_const_i32(1 << size);
 +    load_reg_var(s, addr, a->rn);
 +    for (reg = 0; reg < nregs; reg++) {
 +        for (n = 0; n < 8 >> size; n++) {
 +            int xs;
 +            for (xs = 0; xs < interleave; xs++) {
 +                int tt = a->vd + reg + spacing * xs;
 +
 +                if (a->l) {
 +                    gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
 +                    neon_store_element64(tt, n, size, tmp64);
 +                } else {
 +                    neon_load_element64(tmp64, tt, n, size);
 +                    gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
 +                }
 +                tcg_gen_add_i32(addr, addr, tmp);
 +            }
 +        }
 +    }
 +    tcg_temp_free_i32(addr);
 +    tcg_temp_free_i32(tmp);
 +    tcg_temp_free_i64(tmp64);
 +
 +    gen_neon_ldst_base_update(s, a->rm, a->rn, nregs * interleave * 8);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
  }
- static const TypeInfo spitz_keyboard_info = {
 -static struct {
 -    int nregs;
 -    int interleave;
 -    int spacing;
 -} const neon_ls_element_type[11] = {
 -    {1, 4, 1},
 -    {1, 4, 2},
 -    {4, 1, 1},
 -    {2, 2, 2},
 -    {1, 3, 1},
 -    {1, 3, 2},
 -    {3, 1, 1},
 -    {1, 1, 1},
 -    {1, 2, 1},
 -    {1, 2, 2},
 -    {2, 1, 1}
 -};
 -
  /* Translate a NEON load/store element instruction.  Return nonzero if the
     instruction is invalid.  */
  static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
  {
      int rd, rn, rm;
 -    int op;
      int nregs;
 -    int interleave;
 -    int spacing;
      int stride;
      int size;
      int reg;
      int load;
 -    int n;
      int vec_size;
 -    int mmu_idx;
 -    MemOp endian;
      TCGv_i32 addr;
      TCGv_i32 tmp;
 -    TCGv_i32 tmp2;
 -    TCGv_i64 tmp64;
      if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
          return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      rn = (insn >> 16) & 0xf;
      rm = insn & 0xf;
      load = (insn & (1 << 21)) != 0;
 -    endian = s->be_data;
 -    mmu_idx = get_mem_index(s);
      if ((insn & (1 << 23)) == 0) {
 -        /* Load store all elements.  */
 -        op = (insn >> 8) & 0xf;
 -        size = (insn >> 6) & 3;
 -        if (op > 10)
 -            return 1;
 -        /* Catch UNDEF cases for bad values of align field */
 -        switch (op & 0xc) {
 -        case 4:
 -            if (((insn >> 5) & 1) == 1) {
 -                return 1;
 -            }
 -            break;
 -        case 8:
 -            if (((insn >> 4) & 3) == 3) {
 -                return 1;
 -            }
 -            break;
 -        default:
 -            break;
 -        }
 -        nregs = neon_ls_element_type[op].nregs;
 -        interleave = neon_ls_element_type[op].interleave;
 -        spacing = neon_ls_element_type[op].spacing;
 -        if (size == 3 && (interleave | spacing) != 1) {
 -            return 1;
 -        }
 -        /* For our purposes, bytes are always little-endian.  */
 -        if (size == 0) {
 -            endian = MO_LE;
 -        }
 -        /* Consecutive little-endian elements from a single register
 -         * can be promoted to a larger little-endian operation.
 -         */
 -        if (interleave == 1 && endian == MO_LE) {
 -            size = 3;
 -        }
 -        tmp64 = tcg_temp_new_i64();
 -        addr = tcg_temp_new_i32();
 -        tmp2 = tcg_const_i32(1 << size);
 -        load_reg_var(s, addr, rn);
 -        for (reg = 0; reg < nregs; reg++) {
 -            for (n = 0; n < 8 >> size; n++) {
 -                int xs;
 -                for (xs = 0; xs < interleave; xs++) {
 -                    int tt = rd + reg + spacing * xs;
 -
 -                    if (load) {
 -                        gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
 -                        neon_store_element64(tt, n, size, tmp64);
 -                    } else {
 -                        neon_load_element64(tmp64, tt, n, size);
 -                        gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
 -                    }
 -                    tcg_gen_add_i32(addr, addr, tmp2);
 -                }
 -            }
 -        }
 -        tcg_temp_free_i32(addr);
 -        tcg_temp_free_i32(tmp2);
 -        tcg_temp_free_i64(tmp64);
 -        stride = nregs * interleave * 8;
 +        /* Load store all elements -- handled already by decodetree */
 +        return 1;
      } else {
          size = (insn >> 10) & 3;
          if (size == 3) {
 --
 .20.1

-[PULL 19/37] target/arm: Improve masking in arm_hcr_el2_eff
+[PULL 31/39] target/arm: Convert Neon 'load single structure to all lanes' to decodetree
-From: Richard Henderson <richard.henderson@linaro.org>
+Convert the Neon "load single structure to all lanes" insns to
 decodetree.
-Update the {TGE,E2H} == '11' masking to ARMv8.6.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-If EL2 is configured for aarch32, disable all of
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-the bits that are RES0 in aarch32 mode.
+Message-id: 20200430181003.21682-13-peter.maydell@linaro.org
 ---
  target/arm/neon-ls.decode       |  5 +++
  target/arm/translate-neon.inc.c | 73 +++++++++++++++++++++++++++++++++
  target/arm/translate.c          | 55 +------------------------
 files changed, 80 insertions(+), 53 deletions(-)
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
 Message-id: 20200229012811.24129-6-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  target/arm/helper.c | 31 +++++++++++++++++++++++++++----
 file changed, 27 insertions(+), 4 deletions(-)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-ls.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-ls.decode
-@@ -XXX,XX +XXX,XX @@ uint64_t arm_hcr_el2_eff(CPUARMState *env)
+@@ -XXX,XX +XXX,XX @@
-          * Since the v8.4 language applies to the entire register, and
-          * appears to be backward compatible, use that.
+ VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
-          */
+                vd=%vd_dp
--        ret = 0;
++
--    } else if (ret & HCR_TGE) {
++# Neon load single element to all lanes
--        /* These bits are up-to-date as of ARMv8.4.  */
++
-+        return 0;
++VLD_all_lanes  1111 0100 1 . 1 0 rn:4 .... 11 n:2 size:2 t:1 a:1 rm:4 \
 +               vd=%vd_dp
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
      gen_neon_ldst_base_update(s, a->rm, a->rn, nregs * interleave * 8);
      return true;
  }
 +
 +static bool trans_VLD_all_lanes(DisasContext *s, arg_VLD_all_lanes *a)
 +{
 +    /* Neon load single structure to all lanes */
 +    int reg, stride, vec_size;
 +    int vd = a->vd;
 +    int size = a->size;
 +    int nregs = a->n + 1;
 +    TCGv_i32 addr, tmp;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist */
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
 +        return false;
 +    }
 +
 +    if (size == 3) {
 +        if (nregs != 4 || a->a == 0) {
 +            return false;
 +        }
 +        /* For VLD4 size == 3 a == 1 means 32 bits at 16 byte alignment */
 +        size = 2;
 +    }
 +    if (nregs == 1 && a->a == 1 && size == 0) {
 +        return false;
 +    }
 +    if (nregs == 3 && a->a == 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    /*
-+     * For a cpu that supports both aarch64 and aarch32, we can set bits
++     * VLD1 to all lanes: T bit indicates how many Dregs to write.
-+     * in HCR_EL2 (e.g. via EL3) that are RES0 when we enter EL2 as aa32.
++     * VLD2/3/4 to all lanes: T bit indicates register stride.
 +     * Ignore all of the bits in HCR+HCR2 that are not valid for aarch32.
 +     */
-+    if (!arm_el_is_aa64(env, 2)) {
++    stride = a->t ? 2 : 1;
-+        uint64_t aa32_valid;
++    vec_size = nregs == 1 ? stride * 8 : 8;
 +
-+        /*
++    tmp = tcg_temp_new_i32();
-+         * These bits are up-to-date as of ARMv8.6.
++    addr = tcg_temp_new_i32();
-+         * For HCR, it's easiest to list just the 2 bits that are invalid.
++    load_reg_var(s, addr, a->rn);
-+         * For HCR2, list those that are valid.
++    for (reg = 0; reg < nregs; reg++) {
-+         */
++        gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
-+        aa32_valid = MAKE_64BIT_MASK(0, 32) & ~(HCR_RW | HCR_TDZ);
++                        s->be_data | size);
-+        aa32_valid |= (HCR_CD | HCR_ID | HCR_TERR | HCR_TEA | HCR_MIOCNCE |
++        if ((vd & 1) && vec_size == 16) {
-+                       HCR_TID4 | HCR_TICAB | HCR_TOCU | HCR_TTLBIS);
++            /*
-+        ret &= aa32_valid;
++             * We cannot write 16 bytes at once because the
 +             * destination is unaligned.
 +             */
 +            tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0),
 +                                 8, 8, tmp);
 +            tcg_gen_gvec_mov(0, neon_reg_offset(vd + 1, 0),
 +                             neon_reg_offset(vd, 0), 8, 8);
 +        } else {
 +            tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0),
 +                                 vec_size, vec_size, tmp);
 +        }
 +        tcg_gen_addi_i32(addr, addr, 1 << size);
 +        vd += stride;
 +    }
++    tcg_temp_free_i32(tmp);
++    tcg_temp_free_i32(addr);
 +
-+    if (ret & HCR_TGE) {
++    gen_neon_ldst_base_update(s, a->rm, a->rn, (1 << size) * nregs);
-+        /* These bits are up-to-date as of ARMv8.6.  */
++
-         if (ret & HCR_E2H) {
++    return true;
-             ret &= ~(HCR_VM | HCR_FMO | HCR_IMO | HCR_AMO |
++}
-                      HCR_BSU_MASK | HCR_DC | HCR_TWI | HCR_TWE |
+diff --git a/target/arm/translate.c b/target/arm/translate.c
-                      HCR_TID0 | HCR_TID2 | HCR_TPCP | HCR_TPU |
+index XXXXXXX..XXXXXXX 100644
--                     HCR_TDZ | HCR_CD | HCR_ID | HCR_MIOCNCE);
+--- a/target/arm/translate.c
-+                     HCR_TDZ | HCR_CD | HCR_ID | HCR_MIOCNCE |
++++ b/target/arm/translate.c
-+                     HCR_TID4 | HCR_TICAB | HCR_TOCU | HCR_ENSCXT |
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
-+                     HCR_TTLBIS | HCR_TTLBOS | HCR_TID5);
+     int size;
      int reg;
      int load;
 -    int vec_size;
      TCGv_i32 addr;
      TCGv_i32 tmp;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      } else {
          size = (insn >> 10) & 3;
          if (size == 3) {
 -            /* Load single element to all lanes.  */
 -            int a = (insn >> 4) & 1;
 -            if (!load) {
 -                return 1;
 -            }
 -            size = (insn >> 6) & 3;
 -            nregs = ((insn >> 8) & 3) + 1;
 -
 -            if (size == 3) {
 -                if (nregs != 4 || a == 0) {
 -                    return 1;
 -                }
 -                /* For VLD4 size==3 a == 1 means 32 bits at 16 byte alignment */
 -                size = 2;
 -            }
 -            if (nregs == 1 && a == 1 && size == 0) {
 -                return 1;
 -            }
 -            if (nregs == 3 && a == 1) {
 -                return 1;
 -            }
 -            addr = tcg_temp_new_i32();
 -            load_reg_var(s, addr, rn);
 -
 -            /* VLD1 to all lanes: bit 5 indicates how many Dregs to write.
 -             * VLD2/3/4 to all lanes: bit 5 indicates register stride.
 -             */
 -            stride = (insn & (1 << 5)) ? 2 : 1;
 -            vec_size = nregs == 1 ? stride * 8 : 8;
 -
 -            tmp = tcg_temp_new_i32();
 -            for (reg = 0; reg < nregs; reg++) {
 -                gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 -                                s->be_data | size);
 -                if ((rd & 1) && vec_size == 16) {
 -                    /* We cannot write 16 bytes at once because the
 -                     * destination is unaligned.
 -                     */
 -                    tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
 -                                         8, 8, tmp);
 -                    tcg_gen_gvec_mov(0, neon_reg_offset(rd + 1, 0),
 -                                     neon_reg_offset(rd, 0), 8, 8);
 -                } else {
 -                    tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
 -                                         vec_size, vec_size, tmp);
 -                }
 -                tcg_gen_addi_i32(addr, addr, 1 << size);
 -                rd += stride;
 -            }
 -            tcg_temp_free_i32(tmp);
 -            tcg_temp_free_i32(addr);
 -            stride = (1 << size) * nregs;
 +            /* Load single element to all lanes -- handled by decodetree  */
 +            return 1;
          } else {
-             ret |= HCR_FMO | HCR_IMO | HCR_AMO;
+             /* Single element.  */
-         }
+             int idx = (insn >> 4) & 0xf;
 --
 .20.1

-[PULL 23/37] target/arm: Honor the HCR_EL2.TPCP bit
+[PULL 32/39] target/arm: Convert Neon 'load/store single structure' to decodetree
-From: Richard Henderson <richard.henderson@linaro.org>
+Convert the Neon "load/store single structure to one lane" insns to
+decodetree.
-This bit traps EL1 access to cache maintenance insns that operate
-to the point of coherency or persistence.
+As this is the last set of insns in the neon load/store group,
+we can remove the whole disas_neon_ls_insn() function.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200229012811.24129-10-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-14-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 39 +++++++++++++++++++++++++++++++--------
+ target/arm/neon-ls.decode       |  11 +++
-file changed, 31 insertions(+), 8 deletions(-)
+ target/arm/translate-neon.inc.c |  89 +++++++++++++++++++
+ target/arm/translate.c          | 147 --------------------------------
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+files changed, 100 insertions(+), 147 deletions(-)
 diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-ls.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-ls.decode
-@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
+@@ -XXX,XX +XXX,XX @@ VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
-     return CP_ACCESS_OK;
  VLD_all_lanes  1111 0100 1 . 1 0 rn:4 .... 11 n:2 size:2 t:1 a:1 rm:4 \
                 vd=%vd_dp
 +
 +# Neon load/store single structure to one lane
 +%imm1_5_p1 5:1 !function=plus1
 +%imm1_6_p1 6:1 !function=plus1
 +
 +VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 00 n:2 reg_idx:3 align:1 rm:4 \
 +               vd=%vd_dp size=0 stride=1
 +VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 01 n:2 reg_idx:2 align:2 rm:4 \
 +               vd=%vd_dp size=1 stride=%imm1_5_p1
 +VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 10 n:2 reg_idx:1 align:3 rm:4 \
 +               vd=%vd_dp size=2 stride=%imm1_6_p1
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@
   * It might be possible to convert it to a standalone .c file eventually.
   */
 +static inline int plus1(DisasContext *s, int x)
 +{
 +    return x + 1;
 +}
 +
  /* Include the generated Neon decoder */
  #include "decode-neon-dp.inc.c"
  #include "decode-neon-ls.inc.c"
@@ -XXX,XX +XXX,XX @@ static bool trans_VLD_all_lanes(DisasContext *s, arg_VLD_all_lanes *a)
      return true;
  }
++
-+static CPAccessResult aa64_cacheop_poc_access(CPUARMState *env,
++static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a)
 +                                              const ARMCPRegInfo *ri,
 +                                              bool isread)
 +{
-+    /* Cache invalidate/clean to Point of Coherency or Persistence...  */
++    /* Neon load/store single structure to one lane */
-+    switch (arm_current_el(env)) {
++    int reg;
-+    case 0:
++    int nregs = a->n + 1;
-+        /* ... EL0 must UNDEF unless SCTLR_EL1.UCI is set.  */
++    int vd = a->vd;
-+        if (!(arm_sctlr(env, 0) & SCTLR_UCI)) {
++    TCGv_i32 addr, tmp;
-+            return CP_ACCESS_TRAP;
++
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist */
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
 +        return false;
 +    }
 +
 +    /* Catch the UNDEF cases. This is unavoidably a bit messy. */
 +    switch (nregs) {
 +    case 1:
 +        if (((a->align & (1 << a->size)) != 0) ||
 +            (a->size == 2 && ((a->align & 3) == 1 || (a->align & 3) == 2))) {
 +            return false;
 +        }
 +        break;
 +    case 3:
 +        if ((a->align & 1) != 0) {
 +            return false;
 +        }
 +        /* fall through */
-+    case 1:
++    case 2:
-+        /* ... EL1 must trap to EL2 if HCR_EL2.TPCP is set.  */
++        if (a->size == 2 && (a->align & 2) != 0) {
-+        if (arm_hcr_el2_eff(env) & HCR_TPCP) {
++            return false;
 +            return CP_ACCESS_TRAP_EL2;
 +        }
 +        break;
-+    }
++    case 4:
-+    return CP_ACCESS_OK;
++        if ((a->size == 2) && ((a->align & 3) == 3)) {
 +            return false;
 +        }
 +        break;
 +    default:
 +        abort();
 +    }
 +    if ((vd + a->stride * (nregs - 1)) > 31) {
 +        /*
 +         * Attempts to write off the end of the register file are
 +         * UNPREDICTABLE; we choose to UNDEF because otherwise we would
 +         * access off the end of the array that holds the register data.
 +         */
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    tmp = tcg_temp_new_i32();
 +    addr = tcg_temp_new_i32();
 +    load_reg_var(s, addr, a->rn);
 +    /*
 +     * TODO: if we implemented alignment exceptions, we should check
 +     * addr against the alignment encoded in a->align here.
 +     */
 +    for (reg = 0; reg < nregs; reg++) {
 +        if (a->l) {
 +            gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 +                            s->be_data | a->size);
 +            neon_store_element(vd, a->reg_idx, a->size, tmp);
 +        } else { /* Store */
 +            neon_load_element(tmp, vd, a->reg_idx, a->size);
 +            gen_aa32_st_i32(s, tmp, addr, get_mem_index(s),
 +                            s->be_data | a->size);
 +        }
 +        vd += a->stride;
 +        tcg_gen_addi_i32(addr, addr, 1 << a->size);
 +    }
 +    tcg_temp_free_i32(addr);
 +    tcg_temp_free_i32(tmp);
 +
 +    gen_neon_ldst_base_update(s, a->rm, a->rn, (1 << a->size) * nregs);
 +
 +    return true;
 +}
-+
+diff --git a/target/arm/translate.c b/target/arm/translate.c
- /* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
+index XXXXXXX..XXXXXXX 100644
-  * Page D4-1736 (DDI0487A.b)
+--- a/target/arm/translate.c
-  */
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
-       .accessfn = aa64_cacheop_access },
+     tcg_temp_free_i32(rd);
-     { .name = "DC_IVAC", .state = ARM_CP_STATE_AA64,
+ }
-       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
--      .access = PL1_W, .type = ARM_CP_NOP },
+-
-+      .access = PL1_W, .accessfn = aa64_cacheop_poc_access,
+-/* Translate a NEON load/store element instruction.  Return nonzero if the
-+      .type = ARM_CP_NOP },
+-   instruction is invalid.  */
-     { .name = "DC_ISW", .state = ARM_CP_STATE_AA64,
+-static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
-       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 2,
+-{
-       .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
+-    int rd, rn, rm;
-     { .name = "DC_CVAC", .state = ARM_CP_STATE_AA64,
+-    int nregs;
-       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 1,
+-    int stride;
-       .access = PL0_W, .type = ARM_CP_NOP,
+-    int size;
--      .accessfn = aa64_cacheop_access },
+-    int reg;
-+      .accessfn = aa64_cacheop_poc_access },
+-    int load;
-     { .name = "DC_CSW", .state = ARM_CP_STATE_AA64,
+-    TCGv_i32 addr;
-       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
+-    TCGv_i32 tmp;
-       .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
+-
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+-    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-     { .name = "DC_CIVAC", .state = ARM_CP_STATE_AA64,
+-        return 1;
-       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 1,
+-    }
-       .access = PL0_W, .type = ARM_CP_NOP,
+-
--      .accessfn = aa64_cacheop_access },
+-    /* FIXME: this access check should not take precedence over UNDEF
-+      .accessfn = aa64_cacheop_poc_access },
+-     * for invalid encodings; we will generate incorrect syndrome information
-     { .name = "DC_CISW", .state = ARM_CP_STATE_AA64,
+-     * for attempts to execute invalid vfp/neon encodings with FP disabled.
-       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 2,
+-     */
-       .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
+-    if (s->fp_excp_el) {
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+-        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
-     { .name = "BPIMVA", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 7,
+-                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
-       .type = ARM_CP_NOP, .access = PL1_W },
+-        return 0;
-     { .name = "DCIMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
+-    }
--      .type = ARM_CP_NOP, .access = PL1_W },
+-
-+      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_poc_access },
+-    if (!s->vfp_enabled)
-     { .name = "DCISW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 2,
+-      return 1;
-       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
+-    VFP_DREG_D(rd, insn);
-     { .name = "DCCMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 1,
+-    rn = (insn >> 16) & 0xf;
--      .type = ARM_CP_NOP, .access = PL1_W },
+-    rm = insn & 0xf;
-+      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_poc_access },
+-    load = (insn & (1 << 21)) != 0;
-     { .name = "DCCSW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
+-    if ((insn & (1 << 23)) == 0) {
-       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
+-        /* Load store all elements -- handled already by decodetree */
-     { .name = "DCCMVAU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 11, .opc2 = 1,
+-        return 1;
-       .type = ARM_CP_NOP, .access = PL1_W },
+-    } else {
-     { .name = "DCCIMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 1,
+-        size = (insn >> 10) & 3;
--      .type = ARM_CP_NOP, .access = PL1_W },
+-        if (size == 3) {
-+      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_poc_access },
+-            /* Load single element to all lanes -- handled by decodetree  */
-     { .name = "DCCISW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 2,
+-            return 1;
-       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
+-        } else {
-     /* MMU Domain access control / MPU write buffer control */
+-            /* Single element.  */
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dcpop_reg[] = {
+-            int idx = (insn >> 4) & 0xf;
-     { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
+-            int reg_idx;
-       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
+-            switch (size) {
-       .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
+-            case 0:
--      .accessfn = aa64_cacheop_access, .writefn = dccvap_writefn },
+-                reg_idx = (insn >> 5) & 7;
-+      .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
+-                stride = 1;
-     REGINFO_SENTINEL
+-                break;
- };
+-            case 1:
+-                reg_idx = (insn >> 6) & 3;
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dcpodp_reg[] = {
+-                stride = (insn & (1 << 5)) ? 2 : 1;
-     { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
+-                break;
-       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
+-            case 2:
-       .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
+-                reg_idx = (insn >> 7) & 1;
--      .accessfn = aa64_cacheop_access, .writefn = dccvap_writefn },
+-                stride = (insn & (1 << 6)) ? 2 : 1;
-+      .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
+-                break;
-     REGINFO_SENTINEL
+-            default:
- };
+-                abort();
- #endif /*CONFIG_USER_ONLY*/
+-            }
 -            nregs = ((insn >> 8) & 3) + 1;
 -            /* Catch the UNDEF cases. This is unavoidably a bit messy. */
 -            switch (nregs) {
 -            case 1:
 -                if (((idx & (1 << size)) != 0) ||
 -                    (size == 2 && ((idx & 3) == 1 || (idx & 3) == 2))) {
 -                    return 1;
 -                }
 -                break;
 -            case 3:
 -                if ((idx & 1) != 0) {
 -                    return 1;
 -                }
 -                /* fall through */
 -            case 2:
 -                if (size == 2 && (idx & 2) != 0) {
 -                    return 1;
 -                }
 -                break;
 -            case 4:
 -                if ((size == 2) && ((idx & 3) == 3)) {
 -                    return 1;
 -                }
 -                break;
 -            default:
 -                abort();
 -            }
 -            if ((rd + stride * (nregs - 1)) > 31) {
 -                /* Attempts to write off the end of the register file
 -                 * are UNPREDICTABLE; we choose to UNDEF because otherwise
 -                 * the neon_load_reg() would write off the end of the array.
 -                 */
 -                return 1;
 -            }
 -            tmp = tcg_temp_new_i32();
 -            addr = tcg_temp_new_i32();
 -            load_reg_var(s, addr, rn);
 -            for (reg = 0; reg < nregs; reg++) {
 -                if (load) {
 -                    gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 -                                    s->be_data | size);
 -                    neon_store_element(rd, reg_idx, size, tmp);
 -                } else { /* Store */
 -                    neon_load_element(tmp, rd, reg_idx, size);
 -                    gen_aa32_st_i32(s, tmp, addr, get_mem_index(s),
 -                                    s->be_data | size);
 -                }
 -                rd += stride;
 -                tcg_gen_addi_i32(addr, addr, 1 << size);
 -            }
 -            tcg_temp_free_i32(addr);
 -            tcg_temp_free_i32(tmp);
 -            stride = nregs * (1 << size);
 -        }
 -    }
 -    if (rm != 15) {
 -        TCGv_i32 base;
 -
 -        base = load_reg(s, rn);
 -        if (rm == 13) {
 -            tcg_gen_addi_i32(base, base, stride);
 -        } else {
 -            TCGv_i32 index;
 -            index = load_reg(s, rm);
 -            tcg_gen_add_i32(base, base, index);
 -            tcg_temp_free_i32(index);
 -        }
 -        store_reg(s, rn, base);
 -    }
 -    return 0;
 -}
 -
  static inline void gen_neon_narrow(int size, TCGv_i32 dest, TCGv_i64 src)
  {
      switch (size) {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
              }
              return;
          }
 -        if ((insn & 0x0f100000) == 0x04000000) {
 -            /* NEON load/store.  */
 -            if (disas_neon_ls_insn(s, insn)) {
 -                goto illegal_op;
 -            }
 -            return;
 -        }
          if ((insn & 0x0e000f00) == 0x0c000100) {
              if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
                  /* iWMMXt register transfer.  */
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
          }
          break;
      case 12:
 -        if ((insn & 0x01100000) == 0x01000000) {
 -            if (disas_neon_ls_insn(s, insn)) {
 -                goto illegal_op;
 -            }
 -            break;
 -        }
          goto illegal_op;
      default:
      illegal_op:
 --
 .20.1

-[PULL 14/37] hw/timer/cadence_ttc: move timer_new from init() into realize() to avoid memleaks
+[PULL 33/39] target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree
-From: Pan Nengyuan <pannengyuan@huawei.com>
+Convert the Neon 3-reg-same VADD and VSUB insns to decodetree.
-There are some memleaks when we call 'device_list_properties'. This patch move timer_new from init into realize to fix it.
+Note that we don't need the neon_3r_sizes[op] check here because all
 size values are OK for VADD and VSUB; we'll add this when we convert
 the first insn that has size restrictions.
-Reported-by: Euler Robot <euler.robot@huawei.com>
+For this we need one of the GVecGen*Fn typedefs currently in
-Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
+translate-a64.h; move them all to translate.h as a block so they
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+are visible to the 32-bit decoder.
-Message-id: 20200227025055.14341-7-pannengyuan@huawei.com
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-15-peter.maydell@linaro.org
 ---
- hw/timer/cadence_ttc.c | 18 ++++++++++++------
+ target/arm/translate-a64.h      |  9 --------
-file changed, 12 insertions(+), 6 deletions(-)
+ target/arm/translate.h          |  9 ++++++++
  target/arm/neon-dp.decode       | 17 +++++++++++++++
  target/arm/translate-neon.inc.c | 38 +++++++++++++++++++++++++++++++++
  target/arm/translate.c          | 14 ++++--------
 files changed, 68 insertions(+), 19 deletions(-)
-diff --git a/hw/timer/cadence_ttc.c b/hw/timer/cadence_ttc.c
+diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
 index XXXXXXX..XXXXXXX 100644
---- a/hw/timer/cadence_ttc.c
+--- a/target/arm/translate-a64.h
-+++ b/hw/timer/cadence_ttc.c
++++ b/target/arm/translate-a64.h
-@@ -XXX,XX +XXX,XX @@ static void cadence_timer_init(uint32_t freq, CadenceTimerState *s)
+@@ -XXX,XX +XXX,XX @@ static inline int vec_full_reg_size(DisasContext *s)
- static void cadence_ttc_init(Object *obj)
- {
+ bool disas_sve(DisasContext *, uint32_t);
-     CadenceTTCState *s = CADENCE_TTC(obj);
--    int i;
+-/* Note that the gvec expanders operate on offsets + sizes.  */
 -typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t);
 -typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
 -                         uint32_t, uint32_t);
 -typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
 -                        uint32_t, uint32_t, uint32_t);
 -typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
 -                        uint32_t, uint32_t, uint32_t);
 -
--    for (i = 0; i < 3; ++i) {
+ #endif /* TARGET_ARM_TRANSLATE_A64_H */
--        cadence_timer_init(133000000, &s->timer[i]);
+diff --git a/target/arm/translate.h b/target/arm/translate.h
--        sysbus_init_irq(SYS_BUS_DEVICE(obj), &s->timer[i].irq);
+index XXXXXXX..XXXXXXX 100644
--    }
+--- a/target/arm/translate.h
++++ b/target/arm/translate.h
-     memory_region_init_io(&s->iomem, obj, &cadence_ttc_ops, s,
+@@ -XXX,XX +XXX,XX @@ void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
-                           "timer", 0x1000);
+ #define dc_isar_feature(name, ctx) \
-     sysbus_init_mmio(SYS_BUS_DEVICE(obj), &s->iomem);
+     ({ DisasContext *ctx_ = (ctx); isar_feature_##name(ctx_->isar); })
 +/* Note that the gvec expanders operate on offsets + sizes.  */
 +typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t);
 +typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
 +                         uint32_t, uint32_t);
 +typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
 +                        uint32_t, uint32_t, uint32_t);
 +typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
 +                        uint32_t, uint32_t, uint32_t);
 +
  #endif /* TARGET_ARM_TRANSLATE_H */
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-dp.decode
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
  #
  # This file is processed by scripts/decodetree.py
  #
 +# VFP/Neon register fields; same as vfp.decode
 +%vm_dp  5:1 0:4
 +%vn_dp  7:1 16:4
 +%vd_dp  22:1 12:4
  # Encodings for Neon data processing instructions where the T32 encoding
  # is a simple transformation of the A32 encoding.
@@ -XXX,XX +XXX,XX @@
  #   0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
  # This file works on the A32 encoding only; calling code for T32 has to
  # transform the insn into the A32 version first.
 +
 +######################################################################
 +# 3-reg-same grouping:
 +# 1111 001 U 0 D sz:2 Vn:4 Vd:4 opc:4 N Q M op Vm:4
 +######################################################################
 +
 +&3same vm vn vd q size
 +
 +@3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
 +                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
 +VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
 +VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a)
      return true;
  }
++
-+static void cadence_ttc_realize(DeviceState *dev, Error **errp)
++static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn)
 +{
-+    CadenceTTCState *s = CADENCE_TTC(dev);
++    int vec_size = a->q ? 16 : 8;
-+    int i;
++    int rd_ofs = neon_reg_offset(a->vd, 0);
 +    int rn_ofs = neon_reg_offset(a->vn, 0);
 +    int rm_ofs = neon_reg_offset(a->vm, 0);
 +
-+    for (i = 0; i < 3; ++i) {
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+        cadence_timer_init(133000000, &s->timer[i]);
++        return false;
 +        sysbus_init_irq(SYS_BUS_DEVICE(dev), &s->timer[i].irq);
 +    }
++
++    /* UNDEF accesses to D16-D31 if they don't exist. */
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
++        ((a->vd | a->vn | a->vm) & 0x10)) {
++        return false;
++    }
++
++    if ((a->vn | a->vm | a->vd) & a->q) {
++        return false;
++    }
++
++    if (!vfp_access_check(s)) {
++        return true;
++    }
++
++    fn(a->size, rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
++    return true;
 +}
 +
- static int cadence_timer_pre_save(void *opaque)
++#define DO_3SAME(INSN, FUNC)                                            \
- {
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-     cadence_timer_sync((CadenceTimerState *)opaque);
++    {                                                                   \
-@@ -XXX,XX +XXX,XX @@ static void cadence_ttc_class_init(ObjectClass *klass, void *data)
++        return do_3same(s, a, FUNC);                                    \
-     DeviceClass *dc = DEVICE_CLASS(klass);
++    }
++
-     dc->vmsd = &vmstate_cadence_ttc;
++DO_3SAME(VADD, tcg_gen_gvec_add)
-+    dc->realize = cadence_ttc_realize;
++DO_3SAME(VSUB, tcg_gen_gvec_sub)
- }
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
- static const TypeInfo cadence_ttc_info = {
+--- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 0;
 -        case NEON_3R_VADD_VSUB:
 -            if (u) {
 -                tcg_gen_gvec_sub(size, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -            } else {
 -                tcg_gen_gvec_add(size, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -            }
 -            return 0;
 -
          case NEON_3R_VQADD:
              tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
                             rn_ofs, rm_ofs, vec_size, vec_size,
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
                             u ? &ushl_op[size] : &sshl_op[size]);
              return 0;
 +
 +        case NEON_3R_VADD_VSUB:
 +            /* Already handled by decodetree */
 +            return 1;
          }
          if (size == 3) {
 --
 .20.1

-[PULL 27/37] hw/arm/cubieboard: use ARM Cortex-A8 as the default CPU in machine definition
+[PULL 34/39] target/arm: Convert Neon 3-reg-same logic ops to decodetree
-From: Niek Linnenbank <nieklinnenbank@gmail.com>
+Convert the Neon logic ops in the 3-reg-same grouping to decodetree.
 Note that for the logic ops the 'size' field forms part of their
 decode and the actual operations are always bitwise.
-The Cubieboard is a singleboard computer with an Allwinner A10 System-on-Chip [1].
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-As documented in the Allwinner A10 User Manual V1.5 [2], the SoC has an ARM
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Cortex-A8 processor. Currently the Cubieboard machine definition specifies the
+Message-id: 20200430181003.21682-16-peter.maydell@linaro.org
-ARM Cortex-A9 in its description and as the default CPU.
+---
  target/arm/neon-dp.decode       | 12 +++++++++++
  target/arm/translate-neon.inc.c | 19 +++++++++++++++++
  target/arm/translate.c          | 38 +--------------------------------
 files changed, 32 insertions(+), 37 deletions(-)
-This patch corrects the Cubieboard machine definition to use the ARM Cortex-A8.
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 The only user-visible effect is that our textual description of the
 machine was wrong, because hw/arm/allwinner-a10.c always creates a
 Cortex-A8 CPU regardless of the default value in the MachineClass struct.
  [1] http://docs.cubieboard.org/products/start#cubieboard1
  [2] https://linux-sunxi.org/File:Allwinner_A10_User_manual_V1.5.pdf
 Fixes: 8a863c8120994981a099
 Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
 Message-id: 20200227220149.6845-2-nieklinnenbank@gmail.com
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 [note in commit message that the bug didn't have much visible effect]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  hw/arm/cubieboard.c | 4 ++--
 file changed, 2 insertions(+), 2 deletions(-)
 diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/cubieboard.c
+--- a/target/arm/neon-dp.decode
-+++ b/hw/arm/cubieboard.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static void cubieboard_init(MachineState *machine)
+@@ -XXX,XX +XXX,XX @@
+ @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
- static void cubieboard_machine_init(MachineClass *mc)
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
- {
--    mc->desc = "cubietech cubieboard (Cortex-A9)";
++@3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
--    mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a9");
++                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
-+    mc->desc = "cubietech cubieboard (Cortex-A8)";
++
-+    mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a8");
++VAND_3s          1111 001 0 0 . 00 .... .... 0001 ... 1 .... @3same_logic
-     mc->init = cubieboard_init;
++VBIC_3s          1111 001 0 0 . 01 .... .... 0001 ... 1 .... @3same_logic
-     mc->block_default_type = IF_IDE;
++VORR_3s          1111 001 0 0 . 10 .... .... 0001 ... 1 .... @3same_logic
-     mc->units_per_default_bus = 1;
++VORN_3s          1111 001 0 0 . 11 .... .... 0001 ... 1 .... @3same_logic
 +VEOR_3s          1111 001 1 0 . 00 .... .... 0001 ... 1 .... @3same_logic
 +VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
 +VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
 +VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
 +
  VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
  VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn)
  DO_3SAME(VADD, tcg_gen_gvec_add)
  DO_3SAME(VSUB, tcg_gen_gvec_sub)
 +DO_3SAME(VAND, tcg_gen_gvec_and)
 +DO_3SAME(VBIC, tcg_gen_gvec_andc)
 +DO_3SAME(VORR, tcg_gen_gvec_or)
 +DO_3SAME(VORN, tcg_gen_gvec_orc)
 +DO_3SAME(VEOR, tcg_gen_gvec_xor)
 +
 +/* These insns are all gvec_bitsel but with the inputs in various orders. */
 +#define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
 +    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 +                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 +                                uint32_t oprsz, uint32_t maxsz)         \
 +    {                                                                   \
 +        tcg_gen_gvec_bitsel(vece, rd_ofs, O1, O2, O3, oprsz, maxsz);    \
 +    }                                                                   \
 +    DO_3SAME(INSN, gen_##INSN##_3s)
 +
 +DO_3SAME_BITSEL(VBSL, rd_ofs, rn_ofs, rm_ofs)
 +DO_3SAME_BITSEL(VBIT, rm_ofs, rn_ofs, rd_ofs)
 +DO_3SAME_BITSEL(VBIF, rm_ofs, rd_ofs, rn_ofs)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 1;
 -        case NEON_3R_LOGIC: /* Logic ops.  */
 -            switch ((u << 2) | size) {
 -            case 0: /* VAND */
 -                tcg_gen_gvec_and(0, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -                break;
 -            case 1: /* VBIC */
 -                tcg_gen_gvec_andc(0, rd_ofs, rn_ofs, rm_ofs,
 -                                  vec_size, vec_size);
 -                break;
 -            case 2: /* VORR */
 -                tcg_gen_gvec_or(0, rd_ofs, rn_ofs, rm_ofs,
 -                                vec_size, vec_size);
 -                break;
 -            case 3: /* VORN */
 -                tcg_gen_gvec_orc(0, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -                break;
 -            case 4: /* VEOR */
 -                tcg_gen_gvec_xor(0, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -                break;
 -            case 5: /* VBSL */
 -                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rd_ofs, rn_ofs, rm_ofs,
 -                                    vec_size, vec_size);
 -                break;
 -            case 6: /* VBIT */
 -                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rn_ofs, rd_ofs,
 -                                    vec_size, vec_size);
 -                break;
 -            case 7: /* VBIF */
 -                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rd_ofs, rn_ofs,
 -                                    vec_size, vec_size);
 -                break;
 -            }
 -            return 0;
 -
          case NEON_3R_VQADD:
              tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
                             rn_ofs, rm_ofs, vec_size, vec_size,
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              return 0;
          case NEON_3R_VADD_VSUB:
 +        case NEON_3R_LOGIC:
              /* Already handled by decodetree */
              return 1;
          }
 --
 .20.1

-[PULL 09/37] hw/arm/z2: Simplify since the machines are little-endian only
+[PULL 35/39] target/arm: Convert Neon 3-reg-same VMAX/VMIN to decodetree
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+Convert the Neon 3-reg-same VMAX and VMIN insns to decodetree.
-We only build the little-endian softmmu configurations. Checking
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-for big endian is pointless, remove the unused code.
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200430181003.21682-17-peter.maydell@linaro.org
 ---
  target/arm/neon-dp.decode       |  5 +++++
  target/arm/translate-neon.inc.c | 14 ++++++++++++++
  target/arm/translate.c          | 21 ++-------------------
 files changed, 21 insertions(+), 19 deletions(-)
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  hw/arm/z2.c | 8 +-------
 file changed, 1 insertion(+), 7 deletions(-)
 diff --git a/hw/arm/z2.c b/hw/arm/z2.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/z2.c
+--- a/target/arm/neon-dp.decode
-+++ b/hw/arm/z2.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static void z2_init(MachineState *machine)
+@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
-     uint32_t sector_len = 0x10000;
+ VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
-     PXA2xxState *mpu;
+ VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
-     DriveInfo *dinfo;
--    int be;
++VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
-     void *z2_lcd;
++VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
-     I2CBus *bus;
++VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
-     DeviceState *wm;
++VMIN_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
-@@ -XXX,XX +XXX,XX @@ static void z2_init(MachineState *machine)
++
-     /* Setup CPU & memory */
+ VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
-     mpu = pxa270_init(address_space_mem, z2_binfo.ram_size, machine->cpu_type);
+ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
--#ifdef TARGET_WORDS_BIGENDIAN
+index XXXXXXX..XXXXXXX 100644
--    be = 1;
+--- a/target/arm/translate-neon.inc.c
--#else
++++ b/target/arm/translate-neon.inc.c
--    be = 0;
+@@ -XXX,XX +XXX,XX @@ DO_3SAME(VEOR, tcg_gen_gvec_xor)
--#endif
+ DO_3SAME_BITSEL(VBSL, rd_ofs, rn_ofs, rm_ofs)
-     dinfo = drive_get(IF_PFLASH, 0, 0);
+ DO_3SAME_BITSEL(VBIT, rm_ofs, rn_ofs, rd_ofs)
-     if (!pflash_cfi01_register(Z2_FLASH_BASE, "z2.flash0", Z2_FLASH_SIZE,
+ DO_3SAME_BITSEL(VBIF, rm_ofs, rd_ofs, rn_ofs)
-                                dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
++
--                               sector_len, 4, 0, 0, 0, 0, be)) {
++#define DO_3SAME_NO_SZ_3(INSN, FUNC)                                    \
-+                               sector_len, 4, 0, 0, 0, 0, 0)) {
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-         error_report("Error registering flash memory");
++    {                                                                   \
-         exit(1);
++        if (a->size == 3) {                                             \
-     }
++            return false;                                               \
 +        }                                                               \
 +        return do_3same(s, a, FUNC);                                    \
 +    }
 +
 +DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
 +DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
 +DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
 +DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                               rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
              return 0;
 -        case NEON_3R_VMAX:
 -            if (u) {
 -                tcg_gen_gvec_umax(size, rd_ofs, rn_ofs, rm_ofs,
 -                                  vec_size, vec_size);
 -            } else {
 -                tcg_gen_gvec_smax(size, rd_ofs, rn_ofs, rm_ofs,
 -                                  vec_size, vec_size);
 -            }
 -            return 0;
 -        case NEON_3R_VMIN:
 -            if (u) {
 -                tcg_gen_gvec_umin(size, rd_ofs, rn_ofs, rm_ofs,
 -                                  vec_size, vec_size);
 -            } else {
 -                tcg_gen_gvec_smin(size, rd_ofs, rn_ofs, rm_ofs,
 -                                  vec_size, vec_size);
 -            }
 -            return 0;
 -
          case NEON_3R_VSHL:
              /* Note the operation is vshl vd,vm,vn */
              tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VADD_VSUB:
          case NEON_3R_LOGIC:
 +        case NEON_3R_VMAX:
 +        case NEON_3R_VMIN:
              /* Already handled by decodetree */
              return 1;
          }
 --
 .20.1

-[PULL 08/37] hw/arm/omap_sx1: Simplify since the machines are little-endian only
+[PULL 36/39] target/arm: Convert Neon 3-reg-same comparisons to decodetree
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+Convert the Neon comparison ops in the 3-reg-same grouping
 to decodetree.
-We only build the little-endian softmmu configurations. Checking
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-for big endian is pointless, remove the unused code.
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200430181003.21682-18-peter.maydell@linaro.org
 ---
  target/arm/neon-dp.decode       |  8 ++++++++
  target/arm/translate-neon.inc.c | 22 ++++++++++++++++++++++
  target/arm/translate.c          | 23 +++--------------------
 files changed, 33 insertions(+), 20 deletions(-)
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  hw/arm/omap_sx1.c | 11 ++---------
 file changed, 2 insertions(+), 9 deletions(-)
 diff --git a/hw/arm/omap_sx1.c b/hw/arm/omap_sx1.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/omap_sx1.c
+--- a/target/arm/neon-dp.decode
-+++ b/hw/arm/omap_sx1.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static void sx1_init(MachineState *machine, const int version)
+@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
-     DriveInfo *dinfo;
+ VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
-     int fl_idx;
+ VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
-     uint32_t flash_size = flash0_size;
--    int be;
++VCGT_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 0 .... @3same
++VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
-     if (machine->ram_size != mc->default_ram_size) {
++VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
-         char *sz = size_to_str(mc->default_ram_size);
++VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
-@@ -XXX,XX +XXX,XX @@ static void sx1_init(MachineState *machine, const int version)
++
-                                 OMAP_CS2_BASE, &cs[3]);
+ VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
+ VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
-     fl_idx = 0;
+ VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
--#ifdef TARGET_WORDS_BIGENDIAN
+@@ -XXX,XX +XXX,XX @@ VMIN_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
--    be = 1;
--#else
+ VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
--    be = 0;
+ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
--#endif
++
 +VTST_3s          1111 001 0 0 . .. .... .... 1000 . . . 1 .... @3same
 +VCEQ_3s          1111 001 1 0 . .. .... .... 1000 . . . 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
  DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
  DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
  DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
 +
 +#define DO_3SAME_CMP(INSN, COND)                                        \
 +    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 +                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 +                                uint32_t oprsz, uint32_t maxsz)         \
 +    {                                                                   \
 +        tcg_gen_gvec_cmp(COND, vece, rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz); \
 +    }                                                                   \
 +    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
 +
 +DO_3SAME_CMP(VCGT_S, TCG_COND_GT)
 +DO_3SAME_CMP(VCGT_U, TCG_COND_GTU)
 +DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
 +DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
 +DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
 +
 +static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                         uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
 +{
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
 +}
 +DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             u ? &mls_op[size] : &mla_op[size]);
              return 0;
 -        case NEON_3R_VTST_VCEQ:
 -            if (u) { /* VCEQ */
 -                tcg_gen_gvec_cmp(TCG_COND_EQ, size, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -            } else { /* VTST */
 -                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
 -                               vec_size, vec_size, &cmtst_op[size]);
 -            }
 -            return 0;
 -
-     if ((dinfo = drive_get(IF_PFLASH, 0, fl_idx)) != NULL) {
+-        case NEON_3R_VCGT:
-         if (!pflash_cfi01_register(OMAP_CS0_BASE,
+-            tcg_gen_gvec_cmp(u ? TCG_COND_GTU : TCG_COND_GT, size,
-                                    "omap_sx1.flash0-1", flash_size,
+-                             rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
-                                    blk_by_legacy_dinfo(dinfo),
+-            return 0;
--                                   sector_size, 4, 0, 0, 0, 0, be)) {
+-
-+                                   sector_size, 4, 0, 0, 0, 0, 0)) {
+-        case NEON_3R_VCGE:
-             fprintf(stderr, "qemu: Error registering flash memory %d.\n",
+-            tcg_gen_gvec_cmp(u ? TCG_COND_GEU : TCG_COND_GE, size,
-                            fl_idx);
+-                             rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
-         }
+-            return 0;
-@@ -XXX,XX +XXX,XX @@ static void sx1_init(MachineState *machine, const int version)
+-
-         if (!pflash_cfi01_register(OMAP_CS1_BASE,
+         case NEON_3R_VSHL:
-                                    "omap_sx1.flash1-1", flash1_size,
+             /* Note the operation is vshl vd,vm,vn */
-                                    blk_by_legacy_dinfo(dinfo),
+             tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
--                                   sector_size, 4, 0, 0, 0, 0, be)) {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+                                   sector_size, 4, 0, 0, 0, 0, 0)) {
+         case NEON_3R_LOGIC:
-             fprintf(stderr, "qemu: Error registering flash memory %d.\n",
+         case NEON_3R_VMAX:
-                            fl_idx);
+         case NEON_3R_VMIN:
 +        case NEON_3R_VTST_VCEQ:
 +        case NEON_3R_VCGT:
 +        case NEON_3R_VCGE:
              /* Already handled by decodetree */
              return 1;
          }
 --
 .20.1

-[PULL 07/37] hw/arm/mainstone: Simplify since the machines are little-endian only
+[PULL 37/39] target/arm: Convert Neon 3-reg-same VQADD/VQSUB to decodetree
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+Convert the Neon VQADD/VQSUB insns in the 3-reg-same grouping
 to decodetree.
-We only build the little-endian softmmu configurations. Checking
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-for big endian is pointless, remove the unused code.
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200430181003.21682-19-peter.maydell@linaro.org
 ---
  target/arm/neon-dp.decode       |  6 ++++++
  target/arm/translate-neon.inc.c | 15 +++++++++++++++
  target/arm/translate.c          | 14 ++------------
 files changed, 23 insertions(+), 12 deletions(-)
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  hw/arm/mainstone.c | 8 +-------
 file changed, 1 insertion(+), 7 deletions(-)
 diff --git a/hw/arm/mainstone.c b/hw/arm/mainstone.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/mainstone.c
+--- a/target/arm/neon-dp.decode
-+++ b/hw/arm/mainstone.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static void mainstone_common_init(MemoryRegion *address_space_mem,
+@@ -XXX,XX +XXX,XX @@
-     DeviceState *mst_irq;
+ @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
-     DriveInfo *dinfo;
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
-     int i;
--    int be;
++VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
-     MemoryRegion *rom = g_new(MemoryRegion, 1);
++VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
++
-     /* Setup CPU & memory */
+ @3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
-@@ -XXX,XX +XXX,XX @@ static void mainstone_common_init(MemoryRegion *address_space_mem,
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
-     memory_region_set_readonly(rom, true);
-     memory_region_add_subregion(address_space_mem, 0, rom);
+@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
+ VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
--#ifdef TARGET_WORDS_BIGENDIAN
+ VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
--    be = 1;
--#else
++VQSUB_S_3s       1111 001 0 0 . .. .... .... 0010 . . . 1 .... @3same
--    be = 0;
++VQSUB_U_3s       1111 001 1 0 . .. .... .... 0010 . . . 1 .... @3same
--#endif
++
-     /* There are two 32MiB flash devices on the board */
+ VCGT_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 0 .... @3same
-     for (i = 0; i < 2; i ++) {
+ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
-         dinfo = drive_get(IF_PFLASH, 0, i);
+ VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
-@@ -XXX,XX +XXX,XX @@ static void mainstone_common_init(MemoryRegion *address_space_mem,
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-                                    i ? "mainstone.flash1" : "mainstone.flash0",
+index XXXXXXX..XXXXXXX 100644
-                                    MAINSTONE_FLASH,
+--- a/target/arm/translate-neon.inc.c
-                                    dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
++++ b/target/arm/translate-neon.inc.c
--                                   sector_len, 4, 0, 0, 0, 0, be)) {
+@@ -XXX,XX +XXX,XX @@ static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-+                                   sector_len, 4, 0, 0, 0, 0, 0)) {
+     tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
-             error_report("Error registering flash memory");
+ }
-             exit(1);
+ DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
 +
 +#define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
 +    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 +                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 +                                uint32_t oprsz, uint32_t maxsz)         \
 +    {                                                                   \
 +        tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),           \
 +                       rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]);   \
 +    }                                                                   \
 +    DO_3SAME(INSN, gen_##INSN##_3s)
 +
 +DO_3SAME_GVEC4(VQADD_S, sqadd_op)
 +DO_3SAME_GVEC4(VQADD_U, uqadd_op)
 +DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
 +DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 1;
 -        case NEON_3R_VQADD:
 -            tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 -                           rn_ofs, rm_ofs, vec_size, vec_size,
 -                           (u ? uqadd_op : sqadd_op) + size);
 -            return 0;
 -
 -        case NEON_3R_VQSUB:
 -            tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 -                           rn_ofs, rm_ofs, vec_size, vec_size,
 -                           (u ? uqsub_op : sqsub_op) + size);
 -            return 0;
 -
          case NEON_3R_VMUL: /* VMUL */
              if (u) {
                  /* Polynomial case allows only P8.  */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VTST_VCEQ:
          case NEON_3R_VCGT:
          case NEON_3R_VCGE:
 +        case NEON_3R_VQADD:
 +        case NEON_3R_VQSUB:
              /* Already handled by decodetree */
              return 1;
          }
 --
 .20.1

-[PULL 13/37] hw/arm/strongarm: move timer_new from init() into realize() to avoid memleaks
+[PULL 38/39] target/arm: Convert Neon 3-reg-same VMUL, VMLA, VMLS, VSHL to decodetree
-From: Pan Nengyuan <pannengyuan@huawei.com>
+Convert the Neon VMUL, VMLA, VMLS and VSHL insns in the
 -reg-same grouping to decodetree.
-There are some memleaks when we call 'device_list_properties'. This patch move timer_new from init into realize to fix it.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200430181003.21682-20-peter.maydell@linaro.org
 ---
  target/arm/neon-dp.decode       |  9 +++++++
  target/arm/translate-neon.inc.c | 44 +++++++++++++++++++++++++++++++++
  target/arm/translate.c          | 28 +++------------------
 files changed, 56 insertions(+), 25 deletions(-)
-Reported-by: Euler Robot <euler.robot@huawei.com>
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
 Message-id: 20200227025055.14341-5-pannengyuan@huawei.com
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  hw/arm/strongarm.c | 18 ++++++++++++------
 file changed, 12 insertions(+), 6 deletions(-)
 diff --git a/hw/arm/strongarm.c b/hw/arm/strongarm.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/strongarm.c
+--- a/target/arm/neon-dp.decode
-+++ b/hw/arm/strongarm.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static void strongarm_rtc_init(Object *obj)
+@@ -XXX,XX +XXX,XX @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
-     s->last_rcnr = (uint32_t) mktimegm(&tm);
+ VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
-     s->last_hz = qemu_clock_get_ms(rtc_clock);
+ VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
--    s->rtc_alarm = timer_new_ms(rtc_clock, strongarm_rtc_alarm_tick, s);
++VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
--    s->rtc_hz = timer_new_ms(rtc_clock, strongarm_rtc_hz_tick, s);
++VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
--
++
-     sysbus_init_irq(dev, &s->rtc_irq);
+ VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
-     sysbus_init_irq(dev, &s->rtc_hz_irq);
+ VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
+ VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
-@@ -XXX,XX +XXX,XX @@ static void strongarm_rtc_init(Object *obj)
+@@ -XXX,XX +XXX,XX @@ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
-     sysbus_init_mmio(dev, &s->iomem);
- }
+ VTST_3s          1111 001 0 0 . .. .... .... 1000 . . . 1 .... @3same
+ VCEQ_3s          1111 001 1 0 . .. .... .... 1000 . . . 1 .... @3same
-+static void strongarm_rtc_realize(DeviceState *dev, Error **errp)
++
 +VMLA_3s          1111 001 0 0 . .. .... .... 1001 . . . 0 .... @3same
 +VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
 +
 +VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
 +VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
  DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
  DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
  DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
 +DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
  #define DO_3SAME_CMP(INSN, COND)                                        \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_GVEC4(VQADD_S, sqadd_op)
  DO_3SAME_GVEC4(VQADD_U, uqadd_op)
  DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
  DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
 +
 +static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                           uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
 +{
-+    StrongARMRTCState *s = STRONGARM_RTC(dev);
++    tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz,
-+    s->rtc_alarm = timer_new_ms(rtc_clock, strongarm_rtc_alarm_tick, s);
++                       0, gen_helper_gvec_pmul_b);
 +    s->rtc_hz = timer_new_ms(rtc_clock, strongarm_rtc_hz_tick, s);
 +}
 +
- static int strongarm_rtc_pre_save(void *opaque)
++static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
- {
++{
-     StrongARMRTCState *s = opaque;
++    if (a->size != 0) {
-@@ -XXX,XX +XXX,XX @@ static void strongarm_rtc_sysbus_class_init(ObjectClass *klass, void *data)
++        return false;
++    }
-     dc->desc = "StrongARM RTC Controller";
++    return do_3same(s, a, gen_VMUL_p_3s);
-     dc->vmsd = &vmstate_strongarm_rtc_regs;
++}
-+    dc->realize = strongarm_rtc_realize;
++
- }
++#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY)                           \
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
- static const TypeInfo strongarm_rtc_sysbus_info = {
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-@@ -XXX,XX +XXX,XX @@ static void strongarm_uart_init(Object *obj)
++                                uint32_t oprsz, uint32_t maxsz)         \
-                           "uart", 0x10000);
++    {                                                                   \
-     sysbus_init_mmio(dev, &s->iomem);
++        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
-     sysbus_init_irq(dev, &s->irq);
++                       oprsz, maxsz, &OPARRAY[vece]);                   \
 +    }                                                                   \
 +    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
 +
 +
 +DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op)
 +DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op)
 +
 +#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
 +    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 +                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 +                                uint32_t oprsz, uint32_t maxsz)         \
 +    {                                                                   \
 +        /* Note the operation is vshl vd,vm,vn */                       \
 +        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
 +                       oprsz, maxsz, &OPARRAY[vece]);                   \
 +    }                                                                   \
 +    DO_3SAME(INSN, gen_##INSN##_3s)
 +
 +DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
 +DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 1;
 -        case NEON_3R_VMUL: /* VMUL */
 -            if (u) {
 -                /* Polynomial case allows only P8.  */
 -                if (size != 0) {
 -                    return 1;
 -                }
 -                tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
 -                                   0, gen_helper_gvec_pmul_b);
 -            } else {
 -                tcg_gen_gvec_mul(size, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -            }
 -            return 0;
 -
--    s->rx_timeout_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, strongarm_uart_rx_to, s);
+-        case NEON_3R_VML: /* VMLA, VMLS */
--    s->tx_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, strongarm_uart_tx, s);
+-            tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
- }
+-                           u ? &mls_op[size] : &mla_op[size]);
+-            return 0;
- static void strongarm_uart_realize(DeviceState *dev, Error **errp)
+-
- {
+-        case NEON_3R_VSHL:
-     StrongARMUARTState *s = STRONGARM_UART(dev);
+-            /* Note the operation is vshl vd,vm,vn */
+-            tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
-+    s->rx_timeout_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
+-                           u ? &ushl_op[size] : &sshl_op[size]);
-+                                       strongarm_uart_rx_to,
+-            return 0;
-+                                       s);
+-
-+    s->tx_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, strongarm_uart_tx, s);
+         case NEON_3R_VADD_VSUB:
-     qemu_chr_fe_set_handlers(&s->chr,
+         case NEON_3R_LOGIC:
-                              strongarm_uart_can_receive,
+         case NEON_3R_VMAX:
-                              strongarm_uart_receive,
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VCGE:
          case NEON_3R_VQADD:
          case NEON_3R_VQSUB:
 +        case NEON_3R_VMUL:
 +        case NEON_3R_VML:
 +        case NEON_3R_VSHL:
              /* Already handled by decodetree */
              return 1;
          }
 --
 .20.1

-[PULL 37/37] target/arm: Clean address for DC ZVA
+[PULL 39/39] target/arm: Move gen_ function typedefs to translate.h
-From: Richard Henderson <richard.henderson@linaro.org>
+We're going to want at least some of the NeonGen* typedefs
 for the refactored 32-bit Neon decoder, so move them all
 to translate.h since it makes more sense to keep them in
 one group.
-This data access was forgotten when we added support for cleaning
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-addresses of TBI information.
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200430181003.21682-23-peter.maydell@linaro.org
 ---
  target/arm/translate.h     | 17 +++++++++++++++++
  target/arm/translate-a64.c | 17 -----------------
 files changed, 17 insertions(+), 17 deletions(-)
-Fixes: 3a471103ac1823ba
+diff --git a/target/arm/translate.h b/target/arm/translate.h
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+index XXXXXXX..XXXXXXX 100644
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+--- a/target/arm/translate.h
-Message-id: 20200302175829.2183-8-richard.henderson@linaro.org
++++ b/target/arm/translate.h
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+@@ -XXX,XX +XXX,XX @@ typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
----
+ typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
- target/arm/translate-a64.c | 2 +-
+                         uint32_t, uint32_t, uint32_t);
-file changed, 1 insertion(+), 1 deletion(-)
++/* Function prototype for gen_ functions for calling Neon helpers */
 +typedef void NeonGenOneOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32);
 +typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
 +typedef void NeonGenTwoOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
 +typedef void NeonGenTwo64OpFn(TCGv_i64, TCGv_i64, TCGv_i64);
 +typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
 +typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
 +typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
 +typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
 +typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
 +typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
 +typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
 +typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
 +typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
 +typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
 +typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
 +
  #endif /* TARGET_ARM_TRANSLATE_H */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
+@@ -XXX,XX +XXX,XX @@ typedef struct AArch64DecodeTable {
-         return;
+     AArch64DecodeFn *disas_fn;
-     case ARM_CP_DC_ZVA:
+ } AArch64DecodeTable;
-         /* Writes clear the aligned block of memory which rt points into. */
--        tcg_rt = cpu_reg(s, rt);
+-/* Function prototype for gen_ functions for calling Neon helpers */
-+        tcg_rt = clean_data_tbi(s, cpu_reg(s, rt));
+-typedef void NeonGenOneOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32);
-         gen_helper_dc_zva(cpu_env, tcg_rt);
+-typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
-         return;
+-typedef void NeonGenTwoOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
-     default:
+-typedef void NeonGenTwo64OpFn(TCGv_i64, TCGv_i64, TCGv_i64);
 -typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
 -typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
 -typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
 -typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
 -typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
 -typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
 -typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
 -typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
 -typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
 -typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
 -typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
 -
  /* initialize TCG globals.  */
  void a64_translate_init(void)
  {
 --
 .20.1

Nothing much exciting here, but it's 37 patches worth...

thanks
-- PMM

The following changes since commit e64a62df378a746c0b257105959613c9f8122e59:

Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-040320-1' into staging (2020-03-05 12:13:51 +0000)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200305

for you to fetch changes up to 597d61a3b1f94c53a3aaa77671697c0c5f797dbf:

target/arm: Clean address for DC ZVA (2020-03-05 16:09:21 +0000)

----------------------------------------------------------------
 * versal: Implement ADMA
 * Implement (trivially) ARMv8.2-TTCNP
 * hw/arm/smmu-common: a fix to smmu_find_smmu_pcibus
 * Remove unnecessary endianness-handling on some boards
 * Avoid minor memory leaks from timer_new in some devices
 * Honour more of the HCR_EL2 trap bits
 * Complain rather than ignoring bad command line options for cubieboard
 * Honour TBI for DC ZVA and exception return

----------------------------------------------------------------
Edgar E. Iglesias (2):
      hw/arm: versal: Add support for the LPD ADMAs
      hw/arm: versal: Generate xlnx-versal-virt zdma FDT nodes

Eric Auger (1):
      hw/arm/smmu-common: a fix to smmu_find_smmu_pcibus

Niek Linnenbank (4):
      hw/arm/cubieboard: use ARM Cortex-A8 as the default CPU in machine definition
      hw/arm/cubieboard: restrict allowed CPU type to ARM Cortex-A8
      hw/arm/cubieboard: restrict allowed RAM size to 512MiB and 1GiB
      hw/arm/cubieboard: report error when using unsupported -bios argument

Pan Nengyuan (4):
      hw/arm/pxa2xx: move timer_new from init() into realize() to avoid memleaks
      hw/arm/spitz: move timer_new from init() into realize() to avoid memleaks
      hw/arm/strongarm: move timer_new from init() into realize() to avoid memleaks
      hw/timer/cadence_ttc: move timer_new from init() into realize() to avoid memleaks

Peter Maydell (1):
      target/arm: Implement (trivially) ARMv8.2-TTCNP

Philippe Mathieu-Daudé (6):
      hw/arm/smmu-common: Simplify smmu_find_smmu_pcibus() logic
      hw/arm/gumstix: Simplify since the machines are little-endian only
      hw/arm/mainstone: Simplify since the machines are little-endian only
      hw/arm/omap_sx1: Simplify since the machines are little-endian only
      hw/arm/z2: Simplify since the machines are little-endian only
      hw/arm/musicpal: Simplify since the machines are little-endian only

Richard Henderson (19):
      target/arm: Improve masking of HCR/HCR2 RES0 bits
      target/arm: Add HCR_EL2 bit definitions from ARMv8.6
      target/arm: Disable has_el2 and has_el3 for user-only
      target/arm: Remove EL2 and EL3 setup from user-only
      target/arm: Improve masking in arm_hcr_el2_eff
      target/arm: Honor the HCR_EL2.{TVM,TRVM} bits
      target/arm: Honor the HCR_EL2.TSW bit
      target/arm: Honor the HCR_EL2.TACR bit
      target/arm: Honor the HCR_EL2.TPCP bit
      target/arm: Honor the HCR_EL2.TPU bit
      target/arm: Honor the HCR_EL2.TTLB bit
      tests/tcg/aarch64: Add newline in pauth-1 printf
      target/arm: Replicate TBI/TBID bits for single range regimes
      target/arm: Optimize cpu_mmu_index
      target/arm: Introduce core_to_aa64_mmu_idx
      target/arm: Apply TBI to ESR_ELx in helper_exception_return
      target/arm: Move helper_dc_zva to helper-a64.c
      target/arm: Use DEF_HELPER_FLAGS for helper_dc_zva
      target/arm: Clean address for DC ZVA

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Add support for the Versal LPD ADMAs.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/xlnx-versal.h |  6 ++++++
 hw/arm/xlnx-versal.c         | 24 ++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/xlnx-versal.h
+++ b/include/hw/arm/xlnx-versal.h
@@ -XXX,XX +XXX,XX @@
 #define XLNX_VERSAL_NR_ACPUS   2
 #define XLNX_VERSAL_NR_UARTS   2
 #define XLNX_VERSAL_NR_GEMS    2
+#define XLNX_VERSAL_NR_ADMAS   8
 #define XLNX_VERSAL_NR_IRQS    192
 
 typedef struct Versal {
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
         struct {
             SysBusDevice *uart[XLNX_VERSAL_NR_UARTS];
             SysBusDevice *gem[XLNX_VERSAL_NR_GEMS];
+            SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
         } iou;
     } lpd;
 
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
 #define VERSAL_GEM0_WAKE_IRQ_0     57
 #define VERSAL_GEM1_IRQ_0          58
 #define VERSAL_GEM1_WAKE_IRQ_0     59
+#define VERSAL_ADMA_IRQ_0          60
 
 /* Architecturally reserved IRQs suitable for virtualization.  */
 #define VERSAL_RSVD_IRQ_FIRST 111
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
 #define MM_GEM1                     0xff0d0000U
 #define MM_GEM1_SIZE                0x10000
 
+#define MM_ADMA_CH0                 0xffa80000U
+#define MM_ADMA_CH0_SIZE            0x10000
+
 #define MM_OCM                      0xfffc0000U
 #define MM_OCM_SIZE                 0x40000
 
diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal.c
+++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_gems(Versal *s, qemu_irq *pic)
     }
 }
 
+static void versal_create_admas(Versal *s, qemu_irq *pic)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(s->lpd.iou.adma); i++) {
+        char *name = g_strdup_printf("adma%d", i);
+        DeviceState *dev;
+        MemoryRegion *mr;
+
+        dev = qdev_create(NULL, "xlnx.zdma");
+        s->lpd.iou.adma[i] = SYS_BUS_DEVICE(dev);
+        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
+        qdev_init_nofail(dev);
+
+        mr = sysbus_mmio_get_region(s->lpd.iou.adma[i], 0);
+        memory_region_add_subregion(&s->mr_ps,
+                                    MM_ADMA_CH0 + i * MM_ADMA_CH0_SIZE, mr);
+
+        sysbus_connect_irq(s->lpd.iou.adma[i], 0, pic[VERSAL_ADMA_IRQ_0 + i]);
+        g_free(name);
+    }
+}
+
 /* This takes the board allocated linear DDR memory and creates aliases
  * for each split DDR range/aperture on the Versal address map.
  */
@@ -XXX,XX +XXX,XX @@ static void versal_realize(DeviceState *dev, Error **errp)
     versal_create_apu_gic(s, pic);
     versal_create_uarts(s, pic);
     versal_create_gems(s, pic);
+    versal_create_admas(s, pic);
     versal_map_ddr(s);
     versal_unimp(s);
 
-- 
2.20.1

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Generate xlnx-versal-virt zdma FDT nodes.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/xlnx-versal-virt.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal-virt.c
+++ b/hw/arm/xlnx-versal-virt.c
@@ -XXX,XX +XXX,XX @@ static void fdt_add_gem_nodes(VersalVirt *s)
     }
 }
 
+static void fdt_add_zdma_nodes(VersalVirt *s)
+{
+    const char clocknames[] = "clk_main\0clk_apb";
+    const char compat[] = "xlnx,zynqmp-dma-1.0";
+    int i;
+
+    for (i = XLNX_VERSAL_NR_ADMAS - 1; i >= 0; i--) {
+        uint64_t addr = MM_ADMA_CH0 + MM_ADMA_CH0_SIZE * i;
+        char *name = g_strdup_printf("/dma@%" PRIx64, addr);
+
+        qemu_fdt_add_subnode(s->fdt, name);
+
+        qemu_fdt_setprop_cell(s->fdt, name, "xlnx,bus-width", 64);
+        qemu_fdt_setprop_cells(s->fdt, name, "clocks",
+                               s->phandle.clk_25Mhz, s->phandle.clk_25Mhz);
+        qemu_fdt_setprop(s->fdt, name, "clock-names",
+                         clocknames, sizeof(clocknames));
+        qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
+                               GIC_FDT_IRQ_TYPE_SPI, VERSAL_ADMA_IRQ_0 + i,
+                               GIC_FDT_IRQ_FLAGS_LEVEL_HI);
+        qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
+                                     2, addr, 2, 0x1000);
+        qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat));
+        g_free(name);
+    }
+}
+
 static void fdt_nop_memory_nodes(void *fdt, Error **errp)
 {
     Error *err = NULL;
@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
     fdt_add_uart_nodes(s);
     fdt_add_gic_nodes(s);
     fdt_add_timer_nodes(s);
+    fdt_add_zdma_nodes(s);
     fdt_add_cpu_nodes(s, psci_conduit);
     fdt_add_clk_node(s, "/clk125", 125000000, s->phandle.clk_125Mhz);
     fdt_add_clk_node(s, "/clk25", 25000000, s->phandle.clk_25Mhz);
-- 
2.20.1

The ARMv8.2-TTCNP extension allows an implementation to optimize by
sharing TLB entries between multiple cores, provided that software
declares that it's ready to deal with this by setting a CnP bit in
the TTBRn_ELx.  It is mandatory from ARMv8.2 onward.

For QEMU's TLB implementation, sharing TLB entries between different
cores would not really benefit us and would be a lot of work to
implement.  So we implement this extension in the "trivial" manner:
we allow the guest to set and read back the CnP bit, but don't change
our behaviour (this is an architecturally valid implementation
choice).

The only code path which looks at the TTBRn_ELx values for the
long-descriptor format where the CnP bit is defined is already doing
enough masking to not get confused when the CnP bit at the bottom of
the register is set, so we can simply add a comment noting why we're
relying on that mask.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200225193822.18874-1-peter.maydell@linaro.org
---
 target/arm/cpu.c    | 1 +
 target/arm/cpu64.c  | 2 ++
 target/arm/helper.c | 4 ++++
 3 files changed, 7 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
             t = cpu->isar.id_mmfr4;
             t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
             t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
+            t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
             cpu->isar.id_mmfr4 = t;
         }
 #endif
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
 
         t = cpu->isar.id_aa64mmfr2;
         t = FIELD_DP64(t, ID_AA64MMFR2, UAO, 1);
+        t = FIELD_DP64(t, ID_AA64MMFR2, CNP, 1); /* TTCNP */
         cpu->isar.id_aa64mmfr2 = t;
 
         /* Replicate the same data to the 32-bit id registers.  */
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         u = cpu->isar.id_mmfr4;
         u = FIELD_DP32(u, ID_MMFR4, HPDS, 1); /* AA32HPD */
         u = FIELD_DP32(u, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
+        u = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
         cpu->isar.id_mmfr4 = u;
 
         u = cpu->isar.id_aa64dfr0;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
 
     /* Now we can extract the actual base address from the TTBR */
     descaddr = extract64(ttbr, 0, 48);
+    /*
+     * We rely on this masking to clear the RES0 bits at the bottom of the TTBR
+     * and also to mask out CnP (bit 0) which could validly be non-zero.
+     */
     descaddr &= ~indexmask;
 
     /* The address field in the descriptor goes up to bit 39 for ARMv7
-- 
2.20.1

From: Eric Auger <eric.auger@redhat.com>

Make sure a null SMMUPciBus is returned in case we were
not able to identify a pci bus matching the @bus_num.

This matches the fix done on intel iommu in commit:
a2e1cd41ccfe796529abfd1b6aeb1dd4393762a2

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20200226172628.17449-1-eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmu-common.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -XXX,XX +XXX,XX @@ SMMUPciBus *smmu_find_smmu_pcibus(SMMUState *s, uint8_t bus_num)
                 return smmu_pci_bus;
             }
         }
+        smmu_pci_bus = NULL;
     }
     return smmu_pci_bus;
 }
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

The smmu_find_smmu_pcibus() function was introduced (in commit
cac994ef43b) in a code format that could return an incorrect
pointer, which was then fixed by the previous commit.
We could have avoided this by writing the if() statement
differently. Do it now, in case this function is re-used.
The code is easier to review (harder to miss bugs).

Acked-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmu-common.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -XXX,XX +XXX,XX @@ inline int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
 SMMUPciBus *smmu_find_smmu_pcibus(SMMUState *s, uint8_t bus_num)
 {
     SMMUPciBus *smmu_pci_bus = s->smmu_pcibus_by_bus_num[bus_num];
+    GHashTableIter iter;
 
-    if (!smmu_pci_bus) {
-        GHashTableIter iter;
-
-        g_hash_table_iter_init(&iter, s->smmu_pcibus_by_busptr);
-        while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
-            if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
-                s->smmu_pcibus_by_bus_num[bus_num] = smmu_pci_bus;
-                return smmu_pci_bus;
-            }
-        }
-        smmu_pci_bus = NULL;
+    if (smmu_pci_bus) {
+        return smmu_pci_bus;
     }
-    return smmu_pci_bus;
+
+    g_hash_table_iter_init(&iter, s->smmu_pcibus_by_busptr);
+    while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
+        if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
+            s->smmu_pcibus_by_bus_num[bus_num] = smmu_pci_bus;
+            return smmu_pci_bus;
+        }
+    }
+
+    return NULL;
 }
 
 static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
-- 
2.20.1

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

As the Connex and Verdex machines only boot in little-endian,
we can simplify the code.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/gumstix.c | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/hw/arm/gumstix.c b/hw/arm/gumstix.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/gumstix.c
+++ b/hw/arm/gumstix.c
@@ -XXX,XX +XXX,XX @@ static void connex_init(MachineState *machine)
 {
     PXA2xxState *cpu;
     DriveInfo *dinfo;
-    int be;
     MemoryRegion *address_space_mem = get_system_memory();
 
     uint32_t connex_rom = 0x01000000;
@@ -XXX,XX +XXX,XX @@ static void connex_init(MachineState *machine)
         exit(1);
     }
 
-#ifdef TARGET_WORDS_BIGENDIAN
-    be = 1;
-#else
-    be = 0;
-#endif
     if (!pflash_cfi01_register(0x00000000, "connext.rom", connex_rom,
                                dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
-                               sector_len, 2, 0, 0, 0, 0, be)) {
+                               sector_len, 2, 0, 0, 0, 0, 0)) {
         error_report("Error registering flash memory");
         exit(1);
     }
@@ -XXX,XX +XXX,XX @@ static void verdex_init(MachineState *machine)
 {
     PXA2xxState *cpu;
     DriveInfo *dinfo;
-    int be;
     MemoryRegion *address_space_mem = get_system_memory();
 
     uint32_t verdex_rom = 0x02000000;
@@ -XXX,XX +XXX,XX @@ static void verdex_init(MachineState *machine)
         exit(1);
     }
 
-#ifdef TARGET_WORDS_BIGENDIAN
-    be = 1;
-#else
-    be = 0;
-#endif
     if (!pflash_cfi01_register(0x00000000, "verdex.rom", verdex_rom,
                                dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
-                               sector_len, 2, 0, 0, 0, 0, be)) {
+                               sector_len, 2, 0, 0, 0, 0, 0)) {
         error_report("Error registering flash memory");
         exit(1);
     }
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

We only build the little-endian softmmu configurations. Checking
for big endian is pointless, remove the unused code.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/mainstone.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/hw/arm/mainstone.c b/hw/arm/mainstone.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/mainstone.c
+++ b/hw/arm/mainstone.c
@@ -XXX,XX +XXX,XX @@ static void mainstone_common_init(MemoryRegion *address_space_mem,
     DeviceState *mst_irq;
     DriveInfo *dinfo;
     int i;
-    int be;
     MemoryRegion *rom = g_new(MemoryRegion, 1);
 
     /* Setup CPU & memory */
@@ -XXX,XX +XXX,XX @@ static void mainstone_common_init(MemoryRegion *address_space_mem,
     memory_region_set_readonly(rom, true);
     memory_region_add_subregion(address_space_mem, 0, rom);
 
-#ifdef TARGET_WORDS_BIGENDIAN
-    be = 1;
-#else
-    be = 0;
-#endif
     /* There are two 32MiB flash devices on the board */
     for (i = 0; i < 2; i ++) {
         dinfo = drive_get(IF_PFLASH, 0, i);
@@ -XXX,XX +XXX,XX @@ static void mainstone_common_init(MemoryRegion *address_space_mem,
                                    i ? "mainstone.flash1" : "mainstone.flash0",
                                    MAINSTONE_FLASH,
                                    dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
-                                   sector_len, 4, 0, 0, 0, 0, be)) {
+                                   sector_len, 4, 0, 0, 0, 0, 0)) {
             error_report("Error registering flash memory");
             exit(1);
         }
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

We only build the little-endian softmmu configurations. Checking
for big endian is pointless, remove the unused code.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/omap_sx1.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/hw/arm/omap_sx1.c b/hw/arm/omap_sx1.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/omap_sx1.c
+++ b/hw/arm/omap_sx1.c
@@ -XXX,XX +XXX,XX @@ static void sx1_init(MachineState *machine, const int version)
     DriveInfo *dinfo;
     int fl_idx;
     uint32_t flash_size = flash0_size;
-    int be;
 
     if (machine->ram_size != mc->default_ram_size) {
         char *sz = size_to_str(mc->default_ram_size);
@@ -XXX,XX +XXX,XX @@ static void sx1_init(MachineState *machine, const int version)
                                 OMAP_CS2_BASE, &cs[3]);
 
     fl_idx = 0;
-#ifdef TARGET_WORDS_BIGENDIAN
-    be = 1;
-#else
-    be = 0;
-#endif
-
     if ((dinfo = drive_get(IF_PFLASH, 0, fl_idx)) != NULL) {
         if (!pflash_cfi01_register(OMAP_CS0_BASE,
                                    "omap_sx1.flash0-1", flash_size,
                                    blk_by_legacy_dinfo(dinfo),
-                                   sector_size, 4, 0, 0, 0, 0, be)) {
+                                   sector_size, 4, 0, 0, 0, 0, 0)) {
             fprintf(stderr, "qemu: Error registering flash memory %d.\n",
                            fl_idx);
         }
@@ -XXX,XX +XXX,XX @@ static void sx1_init(MachineState *machine, const int version)
         if (!pflash_cfi01_register(OMAP_CS1_BASE,
                                    "omap_sx1.flash1-1", flash1_size,
                                    blk_by_legacy_dinfo(dinfo),
-                                   sector_size, 4, 0, 0, 0, 0, be)) {
+                                   sector_size, 4, 0, 0, 0, 0, 0)) {
             fprintf(stderr, "qemu: Error registering flash memory %d.\n",
                            fl_idx);
         }
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

We only build the little-endian softmmu configurations. Checking
for big endian is pointless, remove the unused code.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/z2.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/hw/arm/z2.c b/hw/arm/z2.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/z2.c
+++ b/hw/arm/z2.c
@@ -XXX,XX +XXX,XX @@ static void z2_init(MachineState *machine)
     uint32_t sector_len = 0x10000;
     PXA2xxState *mpu;
     DriveInfo *dinfo;
-    int be;
     void *z2_lcd;
     I2CBus *bus;
     DeviceState *wm;
@@ -XXX,XX +XXX,XX @@ static void z2_init(MachineState *machine)
     /* Setup CPU & memory */
     mpu = pxa270_init(address_space_mem, z2_binfo.ram_size, machine->cpu_type);
 
-#ifdef TARGET_WORDS_BIGENDIAN
-    be = 1;
-#else
-    be = 0;
-#endif
     dinfo = drive_get(IF_PFLASH, 0, 0);
     if (!pflash_cfi01_register(Z2_FLASH_BASE, "z2.flash0", Z2_FLASH_SIZE,
                                dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
-                               sector_len, 4, 0, 0, 0, 0, be)) {
+                               sector_len, 4, 0, 0, 0, 0, 0)) {
         error_report("Error registering flash memory");
         exit(1);
     }
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

We only build the little-endian softmmu configurations. Checking
for big endian is pointless, remove the unused code.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/musicpal.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/hw/arm/musicpal.c b/hw/arm/musicpal.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/musicpal.c
+++ b/hw/arm/musicpal.c
@@ -XXX,XX +XXX,XX @@ static void musicpal_init(MachineState *machine)
          * 0xFF800000 (if there is 8 MB flash). So remap flash access if the
          * image is smaller than 32 MB.
          */
-#ifdef TARGET_WORDS_BIGENDIAN
-        pflash_cfi02_register(0x100000000ULL - MP_FLASH_SIZE_MAX,
-                              "musicpal.flash", flash_size,
-                              blk, 0x10000,
-                              MP_FLASH_SIZE_MAX / flash_size,
-                              2, 0x00BF, 0x236D, 0x0000, 0x0000,
-                              0x5555, 0x2AAA, 1);
-#else
         pflash_cfi02_register(0x100000000ULL - MP_FLASH_SIZE_MAX,
                               "musicpal.flash", flash_size,
                               blk, 0x10000,
                               MP_FLASH_SIZE_MAX / flash_size,
                               2, 0x00BF, 0x236D, 0x0000, 0x0000,
                               0x5555, 0x2AAA, 0);
-#endif
-
     }
     sysbus_create_simple(TYPE_MV88W8618_FLASHCFG, MP_FLASHCFG_BASE, NULL);
 
-- 
2.20.1

From: Pan Nengyuan <pannengyuan@huawei.com>

There are some memleaks when we call 'device_list_properties'. This patch move timer_new from init into realize to fix it.

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-id: 20200227025055.14341-3-pannengyuan@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/pxa2xx.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/arm/pxa2xx.c b/hw/arm/pxa2xx.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/pxa2xx.c
+++ b/hw/arm/pxa2xx.c
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_rtc_init(Object *obj)
     s->last_rtcpicr = 0;
     s->last_hz = s->last_sw = s->last_pi = qemu_clock_get_ms(rtc_clock);
 
+    sysbus_init_irq(dev, &s->rtc_irq);
+
+    memory_region_init_io(&s->iomem, obj, &pxa2xx_rtc_ops, s,
+                          "pxa2xx-rtc", 0x10000);
+    sysbus_init_mmio(dev, &s->iomem);
+}
+
+static void pxa2xx_rtc_realize(DeviceState *dev, Error **errp)
+{
+    PXA2xxRTCState *s = PXA2XX_RTC(dev);
     s->rtc_hz    = timer_new_ms(rtc_clock, pxa2xx_rtc_hz_tick,    s);
     s->rtc_rdal1 = timer_new_ms(rtc_clock, pxa2xx_rtc_rdal1_tick, s);
     s->rtc_rdal2 = timer_new_ms(rtc_clock, pxa2xx_rtc_rdal2_tick, s);
     s->rtc_swal1 = timer_new_ms(rtc_clock, pxa2xx_rtc_swal1_tick, s);
     s->rtc_swal2 = timer_new_ms(rtc_clock, pxa2xx_rtc_swal2_tick, s);
     s->rtc_pi    = timer_new_ms(rtc_clock, pxa2xx_rtc_pi_tick,    s);
-
-    sysbus_init_irq(dev, &s->rtc_irq);
-
-    memory_region_init_io(&s->iomem, obj, &pxa2xx_rtc_ops, s,
-                          "pxa2xx-rtc", 0x10000);
-    sysbus_init_mmio(dev, &s->iomem);
 }
 
 static int pxa2xx_rtc_pre_save(void *opaque)
@@ -XXX,XX +XXX,XX @@ static void pxa2xx_rtc_sysbus_class_init(ObjectClass *klass, void *data)
 
     dc->desc = "PXA2xx RTC Controller";
     dc->vmsd = &vmstate_pxa2xx_rtc_regs;
+    dc->realize = pxa2xx_rtc_realize;
 }
 
 static const TypeInfo pxa2xx_rtc_sysbus_info = {
-- 
2.20.1

From: Pan Nengyuan <pannengyuan@huawei.com>

There are some memleaks when we call 'device_list_properties'. This patch move timer_new from init into realize to fix it.

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-id: 20200227025055.14341-4-pannengyuan@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/spitz.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/arm/spitz.c b/hw/arm/spitz.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/spitz.c
+++ b/hw/arm/spitz.c
@@ -XXX,XX +XXX,XX @@ static void spitz_keyboard_init(Object *obj)
 
     spitz_keyboard_pre_map(s);
 
-    s->kbdtimer = timer_new_ns(QEMU_CLOCK_VIRTUAL, spitz_keyboard_tick, s);
     qdev_init_gpio_in(dev, spitz_keyboard_strobe, SPITZ_KEY_STROBE_NUM);
     qdev_init_gpio_out(dev, s->sense, SPITZ_KEY_SENSE_NUM);
 }
 
+static void spitz_keyboard_realize(DeviceState *dev, Error **errp)
+{
+    SpitzKeyboardState *s = SPITZ_KEYBOARD(dev);
+    s->kbdtimer = timer_new_ns(QEMU_CLOCK_VIRTUAL, spitz_keyboard_tick, s);
+}
+
 /* LCD backlight controller */
 
 #define LCDTG_RESCTL	0x00
@@ -XXX,XX +XXX,XX @@ static void spitz_keyboard_class_init(ObjectClass *klass, void *data)
     DeviceClass *dc = DEVICE_CLASS(klass);
 
     dc->vmsd = &vmstate_spitz_kbd;
+    dc->realize = spitz_keyboard_realize;
 }
 
 static const TypeInfo spitz_keyboard_info = {
-- 
2.20.1

From: Pan Nengyuan <pannengyuan@huawei.com>

There are some memleaks when we call 'device_list_properties'. This patch move timer_new from init into realize to fix it.

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-id: 20200227025055.14341-5-pannengyuan@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/strongarm.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/hw/arm/strongarm.c b/hw/arm/strongarm.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/strongarm.c
+++ b/hw/arm/strongarm.c
@@ -XXX,XX +XXX,XX @@ static void strongarm_rtc_init(Object *obj)
     s->last_rcnr = (uint32_t) mktimegm(&tm);
     s->last_hz = qemu_clock_get_ms(rtc_clock);
 
-    s->rtc_alarm = timer_new_ms(rtc_clock, strongarm_rtc_alarm_tick, s);
-    s->rtc_hz = timer_new_ms(rtc_clock, strongarm_rtc_hz_tick, s);
-
     sysbus_init_irq(dev, &s->rtc_irq);
     sysbus_init_irq(dev, &s->rtc_hz_irq);
 
@@ -XXX,XX +XXX,XX @@ static void strongarm_rtc_init(Object *obj)
     sysbus_init_mmio(dev, &s->iomem);
 }
 
+static void strongarm_rtc_realize(DeviceState *dev, Error **errp)
+{
+    StrongARMRTCState *s = STRONGARM_RTC(dev);
+    s->rtc_alarm = timer_new_ms(rtc_clock, strongarm_rtc_alarm_tick, s);
+    s->rtc_hz = timer_new_ms(rtc_clock, strongarm_rtc_hz_tick, s);
+}
+
 static int strongarm_rtc_pre_save(void *opaque)
 {
     StrongARMRTCState *s = opaque;
@@ -XXX,XX +XXX,XX @@ static void strongarm_rtc_sysbus_class_init(ObjectClass *klass, void *data)
 
     dc->desc = "StrongARM RTC Controller";
     dc->vmsd = &vmstate_strongarm_rtc_regs;
+    dc->realize = strongarm_rtc_realize;
 }
 
 static const TypeInfo strongarm_rtc_sysbus_info = {
@@ -XXX,XX +XXX,XX @@ static void strongarm_uart_init(Object *obj)
                           "uart", 0x10000);
     sysbus_init_mmio(dev, &s->iomem);
     sysbus_init_irq(dev, &s->irq);
-
-    s->rx_timeout_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, strongarm_uart_rx_to, s);
-    s->tx_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, strongarm_uart_tx, s);
 }
 
 static void strongarm_uart_realize(DeviceState *dev, Error **errp)
 {
     StrongARMUARTState *s = STRONGARM_UART(dev);
 
+    s->rx_timeout_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
+                                       strongarm_uart_rx_to,
+                                       s);
+    s->tx_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, strongarm_uart_tx, s);
     qemu_chr_fe_set_handlers(&s->chr,
                              strongarm_uart_can_receive,
                              strongarm_uart_receive,
-- 
2.20.1

From: Pan Nengyuan <pannengyuan@huawei.com>

There are some memleaks when we call 'device_list_properties'. This patch move timer_new from init into realize to fix it.

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20200227025055.14341-7-pannengyuan@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/timer/cadence_ttc.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/hw/timer/cadence_ttc.c b/hw/timer/cadence_ttc.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/timer/cadence_ttc.c
+++ b/hw/timer/cadence_ttc.c
@@ -XXX,XX +XXX,XX @@ static void cadence_timer_init(uint32_t freq, CadenceTimerState *s)
 static void cadence_ttc_init(Object *obj)
 {
     CadenceTTCState *s = CADENCE_TTC(obj);
-    int i;
-
-    for (i = 0; i < 3; ++i) {
-        cadence_timer_init(133000000, &s->timer[i]);
-        sysbus_init_irq(SYS_BUS_DEVICE(obj), &s->timer[i].irq);
-    }
 
     memory_region_init_io(&s->iomem, obj, &cadence_ttc_ops, s,
                           "timer", 0x1000);
     sysbus_init_mmio(SYS_BUS_DEVICE(obj), &s->iomem);
 }
 
+static void cadence_ttc_realize(DeviceState *dev, Error **errp)
+{
+    CadenceTTCState *s = CADENCE_TTC(dev);
+    int i;
+
+    for (i = 0; i < 3; ++i) {
+        cadence_timer_init(133000000, &s->timer[i]);
+        sysbus_init_irq(SYS_BUS_DEVICE(dev), &s->timer[i].irq);
+    }
+}
+
 static int cadence_timer_pre_save(void *opaque)
 {
     cadence_timer_sync((CadenceTimerState *)opaque);
@@ -XXX,XX +XXX,XX @@ static void cadence_ttc_class_init(ObjectClass *klass, void *data)
     DeviceClass *dc = DEVICE_CLASS(klass);
 
     dc->vmsd = &vmstate_cadence_ttc;
+    dc->realize = cadence_ttc_realize;
 }
 
 static const TypeInfo cadence_ttc_info = {
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Don't merely start with v8.0, handle v7VE as well.  Ensure that writes
from aarch32 mode do not change bits in the other half of the register.
Protect reads of aa64 id registers with ARM_FEATURE_AARCH64.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
     REGINFO_SENTINEL
 };
 
-static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
 {
     ARMCPU *cpu = env_archcpu(env);
-    /* Begin with bits defined in base ARMv8.0.  */
-    uint64_t valid_mask = MAKE_64BIT_MASK(0, 34);
+
+    if (arm_feature(env, ARM_FEATURE_V8)) {
+        valid_mask |= MAKE_64BIT_MASK(0, 34);  /* ARMv8.0 */
+    } else {
+        valid_mask |= MAKE_64BIT_MASK(0, 28);  /* ARMv7VE */
+    }
 
     if (arm_feature(env, ARM_FEATURE_EL3)) {
         valid_mask &= ~HCR_HCD;
@@ -XXX,XX +XXX,XX @@ static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
          */
         valid_mask &= ~HCR_TSC;
     }
-    if (cpu_isar_feature(aa64_vh, cpu)) {
-        valid_mask |= HCR_E2H;
-    }
-    if (cpu_isar_feature(aa64_lor, cpu)) {
-        valid_mask |= HCR_TLOR;
-    }
-    if (cpu_isar_feature(aa64_pauth, cpu)) {
-        valid_mask |= HCR_API | HCR_APK;
+
+    if (arm_feature(env, ARM_FEATURE_AARCH64)) {
+        if (cpu_isar_feature(aa64_vh, cpu)) {
+            valid_mask |= HCR_E2H;
+        }
+        if (cpu_isar_feature(aa64_lor, cpu)) {
+            valid_mask |= HCR_TLOR;
+        }
+        if (cpu_isar_feature(aa64_pauth, cpu)) {
+            valid_mask |= HCR_API | HCR_APK;
+        }
     }
 
     /* Clear RES0 bits.  */
@@ -XXX,XX +XXX,XX @@ static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
     arm_cpu_update_vfiq(cpu);
 }
 
+static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+{
+    do_hcr_write(env, value, 0);
+}
+
 static void hcr_writehigh(CPUARMState *env, const ARMCPRegInfo *ri,
                           uint64_t value)
 {
     /* Handle HCR2 write, i.e. write to high half of HCR_EL2 */
     value = deposit64(env->cp15.hcr_el2, 32, 32, value);
-    hcr_write(env, NULL, value);
+    do_hcr_write(env, value, MAKE_64BIT_MASK(0, 32));
 }
 
 static void hcr_writelow(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static void hcr_writelow(CPUARMState *env, const ARMCPRegInfo *ri,
 {
     /* Handle HCR write, i.e. write to low half of HCR_EL2 */
     value = deposit64(env->cp15.hcr_el2, 0, 32, value);
-    hcr_write(env, NULL, value);
+    do_hcr_write(env, value, MAKE_64BIT_MASK(32, 32));
 }
 
 /*
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
 #define HCR_TERR      (1ULL << 36)
 #define HCR_TEA       (1ULL << 37)
 #define HCR_MIOCNCE   (1ULL << 38)
+/* RES0 bit 39 */
 #define HCR_APK       (1ULL << 40)
 #define HCR_API       (1ULL << 41)
 #define HCR_NV        (1ULL << 42)
@@ -XXX,XX +XXX,XX @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
 #define HCR_NV2       (1ULL << 45)
 #define HCR_FWB       (1ULL << 46)
 #define HCR_FIEN      (1ULL << 47)
+/* RES0 bit 48 */
 #define HCR_TID4      (1ULL << 49)
 #define HCR_TICAB     (1ULL << 50)
+#define HCR_AMVOFFEN  (1ULL << 51)
 #define HCR_TOCU      (1ULL << 52)
+#define HCR_ENSCXT    (1ULL << 53)
 #define HCR_TTLBIS    (1ULL << 54)
 #define HCR_TTLBOS    (1ULL << 55)
 #define HCR_ATA       (1ULL << 56)
 #define HCR_DCT       (1ULL << 57)
+#define HCR_TID5      (1ULL << 58)
+#define HCR_TWEDEN    (1ULL << 59)
+#define HCR_TWEDEL    MAKE_64BIT_MASK(60, 4)
 
 #define SCR_NS                (1U << 0)
 #define SCR_IRQ               (1U << 1)
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

In arm_cpu_reset, we configure many system registers so that user-only
behaves as it should with a minimum of ifdefs.  However, we do not set
all of the system registers as required for a cpu with EL2 and EL3.

Disabling EL2 and EL3 mean that we will not look at those registers,
which means that we don't have to worry about configuring them.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_reset_hivecs_property =
 static Property arm_cpu_rvbar_property =
             DEFINE_PROP_UINT64("rvbar", ARMCPU, rvbar, 0);
 
+#ifndef CONFIG_USER_ONLY
 static Property arm_cpu_has_el2_property =
             DEFINE_PROP_BOOL("has_el2", ARMCPU, has_el2, true);
 
 static Property arm_cpu_has_el3_property =
             DEFINE_PROP_BOOL("has_el3", ARMCPU, has_el3, true);
+#endif
 
 static Property arm_cpu_cfgend_property =
             DEFINE_PROP_BOOL("cfgend", ARMCPU, cfgend, false);
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
         qdev_property_add_static(DEVICE(obj), &arm_cpu_rvbar_property);
     }
 
+#ifndef CONFIG_USER_ONLY
     if (arm_feature(&cpu->env, ARM_FEATURE_EL3)) {
         /* Add the has_el3 state CPU property only if EL3 is allowed.  This will
          * prevent "has_el3" from existing on CPUs which cannot support EL3.
          */
         qdev_property_add_static(DEVICE(obj), &arm_cpu_has_el3_property);
 
-#ifndef CONFIG_USER_ONLY
         object_property_add_link(obj, "secure-memory",
                                  TYPE_MEMORY_REGION,
                                  (Object **)&cpu->secure_memory,
                                  qdev_prop_allow_set_link_before_realize,
                                  OBJ_PROP_LINK_STRONG,
                                  &error_abort);
-#endif
     }
 
     if (arm_feature(&cpu->env, ARM_FEATURE_EL2)) {
         qdev_property_add_static(DEVICE(obj), &arm_cpu_has_el2_property);
     }
+#endif
 
     if (arm_feature(&cpu->env, ARM_FEATURE_PMU)) {
         cpu->has_pmu = true;
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

We have disabled EL2 and EL3 for user-only, which means that these
registers "don't exist" and should not be set.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
         /* Enable all PAC keys.  */
         env->cp15.sctlr_el[1] |= (SCTLR_EnIA | SCTLR_EnIB |
                                   SCTLR_EnDA | SCTLR_EnDB);
-        /* Enable all PAC instructions */
-        env->cp15.hcr_el2 |= HCR_API;
-        env->cp15.scr_el3 |= SCR_API;
         /* and to the FP/Neon instructions */
         env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3);
         /* and to the SVE instructions */
         env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 16, 2, 3);
-        env->cp15.cptr_el[3] |= CPTR_EZ;
         /* with maximum vector length */
         env->vfp.zcr_el[1] = cpu_isar_feature(aa64_sve, cpu) ?
                              cpu->sve_max_vq - 1 : 0;
-        env->vfp.zcr_el[2] = env->vfp.zcr_el[1];
-        env->vfp.zcr_el[3] = env->vfp.zcr_el[1];
         /*
          * Enable TBI0 and TBI1.  While the real kernel only enables TBI0,
          * turning on both here will produce smaller code and otherwise
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Update the {TGE,E2H} == '11' masking to ARMv8.6.
If EL2 is configured for aarch32, disable all of
the bits that are RES0 in aarch32 mode.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-6-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 31 +++++++++++++++++++++++++++----
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ uint64_t arm_hcr_el2_eff(CPUARMState *env)
          * Since the v8.4 language applies to the entire register, and
          * appears to be backward compatible, use that.
          */
-        ret = 0;
-    } else if (ret & HCR_TGE) {
-        /* These bits are up-to-date as of ARMv8.4.  */
+        return 0;
+    }
+
+    /*
+     * For a cpu that supports both aarch64 and aarch32, we can set bits
+     * in HCR_EL2 (e.g. via EL3) that are RES0 when we enter EL2 as aa32.
+     * Ignore all of the bits in HCR+HCR2 that are not valid for aarch32.
+     */
+    if (!arm_el_is_aa64(env, 2)) {
+        uint64_t aa32_valid;
+
+        /*
+         * These bits are up-to-date as of ARMv8.6.
+         * For HCR, it's easiest to list just the 2 bits that are invalid.
+         * For HCR2, list those that are valid.
+         */
+        aa32_valid = MAKE_64BIT_MASK(0, 32) & ~(HCR_RW | HCR_TDZ);
+        aa32_valid |= (HCR_CD | HCR_ID | HCR_TERR | HCR_TEA | HCR_MIOCNCE |
+                       HCR_TID4 | HCR_TICAB | HCR_TOCU | HCR_TTLBIS);
+        ret &= aa32_valid;
+    }
+
+    if (ret & HCR_TGE) {
+        /* These bits are up-to-date as of ARMv8.6.  */
         if (ret & HCR_E2H) {
             ret &= ~(HCR_VM | HCR_FMO | HCR_IMO | HCR_AMO |
                      HCR_BSU_MASK | HCR_DC | HCR_TWI | HCR_TWE |
                      HCR_TID0 | HCR_TID2 | HCR_TPCP | HCR_TPU |
-                     HCR_TDZ | HCR_CD | HCR_ID | HCR_MIOCNCE);
+                     HCR_TDZ | HCR_CD | HCR_ID | HCR_MIOCNCE |
+                     HCR_TID4 | HCR_TICAB | HCR_TOCU | HCR_ENSCXT |
+                     HCR_TTLBIS | HCR_TTLBOS | HCR_TID5);
         } else {
             ret |= HCR_FMO | HCR_IMO | HCR_AMO;
         }
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

These bits trap EL1 access to various virtual memory controls.

Buglink: https://bugs.launchpad.net/bugs/1855072
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 82 ++++++++++++++++++++++++++++++---------------
 1 file changed, 55 insertions(+), 27 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tpm(CPUARMState *env, const ARMCPRegInfo *ri,
     return CP_ACCESS_OK;
 }
 
+/* Check for traps from EL1 due to HCR_EL2.TVM and HCR_EL2.TRVM.  */
+static CPAccessResult access_tvm_trvm(CPUARMState *env, const ARMCPRegInfo *ri,
+                                      bool isread)
+{
+    if (arm_current_el(env) == 1) {
+        uint64_t trap = isread ? HCR_TRVM : HCR_TVM;
+        if (arm_hcr_el2_eff(env) & trap) {
+            return CP_ACCESS_TRAP_EL2;
+        }
+    }
+    return CP_ACCESS_OK;
+}
+
 static void dacr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
 {
     ARMCPU *cpu = env_archcpu(env);
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cp_reginfo[] = {
      */
     { .name = "CONTEXTIDR_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 1,
-      .access = PL1_RW, .secure = ARM_CP_SECSTATE_NS,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .secure = ARM_CP_SECSTATE_NS,
       .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[1]),
       .resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, },
     { .name = "CONTEXTIDR_S", .state = ARM_CP_STATE_AA32,
       .cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 1,
-      .access = PL1_RW, .secure = ARM_CP_SECSTATE_S,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .secure = ARM_CP_SECSTATE_S,
       .fieldoffset = offsetof(CPUARMState, cp15.contextidr_s),
       .resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, },
     REGINFO_SENTINEL
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v8_cp_reginfo[] = {
     /* MMU Domain access control / MPU write buffer control */
     { .name = "DACR",
       .cp = 15, .opc1 = CP_ANY, .crn = 3, .crm = CP_ANY, .opc2 = CP_ANY,
-      .access = PL1_RW, .resetvalue = 0,
+      .access = PL1_RW, .accessfn = access_tvm_trvm, .resetvalue = 0,
       .writefn = dacr_write, .raw_writefn = raw_write,
       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.dacr_s),
                              offsetoflow32(CPUARMState, cp15.dacr_ns) } },
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
     { .name = "DMB", .cp = 15, .crn = 7, .crm = 10, .opc1 = 0, .opc2 = 5,
       .access = PL0_W, .type = ARM_CP_NOP },
     { .name = "IFAR", .cp = 15, .crn = 6, .crm = 0, .opc1 = 0, .opc2 = 2,
-      .access = PL1_RW,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ifar_s),
                              offsetof(CPUARMState, cp15.ifar_ns) },
       .resetvalue = 0, },
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
      */
     { .name = "AFSR0_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 5, .crm = 1, .opc2 = 0,
-      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .type = ARM_CP_CONST, .resetvalue = 0 },
     { .name = "AFSR1_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 5, .crm = 1, .opc2 = 1,
-      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .type = ARM_CP_CONST, .resetvalue = 0 },
     /* MAIR can just read-as-written because we don't implement caches
      * and so don't need to care about memory attributes.
      */
     { .name = "MAIR_EL1", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 0,
-      .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.mair_el[1]),
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .fieldoffset = offsetof(CPUARMState, cp15.mair_el[1]),
       .resetvalue = 0 },
     { .name = "MAIR_EL3", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 6, .crn = 10, .crm = 2, .opc2 = 0,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
       * handled in the field definitions.
       */
     { .name = "MAIR0", .state = ARM_CP_STATE_AA32,
-      .cp = 15, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 0, .access = PL1_RW,
+      .cp = 15, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 0,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.mair0_s),
                              offsetof(CPUARMState, cp15.mair0_ns) },
       .resetfn = arm_cp_reset_ignore },
     { .name = "MAIR1", .state = ARM_CP_STATE_AA32,
-      .cp = 15, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 1, .access = PL1_RW,
+      .cp = 15, .opc1 = 0, .crn = 10, .crm = 2, .opc2 = 1,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.mair1_s),
                              offsetof(CPUARMState, cp15.mair1_ns) },
       .resetfn = arm_cp_reset_ignore },
@@ -XXX,XX +XXX,XX @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 
 static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
     { .name = "DFSR", .cp = 15, .crn = 5, .crm = 0, .opc1 = 0, .opc2 = 0,
-      .access = PL1_RW, .type = ARM_CP_ALIAS,
+      .access = PL1_RW, .accessfn = access_tvm_trvm, .type = ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.dfsr_s),
                              offsetoflow32(CPUARMState, cp15.dfsr_ns) }, },
     { .name = "IFSR", .cp = 15, .crn = 5, .crm = 0, .opc1 = 0, .opc2 = 1,
-      .access = PL1_RW, .resetvalue = 0,
+      .access = PL1_RW, .accessfn = access_tvm_trvm, .resetvalue = 0,
       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.ifsr_s),
                              offsetoflow32(CPUARMState, cp15.ifsr_ns) } },
     { .name = "DFAR", .cp = 15, .opc1 = 0, .crn = 6, .crm = 0, .opc2 = 0,
-      .access = PL1_RW, .resetvalue = 0,
+      .access = PL1_RW, .accessfn = access_tvm_trvm, .resetvalue = 0,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.dfar_s),
                              offsetof(CPUARMState, cp15.dfar_ns) } },
     { .name = "FAR_EL1", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .crn = 6, .crm = 0, .opc1 = 0, .opc2 = 0,
-      .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.far_el[1]),
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .fieldoffset = offsetof(CPUARMState, cp15.far_el[1]),
       .resetvalue = 0, },
     REGINFO_SENTINEL
 };
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
 static const ARMCPRegInfo vmsa_cp_reginfo[] = {
     { .name = "ESR_EL1", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .crn = 5, .crm = 2, .opc1 = 0, .opc2 = 0,
-      .access = PL1_RW,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
       .fieldoffset = offsetof(CPUARMState, cp15.esr_el[1]), .resetvalue = 0, },
     { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
-      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .writefn = vmsa_ttbr_write, .resetvalue = 0,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                              offsetof(CPUARMState, cp15.ttbr0_ns) } },
     { .name = "TTBR1_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 1,
-      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .writefn = vmsa_ttbr_write, .resetvalue = 0,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) } },
     { .name = "TCR_EL1", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .crn = 2, .crm = 0, .opc1 = 0, .opc2 = 2,
-      .access = PL1_RW, .writefn = vmsa_tcr_el12_write,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .writefn = vmsa_tcr_el12_write,
       .resetfn = vmsa_ttbcr_reset, .raw_writefn = raw_write,
       .fieldoffset = offsetof(CPUARMState, cp15.tcr_el[1]) },
     { .name = "TTBCR", .cp = 15, .crn = 2, .crm = 0, .opc1 = 0, .opc2 = 2,
-      .access = PL1_RW, .type = ARM_CP_ALIAS, .writefn = vmsa_ttbcr_write,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .type = ARM_CP_ALIAS, .writefn = vmsa_ttbcr_write,
       .raw_writefn = vmsa_ttbcr_raw_write,
       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.tcr_el[3]),
                              offsetoflow32(CPUARMState, cp15.tcr_el[1])} },
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
  */
 static const ARMCPRegInfo ttbcr2_reginfo = {
     .name = "TTBCR2", .cp = 15, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 3,
-    .access = PL1_RW, .type = ARM_CP_ALIAS,
+    .access = PL1_RW, .accessfn = access_tvm_trvm,
+    .type = ARM_CP_ALIAS,
     .bank_fieldoffsets = { offsetofhigh32(CPUARMState, cp15.tcr_el[3]),
                            offsetofhigh32(CPUARMState, cp15.tcr_el[1]) },
 };
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo lpae_cp_reginfo[] = {
     /* NOP AMAIR0/1 */
     { .name = "AMAIR0", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .crn = 10, .crm = 3, .opc1 = 0, .opc2 = 0,
-      .access = PL1_RW, .type = ARM_CP_CONST,
-      .resetvalue = 0 },
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .type = ARM_CP_CONST, .resetvalue = 0 },
     /* AMAIR1 is mapped to AMAIR_EL1[63:32] */
     { .name = "AMAIR1", .cp = 15, .crn = 10, .crm = 3, .opc1 = 0, .opc2 = 1,
-      .access = PL1_RW, .type = ARM_CP_CONST,
-      .resetvalue = 0 },
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .type = ARM_CP_CONST, .resetvalue = 0 },
     { .name = "PAR", .cp = 15, .crm = 7, .opc1 = 0,
       .access = PL1_RW, .type = ARM_CP_64BIT, .resetvalue = 0,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.par_s),
                              offsetof(CPUARMState, cp15.par_ns)} },
     { .name = "TTBR0", .cp = 15, .crm = 2, .opc1 = 0,
-      .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                              offsetof(CPUARMState, cp15.ttbr0_ns) },
       .writefn = vmsa_ttbr_write, },
     { .name = "TTBR1", .cp = 15, .crm = 2, .opc1 = 1,
-      .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
+      .access = PL1_RW, .accessfn = access_tvm_trvm,
+      .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) },
       .writefn = vmsa_ttbr_write, },
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .type = ARM_CP_NOP, .access = PL1_W },
     /* MMU Domain access control / MPU write buffer control */
     { .name = "DACR", .cp = 15, .opc1 = 0, .crn = 3, .crm = 0, .opc2 = 0,
-      .access = PL1_RW, .resetvalue = 0,
+      .access = PL1_RW, .accessfn = access_tvm_trvm, .resetvalue = 0,
       .writefn = dacr_write, .raw_writefn = raw_write,
       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.dacr_s),
                              offsetoflow32(CPUARMState, cp15.dacr_ns) } },
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
         ARMCPRegInfo sctlr = {
             .name = "SCTLR", .state = ARM_CP_STATE_BOTH,
             .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 0,
-            .access = PL1_RW,
+            .access = PL1_RW, .accessfn = access_tvm_trvm,
             .bank_fieldoffsets = { offsetof(CPUARMState, cp15.sctlr_s),
                                    offsetof(CPUARMState, cp15.sctlr_ns) },
             .writefn = sctlr_write, .resetvalue = cpu->reset_sctlr,
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

These bits trap EL1 access to set/way cache maintenance insns.

Buglink: https://bugs.launchpad.net/bugs/1863685
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

This bit traps EL1 access to the auxiliary control registers.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

This bit traps EL1 access to cache maintenance insns that operate
to the point of coherency or persistence.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 39 +++++++++++++++++++++++++++++++--------
 1 file changed, 31 insertions(+), 8 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
     return CP_ACCESS_OK;
 }
 
+static CPAccessResult aa64_cacheop_poc_access(CPUARMState *env,
+                                              const ARMCPRegInfo *ri,
+                                              bool isread)
+{
+    /* Cache invalidate/clean to Point of Coherency or Persistence...  */
+    switch (arm_current_el(env)) {
+    case 0:
+        /* ... EL0 must UNDEF unless SCTLR_EL1.UCI is set.  */
+        if (!(arm_sctlr(env, 0) & SCTLR_UCI)) {
+            return CP_ACCESS_TRAP;
+        }
+        /* fall through */
+    case 1:
+        /* ... EL1 must trap to EL2 if HCR_EL2.TPCP is set.  */
+        if (arm_hcr_el2_eff(env) & HCR_TPCP) {
+            return CP_ACCESS_TRAP_EL2;
+        }
+        break;
+    }
+    return CP_ACCESS_OK;
+}
+
 /* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
  * Page D4-1736 (DDI0487A.b)
  */
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .accessfn = aa64_cacheop_access },
     { .name = "DC_IVAC", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
-      .access = PL1_W, .type = ARM_CP_NOP },
+      .access = PL1_W, .accessfn = aa64_cacheop_poc_access,
+      .type = ARM_CP_NOP },
     { .name = "DC_ISW", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 2,
       .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
     { .name = "DC_CVAC", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NOP,
-      .accessfn = aa64_cacheop_access },
+      .accessfn = aa64_cacheop_poc_access },
     { .name = "DC_CSW", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
       .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
     { .name = "DC_CIVAC", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NOP,
-      .accessfn = aa64_cacheop_access },
+      .accessfn = aa64_cacheop_poc_access },
     { .name = "DC_CISW", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 2,
       .access = PL1_W, .accessfn = access_tsw, .type = ARM_CP_NOP },
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
     { .name = "BPIMVA", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 7,
       .type = ARM_CP_NOP, .access = PL1_W },
     { .name = "DCIMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
-      .type = ARM_CP_NOP, .access = PL1_W },
+      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_poc_access },
     { .name = "DCISW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 2,
       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
     { .name = "DCCMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 1,
-      .type = ARM_CP_NOP, .access = PL1_W },
+      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_poc_access },
     { .name = "DCCSW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
     { .name = "DCCMVAU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 11, .opc2 = 1,
       .type = ARM_CP_NOP, .access = PL1_W },
     { .name = "DCCIMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 1,
-      .type = ARM_CP_NOP, .access = PL1_W },
+      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_poc_access },
     { .name = "DCCISW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 2,
       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
     /* MMU Domain access control / MPU write buffer control */
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dcpop_reg[] = {
     { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
-      .accessfn = aa64_cacheop_access, .writefn = dccvap_writefn },
+      .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
     REGINFO_SENTINEL
 };
 
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dcpodp_reg[] = {
     { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
-      .accessfn = aa64_cacheop_access, .writefn = dccvap_writefn },
+      .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
     REGINFO_SENTINEL
 };
 #endif /*CONFIG_USER_ONLY*/
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

This bit traps EL1 access to cache maintenance insns that operate
to the point of unification.  There are no longer any references to
plain aa64_cacheop_access, so remove it.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 53 +++++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 21 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo uao_reginfo = {
     .readfn = aa64_uao_read, .writefn = aa64_uao_write
 };
 
-static CPAccessResult aa64_cacheop_access(CPUARMState *env,
-                                          const ARMCPRegInfo *ri,
-                                          bool isread)
-{
-    /* Cache invalidate/clean: NOP, but EL0 must UNDEF unless
-     * SCTLR_EL1.UCI is set.
-     */
-    if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UCI)) {
-        return CP_ACCESS_TRAP;
-    }
-    return CP_ACCESS_OK;
-}
-
 static CPAccessResult aa64_cacheop_poc_access(CPUARMState *env,
                                               const ARMCPRegInfo *ri,
                                               bool isread)
@@ -XXX,XX +XXX,XX @@ static CPAccessResult aa64_cacheop_poc_access(CPUARMState *env,
     return CP_ACCESS_OK;
 }
 
+static CPAccessResult aa64_cacheop_pou_access(CPUARMState *env,
+                                              const ARMCPRegInfo *ri,
+                                              bool isread)
+{
+    /* Cache invalidate/clean to Point of Unification... */
+    switch (arm_current_el(env)) {
+    case 0:
+        /* ... EL0 must UNDEF unless SCTLR_EL1.UCI is set.  */
+        if (!(arm_sctlr(env, 0) & SCTLR_UCI)) {
+            return CP_ACCESS_TRAP;
+        }
+        /* fall through */
+    case 1:
+        /* ... EL1 must trap to EL2 if HCR_EL2.TPU is set.  */
+        if (arm_hcr_el2_eff(env) & HCR_TPU) {
+            return CP_ACCESS_TRAP_EL2;
+        }
+        break;
+    }
+    return CP_ACCESS_OK;
+}
+
 /* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
  * Page D4-1736 (DDI0487A.b)
  */
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
     /* Cache ops: all NOPs since we don't emulate caches */
     { .name = "IC_IALLUIS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
-      .access = PL1_W, .type = ARM_CP_NOP },
+      .access = PL1_W, .type = ARM_CP_NOP,
+      .accessfn = aa64_cacheop_pou_access },
     { .name = "IC_IALLU", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 0,
-      .access = PL1_W, .type = ARM_CP_NOP },
+      .access = PL1_W, .type = ARM_CP_NOP,
+      .accessfn = aa64_cacheop_pou_access },
     { .name = "IC_IVAU", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 5, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NOP,
-      .accessfn = aa64_cacheop_access },
+      .accessfn = aa64_cacheop_pou_access },
     { .name = "DC_IVAC", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
       .access = PL1_W, .accessfn = aa64_cacheop_poc_access,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
     { .name = "DC_CVAU", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 11, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NOP,
-      .accessfn = aa64_cacheop_access },
+      .accessfn = aa64_cacheop_pou_access },
     { .name = "DC_CIVAC", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NOP,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .writefn = tlbiipas2_is_write },
     /* 32 bit cache operations */
     { .name = "ICIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
-      .type = ARM_CP_NOP, .access = PL1_W },
+      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
     { .name = "BPIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 6,
       .type = ARM_CP_NOP, .access = PL1_W },
     { .name = "ICIALLU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 0,
-      .type = ARM_CP_NOP, .access = PL1_W },
+      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
     { .name = "ICIMVAU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 1,
-      .type = ARM_CP_NOP, .access = PL1_W },
+      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
     { .name = "BPIALL", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 6,
       .type = ARM_CP_NOP, .access = PL1_W },
     { .name = "BPIMVA", .cp = 15, .opc1 = 0, .crn = 7, .crm = 5, .opc2 = 7,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
     { .name = "DCCSW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
     { .name = "DCCMVAU", .cp = 15, .opc1 = 0, .crn = 7, .crm = 11, .opc2 = 1,
-      .type = ARM_CP_NOP, .access = PL1_W },
+      .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
     { .name = "DCCIMVAC", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 1,
       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_poc_access },
     { .name = "DCCISW", .cp = 15, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 2,
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

This bit traps EL1 access to tlb maintenance insns.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200229012811.24129-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 85 +++++++++++++++++++++++++++++----------------
 1 file changed, 55 insertions(+), 30 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tacr(CPUARMState *env, const ARMCPRegInfo *ri,
     return CP_ACCESS_OK;
 }
 
+/* Check for traps from EL1 due to HCR_EL2.TTLB. */
+static CPAccessResult access_ttlb(CPUARMState *env, const ARMCPRegInfo *ri,
+                                  bool isread)
+{
+    if (arm_current_el(env) == 1 && (arm_hcr_el2_eff(env) & HCR_TTLB)) {
+        return CP_ACCESS_TRAP_EL2;
+    }
+    return CP_ACCESS_OK;
+}
+
 static void dacr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
 {
     ARMCPU *cpu = env_archcpu(env);
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
       .type = ARM_CP_NO_RAW, .access = PL1_R, .readfn = isr_read },
     /* 32 bit ITLB invalidates */
     { .name = "ITLBIALL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 0,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiall_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbiall_write },
     { .name = "ITLBIMVA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 1,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbimva_write },
     { .name = "ITLBIASID", .cp = 15, .opc1 = 0, .crn = 8, .crm = 5, .opc2 = 2,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiasid_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbiasid_write },
     /* 32 bit DTLB invalidates */
     { .name = "DTLBIALL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 0,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiall_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbiall_write },
     { .name = "DTLBIMVA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 1,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbimva_write },
     { .name = "DTLBIASID", .cp = 15, .opc1 = 0, .crn = 8, .crm = 6, .opc2 = 2,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiasid_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbiasid_write },
     /* 32 bit TLB invalidates */
     { .name = "TLBIALL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 0,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiall_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbiall_write },
     { .name = "TLBIMVA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 1,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbimva_write },
     { .name = "TLBIASID", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 2,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiasid_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbiasid_write },
     { .name = "TLBIMVAA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 3,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimvaa_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbimvaa_write },
     REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo v7mp_cp_reginfo[] = {
     /* 32 bit TLB invalidates, Inner Shareable */
     { .name = "TLBIALLIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 0,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbiall_is_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbiall_is_write },
     { .name = "TLBIMVAIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 1,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_is_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbimva_is_write },
     { .name = "TLBIASIDIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 2,
-      .type = ARM_CP_NO_RAW, .access = PL1_W,
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
       .writefn = tlbiasid_is_write },
     { .name = "TLBIMVAAIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
-      .type = ARM_CP_NO_RAW, .access = PL1_W,
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
       .writefn = tlbimvaa_is_write },
     REGINFO_SENTINEL
 };
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
     /* TLBI operations */
     { .name = "TLBI_VMALLE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 0,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vmalle1is_write },
     { .name = "TLBI_VAE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 1,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae1is_write },
     { .name = "TLBI_ASIDE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 2,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vmalle1is_write },
     { .name = "TLBI_VAAE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae1is_write },
     { .name = "TLBI_VALE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 5,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae1is_write },
     { .name = "TLBI_VAALE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 7,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae1is_write },
     { .name = "TLBI_VMALLE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 0,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vmalle1_write },
     { .name = "TLBI_VAE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 1,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae1_write },
     { .name = "TLBI_ASIDE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 2,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vmalle1_write },
     { .name = "TLBI_VAAE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 3,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae1_write },
     { .name = "TLBI_VALE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 5,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae1_write },
     { .name = "TLBI_VAALE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 7,
-      .access = PL1_W, .type = ARM_CP_NO_RAW,
+      .access = PL1_W, .accessfn = access_ttlb, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae1_write },
     { .name = "TLBI_IPAS2E1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
 #endif
     /* TLB invalidate last level of translation table walk */
     { .name = "TLBIMVALIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 5,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_is_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbimva_is_write },
     { .name = "TLBIMVAALIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 7,
-      .type = ARM_CP_NO_RAW, .access = PL1_W,
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
       .writefn = tlbimvaa_is_write },
     { .name = "TLBIMVAL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 5,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimva_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbimva_write },
     { .name = "TLBIMVAAL", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 7,
-      .type = ARM_CP_NO_RAW, .access = PL1_W, .writefn = tlbimvaa_write },
+      .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+      .writefn = tlbimvaa_write },
     { .name = "TLBIMVALH", .cp = 15, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 5,
       .type = ARM_CP_NO_RAW, .access = PL2_W,
       .writefn = tlbimva_hyp_write },
-- 
2.20.1

From: Niek Linnenbank <nieklinnenbank@gmail.com>

The Cubieboard is a singleboard computer with an Allwinner A10 System-on-Chip [1].
As documented in the Allwinner A10 User Manual V1.5 [2], the SoC has an ARM
Cortex-A8 processor. Currently the Cubieboard machine definition specifies the
ARM Cortex-A9 in its description and as the default CPU.

This patch corrects the Cubieboard machine definition to use the ARM Cortex-A8.

The only user-visible effect is that our textual description of the
machine was wrong, because hw/arm/allwinner-a10.c always creates a
Cortex-A8 CPU regardless of the default value in the MachineClass struct.

[1] http://docs.cubieboard.org/products/start#cubieboard1
 [2] https://linux-sunxi.org/File:Allwinner_A10_User_manual_V1.5.pdf

Fixes: 8a863c8120994981a099
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20200227220149.6845-2-nieklinnenbank@gmail.com
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[note in commit message that the bug didn't have much visible effect]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/cubieboard.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/cubieboard.c
+++ b/hw/arm/cubieboard.c
@@ -XXX,XX +XXX,XX @@ static void cubieboard_init(MachineState *machine)
 
 static void cubieboard_machine_init(MachineClass *mc)
 {
-    mc->desc = "cubietech cubieboard (Cortex-A9)";
-    mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a9");
+    mc->desc = "cubietech cubieboard (Cortex-A8)";
+    mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a8");
     mc->init = cubieboard_init;
     mc->block_default_type = IF_IDE;
     mc->units_per_default_bus = 1;
-- 
2.20.1

From: Niek Linnenbank <nieklinnenbank@gmail.com>

The Cubieboard has an ARM Cortex-A8.  Instead of simply ignoring a
bogus -cpu option provided by the user, give them an error message so
they know their command line is wrong.

Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20200227220149.6845-3-nieklinnenbank@gmail.com
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweaked commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/cubieboard.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/cubieboard.c
+++ b/hw/arm/cubieboard.c
@@ -XXX,XX +XXX,XX @@ static struct arm_boot_info cubieboard_binfo = {
 
 static void cubieboard_init(MachineState *machine)
 {
-    AwA10State *a10 = AW_A10(object_new(TYPE_AW_A10));
+    AwA10State *a10;
     Error *err = NULL;
 
+    /* Only allow Cortex-A8 for this board */
+    if (strcmp(machine->cpu_type, ARM_CPU_TYPE_NAME("cortex-a8")) != 0) {
+        error_report("This board can only be used with cortex-a8 CPU");
+        exit(1);
+    }
+
+    a10 = AW_A10(object_new(TYPE_AW_A10));
+
     object_property_set_int(OBJECT(&a10->emac), 1, "phy-addr", &err);
     if (err != NULL) {
         error_reportf_err(err, "Couldn't set phy address: ");
-- 
2.20.1

From: Niek Linnenbank <nieklinnenbank@gmail.com>

The Cubieboard contains either 512MiB or 1GiB of onboard RAM [1].
Prevent changing RAM to a different size which could break user programs.

[1] http://linux-sunxi.org/Cubieboard

Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20200227220149.6845-4-nieklinnenbank@gmail.com
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/cubieboard.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/cubieboard.c
+++ b/hw/arm/cubieboard.c
@@ -XXX,XX +XXX,XX @@ static void cubieboard_init(MachineState *machine)
     AwA10State *a10;
     Error *err = NULL;
 
+    /* This board has fixed size RAM (512MiB or 1GiB) */
+    if (machine->ram_size != 512 * MiB &&
+        machine->ram_size != 1 * GiB) {
+        error_report("This machine can only be used with 512MiB or 1GiB RAM");
+        exit(1);
+    }
+
     /* Only allow Cortex-A8 for this board */
     if (strcmp(machine->cpu_type, ARM_CPU_TYPE_NAME("cortex-a8")) != 0) {
         error_report("This board can only be used with cortex-a8 CPU");
@@ -XXX,XX +XXX,XX @@ static void cubieboard_machine_init(MachineClass *mc)
 {
     mc->desc = "cubietech cubieboard (Cortex-A8)";
     mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a8");
+    mc->default_ram_size = 1 * GiB;
     mc->init = cubieboard_init;
     mc->block_default_type = IF_IDE;
     mc->units_per_default_bus = 1;
-- 
2.20.1

From: Niek Linnenbank <nieklinnenbank@gmail.com>

The Cubieboard machine does not support the -bios argument.
Report an error when -bios is used and exit immediately.

Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20200227220149.6845-5-nieklinnenbank@gmail.com
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/cubieboard.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/cubieboard.c
+++ b/hw/arm/cubieboard.c
@@ -XXX,XX +XXX,XX @@
 #include "exec/address-spaces.h"
 #include "qapi/error.h"
 #include "cpu.h"
+#include "sysemu/sysemu.h"
 #include "hw/sysbus.h"
 #include "hw/boards.h"
 #include "hw/arm/allwinner-a10.h"
@@ -XXX,XX +XXX,XX @@ static void cubieboard_init(MachineState *machine)
     AwA10State *a10;
     Error *err = NULL;
 
+    /* BIOS is not supported by this board */
+    if (bios_name) {
+        error_report("BIOS not supported for this machine");
+        exit(1);
+    }
+
     /* This board has fixed size RAM (512MiB or 1GiB) */
     if (machine->ram_size != 512 * MiB &&
         machine->ram_size != 1 * GiB) {
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Replicate the single TBI bit from TCR_EL2 and TCR_EL3 so that
we can unconditionally use pointer bit 55 to index into our
composite TBI1:TBI0 field.

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static int aa64_va_parameter_tbi(uint64_t tcr, ARMMMUIdx mmu_idx)
     } else if (mmu_idx == ARMMMUIdx_Stage2) {
         return 0; /* VTCR_EL2 */
     } else {
-        return extract32(tcr, 20, 1);
+        /* Replicate the single TBI bit so we always have 2 bits.  */
+        return extract32(tcr, 20, 1) * 3;
     }
 }
 
@@ -XXX,XX +XXX,XX @@ static int aa64_va_parameter_tbid(uint64_t tcr, ARMMMUIdx mmu_idx)
     } else if (mmu_idx == ARMMMUIdx_Stage2) {
         return 0; /* VTCR_EL2 */
     } else {
-        return extract32(tcr, 29, 1);
+        /* Replicate the single TBID bit so we always have 2 bits.  */
+        return extract32(tcr, 29, 1) * 3;
     }
 }
 
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

We now cache the core mmu_idx in env->hflags.  Rather than recompute
from scratch, extract the field.  All of the uses of cpu_mmu_index
within target/arm are within helpers, and env->hflags is always stable
within a translation block from whence helpers are called.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200302175829.2183-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h    | 23 +++++++++++++----------
 target/arm/helper.c |  5 -----
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
 
 #define MMU_USER_IDX 0
 
-/**
- * cpu_mmu_index:
- * @env: The cpu environment
- * @ifetch: True for code access, false for data access.
- *
- * Return the core mmu index for the current translation regime.
- * This function is used by generic TCG code paths.
- */
-int cpu_mmu_index(CPUARMState *env, bool ifetch);
-
 /* Indexes used when registering address spaces with cpu_address_space_init */
 typedef enum ARMASIdx {
     ARMASIdx_NS = 0,
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, BTYPE, 10, 2)         /* Not cached. */
 FIELD(TBFLAG_A64, TBID, 12, 2)
 FIELD(TBFLAG_A64, UNPRIV, 14, 1)
 
+/**
+ * cpu_mmu_index:
+ * @env: The cpu environment
+ * @ifetch: True for code access, false for data access.
+ *
+ * Return the core mmu index for the current translation regime.
+ * This function is used by generic TCG code paths.
+ */
+static inline int cpu_mmu_index(CPUARMState *env, bool ifetch)
+{
+    return FIELD_EX32(env->hflags, TBFLAG_ANY, MMUIDX);
+}
+
 static inline bool bswap_code(bool sctlr_b)
 {
 #ifdef CONFIG_USER_ONLY
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_mmu_idx(CPUARMState *env)
     return arm_mmu_idx_el(env, arm_current_el(env));
 }
 
-int cpu_mmu_index(CPUARMState *env, bool ifetch)
-{
-    return arm_to_core_mmu_idx(arm_mmu_idx(env));
-}
-
 #ifndef CONFIG_USER_ONLY
 ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env)
 {
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

If by context we know that we're in AArch64 mode, we need not
test for M-profile when reconstructing the full ARMMMUIdx.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200302175829.2183-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/internals.h     | 6 ++++++
 target/arm/translate-a64.c | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline ARMMMUIdx core_to_arm_mmu_idx(CPUARMState *env, int mmu_idx)
     }
 }
 
+static inline ARMMMUIdx core_to_aa64_mmu_idx(int mmu_idx)
+{
+    /* AArch64 is always a-profile. */
+    return mmu_idx | ARM_MMU_IDX_A;
+}
+
 int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx);
 
 /*
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
     dc->condexec_mask = 0;
     dc->condexec_cond = 0;
     core_mmu_idx = FIELD_EX32(tb_flags, TBFLAG_ANY, MMUIDX);
-    dc->mmu_idx = core_to_arm_mmu_idx(env, core_mmu_idx);
+    dc->mmu_idx = core_to_aa64_mmu_idx(core_mmu_idx);
     dc->tbii = FIELD_EX32(tb_flags, TBFLAG_A64, TBII);
     dc->tbid = FIELD_EX32(tb_flags, TBFLAG_A64, TBID);
     dc->current_el = arm_mmu_idx_to_el(dc->mmu_idx);
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

We missed this case within AArch64.ExceptionReturn.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200302175829.2183-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper-a64.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -XXX,XX +XXX,XX @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
                       "AArch32 EL%d PC 0x%" PRIx32 "\n",
                       cur_el, new_el, env->regs[15]);
     } else {
+        int tbii;
+
         env->aarch64 = 1;
         spsr &= aarch64_pstate_valid_mask(&env_archcpu(env)->isar);
         pstate_write(env, spsr);
@@ -XXX,XX +XXX,XX @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
             env->pstate &= ~PSTATE_SS;
         }
         aarch64_restore_sp(env, new_el);
-        env->pc = new_pc;
         helper_rebuild_hflags_a64(env, new_el);
+
+        /*
+         * Apply TBI to the exception return address.  We had to delay this
+         * until after we selected the new EL, so that we could select the
+         * correct TBI+TBID bits.  This is made easier by waiting until after
+         * the hflags rebuild, since we can pull the composite TBII field
+         * from there.
+         */
+        tbii = FIELD_EX32(env->hflags, TBFLAG_A64, TBII);
+        if ((tbii >> extract64(new_pc, 55, 1)) & 1) {
+            /* TBI is enabled. */
+            int core_mmu_idx = cpu_mmu_index(env, false);
+            if (regime_has_2_ranges(core_to_aa64_mmu_idx(core_mmu_idx))) {
+                new_pc = sextract64(new_pc, 0, 56);
+            } else {
+                new_pc = extract64(new_pc, 0, 56);
+            }
+        }
+        env->pc = new_pc;
+
         qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
                       "AArch64 EL%d PC 0x%" PRIx64 "\n",
                       cur_el, new_el, env->pc);
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

This is an aarch64-only function.  Move it out of the shared file.
This patch is code movement only.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200302175829.2183-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper-a64.h |  1 +
 target/arm/helper.h     |  1 -
 target/arm/helper-a64.c | 91 ++++++++++++++++++++++++++++++++++++++++
 target/arm/op_helper.c  | 93 -----------------------------------------
 4 files changed, 92 insertions(+), 94 deletions(-)

diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(advsimd_f16touinth, i32, f16, ptr)
 DEF_HELPER_2(sqrt_f16, f16, f16, ptr)
 
 DEF_HELPER_2(exception_return, void, env, i64)
+DEF_HELPER_2(dc_zva, void, env, i64)
 
 DEF_HELPER_FLAGS_3(pacia, TCG_CALL_NO_WG, i64, env, i64, i64)
 DEF_HELPER_FLAGS_3(pacib, TCG_CALL_NO_WG, i64, env, i64, i64)
diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
 
 DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
 DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
-DEF_HELPER_2(dc_zva, void, env, i64)
 
 DEF_HELPER_FLAGS_5(gvec_qrdmlah_s16, TCG_CALL_NO_RWG,
                    void, ptr, ptr, ptr, ptr, i32)
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -XXX,XX +XXX,XX @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/units.h"
 #include "cpu.h"
 #include "exec/gdbstub.h"
 #include "exec/helper-proto.h"
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sqrt_f16)(uint32_t a, void *fpstp)
     return float16_sqrt(a, s);
 }
 
+void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
+{
+    /*
+     * Implement DC ZVA, which zeroes a fixed-length block of memory.
+     * Note that we do not implement the (architecturally mandated)
+     * alignment fault for attempts to use this on Device memory
+     * (which matches the usual QEMU behaviour of not implementing either
+     * alignment faults or any memory attribute handling).
+     */
 
+    ARMCPU *cpu = env_archcpu(env);
+    uint64_t blocklen = 4 << cpu->dcz_blocksize;
+    uint64_t vaddr = vaddr_in & ~(blocklen - 1);
+
+#ifndef CONFIG_USER_ONLY
+    {
+        /*
+         * Slightly awkwardly, QEMU's TARGET_PAGE_SIZE may be less than
+         * the block size so we might have to do more than one TLB lookup.
+         * We know that in fact for any v8 CPU the page size is at least 4K
+         * and the block size must be 2K or less, but TARGET_PAGE_SIZE is only
+         * 1K as an artefact of legacy v5 subpage support being present in the
+         * same QEMU executable. So in practice the hostaddr[] array has
+         * two entries, given the current setting of TARGET_PAGE_BITS_MIN.
+         */
+        int maxidx = DIV_ROUND_UP(blocklen, TARGET_PAGE_SIZE);
+        void *hostaddr[DIV_ROUND_UP(2 * KiB, 1 << TARGET_PAGE_BITS_MIN)];
+        int try, i;
+        unsigned mmu_idx = cpu_mmu_index(env, false);
+        TCGMemOpIdx oi = make_memop_idx(MO_UB, mmu_idx);
+
+        assert(maxidx <= ARRAY_SIZE(hostaddr));
+
+        for (try = 0; try < 2; try++) {
+
+            for (i = 0; i < maxidx; i++) {
+                hostaddr[i] = tlb_vaddr_to_host(env,
+                                                vaddr + TARGET_PAGE_SIZE * i,
+                                                1, mmu_idx);
+                if (!hostaddr[i]) {
+                    break;
+                }
+            }
+            if (i == maxidx) {
+                /*
+                 * If it's all in the TLB it's fair game for just writing to;
+                 * we know we don't need to update dirty status, etc.
+                 */
+                for (i = 0; i < maxidx - 1; i++) {
+                    memset(hostaddr[i], 0, TARGET_PAGE_SIZE);
+                }
+                memset(hostaddr[i], 0, blocklen - (i * TARGET_PAGE_SIZE));
+                return;
+            }
+            /*
+             * OK, try a store and see if we can populate the tlb. This
+             * might cause an exception if the memory isn't writable,
+             * in which case we will longjmp out of here. We must for
+             * this purpose use the actual register value passed to us
+             * so that we get the fault address right.
+             */
+            helper_ret_stb_mmu(env, vaddr_in, 0, oi, GETPC());
+            /* Now we can populate the other TLB entries, if any */
+            for (i = 0; i < maxidx; i++) {
+                uint64_t va = vaddr + TARGET_PAGE_SIZE * i;
+                if (va != (vaddr_in & TARGET_PAGE_MASK)) {
+                    helper_ret_stb_mmu(env, va, 0, oi, GETPC());
+                }
+            }
+        }
+
+        /*
+         * Slow path (probably attempt to do this to an I/O device or
+         * similar, or clearing of a block of code we have translations
+         * cached for). Just do a series of byte writes as the architecture
+         * demands. It's not worth trying to use a cpu_physical_memory_map(),
+         * memset(), unmap() sequence here because:
+         *  + we'd need to account for the blocksize being larger than a page
+         *  + the direct-RAM access case is almost always going to be dealt
+         *    with in the fastpath code above, so there's no speed benefit
+         *  + we would have to deal with the map returning NULL because the
+         *    bounce buffer was in use
+         */
+        for (i = 0; i < blocklen; i++) {
+            helper_ret_stb_mmu(env, vaddr + i, 0, oi, GETPC());
+        }
+    }
+#else
+    memset(g2h(vaddr), 0, blocklen);
+#endif
+}
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -XXX,XX +XXX,XX @@
  * License along with this library; if not, see <http://www.gnu.org/licenses/>.
  */
 #include "qemu/osdep.h"
-#include "qemu/units.h"
 #include "qemu/log.h"
 #include "qemu/main-loop.h"
 #include "cpu.h"
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(ror_cc)(CPUARMState *env, uint32_t x, uint32_t i)
         return ((uint32_t)x >> shift) | (x << (32 - shift));
     }
 }
-
-void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
-{
-    /*
-     * Implement DC ZVA, which zeroes a fixed-length block of memory.
-     * Note that we do not implement the (architecturally mandated)
-     * alignment fault for attempts to use this on Device memory
-     * (which matches the usual QEMU behaviour of not implementing either
-     * alignment faults or any memory attribute handling).
-     */
-
-    ARMCPU *cpu = env_archcpu(env);
-    uint64_t blocklen = 4 << cpu->dcz_blocksize;
-    uint64_t vaddr = vaddr_in & ~(blocklen - 1);
-
-#ifndef CONFIG_USER_ONLY
-    {
-        /*
-         * Slightly awkwardly, QEMU's TARGET_PAGE_SIZE may be less than
-         * the block size so we might have to do more than one TLB lookup.
-         * We know that in fact for any v8 CPU the page size is at least 4K
-         * and the block size must be 2K or less, but TARGET_PAGE_SIZE is only
-         * 1K as an artefact of legacy v5 subpage support being present in the
-         * same QEMU executable. So in practice the hostaddr[] array has
-         * two entries, given the current setting of TARGET_PAGE_BITS_MIN.
-         */
-        int maxidx = DIV_ROUND_UP(blocklen, TARGET_PAGE_SIZE);
-        void *hostaddr[DIV_ROUND_UP(2 * KiB, 1 << TARGET_PAGE_BITS_MIN)];
-        int try, i;
-        unsigned mmu_idx = cpu_mmu_index(env, false);
-        TCGMemOpIdx oi = make_memop_idx(MO_UB, mmu_idx);
-
-        assert(maxidx <= ARRAY_SIZE(hostaddr));
-
-        for (try = 0; try < 2; try++) {
-
-            for (i = 0; i < maxidx; i++) {
-                hostaddr[i] = tlb_vaddr_to_host(env,
-                                                vaddr + TARGET_PAGE_SIZE * i,
-                                                1, mmu_idx);
-                if (!hostaddr[i]) {
-                    break;
-                }
-            }
-            if (i == maxidx) {
-                /*
-                 * If it's all in the TLB it's fair game for just writing to;
-                 * we know we don't need to update dirty status, etc.
-                 */
-                for (i = 0; i < maxidx - 1; i++) {
-                    memset(hostaddr[i], 0, TARGET_PAGE_SIZE);
-                }
-                memset(hostaddr[i], 0, blocklen - (i * TARGET_PAGE_SIZE));
-                return;
-            }
-            /*
-             * OK, try a store and see if we can populate the tlb. This
-             * might cause an exception if the memory isn't writable,
-             * in which case we will longjmp out of here. We must for
-             * this purpose use the actual register value passed to us
-             * so that we get the fault address right.
-             */
-            helper_ret_stb_mmu(env, vaddr_in, 0, oi, GETPC());
-            /* Now we can populate the other TLB entries, if any */
-            for (i = 0; i < maxidx; i++) {
-                uint64_t va = vaddr + TARGET_PAGE_SIZE * i;
-                if (va != (vaddr_in & TARGET_PAGE_MASK)) {
-                    helper_ret_stb_mmu(env, va, 0, oi, GETPC());
-                }
-            }
-        }
-
-        /*
-         * Slow path (probably attempt to do this to an I/O device or
-         * similar, or clearing of a block of code we have translations
-         * cached for). Just do a series of byte writes as the architecture
-         * demands. It's not worth trying to use a cpu_physical_memory_map(),
-         * memset(), unmap() sequence here because:
-         *  + we'd need to account for the blocksize being larger than a page
-         *  + the direct-RAM access case is almost always going to be dealt
-         *    with in the fastpath code above, so there's no speed benefit
-         *  + we would have to deal with the map returning NULL because the
-         *    bounce buffer was in use
-         */
-        for (i = 0; i < blocklen; i++) {
-            helper_ret_stb_mmu(env, vaddr + i, 0, oi, GETPC());
-        }
-    }
-#else
-    memset(g2h(vaddr), 0, blocklen);
-#endif
-}
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The function does not write registers, and only reads them by
implication via the exception path.

From: Richard Henderson <richard.henderson@linaro.org>

This data access was forgotten when we added support for cleaning
addresses of TBI information.

Fixes: 3a471103ac1823ba
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200302175829.2183-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
         return;
     case ARM_CP_DC_ZVA:
         /* Writes clear the aligned block of memory which rt points into. */
-        tcg_rt = cpu_reg(s, rt);
+        tcg_rt = clean_data_tbi(s, cpu_reg(s, rt));
         gen_helper_dc_zva(cpu_env, tcg_rt);
         return;
     default:
-- 
2.20.1

Most of this is the Neon decodetree patches, followed by Edgar's versal cleanups.

thanks
-- PMM

The following changes since commit 2ef486e76d64436be90f7359a3071fb2a56ce835:

Merge remote-tracking branch 'remotes/marcel/tags/rdma-pull-request' into staging (2020-05-03 14:12:56 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200504

for you to fetch changes up to 9aefc6cf9b73f66062d2f914a0136756e7a28211:

target/arm: Move gen_ function typedefs to translate.h (2020-05-04 12:59:26 +0100)

----------------------------------------------------------------
target-arm queue:
 * Start of conversion of Neon insns to decodetree
 * versal board: support SD and RTC
 * Implement ARMv8.2-TTS2UXN
 * Make VQDMULL undefined when U=1
 * Some minor code cleanups

----------------------------------------------------------------
Edgar E. Iglesias (11):
      hw/arm: versal: Remove inclusion of arm_gicv3_common.h
      hw/arm: versal: Move misplaced comment
      hw/arm: versal-virt: Fix typo xlnx-ve -> xlnx-versal
      hw/arm: versal: Embed the UARTs into the SoC type
      hw/arm: versal: Embed the GEMs into the SoC type
      hw/arm: versal: Embed the ADMAs into the SoC type
      hw/arm: versal: Embed the APUs into the SoC type
      hw/arm: versal: Add support for SD
      hw/arm: versal: Add support for the RTC
      hw/arm: versal-virt: Add support for SD
      hw/arm: versal-virt: Add support for the RTC

Fredrik Strupe (1):
      target/arm: Make VQDMULL undefined when U=1

Peter Maydell (25):
      target/arm: Don't use a TLB for ARMMMUIdx_Stage2
      target/arm: Use enum constant in get_phys_addr_lpae() call
      target/arm: Add new 's1_is_el0' argument to get_phys_addr_lpae()
      target/arm: Implement ARMv8.2-TTS2UXN
      target/arm: Use correct variable for setting 'max' cpu's ID_AA64DFR0
      target/arm/translate-vfp.inc.c: Remove duplicate simd_r32 check
      target/arm: Don't allow Thumb Neon insns without FEATURE_NEON
      target/arm: Add stubs for AArch32 Neon decodetree
      target/arm: Convert VCMLA (vector) to decodetree
      target/arm: Convert VCADD (vector) to decodetree
      target/arm: Convert V[US]DOT (vector) to decodetree
      target/arm: Convert VFM[AS]L (vector) to decodetree
      target/arm: Convert VCMLA (scalar) to decodetree
      target/arm: Convert V[US]DOT (scalar) to decodetree
      target/arm: Convert VFM[AS]L (scalar) to decodetree
      target/arm: Convert Neon load/store multiple structures to decodetree
      target/arm: Convert Neon 'load single structure to all lanes' to decodetree
      target/arm: Convert Neon 'load/store single structure' to decodetree
      target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree
      target/arm: Convert Neon 3-reg-same logic ops to decodetree
      target/arm: Convert Neon 3-reg-same VMAX/VMIN to decodetree
      target/arm: Convert Neon 3-reg-same comparisons to decodetree
      target/arm: Convert Neon 3-reg-same VQADD/VQSUB to decodetree
      target/arm: Convert Neon 3-reg-same VMUL, VMLA, VMLS, VSHL to decodetree
      target/arm: Move gen_ function typedefs to translate.h

Philippe Mathieu-Daudé (2):
      hw/arm/mps2-tz: Use TYPE_IOTKIT instead of hardcoded string
      target/arm: Use uint64_t for midr field in CPU state struct

include/hw/arm/xlnx-versal.h    |  31 +-
 target/arm/cpu-param.h          |   2 +-
 target/arm/cpu.h                |  38 ++-
 target/arm/translate-a64.h      |   9 -
 target/arm/translate.h          |  26 ++
 target/arm/neon-dp.decode       |  86 +++++
 target/arm/neon-ls.decode       |  52 +++
 target/arm/neon-shared.decode   |  66 ++++
 hw/arm/mps2-tz.c                |   2 +-
 hw/arm/xlnx-versal-virt.c       |  74 ++++-
 hw/arm/xlnx-versal.c            | 115 +++++--
 target/arm/cpu.c                |   3 +-
 target/arm/cpu64.c              |   8 +-
 target/arm/helper.c             | 183 ++++------
 target/arm/translate-a64.c      |  17 -
 target/arm/translate-neon.inc.c | 714 +++++++++++++++++++++++++++++++++++++++
 target/arm/translate-vfp.inc.c  |   6 -
 target/arm/translate.c          | 716 +++-------------------------------------
 target/arm/Makefile.objs        |  18 +
 19 files changed, 1302 insertions(+), 864 deletions(-)
 create mode 100644 target/arm/neon-dp.decode
 create mode 100644 target/arm/neon-ls.decode
 create mode 100644 target/arm/neon-shared.decode
 create mode 100644 target/arm/translate-neon.inc.c

From: Fredrik Strupe <fredrik@strupe.net>

According to Arm ARM, VQDMULL is only valid when U=0, while having
U=1 is unallocated.

Signed-off-by: Fredrik Strupe <fredrik@strupe.net>
Fixes: 695272dcb976 ("target-arm: Handle UNDEF cases for Neon 3-regs-different-widths")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     {0, 0, 0, 0}, /* VMLSL */
                     {0, 0, 0, 9}, /* VQDMLSL */
                     {0, 0, 0, 0}, /* Integer VMULL */
-                    {0, 0, 0, 1}, /* VQDMULL */
+                    {0, 0, 0, 9}, /* VQDMULL */
                     {0, 0, 0, 0xa}, /* Polynomial VMULL */
                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
                 };
-- 
2.20.1

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

By using the TYPE_* definitions for devices, we can:
 - quickly find where devices are used with 'git-grep'
 - easily rename a device (one-line change).

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200428154650.21991-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/mps2-tz.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/mps2-tz.c
+++ b/hw/arm/mps2-tz.c
@@ -XXX,XX +XXX,XX @@ static void mps2tz_common_init(MachineState *machine)
         exit(EXIT_FAILURE);
     }
 
-    sysbus_init_child_obj(OBJECT(machine), "iotkit", &mms->iotkit,
+    sysbus_init_child_obj(OBJECT(machine), TYPE_IOTKIT, &mms->iotkit,
                           sizeof(mms->iotkit), mmc->armsse_type);
     iotkitdev = DEVICE(&mms->iotkit);
     object_property_set_link(OBJECT(&mms->iotkit), OBJECT(system_memory),
-- 
2.20.1

We define ARMMMUIdx_Stage2 as being an MMU index which uses a QEMU
TLB.  However we never actually use the TLB -- all stage 2 lookups
are done by direct calls to get_phys_addr_lpae() followed by a
physical address load via address_space_ld*().

Remove Stage2 from the list of ARM MMU indexes which correspond to
real core MMU indexes, and instead put it in the set of "NOTLB" ARM
MMU indexes.

This allows us to drop NB_MMU_MODES to 11.  It also means we can
safely add support for the ARMv8.3-TTS2UXN extension, which adds
permission bits to the stage 2 descriptors which define execute
permission separatel for EL0 and EL1; supporting that while keeping
Stage2 in a QEMU TLB would require us to use separate TLBs for
"Stage2 for an EL0 access" and "Stage2 for an EL1 access", which is a
lot of extra complication given we aren't even using the QEMU TLB.

In the process of updating the comment on our MMU index use,
fix a couple of other minor errors:
 * NS EL2 EL2&0 was missing from the list in the comment
 * some text hadn't been updated from when we bumped NB_MMU_MODES
   above 8

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-2-peter.maydell@linaro.org
---
 target/arm/cpu-param.h |   2 +-
 target/arm/cpu.h       |  21 +++++---
 target/arm/helper.c    | 112 ++++-------------------------------------
 3 files changed, 27 insertions(+), 108 deletions(-)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -XXX,XX +XXX,XX @@
 # define TARGET_PAGE_BITS_MIN  10
 #endif
 
-#define NB_MMU_MODES 12
+#define NB_MMU_MODES 11
 
 #endif
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  *     handling via the TLB. The only way to do a stage 1 translation without
  *     the immediate stage 2 translation is via the ATS or AT system insns,
  *     which can be slow-pathed and always do a page table walk.
+ *     The only use of stage 2 translations is either as part of an s1+2
+ *     lookup or when loading the descriptors during a stage 1 page table walk,
+ *     and in both those cases we don't use the TLB.
  *  4. we can also safely fold together the "32 bit EL3" and "64 bit EL3"
  *     translation regimes, because they map reasonably well to each other
  *     and they can't both be active at the same time.
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  * NS EL1 EL1&0 stage 1+2 (aka NS PL1)
  * NS EL1 EL1&0 stage 1+2 +PAN
  * NS EL0 EL2&0
+ * NS EL2 EL2&0
  * NS EL2 EL2&0 +PAN
  * NS EL2 (aka NS PL2)
  * S EL0 EL1&0 (aka S PL0)
  * S EL1 EL1&0 (not used if EL3 is 32 bit)
  * S EL1 EL1&0 +PAN
  * S EL3 (aka S PL1)
- * NS EL1&0 stage 2
  *
- * for a total of 12 different mmu_idx.
+ * for a total of 11 different mmu_idx.
  *
  * R profile CPUs have an MPU, but can use the same set of MMU indexes
  * as A profile. They only need to distinguish NS EL0 and NS EL1 (and
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  * are not quite the same -- different CPU types (most notably M profile
  * vs A/R profile) would like to use MMU indexes with different semantics,
  * but since we don't ever need to use all of those in a single CPU we
- * can avoid setting NB_MMU_MODES to more than 8. The lower bits of
+ * can avoid having to set NB_MMU_MODES to "total number of A profile MMU
+ * modes + total number of M profile MMU modes". The lower bits of
  * ARMMMUIdx are the core TLB mmu index, and the higher bits are always
  * the same for any particular CPU.
  * Variables of type ARMMUIdx are always full values, and the core
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
     ARMMMUIdx_SE10_1_PAN = 9 | ARM_MMU_IDX_A,
     ARMMMUIdx_SE3        = 10 | ARM_MMU_IDX_A,
 
-    ARMMMUIdx_Stage2     = 11 | ARM_MMU_IDX_A,
-
     /*
      * These are not allocated TLBs and are used only for AT system
      * instructions or for the first stage of an S12 page table walk.
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
     ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
     ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
     ARMMMUIdx_Stage1_E1_PAN = 2 | ARM_MMU_IDX_NOTLB,
+    /*
+     * Not allocated a TLB: used only for second stage of an S12 page
+     * table walk, or for descriptor loads during first stage of an S1
+     * page table walk. Note that if we ever want to have a TLB for this
+     * then various TLB flush insns which currently are no-ops or flush
+     * only stage 1 MMU indexes will need to change to flush stage 2.
+     */
+    ARMMMUIdx_Stage2     = 3 | ARM_MMU_IDX_NOTLB,
 
     /*
      * M-profile.
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
     TO_CORE_BIT(SE10_1),
     TO_CORE_BIT(SE10_1_PAN),
     TO_CORE_BIT(SE3),
-    TO_CORE_BIT(Stage2),
 
     TO_CORE_BIT(MUser),
     TO_CORE_BIT(MPriv),
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
     tlb_flush_by_mmuidx(cs,
                         ARMMMUIdxBit_E10_1 |
                         ARMMMUIdxBit_E10_1_PAN |
-                        ARMMMUIdxBit_E10_0 |
-                        ARMMMUIdxBit_Stage2);
+                        ARMMMUIdxBit_E10_0);
 }
 
 static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     tlb_flush_by_mmuidx_all_cpus_synced(cs,
                                         ARMMMUIdxBit_E10_1 |
                                         ARMMMUIdxBit_E10_1_PAN |
-                                        ARMMMUIdxBit_E10_0 |
-                                        ARMMMUIdxBit_Stage2);
+                                        ARMMMUIdxBit_E10_0);
 }
 
-static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                            uint64_t value)
-{
-    /* Invalidate by IPA. This has to invalidate any structures that
-     * contain only stage 2 translation information, but does not need
-     * to apply to structures that contain combined stage 1 and stage 2
-     * translation information.
-     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
-     */
-    CPUState *cs = env_cpu(env);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 40);
-
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
-}
-
-static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                               uint64_t value)
-{
-    CPUState *cs = env_cpu(env);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 40);
-
-    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
-                                             ARMMMUIdxBit_Stage2);
-}
 
 static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
                               uint64_t value)
@@ -XXX,XX +XXX,XX @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
         tlb_flush_by_mmuidx(cs,
                             ARMMMUIdxBit_E10_1 |
                             ARMMMUIdxBit_E10_1_PAN |
-                            ARMMMUIdxBit_E10_0 |
-                            ARMMMUIdxBit_Stage2);
+                            ARMMMUIdxBit_E10_0);
         raw_write(env, ri, value);
     }
 }
@@ -XXX,XX +XXX,XX @@ static int alle1_tlbmask(CPUARMState *env)
         return ARMMMUIdxBit_SE10_1 |
                ARMMMUIdxBit_SE10_1_PAN |
                ARMMMUIdxBit_SE10_0;
-    } else if (arm_feature(env, ARM_FEATURE_EL2)) {
-        return ARMMMUIdxBit_E10_1 |
-               ARMMMUIdxBit_E10_1_PAN |
-               ARMMMUIdxBit_E10_0 |
-               ARMMMUIdxBit_Stage2;
     } else {
         return ARMMMUIdxBit_E10_1 |
                ARMMMUIdxBit_E10_1_PAN |
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                              ARMMMUIdxBit_SE3);
 }
 
-static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                                    uint64_t value)
-{
-    /* Invalidate by IPA. This has to invalidate any structures that
-     * contain only stage 2 translation information, but does not need
-     * to apply to structures that contain combined stage 1 and stage 2
-     * translation information.
-     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
-     */
-    ARMCPU *cpu = env_archcpu(env);
-    CPUState *cs = CPU(cpu);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 48);
-
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
-}
-
-static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                                      uint64_t value)
-{
-    CPUState *cs = env_cpu(env);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 48);
-
-    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
-                                             ARMMMUIdxBit_Stage2);
-}
-
 static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
                                       bool isread)
 {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .writefn = tlbi_aa64_vae1_write },
     { .name = "TLBI_IPAS2E1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1is_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_IPAS2LE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1is_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_ALLE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 4,
       .access = PL2_W, .type = ARM_CP_NO_RAW,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .writefn = tlbi_aa64_alle1is_write },
     { .name = "TLBI_IPAS2E1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_IPAS2LE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_ALLE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 4,
       .access = PL2_W, .type = ARM_CP_NO_RAW,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .writefn = tlbimva_hyp_is_write },
     { .name = "TLBIIPAS2",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     { .name = "TLBIIPAS2IS",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_is_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     { .name = "TLBIIPAS2L",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     { .name = "TLBIIPAS2LIS",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_is_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     /* 32 bit cache operations */
     { .name = "ICIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
-- 
2.20.1

The access_type argument to get_phys_addr_lpae() is an MMUAccessType;
use the enum constant MMU_DATA_LOAD rather than a literal 0 when we
call it in S1_ptw_translate().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-3-peter.maydell@linaro.org
---
 target/arm/helper.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
             pcacheattrs = &cacheattrs;
         }
 
-        ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_Stage2, &s2pa,
-                                 &txattrs, &s2prot, &s2size, fi, pcacheattrs);
+        ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
+                                 &s2pa, &txattrs, &s2prot, &s2size, fi,
+                                 pcacheattrs);
         if (ret) {
             assert(fi->type != ARMFault_None);
             fi->s2addr = addr;
-- 
2.20.1

For ARMv8.2-TTS2UXN, the stage 2 page table walk wants to know
whether the stage 1 access is for EL0 or not, because whether
exec permission is given can depend on whether this is an EL0
or EL1 access. Add a new argument to get_phys_addr_lpae() so
the call sites can pass this information in.

Since get_phys_addr_lpae() doesn't already have a doc comment,
add one so we have a place to put the documentation of the
semantics of the new s1_is_el0 argument.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-4-peter.maydell@linaro.org
---
 target/arm/helper.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@
 
 static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
+                               bool s1_is_el0,
                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
                                target_ulong *page_size_ptr,
                                ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs);
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
         }
 
         ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
+                                 false,
                                  &s2pa, &txattrs, &s2prot, &s2size, fi,
                                  pcacheattrs);
         if (ret) {
@@ -XXX,XX +XXX,XX @@ static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
     };
 }
 
+/**
+ * get_phys_addr_lpae: perform one stage of page table walk, LPAE format
+ *
+ * Returns false if the translation was successful. Otherwise, phys_ptr, attrs,
+ * prot and page_size may not be filled in, and the populated fsr value provides
+ * information on why the translation aborted, in the format of a long-format
+ * DFSR/IFSR fault register, with the following caveats:
+ *  * the WnR bit is never set (the caller must do this).
+ *
+ * @env: CPUARMState
+ * @address: virtual address to get physical address for
+ * @access_type: MMU_DATA_LOAD, MMU_DATA_STORE or MMU_INST_FETCH
+ * @mmu_idx: MMU index indicating required translation regime
+ * @s1_is_el0: if @mmu_idx is ARMMMUIdx_Stage2 (so this is a stage 2 page table
+ *             walk), must be true if this is stage 2 of a stage 1+2 walk for an
+ *             EL0 access). If @mmu_idx is anything else, @s1_is_el0 is ignored.
+ * @phys_ptr: set to the physical address corresponding to the virtual address
+ * @attrs: set to the memory transaction attributes to use
+ * @prot: set to the permissions for the page containing phys_ptr
+ * @page_size_ptr: set to the size of the page containing phys_ptr
+ * @fi: set to fault info if the translation fails
+ * @cacheattrs: (if non-NULL) set to the cacheability/shareability attributes
+ */
 static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
+                               bool s1_is_el0,
                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
                                target_ulong *page_size_ptr,
                                ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs)
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
 
             /* S1 is done. Now do S2 translation.  */
             ret = get_phys_addr_lpae(env, ipa, access_type, ARMMMUIdx_Stage2,
+                                     mmu_idx == ARMMMUIdx_E10_0,
                                      phys_ptr, attrs, &s2_prot,
                                      page_size, fi,
                                      cacheattrs != NULL ? &cacheattrs2 : NULL);
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
     }
 
     if (regime_using_lpae_format(env, mmu_idx)) {
-        return get_phys_addr_lpae(env, address, access_type, mmu_idx,
+        return get_phys_addr_lpae(env, address, access_type, mmu_idx, false,
                                   phys_ptr, attrs, prot, page_size,
                                   fi, cacheattrs);
     } else if (regime_sctlr(env, mmu_idx) & SCTLR_XP) {
-- 
2.20.1

The ARMv8.2-TTS2UXN feature extends the XN field in stage 2
translation table descriptors from just bit [54] to bits [54:53],
allowing stage 2 to control execution permissions separately for EL0
and EL1. Implement the new semantics of the XN field and enable
the feature for our 'max' CPU.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-5-peter.maydell@linaro.org
---
 target/arm/cpu.h    | 15 +++++++++++++++
 target/arm/cpu.c    |  1 +
 target/arm/cpu64.c  |  2 ++
 target/arm/helper.c | 37 +++++++++++++++++++++++++++++++------
 4 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_ccidx(const ARMISARegisters *id)
     return FIELD_EX32(id->id_mmfr4, ID_MMFR4, CCIDX) != 0;
 }
 
+static inline bool isar_feature_aa32_tts2uxn(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_mmfr4, ID_MMFR4, XNX) != 0;
+}
+
 /*
  * 64-bit feature tests via id registers.
  */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_ccidx(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, CCIDX) != 0;
 }
 
+static inline bool isar_feature_aa64_tts2uxn(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, XNX) != 0;
+}
+
 /*
  * Feature tests for "does this exist in either 32-bit or 64-bit?"
  */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_ccidx(const ARMISARegisters *id)
     return isar_feature_aa64_ccidx(id) || isar_feature_aa32_ccidx(id);
 }
 
+static inline bool isar_feature_any_tts2uxn(const ARMISARegisters *id)
+{
+    return isar_feature_aa64_tts2uxn(id) || isar_feature_aa32_tts2uxn(id);
+}
+
 /*
  * Forward to the above feature tests given an ARMCPU pointer.
  */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
             t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
             t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
             t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
+            t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* TTS2UXN */
             cpu->isar.id_mmfr4 = t;
         }
 #endif
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);
         t = FIELD_DP64(t, ID_AA64MMFR1, PAN, 2); /* ATS1E1 */
         t = FIELD_DP64(t, ID_AA64MMFR1, VMIDBITS, 2); /* VMID16 */
+        t = FIELD_DP64(t, ID_AA64MMFR1, XNX, 1); /* TTS2UXN */
         cpu->isar.id_aa64mmfr1 = t;
 
         t = cpu->isar.id_aa64mmfr2;
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         u = FIELD_DP32(u, ID_MMFR4, HPDS, 1); /* AA32HPD */
         u = FIELD_DP32(u, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
         u = FIELD_DP32(u, ID_MMFR4, CNP, 1); /* TTCNP */
+        u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
         cpu->isar.id_mmfr4 = u;
 
         u = cpu->isar.id_aa64dfr0;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ simple_ap_to_rw_prot(CPUARMState *env, ARMMMUIdx mmu_idx, int ap)
  *
  * @env:     CPUARMState
  * @s2ap:    The 2-bit stage2 access permissions (S2AP)
- * @xn:      XN (execute-never) bit
+ * @xn:      XN (execute-never) bits
+ * @s1_is_el0: true if this is S2 of an S1+2 walk for EL0
  */
-static int get_S2prot(CPUARMState *env, int s2ap, int xn)
+static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
 {
     int prot = 0;
 
@@ -XXX,XX +XXX,XX @@ static int get_S2prot(CPUARMState *env, int s2ap, int xn)
     if (s2ap & 2) {
         prot |= PAGE_WRITE;
     }
-    if (!xn) {
-        if (arm_el_is_aa64(env, 2) || prot & PAGE_READ) {
+
+    if (cpu_isar_feature(any_tts2uxn, env_archcpu(env))) {
+        switch (xn) {
+        case 0:
             prot |= PAGE_EXEC;
+            break;
+        case 1:
+            if (s1_is_el0) {
+                prot |= PAGE_EXEC;
+            }
+            break;
+        case 2:
+            break;
+        case 3:
+            if (!s1_is_el0) {
+                prot |= PAGE_EXEC;
+            }
+            break;
+        default:
+            g_assert_not_reached();
+        }
+    } else {
+        if (!extract32(xn, 1, 1)) {
+            if (arm_el_is_aa64(env, 2) || prot & PAGE_READ) {
+                prot |= PAGE_EXEC;
+            }
         }
     }
     return prot;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
     }
 
     ap = extract32(attrs, 4, 2);
-    xn = extract32(attrs, 12, 1);
 
     if (mmu_idx == ARMMMUIdx_Stage2) {
         ns = true;
-        *prot = get_S2prot(env, ap, xn);
+        xn = extract32(attrs, 11, 2);
+        *prot = get_S2prot(env, ap, xn, s1_is_el0);
     } else {
         ns = extract32(attrs, 3, 1);
+        xn = extract32(attrs, 12, 1);
         pxn = extract32(attrs, 11, 1);
         *prot = get_S1prot(env, mmu_idx, aarch64, ap, ns, xn, pxn);
     }
-- 
2.20.1

In aarch64_max_initfn() we update both 32-bit and 64-bit ID
registers.  The intended pattern is that for 64-bit ID registers we
use FIELD_DP64 and the uint64_t 't' register, while 32-bit ID
registers use FIELD_DP32 and the uint32_t 'u' register.  For
ID_AA64DFR0 we accidentally used 'u', meaning that the top 32 bits of
this 64-bit ID register would end up always zero.  Luckily at the
moment that's what they should be anyway, so this bug has no visible
effects.

Use the right-sized variable.

Fixes: 3bec78447a958d481991
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200423110915.10527-1-peter.maydell@linaro.org
---
 target/arm/cpu64.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
         cpu->isar.id_mmfr4 = u;
 
-        u = cpu->isar.id_aa64dfr0;
-        u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
-        cpu->isar.id_aa64dfr0 = u;
+        t = cpu->isar.id_aa64dfr0;
+        t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
+        cpu->isar.id_aa64dfr0 = t;
 
         u = cpu->isar.id_dfr0;
         u = FIELD_DP32(u, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
-- 
2.20.1

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

MIDR_EL1 is a 64-bit system register with the top 32-bit being RES0.
Represent it in QEMU's ARMCPU struct with a uint64_t, not a
uint32_t.

This fixes an error when compiling with -Werror=conversion
because we were manipulating the register value using a
local uint64_t variable:

target/arm/cpu64.c: In function ‘aarch64_max_initfn’:
  target/arm/cpu64.c:628:21: error: conversion from ‘uint64_t’ {aka ‘long unsigned int’} to ‘uint32_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
    628 |         cpu->midr = t;
        |                     ^

and future-proofs us against a possible future architecture
change using some of the top 32 bits.

Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20200428172634.29707-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h | 2 +-
 target/arm/cpu.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
         uint64_t id_aa64dfr0;
         uint64_t id_aa64dfr1;
     } isar;
-    uint32_t midr;
+    uint64_t midr;
     uint32_t revidr;
     uint32_t reset_fpsid;
     uint32_t ctr;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo arm_cpus[] = {
 static Property arm_cpu_properties[] = {
     DEFINE_PROP_BOOL("start-powered-off", ARMCPU, start_powered_off, false),
     DEFINE_PROP_UINT32("psci-conduit", ARMCPU, psci_conduit, 0),
-    DEFINE_PROP_UINT32("midr", ARMCPU, midr, 0),
+    DEFINE_PROP_UINT64("midr", ARMCPU, midr, 0),
     DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
                         mp_affinity, ARM64_AFFINITY_INVALID),
     DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
-- 
2.20.1

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Move misplaced comment.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-3-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/xlnx-versal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal.c
+++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
 
         obj = object_new(XLNX_VERSAL_ACPU_TYPE);
         if (!obj) {
-            /* Secondary CPUs start in PSCI powered-down state */
             error_report("Unable to create apu.cpu[%d] of type %s",
                          i, XLNX_VERSAL_ACPU_TYPE);
             exit(EXIT_FAILURE);
@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
         object_property_set_int(obj, s->cfg.psci_conduit,
                                 "psci-conduit", &error_abort);
         if (i) {
+            /* Secondary CPUs start in PSCI powered-down state */
             object_property_set_bool(obj, true,
                                      "start-powered-off", &error_abort);
         }
-- 
2.20.1

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Fix typo xlnx-ve -> xlnx-versal.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-4-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/xlnx-versal-virt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal-virt.c
+++ b/hw/arm/xlnx-versal-virt.c
@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
         psci_conduit = QEMU_PSCI_CONDUIT_SMC;
     }
 
-    sysbus_init_child_obj(OBJECT(machine), "xlnx-ve", &s->soc,
+    sysbus_init_child_obj(OBJECT(machine), "xlnx-versal", &s->soc,
                           sizeof(s->soc), TYPE_XLNX_VERSAL);
     object_property_set_link(OBJECT(&s->soc), OBJECT(machine->ram),
                              "ddr", &error_abort);
-- 
2.20.1

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Embed the UARTs into the SoC type.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-5-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/xlnx-versal.h |  3 ++-
 hw/arm/xlnx-versal.c         | 12 ++++++------
 2 files changed, 8 insertions(+), 7 deletions(-)

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Embed the GEMs into the SoC type.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-6-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/xlnx-versal.h |  3 ++-
 hw/arm/xlnx-versal.c         | 15 ++++++++-------
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/xlnx-versal.h
+++ b/include/hw/arm/xlnx-versal.h
@@ -XXX,XX +XXX,XX @@
 #include "hw/arm/boot.h"
 #include "hw/intc/arm_gicv3.h"
 #include "hw/char/pl011.h"
+#include "hw/net/cadence_gem.h"
 
 #define TYPE_XLNX_VERSAL "xlnx-versal"
 #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
 
         struct {
             PL011State uart[XLNX_VERSAL_NR_UARTS];
-            SysBusDevice *gem[XLNX_VERSAL_NR_GEMS];
+            CadenceGEMState gem[XLNX_VERSAL_NR_GEMS];
             SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
         } iou;
     } lpd;
diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal.c
+++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_gems(Versal *s, qemu_irq *pic)
         DeviceState *dev;
         MemoryRegion *mr;
 
-        dev = qdev_create(NULL, "cadence_gem");
-        s->lpd.iou.gem[i] = SYS_BUS_DEVICE(dev);
-        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
+        sysbus_init_child_obj(OBJECT(s), name,
+                              &s->lpd.iou.gem[i], sizeof(s->lpd.iou.gem[i]),
+                              TYPE_CADENCE_GEM);
+        dev = DEVICE(&s->lpd.iou.gem[i]);
         if (nd->used) {
             qemu_check_nic_model(nd, "cadence_gem");
             qdev_set_nic_properties(dev, nd);
         }
-        object_property_set_int(OBJECT(s->lpd.iou.gem[i]),
+        object_property_set_int(OBJECT(dev),
                                 2, "num-priority-queues",
                                 &error_abort);
-        object_property_set_link(OBJECT(s->lpd.iou.gem[i]),
+        object_property_set_link(OBJECT(dev),
                                  OBJECT(&s->mr_ps), "dma",
                                  &error_abort);
         qdev_init_nofail(dev);
 
-        mr = sysbus_mmio_get_region(s->lpd.iou.gem[i], 0);
+        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
         memory_region_add_subregion(&s->mr_ps, addrs[i], mr);
 
-        sysbus_connect_irq(s->lpd.iou.gem[i], 0, pic[irqs[i]]);
+        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[irqs[i]]);
         g_free(name);
     }
 }
-- 
2.20.1

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Embed the ADMAs into the SoC type.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-7-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/xlnx-versal.h |  3 ++-
 hw/arm/xlnx-versal.c         | 14 +++++++-------
 2 files changed, 9 insertions(+), 8 deletions(-)

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Embed the APUs into the SoC type.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-8-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/xlnx-versal.h |  2 +-
 hw/arm/xlnx-versal-virt.c    |  4 ++--
 hw/arm/xlnx-versal.c         | 19 +++++--------------
 3 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/xlnx-versal.h
+++ b/include/hw/arm/xlnx-versal.h
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
     struct {
         struct {
             MemoryRegion mr;
-            ARMCPU *cpu[XLNX_VERSAL_NR_ACPUS];
+            ARMCPU cpu[XLNX_VERSAL_NR_ACPUS];
             GICv3State gic;
         } apu;
     } fpd;
diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal-virt.c
+++ b/hw/arm/xlnx-versal-virt.c
@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
     s->binfo.get_dtb = versal_virt_get_dtb;
     s->binfo.modify_dtb = versal_virt_modify_dtb;
     if (machine->kernel_filename) {
-        arm_load_kernel(s->soc.fpd.apu.cpu[0], machine, &s->binfo);
+        arm_load_kernel(&s->soc.fpd.apu.cpu[0], machine, &s->binfo);
     } else {
-        AddressSpace *as = arm_boot_address_space(s->soc.fpd.apu.cpu[0],
+        AddressSpace *as = arm_boot_address_space(&s->soc.fpd.apu.cpu[0],
                                                   &s->binfo);
         /* Some boot-loaders (e.g u-boot) don't like blobs at address 0 (NULL).
          * Offset things by 4K.  */
diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal.c
+++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
 
     for (i = 0; i < ARRAY_SIZE(s->fpd.apu.cpu); i++) {
         Object *obj;
-        char *name;
-
-        obj = object_new(XLNX_VERSAL_ACPU_TYPE);
-        if (!obj) {
-            error_report("Unable to create apu.cpu[%d] of type %s",
-                         i, XLNX_VERSAL_ACPU_TYPE);
-            exit(EXIT_FAILURE);
-        }
-
-        name = g_strdup_printf("apu-cpu[%d]", i);
-        object_property_add_child(OBJECT(s), name, obj, &error_fatal);
-        g_free(name);
 
+        object_initialize_child(OBJECT(s), "apu-cpu[*]",
+                                &s->fpd.apu.cpu[i], sizeof(s->fpd.apu.cpu[i]),
+                                XLNX_VERSAL_ACPU_TYPE, &error_abort, NULL);
+        obj = OBJECT(&s->fpd.apu.cpu[i]);
         object_property_set_int(obj, s->cfg.psci_conduit,
                                 "psci-conduit", &error_abort);
         if (i) {
@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
         object_property_set_link(obj, OBJECT(&s->fpd.apu.mr), "memory",
                                  &error_abort);
         object_property_set_bool(obj, true, "realized", &error_fatal);
-        s->fpd.apu.cpu[i] = ARM_CPU(obj);
     }
 }
 
@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_gic(Versal *s, qemu_irq *pic)
     }
 
     for (i = 0; i < nr_apu_cpus; i++) {
-        DeviceState *cpudev = DEVICE(s->fpd.apu.cpu[i]);
+        DeviceState *cpudev = DEVICE(&s->fpd.apu.cpu[i]);
         int ppibase = XLNX_VERSAL_NR_IRQS + i * GIC_INTERNAL + GIC_NR_SGIS;
         qemu_irq maint_irq;
         int ti;
-- 
2.20.1

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Add support for SD.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-9-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/xlnx-versal.h | 12 ++++++++++++
 hw/arm/xlnx-versal.c         | 31 +++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)

diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/xlnx-versal.h
+++ b/include/hw/arm/xlnx-versal.h
@@ -XXX,XX +XXX,XX @@
 
 #include "hw/sysbus.h"
 #include "hw/arm/boot.h"
+#include "hw/sd/sdhci.h"
 #include "hw/intc/arm_gicv3.h"
 #include "hw/char/pl011.h"
 #include "hw/dma/xlnx-zdma.h"
@@ -XXX,XX +XXX,XX @@
 #define XLNX_VERSAL_NR_UARTS   2
 #define XLNX_VERSAL_NR_GEMS    2
 #define XLNX_VERSAL_NR_ADMAS   8
+#define XLNX_VERSAL_NR_SDS     2
 #define XLNX_VERSAL_NR_IRQS    192
 
 typedef struct Versal {
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
         } iou;
     } lpd;
 
+    /* The Platform Management Controller subsystem.  */
+    struct {
+        struct {
+            SDHCIState sd[XLNX_VERSAL_NR_SDS];
+        } iou;
+    } pmc;
+
     struct {
         MemoryRegion *mr_ddr;
         uint32_t psci_conduit;
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
 #define VERSAL_GEM1_IRQ_0          58
 #define VERSAL_GEM1_WAKE_IRQ_0     59
 #define VERSAL_ADMA_IRQ_0          60
+#define VERSAL_SD0_IRQ_0           126
 
 /* Architecturally reserved IRQs suitable for virtualization.  */
 #define VERSAL_RSVD_IRQ_FIRST 111
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
 #define MM_FPD_CRF                  0xfd1a0000U
 #define MM_FPD_CRF_SIZE             0x140000
 
+#define MM_PMC_SD0                  0xf1040000U
+#define MM_PMC_SD0_SIZE             0x10000
 #define MM_PMC_CRP                  0xf1260000U
 #define MM_PMC_CRP_SIZE             0x10000
 #endif
diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal.c
+++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_admas(Versal *s, qemu_irq *pic)
     }
 }
 
+#define SDHCI_CAPABILITIES  0x280737ec6481 /* Same as on ZynqMP.  */
+static void versal_create_sds(Versal *s, qemu_irq *pic)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(s->pmc.iou.sd); i++) {
+        DeviceState *dev;
+        MemoryRegion *mr;
+
+        sysbus_init_child_obj(OBJECT(s), "sd[*]",
+                              &s->pmc.iou.sd[i], sizeof(s->pmc.iou.sd[i]),
+                              TYPE_SYSBUS_SDHCI);
+        dev = DEVICE(&s->pmc.iou.sd[i]);
+
+        object_property_set_uint(OBJECT(dev),
+                                 3, "sd-spec-version", &error_fatal);
+        object_property_set_uint(OBJECT(dev), SDHCI_CAPABILITIES, "capareg",
+                                 &error_fatal);
+        object_property_set_uint(OBJECT(dev), UHS_I, "uhs", &error_fatal);
+        qdev_init_nofail(dev);
+
+        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
+        memory_region_add_subregion(&s->mr_ps,
+                                    MM_PMC_SD0 + i * MM_PMC_SD0_SIZE, mr);
+
+        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0,
+                           pic[VERSAL_SD0_IRQ_0 + i * 2]);
+    }
+}
+
 /* This takes the board allocated linear DDR memory and creates aliases
  * for each split DDR range/aperture on the Versal address map.
  */
@@ -XXX,XX +XXX,XX @@ static void versal_realize(DeviceState *dev, Error **errp)
     versal_create_uarts(s, pic);
     versal_create_gems(s, pic);
     versal_create_admas(s, pic);
+    versal_create_sds(s, pic);
     versal_map_ddr(s);
     versal_unimp(s);
 
-- 
2.20.1

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

hw/arm: versal: Add support for the RTC.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-10-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/xlnx-versal.h |  8 ++++++++
 hw/arm/xlnx-versal.c         | 21 +++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/xlnx-versal.h
+++ b/include/hw/arm/xlnx-versal.h
@@ -XXX,XX +XXX,XX @@
 #include "hw/char/pl011.h"
 #include "hw/dma/xlnx-zdma.h"
 #include "hw/net/cadence_gem.h"
+#include "hw/rtc/xlnx-zynqmp-rtc.h"
 
 #define TYPE_XLNX_VERSAL "xlnx-versal"
 #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
         struct {
             SDHCIState sd[XLNX_VERSAL_NR_SDS];
         } iou;
+
+        XlnxZynqMPRTC rtc;
     } pmc;
 
     struct {
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
 #define VERSAL_GEM1_IRQ_0          58
 #define VERSAL_GEM1_WAKE_IRQ_0     59
 #define VERSAL_ADMA_IRQ_0          60
+#define VERSAL_RTC_APB_ERR_IRQ     121
 #define VERSAL_SD0_IRQ_0           126
+#define VERSAL_RTC_ALARM_IRQ       142
+#define VERSAL_RTC_SECONDS_IRQ     143
 
 /* Architecturally reserved IRQs suitable for virtualization.  */
 #define VERSAL_RSVD_IRQ_FIRST 111
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
 #define MM_PMC_SD0_SIZE             0x10000
 #define MM_PMC_CRP                  0xf1260000U
 #define MM_PMC_CRP_SIZE             0x10000
+#define MM_PMC_RTC                  0xf12a0000
+#define MM_PMC_RTC_SIZE             0x10000
 #endif
diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal.c
+++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_sds(Versal *s, qemu_irq *pic)
     }
 }
 
+static void versal_create_rtc(Versal *s, qemu_irq *pic)
+{
+    SysBusDevice *sbd;
+    MemoryRegion *mr;
+
+    sysbus_init_child_obj(OBJECT(s), "rtc", &s->pmc.rtc, sizeof(s->pmc.rtc),
+                          TYPE_XLNX_ZYNQMP_RTC);
+    sbd = SYS_BUS_DEVICE(&s->pmc.rtc);
+    qdev_init_nofail(DEVICE(sbd));
+
+    mr = sysbus_mmio_get_region(sbd, 0);
+    memory_region_add_subregion(&s->mr_ps, MM_PMC_RTC, mr);
+
+    /*
+     * TODO: Connect the ALARM and SECONDS interrupts once our RTC model
+     * supports them.
+     */
+    sysbus_connect_irq(sbd, 1, pic[VERSAL_RTC_APB_ERR_IRQ]);
+}
+
 /* This takes the board allocated linear DDR memory and creates aliases
  * for each split DDR range/aperture on the Versal address map.
  */
@@ -XXX,XX +XXX,XX @@ static void versal_realize(DeviceState *dev, Error **errp)
     versal_create_gems(s, pic);
     versal_create_admas(s, pic);
     versal_create_sds(s, pic);
+    versal_create_rtc(s, pic);
     versal_map_ddr(s);
     versal_unimp(s);
 
-- 
2.20.1

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Add support for SD.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-11-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/xlnx-versal-virt.c | 46 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal-virt.c
+++ b/hw/arm/xlnx-versal-virt.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/arm/sysbus-fdt.h"
 #include "hw/arm/fdt.h"
 #include "cpu.h"
+#include "hw/qdev-properties.h"
 #include "hw/arm/xlnx-versal.h"
 
 #define TYPE_XLNX_VERSAL_VIRT_MACHINE MACHINE_TYPE_NAME("xlnx-versal-virt")
@@ -XXX,XX +XXX,XX @@ static void fdt_add_zdma_nodes(VersalVirt *s)
     }
 }
 
+static void fdt_add_sd_nodes(VersalVirt *s)
+{
+    const char clocknames[] = "clk_xin\0clk_ahb";
+    const char compat[] = "arasan,sdhci-8.9a";
+    int i;
+
+    for (i = ARRAY_SIZE(s->soc.pmc.iou.sd) - 1; i >= 0; i--) {
+        uint64_t addr = MM_PMC_SD0 + MM_PMC_SD0_SIZE * i;
+        char *name = g_strdup_printf("/sdhci@%" PRIx64, addr);
+
+        qemu_fdt_add_subnode(s->fdt, name);
+
+        qemu_fdt_setprop_cells(s->fdt, name, "clocks",
+                               s->phandle.clk_25Mhz, s->phandle.clk_25Mhz);
+        qemu_fdt_setprop(s->fdt, name, "clock-names",
+                         clocknames, sizeof(clocknames));
+        qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
+                               GIC_FDT_IRQ_TYPE_SPI, VERSAL_SD0_IRQ_0 + i * 2,
+                               GIC_FDT_IRQ_FLAGS_LEVEL_HI);
+        qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
+                                     2, addr, 2, MM_PMC_SD0_SIZE);
+        qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat));
+        g_free(name);
+    }
+}
+
 static void fdt_nop_memory_nodes(void *fdt, Error **errp)
 {
     Error *err = NULL;
@@ -XXX,XX +XXX,XX @@ static void create_virtio_regions(VersalVirt *s)
     }
 }
 
+static void sd_plugin_card(SDHCIState *sd, DriveInfo *di)
+{
+    BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL;
+    DeviceState *card;
+
+    card = qdev_create(qdev_get_child_bus(DEVICE(sd), "sd-bus"), TYPE_SD_CARD);
+    object_property_add_child(OBJECT(sd), "card[*]", OBJECT(card),
+                              &error_fatal);
+    qdev_prop_set_drive(card, "drive", blk, &error_fatal);
+    object_property_set_bool(OBJECT(card), true, "realized", &error_fatal);
+}
+
 static void versal_virt_init(MachineState *machine)
 {
     VersalVirt *s = XLNX_VERSAL_VIRT_MACHINE(machine);
     int psci_conduit = QEMU_PSCI_CONDUIT_DISABLED;
+    int i;
 
     /*
      * If the user provides an Operating System to be loaded, we expect them
@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
     fdt_add_gic_nodes(s);
     fdt_add_timer_nodes(s);
     fdt_add_zdma_nodes(s);
+    fdt_add_sd_nodes(s);
     fdt_add_cpu_nodes(s, psci_conduit);
     fdt_add_clk_node(s, "/clk125", 125000000, s->phandle.clk_125Mhz);
     fdt_add_clk_node(s, "/clk25", 25000000, s->phandle.clk_25Mhz);
@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
     memory_region_add_subregion_overlap(get_system_memory(),
                                         0, &s->soc.fpd.apu.mr, 0);
 
+    /* Plugin SD cards.  */
+    for (i = 0; i < ARRAY_SIZE(s->soc.pmc.iou.sd); i++) {
+        sd_plugin_card(&s->soc.pmc.iou.sd[i], drive_get_next(IF_SD));
+    }
+
     s->binfo.ram_size = machine->ram_size;
     s->binfo.loader_start = 0x0;
     s->binfo.get_dtb = versal_virt_get_dtb;
-- 
2.20.1

From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Add support for the RTC.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-12-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/xlnx-versal-virt.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-versal-virt.c
+++ b/hw/arm/xlnx-versal-virt.c
@@ -XXX,XX +XXX,XX @@ static void fdt_add_sd_nodes(VersalVirt *s)
     }
 }
 
+static void fdt_add_rtc_node(VersalVirt *s)
+{
+    const char compat[] = "xlnx,zynqmp-rtc";
+    const char interrupt_names[] = "alarm\0sec";
+    char *name = g_strdup_printf("/rtc@%x", MM_PMC_RTC);
+
+    qemu_fdt_add_subnode(s->fdt, name);
+
+    qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
+                           GIC_FDT_IRQ_TYPE_SPI, VERSAL_RTC_ALARM_IRQ,
+                           GIC_FDT_IRQ_FLAGS_LEVEL_HI,
+                           GIC_FDT_IRQ_TYPE_SPI, VERSAL_RTC_SECONDS_IRQ,
+                           GIC_FDT_IRQ_FLAGS_LEVEL_HI);
+    qemu_fdt_setprop(s->fdt, name, "interrupt-names",
+                     interrupt_names, sizeof(interrupt_names));
+    qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
+                                 2, MM_PMC_RTC, 2, MM_PMC_RTC_SIZE);
+    qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat));
+    g_free(name);
+}
+
 static void fdt_nop_memory_nodes(void *fdt, Error **errp)
 {
     Error *err = NULL;
@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
     fdt_add_timer_nodes(s);
     fdt_add_zdma_nodes(s);
     fdt_add_sd_nodes(s);
+    fdt_add_rtc_node(s);
     fdt_add_cpu_nodes(s, psci_conduit);
     fdt_add_clk_node(s, "/clk125", 125000000, s->phandle.clk_125Mhz);
     fdt_add_clk_node(s, "/clk25", 25000000, s->phandle.clk_25Mhz);
-- 
2.20.1

Somewhere along theline we accidentally added a duplicate
"using D16-D31 when they don't exist" check to do_vfm_dp()
(probably an artifact of a patchseries rebase). Remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200430181003.21682-2-peter.maydell@linaro.org
---
 target/arm/translate-vfp.inc.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-vfp.inc.c
+++ b/target/arm/translate-vfp.inc.c
@@ -XXX,XX +XXX,XX @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) &&
-        ((a->vd | a->vn | a->vm) & 0x10)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
-- 
2.20.1

We were accidentally permitting decode of Thumb Neon insns even if
the CPU didn't have the FEATURE_NEON bit set, because the feature
check was being done before the call to disas_neon_data_insn() and
disas_neon_ls_insn() in the Arm decoder but was omitted from the
Thumb decoder.  Push the feature bit check down into the called
functions so it is done for both Arm and Thumb encodings.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200430181003.21682-3-peter.maydell@linaro.org
---
 target/arm/translate.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
     TCGv_i32 tmp2;
     TCGv_i64 tmp64;
 
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return 1;
+    }
+
     /* FIXME: this access check should not take precedence over UNDEF
      * for invalid encodings; we will generate incorrect syndrome information
      * for attempts to execute invalid vfp/neon encodings with FP disabled.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     TCGv_ptr ptr1, ptr2, ptr3;
     TCGv_i64 tmp64;
 
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return 1;
+    }
+
     /* FIXME: this access check should not take precedence over UNDEF
      * for invalid encodings; we will generate incorrect syndrome information
      * for attempts to execute invalid vfp/neon encodings with FP disabled.
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
 
         if (((insn >> 25) & 7) == 1) {
             /* NEON Data processing.  */
-            if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-                goto illegal_op;
-            }
-
             if (disas_neon_data_insn(s, insn)) {
                 goto illegal_op;
             }
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
         }
         if ((insn & 0x0f100000) == 0x04000000) {
             /* NEON load/store.  */
-            if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-                goto illegal_op;
-            }
-
             if (disas_neon_ls_insn(s, insn)) {
                 goto illegal_op;
             }
-- 
2.20.1

Add the infrastructure for building and invoking a decodetree decoder
for the AArch32 Neon encodings.  At the moment the new decoder covers
nothing, so we always fall back to the existing hand-written decode.

We follow the same pattern we did for the VFP decodetree conversion
(commit 78e138bc1f672c145ef6ace74617d and following): code that deals
with Neon will be moving gradually out to translate-neon.vfp.inc,
which we #include into translate.c.

In order to share the decode files between A32 and T32, we
split Neon into 3 parts:
 * data-processing
 * load-store
 * 'shared' encodings

The first two groups of instructions have similar but not identical
A32 and T32 encodings, so we need to manually transform the T32
encoding into the A32 one before calling the decoder; the third group
covers the Neon instructions which are identical in A32 and T32.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-4-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       | 29 ++++++++++++++++++++++++++
 target/arm/neon-ls.decode       | 29 ++++++++++++++++++++++++++
 target/arm/neon-shared.decode   | 27 +++++++++++++++++++++++++
 target/arm/translate-neon.inc.c | 32 +++++++++++++++++++++++++++++
 target/arm/translate.c          | 36 +++++++++++++++++++++++++++++++--
 target/arm/Makefile.objs        | 18 +++++++++++++++++
 6 files changed, 169 insertions(+), 2 deletions(-)
 create mode 100644 target/arm/neon-dp.decode
 create mode 100644 target/arm/neon-ls.decode
 create mode 100644 target/arm/neon-shared.decode
 create mode 100644 target/arm/translate-neon.inc.c

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
+# AArch32 Neon data-processing instruction descriptions
+#
+#  Copyright (c) 2020 Linaro, Ltd
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
+
+#
+# This file is processed by scripts/decodetree.py
+#
+
+# Encodings for Neon data processing instructions where the T32 encoding
+# is a simple transformation of the A32 encoding.
+# More specifically, this file covers instructions where the A32 encoding is
+#   0b1111_001p_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
+# and the T32 encoding is
+#   0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
+# This file works on the A32 encoding only; calling code for T32 has to
+# transform the insn into the A32 version first.
diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/neon-ls.decode
@@ -XXX,XX +XXX,XX @@
+# AArch32 Neon load/store instruction descriptions
+#
+#  Copyright (c) 2020 Linaro, Ltd
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
+
+#
+# This file is processed by scripts/decodetree.py
+#
+
+# Encodings for Neon load/store instructions where the T32 encoding
+# is a simple transformation of the A32 encoding.
+# More specifically, this file covers instructions where the A32 encoding is
+#   0b1111_0100_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
+# and the T32 encoding is
+#   0b1111_1001_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
+# This file works on the A32 encoding only; calling code for T32 has to
+# transform the insn into the A32 version first.
diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/neon-shared.decode
@@ -XXX,XX +XXX,XX @@
+# AArch32 Neon instruction descriptions
+#
+#  Copyright (c) 2020 Linaro, Ltd
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
+
+#
+# This file is processed by scripts/decodetree.py
+#
+
+# Encodings for Neon instructions whose encoding is the same for
+# both A32 and T32.
+
+# More specifically, this covers:
+# 2reg scalar ext: 0b1111_1110_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
+# 3same ext:       0b1111_110x_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@
+/*
+ *  ARM translation: AArch32 Neon instructions
+ *
+ *  Copyright (c) 2003 Fabrice Bellard
+ *  Copyright (c) 2005-2007 CodeSourcery
+ *  Copyright (c) 2007 OpenedHand, Ltd.
+ *  Copyright (c) 2020 Linaro, Ltd.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * This file is intended to be included from translate.c; it uses
+ * some macros and definitions provided by that file.
+ * It might be possible to convert it to a standalone .c file eventually.
+ */
+
+/* Include the generated Neon decoder */
+#include "decode-neon-dp.inc.c"
+#include "decode-neon-ls.inc.c"
+#include "decode-neon-shared.inc.c"
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static TCGv_ptr vfp_reg_ptr(bool dp, int reg)
 
 #define ARM_CP_RW_BIT   (1 << 20)
 
-/* Include the VFP decoder */
+/* Include the VFP and Neon decoders */
 #include "translate-vfp.inc.c"
+#include "translate-neon.inc.c"
 
 static inline void iwmmxt_load_reg(TCGv_i64 var, int reg)
 {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
         /* Unconditional instructions.  */
         /* TODO: Perhaps merge these into one decodetree output file.  */
         if (disas_a32_uncond(s, insn) ||
-            disas_vfp_uncond(s, insn)) {
+            disas_vfp_uncond(s, insn) ||
+            disas_neon_dp(s, insn) ||
+            disas_neon_ls(s, insn) ||
+            disas_neon_shared(s, insn)) {
             return;
         }
         /* fall back to legacy decoder */
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
         ARCH(6T2);
     }
 
+    if ((insn & 0xef000000) == 0xef000000) {
+        /*
+         * T32 encodings 0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
+         * transform into
+         * A32 encodings 0b1111_001p_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
+         */
+        uint32_t a32_insn = (insn & 0xe2ffffff) |
+            ((insn & (1 << 28)) >> 4) | (1 << 28);
+
+        if (disas_neon_dp(s, a32_insn)) {
+            return;
+        }
+    }
+
+    if ((insn & 0xff100000) == 0xf9000000) {
+        /*
+         * T32 encodings 0b1111_1001_ppp0_qqqq_qqqq_qqqq_qqqq_qqqq
+         * transform into
+         * A32 encodings 0b1111_0100_ppp0_qqqq_qqqq_qqqq_qqqq_qqqq
+         */
+        uint32_t a32_insn = (insn & 0x00ffffff) | 0xf4000000;
+
+        if (disas_neon_ls(s, a32_insn)) {
+            return;
+        }
+    }
+
     /*
      * TODO: Perhaps merge these into one decodetree output file.
      * Note disas_vfp is written for a32 with cond field in the
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
      */
     if (disas_t32(s, insn) ||
         disas_vfp_uncond(s, insn) ||
+        disas_neon_shared(s, insn) ||
         ((insn >> 28) == 0xe && disas_vfp(s, insn))) {
         return;
     }
diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/Makefile.objs
+++ b/target/arm/Makefile.objs
@@ -XXX,XX +XXX,XX @@ target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.decode $(DECODETREE)
 	  $(PYTHON) $(DECODETREE) --decode disas_sve -o $@ $<,\
 	  "GEN", $(TARGET_DIR)$@)
 
+target/arm/decode-neon-shared.inc.c: $(SRC_PATH)/target/arm/neon-shared.decode $(DECODETREE)
+	$(call quiet-command,\
+	  $(PYTHON) $(DECODETREE) --static-decode disas_neon_shared -o $@ $<,\
+	  "GEN", $(TARGET_DIR)$@)
+
+target/arm/decode-neon-dp.inc.c: $(SRC_PATH)/target/arm/neon-dp.decode $(DECODETREE)
+	$(call quiet-command,\
+	  $(PYTHON) $(DECODETREE) --static-decode disas_neon_dp -o $@ $<,\
+	  "GEN", $(TARGET_DIR)$@)
+
+target/arm/decode-neon-ls.inc.c: $(SRC_PATH)/target/arm/neon-ls.decode $(DECODETREE)
+	$(call quiet-command,\
+	  $(PYTHON) $(DECODETREE) --static-decode disas_neon_ls -o $@ $<,\
+	  "GEN", $(TARGET_DIR)$@)
+
 target/arm/decode-vfp.inc.c: $(SRC_PATH)/target/arm/vfp.decode $(DECODETREE)
 	$(call quiet-command,\
 	  $(PYTHON) $(DECODETREE) --static-decode disas_vfp -o $@ $<,\
@@ -XXX,XX +XXX,XX @@ target/arm/decode-t16.inc.c: $(SRC_PATH)/target/arm/t16.decode $(DECODETREE)
 	  "GEN", $(TARGET_DIR)$@)
 
 target/arm/translate-sve.o: target/arm/decode-sve.inc.c
+target/arm/translate.o: target/arm/decode-neon-shared.inc.c
+target/arm/translate.o: target/arm/decode-neon-dp.inc.c
+target/arm/translate.o: target/arm/decode-neon-ls.inc.c
 target/arm/translate.o: target/arm/decode-vfp.inc.c
 target/arm/translate.o: target/arm/decode-vfp-uncond.inc.c
 target/arm/translate.o: target/arm/decode-a32.inc.c
-- 
2.20.1

Convert the VCMLA (vector) insns in the 3same extension group to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-5-peter.maydell@linaro.org
---
 target/arm/neon-shared.decode   | 11 ++++++++++
 target/arm/translate-neon.inc.c | 37 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 11 +---------
 3 files changed, 49 insertions(+), 10 deletions(-)

diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-shared.decode
+++ b/target/arm/neon-shared.decode
@@ -XXX,XX +XXX,XX @@
 # More specifically, this covers:
 # 2reg scalar ext: 0b1111_1110_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
 # 3same ext:       0b1111_110x_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
+
+# VFP/Neon register fields; same as vfp.decode
+%vm_dp  5:1 0:4
+%vm_sp  0:4 5:1
+%vn_dp  7:1 16:4
+%vn_sp  16:4 7:1
+%vd_dp  22:1 12:4
+%vd_sp  12:4 22:1
+
+VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
+               vm=%vm_dp vn=%vn_dp vd=%vd_dp
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@
 #include "decode-neon-dp.inc.c"
 #include "decode-neon-ls.inc.c"
 #include "decode-neon-shared.inc.c"
+
+static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a)
+{
+    int opr_sz;
+    TCGv_ptr fpst;
+    gen_helper_gvec_3_ptr *fn_gvec_ptr;
+
+    if (!dc_isar_feature(aa32_vcma, s)
+        || (!a->size && !dc_isar_feature(aa32_fp16_arith, s))) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    opr_sz = (1 + a->q) * 8;
+    fpst = get_fpstatus_ptr(1);
+    fn_gvec_ptr = a->size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
+    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
+                       vfp_reg_offset(1, a->vn),
+                       vfp_reg_offset(1, a->vm),
+                       fpst, opr_sz, opr_sz, a->rot,
+                       fn_gvec_ptr);
+    tcg_temp_free_ptr(fpst);
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
     bool is_long = false, q = extract32(insn, 6, 1);
     bool ptr_is_env = false;
 
-    if ((insn & 0xfe200f10) == 0xfc200800) {
-        /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
-        int size = extract32(insn, 20, 1);
-        data = extract32(insn, 23, 2); /* rot */
-        if (!dc_isar_feature(aa32_vcma, s)
-            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
-            return 1;
-        }
-        fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
-    } else if ((insn & 0xfea00f10) == 0xfc800800) {
+    if ((insn & 0xfea00f10) == 0xfc800800) {
         /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
         int size = extract32(insn, 20, 1);
         data = extract32(insn, 24, 1); /* rot */
-- 
2.20.1

Convert the VCADD (vector) insns to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-6-peter.maydell@linaro.org
---
 target/arm/neon-shared.decode   |  3 +++
 target/arm/translate-neon.inc.c | 37 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 11 +---------
 3 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-shared.decode
+++ b/target/arm/neon-shared.decode
@@ -XXX,XX +XXX,XX @@
 
 VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
                vm=%vm_dp vn=%vn_dp vd=%vd_dp
+
+VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
+               vm=%vm_dp vn=%vn_dp vd=%vd_dp
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a)
     tcg_temp_free_ptr(fpst);
     return true;
 }
+
+static bool trans_VCADD(DisasContext *s, arg_VCADD *a)
+{
+    int opr_sz;
+    TCGv_ptr fpst;
+    gen_helper_gvec_3_ptr *fn_gvec_ptr;
+
+    if (!dc_isar_feature(aa32_vcma, s)
+        || (!a->size && !dc_isar_feature(aa32_fp16_arith, s))) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    opr_sz = (1 + a->q) * 8;
+    fpst = get_fpstatus_ptr(1);
+    fn_gvec_ptr = a->size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
+    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
+                       vfp_reg_offset(1, a->vn),
+                       vfp_reg_offset(1, a->vm),
+                       fpst, opr_sz, opr_sz, a->rot,
+                       fn_gvec_ptr);
+    tcg_temp_free_ptr(fpst);
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
     bool is_long = false, q = extract32(insn, 6, 1);
     bool ptr_is_env = false;
 
-    if ((insn & 0xfea00f10) == 0xfc800800) {
-        /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
-        int size = extract32(insn, 20, 1);
-        data = extract32(insn, 24, 1); /* rot */
-        if (!dc_isar_feature(aa32_vcma, s)
-            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
-            return 1;
-        }
-        fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
-    } else if ((insn & 0xfeb00f00) == 0xfc200d00) {
+    if ((insn & 0xfeb00f00) == 0xfc200d00) {
         /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
         bool u = extract32(insn, 4, 1);
         if (!dc_isar_feature(aa32_dp, s)) {
-- 
2.20.1

Convert the V[US]DOT (vector) insns to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-7-peter.maydell@linaro.org
---
 target/arm/neon-shared.decode   |  4 ++++
 target/arm/translate-neon.inc.c | 32 ++++++++++++++++++++++++++++++++
 target/arm/translate.c          |  9 +--------
 3 files changed, 37 insertions(+), 8 deletions(-)

Convert the VFM[AS]L (vector) insns to decodetree.  This is the last
insn in the legacy decoder for the 3same_ext group, so we can
delete the legacy decoder function for the group entirely.

Note that in disas_thumb2_insn() the parts of this encoding space
where the decodetree decoder returns false will correctly be directed
to illegal_op by the "(insn & (1 << 28))" check so they won't fall
into disas_coproc_insn() by mistake.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-8-peter.maydell@linaro.org
---
 target/arm/neon-shared.decode   |  6 +++
 target/arm/translate-neon.inc.c | 31 +++++++++++
 target/arm/translate.c          | 92 +--------------------------------
 3 files changed, 38 insertions(+), 91 deletions(-)

diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-shared.decode
+++ b/target/arm/neon-shared.decode
@@ -XXX,XX +XXX,XX @@ VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
 # VUDOT and VSDOT
 VDOT           1111 110 00 . 10 .... .... 1101 . q:1 . u:1 .... \
                vm=%vm_dp vn=%vn_dp vd=%vd_dp
+
+# VFM[AS]L
+VFML           1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \
+               vm=%vm_sp vn=%vn_sp vd=%vd_dp q=0
+VFML           1111 110 0 s:1 . 10 .... .... 1000 . 1 . 1 .... \
+               vm=%vm_dp vn=%vn_dp vd=%vd_dp q=1
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VDOT(DisasContext *s, arg_VDOT *a)
                        opr_sz, opr_sz, 0, fn_gvec);
     return true;
 }
+
+static bool trans_VFML(DisasContext *s, arg_VFML *a)
+{
+    int opr_sz;
+
+    if (!dc_isar_feature(aa32_fhm, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        (a->vd & 0x10)) {
+        return false;
+    }
+
+    if (a->vd & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    opr_sz = (1 + a->q) * 8;
+    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
+                       vfp_reg_offset(a->q, a->vn),
+                       vfp_reg_offset(a->q, a->vm),
+                       cpu_env, opr_sz, opr_sz, a->s, /* is_2 == 0 */
+                       gen_helper_gvec_fmlal_a32);
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     return 0;
 }
 
-/* Advanced SIMD three registers of the same length extension.
- *  31           25    23  22    20   16   12  11   10   9    8        3     0
- * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
- * | 1 1 1 1 1 1 0 | op1 | D | op2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
- * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
- */
-static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
-{
-    gen_helper_gvec_3 *fn_gvec = NULL;
-    gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
-    int rd, rn, rm, opr_sz;
-    int data = 0;
-    int off_rn, off_rm;
-    bool is_long = false, q = extract32(insn, 6, 1);
-    bool ptr_is_env = false;
-
-    if ((insn & 0xff300f10) == 0xfc200810) {
-        /* VFM[AS]L -- 1111 1100 S.10 .... .... 1000 .Q.1 .... */
-        int is_s = extract32(insn, 23, 1);
-        if (!dc_isar_feature(aa32_fhm, s)) {
-            return 1;
-        }
-        is_long = true;
-        data = is_s; /* is_2 == 0 */
-        fn_gvec_ptr = gen_helper_gvec_fmlal_a32;
-        ptr_is_env = true;
-    } else {
-        return 1;
-    }
-
-    VFP_DREG_D(rd, insn);
-    if (rd & q) {
-        return 1;
-    }
-    if (q || !is_long) {
-        VFP_DREG_N(rn, insn);
-        VFP_DREG_M(rm, insn);
-        if ((rn | rm) & q & !is_long) {
-            return 1;
-        }
-        off_rn = vfp_reg_offset(1, rn);
-        off_rm = vfp_reg_offset(1, rm);
-    } else {
-        rn = VFP_SREG_N(insn);
-        rm = VFP_SREG_M(insn);
-        off_rn = vfp_reg_offset(0, rn);
-        off_rm = vfp_reg_offset(0, rm);
-    }
-
-    if (s->fp_excp_el) {
-        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
-                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
-        return 0;
-    }
-    if (!s->vfp_enabled) {
-        return 1;
-    }
-
-    opr_sz = (1 + q) * 8;
-    if (fn_gvec_ptr) {
-        TCGv_ptr ptr;
-        if (ptr_is_env) {
-            ptr = cpu_env;
-        } else {
-            ptr = get_fpstatus_ptr(1);
-        }
-        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
-                           opr_sz, opr_sz, data, fn_gvec_ptr);
-        if (!ptr_is_env) {
-            tcg_temp_free_ptr(ptr);
-        }
-    } else {
-        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
-                           opr_sz, opr_sz, data, fn_gvec);
-    }
-    return 0;
-}
-
 /* Advanced SIMD two registers and a scalar extension.
  *  31             24   23  22   20   16   12  11   10   9    8        3     0
  * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                     }
                 }
             }
-        } else if ((insn & 0x0e000a00) == 0x0c000800
-                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
-            if (disas_neon_insn_3same_ext(s, insn)) {
-                goto illegal_op;
-            }
-            return;
         } else if ((insn & 0x0f000a00) == 0x0e000800
                    && arm_dc_feature(s, ARM_FEATURE_V8)) {
             if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
             }
             break;
         }
-        if ((insn & 0xfe000a00) == 0xfc000800
+        if ((insn & 0xff000a00) == 0xfe000800
             && arm_dc_feature(s, ARM_FEATURE_V8)) {
             /* The Thumb2 and ARM encodings are identical.  */
-            if (disas_neon_insn_3same_ext(s, insn)) {
-                goto illegal_op;
-            }
-        } else if ((insn & 0xff000a00) == 0xfe000800
-                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
-            /* The Thumb2 and ARM encodings are identical.  */
             if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
                 goto illegal_op;
             }
-- 
2.20.1

Convert VCMLA (scalar) in the 2reg-scalar-ext group to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-9-peter.maydell@linaro.org
---
 target/arm/neon-shared.decode   |  5 +++++
 target/arm/translate-neon.inc.c | 40 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 26 +--------------------
 3 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-shared.decode
+++ b/target/arm/neon-shared.decode
@@ -XXX,XX +XXX,XX @@ VFML           1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \
                vm=%vm_sp vn=%vn_sp vd=%vd_dp q=0
 VFML           1111 110 0 s:1 . 10 .... .... 1000 . 1 . 1 .... \
                vm=%vm_dp vn=%vn_dp vd=%vd_dp q=1
+
+VCMLA_scalar   1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \
+               vn=%vn_dp vd=%vd_dp size=0
+VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
+               vm=%vm_dp vn=%vn_dp vd=%vd_dp size=1 index=0
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VFML(DisasContext *s, arg_VFML *a)
                        gen_helper_gvec_fmlal_a32);
     return true;
 }
+
+static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a)
+{
+    gen_helper_gvec_3_ptr *fn_gvec_ptr;
+    int opr_sz;
+    TCGv_ptr fpst;
+
+    if (!dc_isar_feature(aa32_vcma, s)) {
+        return false;
+    }
+    if (a->size == 0 && !dc_isar_feature(aa32_fp16_arith, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vd | a->vn) & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    fn_gvec_ptr = (a->size ? gen_helper_gvec_fcmlas_idx
+                   : gen_helper_gvec_fcmlah_idx);
+    opr_sz = (1 + a->q) * 8;
+    fpst = get_fpstatus_ptr(1);
+    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
+                       vfp_reg_offset(1, a->vn),
+                       vfp_reg_offset(1, a->vm),
+                       fpst, opr_sz, opr_sz,
+                       (a->index << 2) | a->rot, fn_gvec_ptr);
+    tcg_temp_free_ptr(fpst);
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
     bool is_long = false, q = extract32(insn, 6, 1);
     bool ptr_is_env = false;
 
-    if ((insn & 0xff000f10) == 0xfe000800) {
-        /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */
-        int rot = extract32(insn, 20, 2);
-        int size = extract32(insn, 23, 1);
-        int index;
-
-        if (!dc_isar_feature(aa32_vcma, s)) {
-            return 1;
-        }
-        if (size == 0) {
-            if (!dc_isar_feature(aa32_fp16_arith, s)) {
-                return 1;
-            }
-            /* For fp16, rm is just Vm, and index is M.  */
-            rm = extract32(insn, 0, 4);
-            index = extract32(insn, 5, 1);
-        } else {
-            /* For fp32, rm is the usual M:Vm, and index is 0.  */
-            VFP_DREG_M(rm, insn);
-            index = 0;
-        }
-        data = (index << 2) | rot;
-        fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx
-                       : gen_helper_gvec_fcmlah_idx);
-    } else if ((insn & 0xffb00f00) == 0xfe200d00) {
+    if ((insn & 0xffb00f00) == 0xfe200d00) {
         /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
         int u = extract32(insn, 4, 1);
 
-- 
2.20.1

Convert the V[US]DOT (scalar) insns in the 2reg-scalar-ext group
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-10-peter.maydell@linaro.org
---
 target/arm/neon-shared.decode   |  3 +++
 target/arm/translate-neon.inc.c | 35 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 13 +-----------
 3 files changed, 39 insertions(+), 12 deletions(-)

Convert the VFM[AS]L (scalar) insns in the 2reg-scalar-ext group
to decodetree. These are the last ones in the group so we can remove
all the legacy decode for the group.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-11-peter.maydell@linaro.org
---
 target/arm/neon-shared.decode   |   7 +++
 target/arm/translate-neon.inc.c |  32 ++++++++++
 target/arm/translate.c          | 107 +-------------------------------
 3 files changed, 40 insertions(+), 106 deletions(-)

diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-shared.decode
+++ b/target/arm/neon-shared.decode
@@ -XXX,XX +XXX,XX @@ VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
 
 VDOT_scalar    1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \
                vm=%vm_dp vn=%vn_dp vd=%vd_dp
+
+%vfml_scalar_q0_rm 0:3 5:1
+%vfml_scalar_q1_index 5:1 3:1
+VFML_scalar    1111 1110 0 . 0 s:1 .... .... 1000 . 0 . 1 index:1 ... \
+               rm=%vfml_scalar_q0_rm vn=%vn_sp vd=%vd_dp q=0
+VFML_scalar    1111 1110 0 . 0 s:1 .... .... 1000 . 1 . 1 . rm:3 \
+               index=%vfml_scalar_q1_index vn=%vn_dp vd=%vd_dp q=1
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a)
     tcg_temp_free_ptr(fpst);
     return true;
 }
+
+static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a)
+{
+    int opr_sz;
+
+    if (!dc_isar_feature(aa32_fhm, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd & 0x10) || (a->q && (a->vn & 0x10)))) {
+        return false;
+    }
+
+    if (a->vd & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    opr_sz = (1 + a->q) * 8;
+    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
+                       vfp_reg_offset(a->q, a->vn),
+                       vfp_reg_offset(a->q, a->rm),
+                       cpu_env, opr_sz, opr_sz,
+                       (a->index << 2) | a->s, /* is_2 == 0 */
+                       gen_helper_gvec_fmlal_idx_a32);
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
 }
 
 #define VFP_REG_SHR(x, n) (((n) > 0) ? (x) >> (n) : (x) << -(n))
-#define VFP_SREG(insn, bigbit, smallbit) \
-  ((VFP_REG_SHR(insn, bigbit - 1) & 0x1e) | (((insn) >> (smallbit)) & 1))
 #define VFP_DREG(reg, insn, bigbit, smallbit) do { \
     if (dc_isar_feature(aa32_simd_r32, s)) { \
         reg = (((insn) >> (bigbit)) & 0x0f) \
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
         reg = ((insn) >> (bigbit)) & 0x0f; \
     }} while (0)
 
-#define VFP_SREG_D(insn) VFP_SREG(insn, 12, 22)
 #define VFP_DREG_D(reg, insn) VFP_DREG(reg, insn, 12, 22)
-#define VFP_SREG_N(insn) VFP_SREG(insn, 16,  7)
 #define VFP_DREG_N(reg, insn) VFP_DREG(reg, insn, 16,  7)
-#define VFP_SREG_M(insn) VFP_SREG(insn,  0,  5)
 #define VFP_DREG_M(reg, insn) VFP_DREG(reg, insn,  0,  5)
 
 static void gen_neon_dup_low16(TCGv_i32 var)
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     return 0;
 }
 
-/* Advanced SIMD two registers and a scalar extension.
- *  31             24   23  22   20   16   12  11   10   9    8        3     0
- * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
- * | 1 1 1 1 1 1 1 0 | o1 | D | o2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
- * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
- *
- */
-
-static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
-{
-    gen_helper_gvec_3 *fn_gvec = NULL;
-    gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
-    int rd, rn, rm, opr_sz, data;
-    int off_rn, off_rm;
-    bool is_long = false, q = extract32(insn, 6, 1);
-    bool ptr_is_env = false;
-
-    if ((insn & 0xffa00f10) == 0xfe000810) {
-        /* VFM[AS]L -- 1111 1110 0.0S .... .... 1000 .Q.1 .... */
-        int is_s = extract32(insn, 20, 1);
-        int vm20 = extract32(insn, 0, 3);
-        int vm3 = extract32(insn, 3, 1);
-        int m = extract32(insn, 5, 1);
-        int index;
-
-        if (!dc_isar_feature(aa32_fhm, s)) {
-            return 1;
-        }
-        if (q) {
-            rm = vm20;
-            index = m * 2 + vm3;
-        } else {
-            rm = vm20 * 2 + m;
-            index = vm3;
-        }
-        is_long = true;
-        data = (index << 2) | is_s; /* is_2 == 0 */
-        fn_gvec_ptr = gen_helper_gvec_fmlal_idx_a32;
-        ptr_is_env = true;
-    } else {
-        return 1;
-    }
-
-    VFP_DREG_D(rd, insn);
-    if (rd & q) {
-        return 1;
-    }
-    if (q || !is_long) {
-        VFP_DREG_N(rn, insn);
-        if (rn & q & !is_long) {
-            return 1;
-        }
-        off_rn = vfp_reg_offset(1, rn);
-        off_rm = vfp_reg_offset(1, rm);
-    } else {
-        rn = VFP_SREG_N(insn);
-        off_rn = vfp_reg_offset(0, rn);
-        off_rm = vfp_reg_offset(0, rm);
-    }
-    if (s->fp_excp_el) {
-        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
-                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
-        return 0;
-    }
-    if (!s->vfp_enabled) {
-        return 1;
-    }
-
-    opr_sz = (1 + q) * 8;
-    if (fn_gvec_ptr) {
-        TCGv_ptr ptr;
-        if (ptr_is_env) {
-            ptr = cpu_env;
-        } else {
-            ptr = get_fpstatus_ptr(1);
-        }
-        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
-                           opr_sz, opr_sz, data, fn_gvec_ptr);
-        if (!ptr_is_env) {
-            tcg_temp_free_ptr(ptr);
-        }
-    } else {
-        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
-                           opr_sz, opr_sz, data, fn_gvec);
-    }
-    return 0;
-}
-
 static int disas_coproc_insn(DisasContext *s, uint32_t insn)
 {
     int cpnum, is64, crn, crm, opc1, opc2, isread, rt, rt2;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                     }
                 }
             }
-        } else if ((insn & 0x0f000a00) == 0x0e000800
-                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
-            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
-                goto illegal_op;
-            }
-            return;
         }
         goto illegal_op;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
             }
             break;
         }
-        if ((insn & 0xff000a00) == 0xfe000800
-            && arm_dc_feature(s, ARM_FEATURE_V8)) {
-            /* The Thumb2 and ARM encodings are identical.  */
-            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
-                goto illegal_op;
-            }
-        } else if (((insn >> 24) & 3) == 3) {
+        if (((insn >> 24) & 3) == 3) {
             /* Translate into the equivalent ARM encoding.  */
             insn = (insn & 0xe2ffffff) | ((insn & (1 << 28)) >> 4) | (1 << 28);
             if (disas_neon_data_insn(s, insn)) {
-- 
2.20.1

Convert the Neon "load/store multiple structures" insns to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-12-peter.maydell@linaro.org
---
 target/arm/neon-ls.decode       |   7 ++
 target/arm/translate-neon.inc.c | 124 ++++++++++++++++++++++++++++++++
 target/arm/translate.c          |  91 +----------------------
 3 files changed, 133 insertions(+), 89 deletions(-)

diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-ls.decode
+++ b/target/arm/neon-ls.decode
@@ -XXX,XX +XXX,XX @@
 #   0b1111_1001_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
 # This file works on the A32 encoding only; calling code for T32 has to
 # transform the insn into the A32 version first.
+
+%vd_dp  22:1 12:4
+
+# Neon load/store multiple structures
+
+VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
+               vd=%vd_dp
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a)
                        gen_helper_gvec_fmlal_idx_a32);
     return true;
 }
+
+static struct {
+    int nregs;
+    int interleave;
+    int spacing;
+} const neon_ls_element_type[11] = {
+    {1, 4, 1},
+    {1, 4, 2},
+    {4, 1, 1},
+    {2, 2, 2},
+    {1, 3, 1},
+    {1, 3, 2},
+    {3, 1, 1},
+    {1, 1, 1},
+    {1, 2, 1},
+    {1, 2, 2},
+    {2, 1, 1}
+};
+
+static void gen_neon_ldst_base_update(DisasContext *s, int rm, int rn,
+                                      int stride)
+{
+    if (rm != 15) {
+        TCGv_i32 base;
+
+        base = load_reg(s, rn);
+        if (rm == 13) {
+            tcg_gen_addi_i32(base, base, stride);
+        } else {
+            TCGv_i32 index;
+            index = load_reg(s, rm);
+            tcg_gen_add_i32(base, base, index);
+            tcg_temp_free_i32(index);
+        }
+        store_reg(s, rn, base);
+    }
+}
+
+static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
+{
+    /* Neon load/store multiple structures */
+    int nregs, interleave, spacing, reg, n;
+    MemOp endian = s->be_data;
+    int mmu_idx = get_mem_index(s);
+    int size = a->size;
+    TCGv_i64 tmp64;
+    TCGv_i32 addr, tmp;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
+        return false;
+    }
+    if (a->itype > 10) {
+        return false;
+    }
+    /* Catch UNDEF cases for bad values of align field */
+    switch (a->itype & 0xc) {
+    case 4:
+        if (a->align >= 2) {
+            return false;
+        }
+        break;
+    case 8:
+        if (a->align == 3) {
+            return false;
+        }
+        break;
+    default:
+        break;
+    }
+    nregs = neon_ls_element_type[a->itype].nregs;
+    interleave = neon_ls_element_type[a->itype].interleave;
+    spacing = neon_ls_element_type[a->itype].spacing;
+    if (size == 3 && (interleave | spacing) != 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    /* For our purposes, bytes are always little-endian.  */
+    if (size == 0) {
+        endian = MO_LE;
+    }
+    /*
+     * Consecutive little-endian elements from a single register
+     * can be promoted to a larger little-endian operation.
+     */
+    if (interleave == 1 && endian == MO_LE) {
+        size = 3;
+    }
+    tmp64 = tcg_temp_new_i64();
+    addr = tcg_temp_new_i32();
+    tmp = tcg_const_i32(1 << size);
+    load_reg_var(s, addr, a->rn);
+    for (reg = 0; reg < nregs; reg++) {
+        for (n = 0; n < 8 >> size; n++) {
+            int xs;
+            for (xs = 0; xs < interleave; xs++) {
+                int tt = a->vd + reg + spacing * xs;
+
+                if (a->l) {
+                    gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
+                    neon_store_element64(tt, n, size, tmp64);
+                } else {
+                    neon_load_element64(tmp64, tt, n, size);
+                    gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
+                }
+                tcg_gen_add_i32(addr, addr, tmp);
+            }
+        }
+    }
+    tcg_temp_free_i32(addr);
+    tcg_temp_free_i32(tmp);
+    tcg_temp_free_i64(tmp64);
+
+    gen_neon_ldst_base_update(s, a->rm, a->rn, nregs * interleave * 8);
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
 }
 
 
-static struct {
-    int nregs;
-    int interleave;
-    int spacing;
-} const neon_ls_element_type[11] = {
-    {1, 4, 1},
-    {1, 4, 2},
-    {4, 1, 1},
-    {2, 2, 2},
-    {1, 3, 1},
-    {1, 3, 2},
-    {3, 1, 1},
-    {1, 1, 1},
-    {1, 2, 1},
-    {1, 2, 2},
-    {2, 1, 1}
-};
-
 /* Translate a NEON load/store element instruction.  Return nonzero if the
    instruction is invalid.  */
 static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
 {
     int rd, rn, rm;
-    int op;
     int nregs;
-    int interleave;
-    int spacing;
     int stride;
     int size;
     int reg;
     int load;
-    int n;
     int vec_size;
-    int mmu_idx;
-    MemOp endian;
     TCGv_i32 addr;
     TCGv_i32 tmp;
-    TCGv_i32 tmp2;
-    TCGv_i64 tmp64;
 
     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
         return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
     rn = (insn >> 16) & 0xf;
     rm = insn & 0xf;
     load = (insn & (1 << 21)) != 0;
-    endian = s->be_data;
-    mmu_idx = get_mem_index(s);
     if ((insn & (1 << 23)) == 0) {
-        /* Load store all elements.  */
-        op = (insn >> 8) & 0xf;
-        size = (insn >> 6) & 3;
-        if (op > 10)
-            return 1;
-        /* Catch UNDEF cases for bad values of align field */
-        switch (op & 0xc) {
-        case 4:
-            if (((insn >> 5) & 1) == 1) {
-                return 1;
-            }
-            break;
-        case 8:
-            if (((insn >> 4) & 3) == 3) {
-                return 1;
-            }
-            break;
-        default:
-            break;
-        }
-        nregs = neon_ls_element_type[op].nregs;
-        interleave = neon_ls_element_type[op].interleave;
-        spacing = neon_ls_element_type[op].spacing;
-        if (size == 3 && (interleave | spacing) != 1) {
-            return 1;
-        }
-        /* For our purposes, bytes are always little-endian.  */
-        if (size == 0) {
-            endian = MO_LE;
-        }
-        /* Consecutive little-endian elements from a single register
-         * can be promoted to a larger little-endian operation.
-         */
-        if (interleave == 1 && endian == MO_LE) {
-            size = 3;
-        }
-        tmp64 = tcg_temp_new_i64();
-        addr = tcg_temp_new_i32();
-        tmp2 = tcg_const_i32(1 << size);
-        load_reg_var(s, addr, rn);
-        for (reg = 0; reg < nregs; reg++) {
-            for (n = 0; n < 8 >> size; n++) {
-                int xs;
-                for (xs = 0; xs < interleave; xs++) {
-                    int tt = rd + reg + spacing * xs;
-
-                    if (load) {
-                        gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
-                        neon_store_element64(tt, n, size, tmp64);
-                    } else {
-                        neon_load_element64(tmp64, tt, n, size);
-                        gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
-                    }
-                    tcg_gen_add_i32(addr, addr, tmp2);
-                }
-            }
-        }
-        tcg_temp_free_i32(addr);
-        tcg_temp_free_i32(tmp2);
-        tcg_temp_free_i64(tmp64);
-        stride = nregs * interleave * 8;
+        /* Load store all elements -- handled already by decodetree */
+        return 1;
     } else {
         size = (insn >> 10) & 3;
         if (size == 3) {
-- 
2.20.1

Convert the Neon "load single structure to all lanes" insns to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-13-peter.maydell@linaro.org
---
 target/arm/neon-ls.decode       |  5 +++
 target/arm/translate-neon.inc.c | 73 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 55 +------------------------
 3 files changed, 80 insertions(+), 53 deletions(-)

Convert the Neon "load/store single structure to one lane" insns to
decodetree.

As this is the last set of insns in the neon load/store group,
we can remove the whole disas_neon_ls_insn() function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-14-peter.maydell@linaro.org
---
 target/arm/neon-ls.decode       |  11 +++
 target/arm/translate-neon.inc.c |  89 +++++++++++++++++++
 target/arm/translate.c          | 147 --------------------------------
 3 files changed, 100 insertions(+), 147 deletions(-)

diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-ls.decode
+++ b/target/arm/neon-ls.decode
@@ -XXX,XX +XXX,XX @@ VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
 
 VLD_all_lanes  1111 0100 1 . 1 0 rn:4 .... 11 n:2 size:2 t:1 a:1 rm:4 \
                vd=%vd_dp
+
+# Neon load/store single structure to one lane
+%imm1_5_p1 5:1 !function=plus1
+%imm1_6_p1 6:1 !function=plus1
+
+VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 00 n:2 reg_idx:3 align:1 rm:4 \
+               vd=%vd_dp size=0 stride=1
+VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 01 n:2 reg_idx:2 align:2 rm:4 \
+               vd=%vd_dp size=1 stride=%imm1_5_p1
+VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 10 n:2 reg_idx:1 align:3 rm:4 \
+               vd=%vd_dp size=2 stride=%imm1_6_p1
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@
  * It might be possible to convert it to a standalone .c file eventually.
  */
 
+static inline int plus1(DisasContext *s, int x)
+{
+    return x + 1;
+}
+
 /* Include the generated Neon decoder */
 #include "decode-neon-dp.inc.c"
 #include "decode-neon-ls.inc.c"
@@ -XXX,XX +XXX,XX @@ static bool trans_VLD_all_lanes(DisasContext *s, arg_VLD_all_lanes *a)
 
     return true;
 }
+
+static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a)
+{
+    /* Neon load/store single structure to one lane */
+    int reg;
+    int nregs = a->n + 1;
+    int vd = a->vd;
+    TCGv_i32 addr, tmp;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
+        return false;
+    }
+
+    /* Catch the UNDEF cases. This is unavoidably a bit messy. */
+    switch (nregs) {
+    case 1:
+        if (((a->align & (1 << a->size)) != 0) ||
+            (a->size == 2 && ((a->align & 3) == 1 || (a->align & 3) == 2))) {
+            return false;
+        }
+        break;
+    case 3:
+        if ((a->align & 1) != 0) {
+            return false;
+        }
+        /* fall through */
+    case 2:
+        if (a->size == 2 && (a->align & 2) != 0) {
+            return false;
+        }
+        break;
+    case 4:
+        if ((a->size == 2) && ((a->align & 3) == 3)) {
+            return false;
+        }
+        break;
+    default:
+        abort();
+    }
+    if ((vd + a->stride * (nregs - 1)) > 31) {
+        /*
+         * Attempts to write off the end of the register file are
+         * UNPREDICTABLE; we choose to UNDEF because otherwise we would
+         * access off the end of the array that holds the register data.
+         */
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    tmp = tcg_temp_new_i32();
+    addr = tcg_temp_new_i32();
+    load_reg_var(s, addr, a->rn);
+    /*
+     * TODO: if we implemented alignment exceptions, we should check
+     * addr against the alignment encoded in a->align here.
+     */
+    for (reg = 0; reg < nregs; reg++) {
+        if (a->l) {
+            gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
+                            s->be_data | a->size);
+            neon_store_element(vd, a->reg_idx, a->size, tmp);
+        } else { /* Store */
+            neon_load_element(tmp, vd, a->reg_idx, a->size);
+            gen_aa32_st_i32(s, tmp, addr, get_mem_index(s),
+                            s->be_data | a->size);
+        }
+        vd += a->stride;
+        tcg_gen_addi_i32(addr, addr, 1 << a->size);
+    }
+    tcg_temp_free_i32(addr);
+    tcg_temp_free_i32(tmp);
+
+    gen_neon_ldst_base_update(s, a->rm, a->rn, (1 << a->size) * nregs);
+
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
     tcg_temp_free_i32(rd);
 }
 
-
-/* Translate a NEON load/store element instruction.  Return nonzero if the
-   instruction is invalid.  */
-static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
-{
-    int rd, rn, rm;
-    int nregs;
-    int stride;
-    int size;
-    int reg;
-    int load;
-    TCGv_i32 addr;
-    TCGv_i32 tmp;
-
-    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-        return 1;
-    }
-
-    /* FIXME: this access check should not take precedence over UNDEF
-     * for invalid encodings; we will generate incorrect syndrome information
-     * for attempts to execute invalid vfp/neon encodings with FP disabled.
-     */
-    if (s->fp_excp_el) {
-        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
-                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
-        return 0;
-    }
-
-    if (!s->vfp_enabled)
-      return 1;
-    VFP_DREG_D(rd, insn);
-    rn = (insn >> 16) & 0xf;
-    rm = insn & 0xf;
-    load = (insn & (1 << 21)) != 0;
-    if ((insn & (1 << 23)) == 0) {
-        /* Load store all elements -- handled already by decodetree */
-        return 1;
-    } else {
-        size = (insn >> 10) & 3;
-        if (size == 3) {
-            /* Load single element to all lanes -- handled by decodetree  */
-            return 1;
-        } else {
-            /* Single element.  */
-            int idx = (insn >> 4) & 0xf;
-            int reg_idx;
-            switch (size) {
-            case 0:
-                reg_idx = (insn >> 5) & 7;
-                stride = 1;
-                break;
-            case 1:
-                reg_idx = (insn >> 6) & 3;
-                stride = (insn & (1 << 5)) ? 2 : 1;
-                break;
-            case 2:
-                reg_idx = (insn >> 7) & 1;
-                stride = (insn & (1 << 6)) ? 2 : 1;
-                break;
-            default:
-                abort();
-            }
-            nregs = ((insn >> 8) & 3) + 1;
-            /* Catch the UNDEF cases. This is unavoidably a bit messy. */
-            switch (nregs) {
-            case 1:
-                if (((idx & (1 << size)) != 0) ||
-                    (size == 2 && ((idx & 3) == 1 || (idx & 3) == 2))) {
-                    return 1;
-                }
-                break;
-            case 3:
-                if ((idx & 1) != 0) {
-                    return 1;
-                }
-                /* fall through */
-            case 2:
-                if (size == 2 && (idx & 2) != 0) {
-                    return 1;
-                }
-                break;
-            case 4:
-                if ((size == 2) && ((idx & 3) == 3)) {
-                    return 1;
-                }
-                break;
-            default:
-                abort();
-            }
-            if ((rd + stride * (nregs - 1)) > 31) {
-                /* Attempts to write off the end of the register file
-                 * are UNPREDICTABLE; we choose to UNDEF because otherwise
-                 * the neon_load_reg() would write off the end of the array.
-                 */
-                return 1;
-            }
-            tmp = tcg_temp_new_i32();
-            addr = tcg_temp_new_i32();
-            load_reg_var(s, addr, rn);
-            for (reg = 0; reg < nregs; reg++) {
-                if (load) {
-                    gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
-                                    s->be_data | size);
-                    neon_store_element(rd, reg_idx, size, tmp);
-                } else { /* Store */
-                    neon_load_element(tmp, rd, reg_idx, size);
-                    gen_aa32_st_i32(s, tmp, addr, get_mem_index(s),
-                                    s->be_data | size);
-                }
-                rd += stride;
-                tcg_gen_addi_i32(addr, addr, 1 << size);
-            }
-            tcg_temp_free_i32(addr);
-            tcg_temp_free_i32(tmp);
-            stride = nregs * (1 << size);
-        }
-    }
-    if (rm != 15) {
-        TCGv_i32 base;
-
-        base = load_reg(s, rn);
-        if (rm == 13) {
-            tcg_gen_addi_i32(base, base, stride);
-        } else {
-            TCGv_i32 index;
-            index = load_reg(s, rm);
-            tcg_gen_add_i32(base, base, index);
-            tcg_temp_free_i32(index);
-        }
-        store_reg(s, rn, base);
-    }
-    return 0;
-}
-
 static inline void gen_neon_narrow(int size, TCGv_i32 dest, TCGv_i64 src)
 {
     switch (size) {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
             }
             return;
         }
-        if ((insn & 0x0f100000) == 0x04000000) {
-            /* NEON load/store.  */
-            if (disas_neon_ls_insn(s, insn)) {
-                goto illegal_op;
-            }
-            return;
-        }
         if ((insn & 0x0e000f00) == 0x0c000100) {
             if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
                 /* iWMMXt register transfer.  */
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
         }
         break;
     case 12:
-        if ((insn & 0x01100000) == 0x01000000) {
-            if (disas_neon_ls_insn(s, insn)) {
-                goto illegal_op;
-            }
-            break;
-        }
         goto illegal_op;
     default:
     illegal_op:
-- 
2.20.1

Convert the Neon 3-reg-same VADD and VSUB insns to decodetree.

Note that we don't need the neon_3r_sizes[op] check here because all
size values are OK for VADD and VSUB; we'll add this when we convert
the first insn that has size restrictions.

For this we need one of the GVecGen*Fn typedefs currently in
translate-a64.h; move them all to translate.h as a block so they
are visible to the 32-bit decoder.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-15-peter.maydell@linaro.org
---
 target/arm/translate-a64.h      |  9 --------
 target/arm/translate.h          |  9 ++++++++
 target/arm/neon-dp.decode       | 17 +++++++++++++++
 target/arm/translate-neon.inc.c | 38 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 14 ++++--------
 5 files changed, 68 insertions(+), 19 deletions(-)

diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.h
+++ b/target/arm/translate-a64.h
@@ -XXX,XX +XXX,XX @@ static inline int vec_full_reg_size(DisasContext *s)
 
 bool disas_sve(DisasContext *, uint32_t);
 
-/* Note that the gvec expanders operate on offsets + sizes.  */
-typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t);
-typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
-                         uint32_t, uint32_t);
-typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
-                        uint32_t, uint32_t, uint32_t);
-typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
-                        uint32_t, uint32_t, uint32_t);
-
 #endif /* TARGET_ARM_TRANSLATE_A64_H */
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 #define dc_isar_feature(name, ctx) \
     ({ DisasContext *ctx_ = (ctx); isar_feature_##name(ctx_->isar); })
 
+/* Note that the gvec expanders operate on offsets + sizes.  */
+typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t);
+typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
+                         uint32_t, uint32_t);
+typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
+                        uint32_t, uint32_t, uint32_t);
+typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
+                        uint32_t, uint32_t, uint32_t);
+
 #endif /* TARGET_ARM_TRANSLATE_H */
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 #
 # This file is processed by scripts/decodetree.py
 #
+# VFP/Neon register fields; same as vfp.decode
+%vm_dp  5:1 0:4
+%vn_dp  7:1 16:4
+%vd_dp  22:1 12:4
 
 # Encodings for Neon data processing instructions where the T32 encoding
 # is a simple transformation of the A32 encoding.
@@ -XXX,XX +XXX,XX @@
 #   0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 # This file works on the A32 encoding only; calling code for T32 has to
 # transform the insn into the A32 version first.
+
+######################################################################
+# 3-reg-same grouping:
+# 1111 001 U 0 D sz:2 Vn:4 Vd:4 opc:4 N Q M op Vm:4
+######################################################################
+
+&3same vm vn vd q size
+
+@3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
+                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
+
+VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
+VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a)
 
     return true;
 }
+
+static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn)
+{
+    int vec_size = a->q ? 16 : 8;
+    int rd_ofs = neon_reg_offset(a->vd, 0);
+    int rn_ofs = neon_reg_offset(a->vn, 0);
+    int rm_ofs = neon_reg_offset(a->vm, 0);
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    fn(a->size, rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
+    return true;
+}
+
+#define DO_3SAME(INSN, FUNC)                                            \
+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+    {                                                                   \
+        return do_3same(s, a, FUNC);                                    \
+    }
+
+DO_3SAME(VADD, tcg_gen_gvec_add)
+DO_3SAME(VSUB, tcg_gen_gvec_sub)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             }
             return 0;
 
-        case NEON_3R_VADD_VSUB:
-            if (u) {
-                tcg_gen_gvec_sub(size, rd_ofs, rn_ofs, rm_ofs,
-                                 vec_size, vec_size);
-            } else {
-                tcg_gen_gvec_add(size, rd_ofs, rn_ofs, rm_ofs,
-                                 vec_size, vec_size);
-            }
-            return 0;
-
         case NEON_3R_VQADD:
             tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
                            rn_ofs, rm_ofs, vec_size, vec_size,
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
                            u ? &ushl_op[size] : &sshl_op[size]);
             return 0;
+
+        case NEON_3R_VADD_VSUB:
+            /* Already handled by decodetree */
+            return 1;
         }
 
         if (size == 3) {
-- 
2.20.1

Convert the Neon logic ops in the 3-reg-same grouping to decodetree.
Note that for the logic ops the 'size' field forms part of their
decode and the actual operations are always bitwise.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-16-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       | 12 +++++++++++
 target/arm/translate-neon.inc.c | 19 +++++++++++++++++
 target/arm/translate.c          | 38 +--------------------------------
 3 files changed, 32 insertions(+), 37 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 
+@3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
+                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
+
+VAND_3s          1111 001 0 0 . 00 .... .... 0001 ... 1 .... @3same_logic
+VBIC_3s          1111 001 0 0 . 01 .... .... 0001 ... 1 .... @3same_logic
+VORR_3s          1111 001 0 0 . 10 .... .... 0001 ... 1 .... @3same_logic
+VORN_3s          1111 001 0 0 . 11 .... .... 0001 ... 1 .... @3same_logic
+VEOR_3s          1111 001 1 0 . 00 .... .... 0001 ... 1 .... @3same_logic
+VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
+VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
+VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
+
 VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
 VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn)
 
 DO_3SAME(VADD, tcg_gen_gvec_add)
 DO_3SAME(VSUB, tcg_gen_gvec_sub)
+DO_3SAME(VAND, tcg_gen_gvec_and)
+DO_3SAME(VBIC, tcg_gen_gvec_andc)
+DO_3SAME(VORR, tcg_gen_gvec_or)
+DO_3SAME(VORN, tcg_gen_gvec_orc)
+DO_3SAME(VEOR, tcg_gen_gvec_xor)
+
+/* These insns are all gvec_bitsel but with the inputs in various orders. */
+#define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        tcg_gen_gvec_bitsel(vece, rd_ofs, O1, O2, O3, oprsz, maxsz);    \
+    }                                                                   \
+    DO_3SAME(INSN, gen_##INSN##_3s)
+
+DO_3SAME_BITSEL(VBSL, rd_ofs, rn_ofs, rm_ofs)
+DO_3SAME_BITSEL(VBIT, rm_ofs, rn_ofs, rd_ofs)
+DO_3SAME_BITSEL(VBIF, rm_ofs, rd_ofs, rn_ofs)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             }
             return 1;
 
-        case NEON_3R_LOGIC: /* Logic ops.  */
-            switch ((u << 2) | size) {
-            case 0: /* VAND */
-                tcg_gen_gvec_and(0, rd_ofs, rn_ofs, rm_ofs,
-                                 vec_size, vec_size);
-                break;
-            case 1: /* VBIC */
-                tcg_gen_gvec_andc(0, rd_ofs, rn_ofs, rm_ofs,
-                                  vec_size, vec_size);
-                break;
-            case 2: /* VORR */
-                tcg_gen_gvec_or(0, rd_ofs, rn_ofs, rm_ofs,
-                                vec_size, vec_size);
-                break;
-            case 3: /* VORN */
-                tcg_gen_gvec_orc(0, rd_ofs, rn_ofs, rm_ofs,
-                                 vec_size, vec_size);
-                break;
-            case 4: /* VEOR */
-                tcg_gen_gvec_xor(0, rd_ofs, rn_ofs, rm_ofs,
-                                 vec_size, vec_size);
-                break;
-            case 5: /* VBSL */
-                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rd_ofs, rn_ofs, rm_ofs,
-                                    vec_size, vec_size);
-                break;
-            case 6: /* VBIT */
-                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rn_ofs, rd_ofs,
-                                    vec_size, vec_size);
-                break;
-            case 7: /* VBIF */
-                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rd_ofs, rn_ofs,
-                                    vec_size, vec_size);
-                break;
-            }
-            return 0;
-
         case NEON_3R_VQADD:
             tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
                            rn_ofs, rm_ofs, vec_size, vec_size,
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             return 0;
 
         case NEON_3R_VADD_VSUB:
+        case NEON_3R_LOGIC:
             /* Already handled by decodetree */
             return 1;
         }
-- 
2.20.1

Convert the Neon 3-reg-same VMAX and VMIN insns to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-17-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  5 +++++
 target/arm/translate-neon.inc.c | 14 ++++++++++++++
 target/arm/translate.c          | 21 ++-------------------
 3 files changed, 21 insertions(+), 19 deletions(-)

Convert the Neon comparison ops in the 3-reg-same grouping
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-18-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  8 ++++++++
 target/arm/translate-neon.inc.c | 22 ++++++++++++++++++++++
 target/arm/translate.c          | 23 +++--------------------
 3 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
 VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
 VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
 
+VCGT_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 0 .... @3same
+VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
+VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
+VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
+
 VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
 VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
 VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
@@ -XXX,XX +XXX,XX @@ VMIN_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
 
 VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
 VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
+
+VTST_3s          1111 001 0 0 . .. .... .... 1000 . . . 1 .... @3same
+VCEQ_3s          1111 001 1 0 . .. .... .... 1000 . . . 1 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
 DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
 DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
 DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
+
+#define DO_3SAME_CMP(INSN, COND)                                        \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        tcg_gen_gvec_cmp(COND, vece, rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz); \
+    }                                                                   \
+    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
+
+DO_3SAME_CMP(VCGT_S, TCG_COND_GT)
+DO_3SAME_CMP(VCGT_U, TCG_COND_GTU)
+DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
+DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
+DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
+
+static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                         uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
+{
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
+}
+DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                            u ? &mls_op[size] : &mla_op[size]);
             return 0;
 
-        case NEON_3R_VTST_VCEQ:
-            if (u) { /* VCEQ */
-                tcg_gen_gvec_cmp(TCG_COND_EQ, size, rd_ofs, rn_ofs, rm_ofs,
-                                 vec_size, vec_size);
-            } else { /* VTST */
-                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
-                               vec_size, vec_size, &cmtst_op[size]);
-            }
-            return 0;
-
-        case NEON_3R_VCGT:
-            tcg_gen_gvec_cmp(u ? TCG_COND_GTU : TCG_COND_GT, size,
-                             rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
-            return 0;
-
-        case NEON_3R_VCGE:
-            tcg_gen_gvec_cmp(u ? TCG_COND_GEU : TCG_COND_GE, size,
-                             rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
-            return 0;
-
         case NEON_3R_VSHL:
             /* Note the operation is vshl vd,vm,vn */
             tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_LOGIC:
         case NEON_3R_VMAX:
         case NEON_3R_VMIN:
+        case NEON_3R_VTST_VCEQ:
+        case NEON_3R_VCGT:
+        case NEON_3R_VCGE:
             /* Already handled by decodetree */
             return 1;
         }
-- 
2.20.1

Convert the Neon VQADD/VQSUB insns in the 3-reg-same grouping
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-19-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  6 ++++++
 target/arm/translate-neon.inc.c | 15 +++++++++++++++
 target/arm/translate.c          | 14 ++------------
 3 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 
+VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
+VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
+
 @3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
 
@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
 VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
 VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
 
+VQSUB_S_3s       1111 001 0 0 . .. .... .... 0010 . . . 1 .... @3same
+VQSUB_U_3s       1111 001 1 0 . .. .... .... 0010 . . . 1 .... @3same
+
 VCGT_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 0 .... @3same
 VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
 VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
     tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
 }
 DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
+
+#define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),           \
+                       rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]);   \
+    }                                                                   \
+    DO_3SAME(INSN, gen_##INSN##_3s)
+
+DO_3SAME_GVEC4(VQADD_S, sqadd_op)
+DO_3SAME_GVEC4(VQADD_U, uqadd_op)
+DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
+DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             }
             return 1;
 
-        case NEON_3R_VQADD:
-            tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
-                           rn_ofs, rm_ofs, vec_size, vec_size,
-                           (u ? uqadd_op : sqadd_op) + size);
-            return 0;
-
-        case NEON_3R_VQSUB:
-            tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
-                           rn_ofs, rm_ofs, vec_size, vec_size,
-                           (u ? uqsub_op : sqsub_op) + size);
-            return 0;
-
         case NEON_3R_VMUL: /* VMUL */
             if (u) {
                 /* Polynomial case allows only P8.  */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VTST_VCEQ:
         case NEON_3R_VCGT:
         case NEON_3R_VCGE:
+        case NEON_3R_VQADD:
+        case NEON_3R_VQSUB:
             /* Already handled by decodetree */
             return 1;
         }
-- 
2.20.1

Convert the Neon VMUL, VMLA, VMLS and VSHL insns in the
3-reg-same grouping to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-20-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  9 +++++++
 target/arm/translate-neon.inc.c | 44 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 28 +++------------------
 3 files changed, 56 insertions(+), 25 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
 VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
 VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
 
+VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
+VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
+
 VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
 VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
 VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
@@ -XXX,XX +XXX,XX @@ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
 
 VTST_3s          1111 001 0 0 . .. .... .... 1000 . . . 1 .... @3same
 VCEQ_3s          1111 001 1 0 . .. .... .... 1000 . . . 1 .... @3same
+
+VMLA_3s          1111 001 0 0 . .. .... .... 1001 . . . 0 .... @3same
+VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
+
+VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
+VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
 DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
 DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
 DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
+DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
 
 #define DO_3SAME_CMP(INSN, COND)                                        \
     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_GVEC4(VQADD_S, sqadd_op)
 DO_3SAME_GVEC4(VQADD_U, uqadd_op)
 DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
 DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
+
+static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                           uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
+{
+    tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz,
+                       0, gen_helper_gvec_pmul_b);
+}
+
+static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
+{
+    if (a->size != 0) {
+        return false;
+    }
+    return do_3same(s, a, gen_VMUL_p_3s);
+}
+
+#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY)                           \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
+                       oprsz, maxsz, &OPARRAY[vece]);                   \
+    }                                                                   \
+    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
+
+
+DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op)
+DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op)
+
+#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        /* Note the operation is vshl vd,vm,vn */                       \
+        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
+                       oprsz, maxsz, &OPARRAY[vece]);                   \
+    }                                                                   \
+    DO_3SAME(INSN, gen_##INSN##_3s)
+
+DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
+DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             }
             return 1;
 
-        case NEON_3R_VMUL: /* VMUL */
-            if (u) {
-                /* Polynomial case allows only P8.  */
-                if (size != 0) {
-                    return 1;
-                }
-                tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
-                                   0, gen_helper_gvec_pmul_b);
-            } else {
-                tcg_gen_gvec_mul(size, rd_ofs, rn_ofs, rm_ofs,
-                                 vec_size, vec_size);
-            }
-            return 0;
-
-        case NEON_3R_VML: /* VMLA, VMLS */
-            tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
-                           u ? &mls_op[size] : &mla_op[size]);
-            return 0;
-
-        case NEON_3R_VSHL:
-            /* Note the operation is vshl vd,vm,vn */
-            tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
-                           u ? &ushl_op[size] : &sshl_op[size]);
-            return 0;
-
         case NEON_3R_VADD_VSUB:
         case NEON_3R_LOGIC:
         case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VCGE:
         case NEON_3R_VQADD:
         case NEON_3R_VQSUB:
+        case NEON_3R_VMUL:
+        case NEON_3R_VML:
+        case NEON_3R_VSHL:
             /* Already handled by decodetree */
             return 1;
         }
-- 
2.20.1

We're going to want at least some of the NeonGen* typedefs
for the refactored 32-bit Neon decoder, so move them all
to translate.h since it makes more sense to keep them in
one group.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-23-peter.maydell@linaro.org
---
 target/arm/translate.h     | 17 +++++++++++++++++
 target/arm/translate-a64.c | 17 -----------------
 2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
 typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
                         uint32_t, uint32_t, uint32_t);
 
+/* Function prototype for gen_ functions for calling Neon helpers */
+typedef void NeonGenOneOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32);
+typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
+typedef void NeonGenTwoOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
+typedef void NeonGenTwo64OpFn(TCGv_i64, TCGv_i64, TCGv_i64);
+typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
+typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
+typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
+typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
+typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
+typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
+typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
+typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
+typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
+typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
+typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
+
 #endif /* TARGET_ARM_TRANSLATE_H */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ typedef struct AArch64DecodeTable {
     AArch64DecodeFn *disas_fn;
 } AArch64DecodeTable;
 
-/* Function prototype for gen_ functions for calling Neon helpers */
-typedef void NeonGenOneOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32);
-typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
-typedef void NeonGenTwoOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
-typedef void NeonGenTwo64OpFn(TCGv_i64, TCGv_i64, TCGv_i64);
-typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
-typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
-typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
-typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
-typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
-typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
-typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
-typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
-typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
-typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
-typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
-
 /* initialize TCG globals.  */
 void a64_translate_init(void)
 {
-- 
2.20.1