Series comparison

-[Qemu-devel] [PULL 00/16] target-arm queue
+[PULL 00/39] target-arm queue
-The following changes since commit adf2e451f357e993f173ba9b4176dbf3e65fee7e:
+Most of this is the Neon decodetree patches, followed by Edgar's versal cleanups.
-  Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging (2019-02-26 19:04:47 +0000)
+thanks
 -- PMM
 The following changes since commit 2ef486e76d64436be90f7359a3071fb2a56ce835:
   Merge remote-tracking branch 'remotes/marcel/tags/rdma-pull-request' into staging (2020-05-03 14:12:56 +0100)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20190228-1
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200504
-for you to fetch changes up to 1c9af3a9e05c1607a36df4943f8f5393d7621a91:
+for you to fetch changes up to 9aefc6cf9b73f66062d2f914a0136756e7a28211:
-  linux-user: Enable HWCAP_ASIMDFHM, HWCAP_JSCVT (2019-02-28 11:03:05 +0000)
+  target/arm: Move gen_ function typedefs to translate.h (2020-05-04 12:59:26 +0100)
 ----------------------------------------------------------------
 target-arm queue:
- * add MHU and dual-core support to Musca boards
+ * Start of conversion of Neon insns to decodetree
- * refactor some VFP insns to be gated by ID registers
+ * versal board: support SD and RTC
- * Revert "arm: Allow system registers for KVM guests to be changed by QEMU code"
+ * Implement ARMv8.2-TTS2UXN
- * Implement ARMv8.2-FHM extension
+ * Make VQDMULL undefined when U=1
- * Advertise JSCVT via HWCAP for linux-user
+ * Some minor code cleanups
 ----------------------------------------------------------------
-Peter Maydell (11):
+Edgar E. Iglesias (11):
-      hw/misc/armsse-mhu.c: Model the SSE-200 Message Handling Unit
+      hw/arm: versal: Remove inclusion of arm_gicv3_common.h
-      hw/arm/armsse: Wire up the MHUs
+      hw/arm: versal: Move misplaced comment
-      target/arm/cpu: Allow init-svtor property to be set after realize
+      hw/arm: versal-virt: Fix typo xlnx-ve -> xlnx-versal
-      target/arm/arm-powerctl: Add new arm_set_cpu_on_and_reset()
+      hw/arm: versal: Embed the UARTs into the SoC type
-      hw/misc/iotkit-sysctl: Correct typo in INITSVTOR0 register name
+      hw/arm: versal: Embed the GEMs into the SoC type
-      hw/arm/iotkit-sysctl: Add SSE-200 registers
+      hw/arm: versal: Embed the ADMAs into the SoC type
-      hw/arm/iotkit-sysctl: Implement CPUWAIT and INITSVTOR*
+      hw/arm: versal: Embed the APUs into the SoC type
-      hw/arm/armsse: Unify init-svtor and cpuwait handling
+      hw/arm: versal: Add support for SD
-      target/arm: Use MVFR1 feature bits to gate A32/T32 FP16 instructions
+      hw/arm: versal: Add support for the RTC
-      target/arm: Gate "miscellaneous FP" insns by ID register field
+      hw/arm: versal-virt: Add support for SD
-      Revert "arm: Allow system registers for KVM guests to be changed by QEMU code"
+      hw/arm: versal-virt: Add support for the RTC
-Richard Henderson (5):
+Fredrik Strupe (1):
-      target/arm: Add helpers for FMLAL
+      target/arm: Make VQDMULL undefined when U=1
       target/arm: Implement FMLAL and FMLSL for aarch64
       target/arm: Implement VFMAL and VFMSL for aarch32
       target/arm: Enable ARMv8.2-FHM for -cpu max
       linux-user: Enable HWCAP_ASIMDFHM, HWCAP_JSCVT
- hw/misc/Makefile.objs           |   1 +
+Peter Maydell (25):
- include/hw/arm/armsse.h         |   3 +-
+      target/arm: Don't use a TLB for ARMMMUIdx_Stage2
- include/hw/misc/armsse-mhu.h    |  44 ++++++
+      target/arm: Use enum constant in get_phys_addr_lpae() call
- include/hw/misc/iotkit-sysctl.h |  25 +++-
+      target/arm: Add new 's1_is_el0' argument to get_phys_addr_lpae()
- target/arm/arm-powerctl.h       |  16 +++
+      target/arm: Implement ARMv8.2-TTS2UXN
- target/arm/cpu.h                |  76 +++++++++--
+      target/arm: Use correct variable for setting 'max' cpu's ID_AA64DFR0
- target/arm/helper.h             |   9 ++
+      target/arm/translate-vfp.inc.c: Remove duplicate simd_r32 check
- hw/arm/armsse.c                 |  91 +++++++++----
+      target/arm: Don't allow Thumb Neon insns without FEATURE_NEON
- hw/misc/armsse-mhu.c            | 198 +++++++++++++++++++++++++++
+      target/arm: Add stubs for AArch32 Neon decodetree
- hw/misc/iotkit-sysctl.c         | 294 ++++++++++++++++++++++++++++++++++++++--
+      target/arm: Convert VCMLA (vector) to decodetree
- linux-user/elfload.c            |   2 +
+      target/arm: Convert VCADD (vector) to decodetree
- target/arm/arm-powerctl.c       |  56 ++++++++
+      target/arm: Convert V[US]DOT (vector) to decodetree
- target/arm/cpu.c                |  32 ++++-
+      target/arm: Convert VFM[AS]L (vector) to decodetree
- target/arm/cpu64.c              |   2 +
+      target/arm: Convert VCMLA (scalar) to decodetree
- target/arm/helper.c             |  27 +---
+      target/arm: Convert V[US]DOT (scalar) to decodetree
- target/arm/kvm32.c              |  23 +++-
+      target/arm: Convert VFM[AS]L (scalar) to decodetree
- target/arm/kvm64.c              |   2 -
+      target/arm: Convert Neon load/store multiple structures to decodetree
- target/arm/machine.c            |   2 +-
+      target/arm: Convert Neon 'load single structure to all lanes' to decodetree
- target/arm/translate-a64.c      |  49 ++++++-
+      target/arm: Convert Neon 'load/store single structure' to decodetree
- target/arm/translate.c          | 180 ++++++++++++++++--------
+      target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree
- target/arm/vec_helper.c         | 148 ++++++++++++++++++++
+      target/arm: Convert Neon 3-reg-same logic ops to decodetree
- MAINTAINERS                     |   2 +
+      target/arm: Convert Neon 3-reg-same VMAX/VMIN to decodetree
- default-configs/arm-softmmu.mak |   1 +
+      target/arm: Convert Neon 3-reg-same comparisons to decodetree
- hw/misc/trace-events            |   4 +
+      target/arm: Convert Neon 3-reg-same VQADD/VQSUB to decodetree
-files changed, 1139 insertions(+), 148 deletions(-)
+      target/arm: Convert Neon 3-reg-same VMUL, VMLA, VMLS, VSHL to decodetree
- create mode 100644 include/hw/misc/armsse-mhu.h
+      target/arm: Move gen_ function typedefs to translate.h
  create mode 100644 hw/misc/armsse-mhu.c
+Philippe Mathieu-Daudé (2):
+      hw/arm/mps2-tz: Use TYPE_IOTKIT instead of hardcoded string
+      target/arm: Use uint64_t for midr field in CPU state struct
+ include/hw/arm/xlnx-versal.h    |  31 +-
+ target/arm/cpu-param.h          |   2 +-
+ target/arm/cpu.h                |  38 ++-
+ target/arm/translate-a64.h      |   9 -
+ target/arm/translate.h          |  26 ++
+ target/arm/neon-dp.decode       |  86 +++++
+ target/arm/neon-ls.decode       |  52 +++
+ target/arm/neon-shared.decode   |  66 ++++
+ hw/arm/mps2-tz.c                |   2 +-
+ hw/arm/xlnx-versal-virt.c       |  74 ++++-
+ hw/arm/xlnx-versal.c            | 115 +++++--
+ target/arm/cpu.c                |   3 +-
+ target/arm/cpu64.c              |   8 +-
+ target/arm/helper.c             | 183 ++++------
+ target/arm/translate-a64.c      |  17 -
+ target/arm/translate-neon.inc.c | 714 +++++++++++++++++++++++++++++++++++++++
+ target/arm/translate-vfp.inc.c  |   6 -
+ target/arm/translate.c          | 716 +++-------------------------------------
+ target/arm/Makefile.objs        |  18 +
+files changed, 1302 insertions(+), 864 deletions(-)
+ create mode 100644 target/arm/neon-dp.decode
+ create mode 100644 target/arm/neon-ls.decode
+ create mode 100644 target/arm/neon-shared.decode
+ create mode 100644 target/arm/translate-neon.inc.c

-New patch
+[PULL 01/39] target/arm: Make VQDMULL undefined when U=1
+From: Fredrik Strupe <fredrik@strupe.net>
+According to Arm ARM, VQDMULL is only valid when U=0, while having
+U=1 is unallocated.
+Signed-off-by: Fredrik Strupe <fredrik@strupe.net>
+Fixes: 695272dcb976 ("target-arm: Handle UNDEF cases for Neon 3-regs-different-widths")
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ target/arm/translate.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+                     {0, 0, 0, 0}, /* VMLSL */
+                     {0, 0, 0, 9}, /* VQDMLSL */
+                     {0, 0, 0, 0}, /* Integer VMULL */
+-                    {0, 0, 0, 1}, /* VQDMULL */
++                    {0, 0, 0, 9}, /* VQDMULL */
+                     {0, 0, 0, 0xa}, /* Polynomial VMULL */
+                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
+                 };
+--
+.20.1

-New patch
+[PULL 02/39] hw/arm/mps2-tz: Use TYPE_IOTKIT instead of hardcoded string
+From: Philippe Mathieu-Daudé <f4bug@amsat.org>
+By using the TYPE_* definitions for devices, we can:
+ - quickly find where devices are used with 'git-grep'
+ - easily rename a device (one-line change).
+Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20200428154650.21991-1-f4bug@amsat.org
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/arm/mps2-tz.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/mps2-tz.c
++++ b/hw/arm/mps2-tz.c
+@@ -XXX,XX +XXX,XX @@ static void mps2tz_common_init(MachineState *machine)
+         exit(EXIT_FAILURE);
+     }
+-    sysbus_init_child_obj(OBJECT(machine), "iotkit", &mms->iotkit,
++    sysbus_init_child_obj(OBJECT(machine), TYPE_IOTKIT, &mms->iotkit,
+                           sizeof(mms->iotkit), mmc->armsse_type);
+     iotkitdev = DEVICE(&mms->iotkit);
+     object_property_set_link(OBJECT(&mms->iotkit), OBJECT(system_memory),
+--
+.20.1

-[Qemu-devel] [PULL 11/16] Revert "arm: Allow system registers for KVM guests to be changed by QEMU code"
+[PULL 03/39] target/arm: Don't use a TLB for ARMMMUIdx_Stage2
-This reverts commit 823e1b3818f9b10b824ddcd756983b6e2fa68730,
+We define ARMMMUIdx_Stage2 as being an MMU index which uses a QEMU
-which introduces a regression running EDK2 guest firmware
+TLB.  However we never actually use the TLB -- all stage 2 lookups
-under KVM:
+are done by direct calls to get_phys_addr_lpae() followed by a
+physical address load via address_space_ld*().
-error: kvm run failed Function not implemented
- PC=000000013f5a6208 X00=00000000404003c4 X01=000000000000003a
+Remove Stage2 from the list of ARM MMU indexes which correspond to
-X02=0000000000000000 X03=00000000404003c4 X04=0000000000000000
+real core MMU indexes, and instead put it in the set of "NOTLB" ARM
-X05=0000000096000046 X06=000000013d2ef270 X07=000000013e3d1710
+MMU indexes.
-X08=09010755ffaf8ba8 X09=ffaf8b9cfeeb5468 X10=feeb546409010756
-X11=09010757ffaf8b90 X12=feeb50680903068b X13=090306a1ffaf8bc0
+This allows us to drop NB_MMU_MODES to 11.  It also means we can
-X14=0000000000000000 X15=0000000000000000 X16=000000013f872da0
+safely add support for the ARMv8.3-TTS2UXN extension, which adds
-X17=00000000ffffa6ab X18=0000000000000000 X19=000000013f5a92d0
+permission bits to the stage 2 descriptors which define execute
-X20=000000013f5a7a78 X21=000000000000003a X22=000000013f5a7ab2
+permission separatel for EL0 and EL1; supporting that while keeping
-X23=000000013f5a92e8 X24=000000013f631090 X25=0000000000000010
+Stage2 in a QEMU TLB would require us to use separate TLBs for
-X26=0000000000000100 X27=000000013f89501b X28=000000013e3d14e0
+"Stage2 for an EL0 access" and "Stage2 for an EL1 access", which is a
-X29=000000013e3d12a0 X30=000000013f5a2518  SP=000000013b7be0b0
+lot of extra complication given we aren't even using the QEMU TLB.
-PSTATE=404003c4 -Z-- EL1t
+In the process of updating the comment on our MMU index use,
-with
+fix a couple of other minor errors:
-[ 3507.926571] kvm [35042]: load/store instruction decoding not implemented
+ * NS EL2 EL2&0 was missing from the list in the comment
-in the host dmesg.
+ * some text hadn't been updated from when we bumped NB_MMU_MODES
+   above 8
-Revert the change for the moment until we can investigate the
 cause of the regression.
 Reported-by: Eric Auger <eric.auger@redhat.com>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200330210400.11724-2-peter.maydell@linaro.org
 ---
- target/arm/cpu.h     |  9 +--------
+ target/arm/cpu-param.h |   2 +-
- target/arm/helper.c  | 27 ++-------------------------
+ target/arm/cpu.h       |  21 +++++---
- target/arm/kvm32.c   | 20 ++++++++++++++++++--
+ target/arm/helper.c    | 112 ++++-------------------------------------
- target/arm/kvm64.c   |  2 --
+files changed, 27 insertions(+), 108 deletions(-)
- target/arm/machine.c |  2 +-
-files changed, 22 insertions(+), 38 deletions(-)
+diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
+index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu-param.h
 +++ b/target/arm/cpu-param.h
@@ -XXX,XX +XXX,XX @@
  # define TARGET_PAGE_BITS_MIN  10
  #endif
 -#define NB_MMU_MODES 12
 +#define NB_MMU_MODES 11
  #endif
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ bool write_list_to_cpustate(ARMCPU *cpu);
+@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
- /**
+  *     handling via the TLB. The only way to do a stage 1 translation without
-  * write_cpustate_to_list:
+  *     the immediate stage 2 translation is via the ATS or AT system insns,
-  * @cpu: ARMCPU
+  *     which can be slow-pathed and always do a page table walk.
-- * @kvm_sync: true if this is for syncing back to KVM
++ *     The only use of stage 2 translations is either as part of an s1+2
 + *     lookup or when loading the descriptors during a stage 1 page table walk,
 + *     and in both those cases we don't use the TLB.
   *  4. we can also safely fold together the "32 bit EL3" and "64 bit EL3"
   *     translation regimes, because they map reasonably well to each other
   *     and they can't both be active at the same time.
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
   * NS EL1 EL1&0 stage 1+2 (aka NS PL1)
   * NS EL1 EL1&0 stage 1+2 +PAN
   * NS EL0 EL2&0
 + * NS EL2 EL2&0
   * NS EL2 EL2&0 +PAN
   * NS EL2 (aka NS PL2)
   * S EL0 EL1&0 (aka S PL0)
   * S EL1 EL1&0 (not used if EL3 is 32 bit)
   * S EL1 EL1&0 +PAN
   * S EL3 (aka S PL1)
 - * NS EL1&0 stage 2
   *
-  * For each register listed in the ARMCPU cpreg_indexes list, write
+- * for a total of 12 different mmu_idx.
-  * its value from the ARMCPUState structure into the cpreg_values list.
++ * for a total of 11 different mmu_idx.
   * This is used to copy info from TCG's working data structures into
   * KVM or for outbound migration.
   *
-- * @kvm_sync is true if we are doing this in order to sync the
+  * R profile CPUs have an MPU, but can use the same set of MMU indexes
-- * register state back to KVM. In this case we will only update
+  * as A profile. They only need to distinguish NS EL0 and NS EL1 (and
-- * values in the list if the previous list->cpustate sync actually
+@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
-- * successfully wrote the CPU state. Otherwise we will keep the value
+  * are not quite the same -- different CPU types (most notably M profile
-- * that is in the list.
+  * vs A/R profile) would like to use MMU indexes with different semantics,
-- *
+  * but since we don't ever need to use all of those in a single CPU we
-  * Returns: true if all register values were read correctly,
+- * can avoid setting NB_MMU_MODES to more than 8. The lower bits of
-  * false if some register was unknown or could not be read.
++ * can avoid having to set NB_MMU_MODES to "total number of A profile MMU
-  * Note that we do not stop early on failure -- we will attempt
++ * modes + total number of M profile MMU modes". The lower bits of
-  * reading all registers in the list.
+  * ARMMMUIdx are the core TLB mmu index, and the higher bits are always
-  */
+  * the same for any particular CPU.
--bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
+  * Variables of type ARMMUIdx are always full values, and the core
-+bool write_cpustate_to_list(ARMCPU *cpu);
+@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
+     ARMMMUIdx_SE10_1_PAN = 9 | ARM_MMU_IDX_A,
- #define ARM_CPUID_TI915T      0x54029152
+     ARMMMUIdx_SE3        = 10 | ARM_MMU_IDX_A,
- #define ARM_CPUID_TI925T      0x54029252
 -    ARMMMUIdx_Stage2     = 11 | ARM_MMU_IDX_A,
 -
      /*
       * These are not allocated TLBs and are used only for AT system
       * instructions or for the first stage of an S12 page table walk.
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
      ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
      ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
      ARMMMUIdx_Stage1_E1_PAN = 2 | ARM_MMU_IDX_NOTLB,
 +    /*
 +     * Not allocated a TLB: used only for second stage of an S12 page
 +     * table walk, or for descriptor loads during first stage of an S1
 +     * page table walk. Note that if we ever want to have a TLB for this
 +     * then various TLB flush insns which currently are no-ops or flush
 +     * only stage 1 MMU indexes will need to change to flush stage 2.
 +     */
 +    ARMMMUIdx_Stage2     = 3 | ARM_MMU_IDX_NOTLB,
      /*
       * M-profile.
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
      TO_CORE_BIT(SE10_1),
      TO_CORE_BIT(SE10_1_PAN),
      TO_CORE_BIT(SE3),
 -    TO_CORE_BIT(Stage2),
      TO_CORE_BIT(MUser),
      TO_CORE_BIT(MPriv),
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static bool raw_accessors_invalid(const ARMCPRegInfo *ri)
+@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
-     return true;
+     tlb_flush_by_mmuidx(cs,
                          ARMMMUIdxBit_E10_1 |
                          ARMMMUIdxBit_E10_1_PAN |
 -                        ARMMMUIdxBit_E10_0 |
 -                        ARMMMUIdxBit_Stage2);
 +                        ARMMMUIdxBit_E10_0);
  }
--bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync)
+ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-+bool write_cpustate_to_list(ARMCPU *cpu)
+@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
      tlb_flush_by_mmuidx_all_cpus_synced(cs,
                                          ARMMMUIdxBit_E10_1 |
                                          ARMMMUIdxBit_E10_1_PAN |
 -                                        ARMMMUIdxBit_E10_0 |
 -                                        ARMMMUIdxBit_Stage2);
 +                                        ARMMMUIdxBit_E10_0);
  }
 -static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                            uint64_t value)
 -{
 -    /* Invalidate by IPA. This has to invalidate any structures that
 -     * contain only stage 2 translation information, but does not need
 -     * to apply to structures that contain combined stage 1 and stage 2
 -     * translation information.
 -     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
 -     */
 -    CPUState *cs = env_cpu(env);
 -    uint64_t pageaddr;
 -
 -    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
 -        return;
 -    }
 -
 -    pageaddr = sextract64(value << 12, 0, 40);
 -
 -    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
 -}
 -
 -static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                               uint64_t value)
 -{
 -    CPUState *cs = env_cpu(env);
 -    uint64_t pageaddr;
 -
 -    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
 -        return;
 -    }
 -
 -    pageaddr = sextract64(value << 12, 0, 40);
 -
 -    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
 -                                             ARMMMUIdxBit_Stage2);
 -}
  static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                uint64_t value)
@@ -XXX,XX +XXX,XX @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
          tlb_flush_by_mmuidx(cs,
                              ARMMMUIdxBit_E10_1 |
                              ARMMMUIdxBit_E10_1_PAN |
 -                            ARMMMUIdxBit_E10_0 |
 -                            ARMMMUIdxBit_Stage2);
 +                            ARMMMUIdxBit_E10_0);
          raw_write(env, ri, value);
      }
  }
@@ -XXX,XX +XXX,XX @@ static int alle1_tlbmask(CPUARMState *env)
          return ARMMMUIdxBit_SE10_1 |
                 ARMMMUIdxBit_SE10_1_PAN |
                 ARMMMUIdxBit_SE10_0;
 -    } else if (arm_feature(env, ARM_FEATURE_EL2)) {
 -        return ARMMMUIdxBit_E10_1 |
 -               ARMMMUIdxBit_E10_1_PAN |
 -               ARMMMUIdxBit_E10_0 |
 -               ARMMMUIdxBit_Stage2;
      } else {
          return ARMMMUIdxBit_E10_1 |
                 ARMMMUIdxBit_E10_1_PAN |
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                               ARMMMUIdxBit_SE3);
  }
 -static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                                    uint64_t value)
 -{
 -    /* Invalidate by IPA. This has to invalidate any structures that
 -     * contain only stage 2 translation information, but does not need
 -     * to apply to structures that contain combined stage 1 and stage 2
 -     * translation information.
 -     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
 -     */
 -    ARMCPU *cpu = env_archcpu(env);
 -    CPUState *cs = CPU(cpu);
 -    uint64_t pageaddr;
 -
 -    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
 -        return;
 -    }
 -
 -    pageaddr = sextract64(value << 12, 0, 48);
 -
 -    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
 -}
 -
 -static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
 -                                      uint64_t value)
 -{
 -    CPUState *cs = env_cpu(env);
 -    uint64_t pageaddr;
 -
 -    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
 -        return;
 -    }
 -
 -    pageaddr = sextract64(value << 12, 0, 48);
 -
 -    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
 -                                             ARMMMUIdxBit_Stage2);
 -}
 -
  static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
                                        bool isread)
  {
-     /* Write the coprocessor state from cpu->env to the (index,value) list. */
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
-     int i;
+       .writefn = tlbi_aa64_vae1_write },
-@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync)
+     { .name = "TLBI_IPAS2E1IS", .state = ARM_CP_STATE_AA64,
-     for (i = 0; i < cpu->cpreg_array_len; i++) {
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
-         uint32_t regidx = kvm_to_cpreg_id(cpu->cpreg_indexes[i]);
+-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-         const ARMCPRegInfo *ri;
+-      .writefn = tlbi_aa64_ipas2e1is_write },
--        uint64_t newval;
++      .access = PL2_W, .type = ARM_CP_NOP },
+     { .name = "TLBI_IPAS2LE1IS", .state = ARM_CP_STATE_AA64,
-         ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
-         if (!ri) {
+-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync)
+-      .writefn = tlbi_aa64_ipas2e1is_write },
-         if (ri->type & ARM_CP_NO_RAW) {
++      .access = PL2_W, .type = ARM_CP_NOP },
-             continue;
+     { .name = "TLBI_ALLE1IS", .state = ARM_CP_STATE_AA64,
-         }
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 4,
--
+       .access = PL2_W, .type = ARM_CP_NO_RAW,
--        newval = read_raw_cp_reg(&cpu->env, ri);
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
--        if (kvm_sync) {
+       .writefn = tlbi_aa64_alle1is_write },
--            /*
+     { .name = "TLBI_IPAS2E1", .state = ARM_CP_STATE_AA64,
--             * Only sync if the previous list->cpustate sync succeeded.
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
--             * Rather than tracking the success/failure state for every
+-      .access = PL2_W, .type = ARM_CP_NO_RAW,
--             * item in the list, we just recheck "does the raw write we must
+-      .writefn = tlbi_aa64_ipas2e1_write },
--             * have made in write_list_to_cpustate() read back OK" here.
++      .access = PL2_W, .type = ARM_CP_NOP },
--             */
+     { .name = "TLBI_IPAS2LE1", .state = ARM_CP_STATE_AA64,
--            uint64_t oldval = cpu->cpreg_values[i];
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
--
+-      .access = PL2_W, .type = ARM_CP_NO_RAW,
--            if (oldval == newval) {
+-      .writefn = tlbi_aa64_ipas2e1_write },
--                continue;
++      .access = PL2_W, .type = ARM_CP_NOP },
--            }
+     { .name = "TLBI_ALLE1", .state = ARM_CP_STATE_AA64,
--
+       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 4,
--            write_raw_cp_reg(&cpu->env, ri, oldval);
+       .access = PL2_W, .type = ARM_CP_NO_RAW,
--            if (read_raw_cp_reg(&cpu->env, ri) != oldval) {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
--                continue;
+       .writefn = tlbimva_hyp_is_write },
--            }
+     { .name = "TLBIIPAS2",
--
+       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
--            write_raw_cp_reg(&cpu->env, ri, newval);
+-      .type = ARM_CP_NO_RAW, .access = PL2_W,
--        }
+-      .writefn = tlbiipas2_write },
--        cpu->cpreg_values[i] = newval;
++      .type = ARM_CP_NOP, .access = PL2_W },
-+        cpu->cpreg_values[i] = read_raw_cp_reg(&cpu->env, ri);
+     { .name = "TLBIIPAS2IS",
-     }
+       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
-     return ok;
+-      .type = ARM_CP_NO_RAW, .access = PL2_W,
- }
+-      .writefn = tlbiipas2_is_write },
-diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
++      .type = ARM_CP_NOP, .access = PL2_W },
-index XXXXXXX..XXXXXXX 100644
+     { .name = "TLBIIPAS2L",
---- a/target/arm/kvm32.c
+       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
-+++ b/target/arm/kvm32.c
+-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
+-      .writefn = tlbiipas2_write },
-         return ret;
++      .type = ARM_CP_NOP, .access = PL2_W },
-     }
+     { .name = "TLBIIPAS2LIS",
+       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
--    write_cpustate_to_list(cpu, true);
+-      .type = ARM_CP_NO_RAW, .access = PL2_W,
--
+-      .writefn = tlbiipas2_is_write },
-+    /* Note that we do not call write_cpustate_to_list()
++      .type = ARM_CP_NOP, .access = PL2_W },
-+     * here, so we are only writing the tuple list back to
+     /* 32 bit cache operations */
-+     * KVM. This is safe because nothing can change the
+     { .name = "ICIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
-+     * CPUARMState cp15 fields (in particular gdb accesses cannot)
+       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
 +     * and so there are no changes to sync. In fact syncing would
 +     * be wrong at this point: for a constant register where TCG and
 +     * KVM disagree about its value, the preceding write_list_to_cpustate()
 +     * would not have had any effect on the CPUARMState value (since the
 +     * register is read-only), and a write_cpustate_to_list() here would
 +     * then try to write the TCG value back into KVM -- this would either
 +     * fail or incorrectly change the value the guest sees.
 +     *
 +     * If we ever want to allow the user to modify cp15 registers via
 +     * the gdb stub, we would need to be more clever here (for instance
 +     * tracking the set of registers kvm_arch_get_registers() successfully
 +     * managed to update the CPUARMState with, and only allowing those
 +     * to be written back up into the kernel).
 +     */
      if (!write_list_to_kvmstate(cpu, level)) {
          return EINVAL;
      }
 diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm64.c
 +++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
          return ret;
      }
 -    write_cpustate_to_list(cpu, true);
 -
      if (!write_list_to_kvmstate(cpu, level)) {
          return EINVAL;
      }
 diff --git a/target/arm/machine.c b/target/arm/machine.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/machine.c
 +++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static int cpu_pre_save(void *opaque)
              abort();
          }
      } else {
 -        if (!write_cpustate_to_list(cpu, false)) {
 +        if (!write_cpustate_to_list(cpu)) {
              /* This should never fail. */
              abort();
          }
 --
 .20.1

-New patch
+[PULL 04/39] target/arm: Use enum constant in get_phys_addr_lpae() call
+The access_type argument to get_phys_addr_lpae() is an MMUAccessType;
+use the enum constant MMU_DATA_LOAD rather than a literal 0 when we
+call it in S1_ptw_translate().
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200330210400.11724-3-peter.maydell@linaro.org
+---
+ target/arm/helper.c | 5 +++--
+file changed, 3 insertions(+), 2 deletions(-)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
+             pcacheattrs = &cacheattrs;
+         }
+-        ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_Stage2, &s2pa,
+-                                 &txattrs, &s2prot, &s2size, fi, pcacheattrs);
++        ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
++                                 &s2pa, &txattrs, &s2prot, &s2size, fi,
++                                 pcacheattrs);
+         if (ret) {
+             assert(fi->type != ARMFault_None);
+             fi->s2addr = addr;
+--
+.20.1

-New patch
+[PULL 05/39] target/arm: Add new 's1_is_el0' argument to get_phys_addr_lpae()
+For ARMv8.2-TTS2UXN, the stage 2 page table walk wants to know
+whether the stage 1 access is for EL0 or not, because whether
+exec permission is given can depend on whether this is an EL0
+or EL1 access. Add a new argument to get_phys_addr_lpae() so
+the call sites can pass this information in.
+Since get_phys_addr_lpae() doesn't already have a doc comment,
+add one so we have a place to put the documentation of the
+semantics of the new s1_is_el0 argument.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200330210400.11724-4-peter.maydell@linaro.org
+---
+ target/arm/helper.c | 29 ++++++++++++++++++++++++++++-
+file changed, 28 insertions(+), 1 deletion(-)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@
+ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
+                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
++                               bool s1_is_el0,
+                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
+                                target_ulong *page_size_ptr,
+                                ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs);
+@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
+         }
+         ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
++                                 false,
+                                  &s2pa, &txattrs, &s2prot, &s2size, fi,
+                                  pcacheattrs);
+         if (ret) {
+@@ -XXX,XX +XXX,XX @@ static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
+     };
+ }
++/**
++ * get_phys_addr_lpae: perform one stage of page table walk, LPAE format
++ *
++ * Returns false if the translation was successful. Otherwise, phys_ptr, attrs,
++ * prot and page_size may not be filled in, and the populated fsr value provides
++ * information on why the translation aborted, in the format of a long-format
++ * DFSR/IFSR fault register, with the following caveats:
++ *  * the WnR bit is never set (the caller must do this).
++ *
++ * @env: CPUARMState
++ * @address: virtual address to get physical address for
++ * @access_type: MMU_DATA_LOAD, MMU_DATA_STORE or MMU_INST_FETCH
++ * @mmu_idx: MMU index indicating required translation regime
++ * @s1_is_el0: if @mmu_idx is ARMMMUIdx_Stage2 (so this is a stage 2 page table
++ *             walk), must be true if this is stage 2 of a stage 1+2 walk for an
++ *             EL0 access). If @mmu_idx is anything else, @s1_is_el0 is ignored.
++ * @phys_ptr: set to the physical address corresponding to the virtual address
++ * @attrs: set to the memory transaction attributes to use
++ * @prot: set to the permissions for the page containing phys_ptr
++ * @page_size_ptr: set to the size of the page containing phys_ptr
++ * @fi: set to fault info if the translation fails
++ * @cacheattrs: (if non-NULL) set to the cacheability/shareability attributes
++ */
+ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
+                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
++                               bool s1_is_el0,
+                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
+                                target_ulong *page_size_ptr,
+                                ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs)
+@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
+             /* S1 is done. Now do S2 translation.  */
+             ret = get_phys_addr_lpae(env, ipa, access_type, ARMMMUIdx_Stage2,
++                                     mmu_idx == ARMMMUIdx_E10_0,
+                                      phys_ptr, attrs, &s2_prot,
+                                      page_size, fi,
+                                      cacheattrs != NULL ? &cacheattrs2 : NULL);
+@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
+     }
+     if (regime_using_lpae_format(env, mmu_idx)) {
+-        return get_phys_addr_lpae(env, address, access_type, mmu_idx,
++        return get_phys_addr_lpae(env, address, access_type, mmu_idx, false,
+                                   phys_ptr, attrs, prot, page_size,
+                                   fi, cacheattrs);
+     } else if (regime_sctlr(env, mmu_idx) & SCTLR_XP) {
+--
+.20.1

-[Qemu-devel] [PULL 09/16] target/arm: Use MVFR1 feature bits to gate A32/T32 FP16 instructions
+[PULL 06/39] target/arm: Implement ARMv8.2-TTS2UXN
-Instead of gating the A32/T32 FP16 conversion instructions on
+The ARMv8.2-TTS2UXN feature extends the XN field in stage 2
-the ARM_FEATURE_VFP_FP16 flag, switch to our new approach of
+translation table descriptors from just bit [54] to bits [54:53],
-looking at ID register bits. In this case MVFR1 fields FPHP
+allowing stage 2 to control execution permissions separately for EL0
-and SIMDHP indicate the presence of these insns.
+and EL1. Implement the new semantics of the XN field and enable
+the feature for our 'max' CPU.
 This change doesn't alter behaviour for any of our CPUs.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190222170936.13268-2-peter.maydell@linaro.org
+Message-id: 20200330210400.11724-5-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       | 37 ++++++++++++++++++++++++++++++++++++-
+ target/arm/cpu.h    | 15 +++++++++++++++
- target/arm/cpu.c       |  2 --
+ target/arm/cpu.c    |  1 +
- target/arm/kvm32.c     |  3 ---
+ target/arm/cpu64.c  |  2 ++
- target/arm/translate.c | 26 ++++++++++++++++++--------
+ target/arm/helper.c | 37 +++++++++++++++++++++++++++++++------
-files changed, 54 insertions(+), 14 deletions(-)
+files changed, 49 insertions(+), 6 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ FIELD(ID_DFR0, MPROFDBG, 20, 4)
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_ccidx(const ARMISARegisters *id)
- FIELD(ID_DFR0, PERFMON, 24, 4)
+     return FIELD_EX32(id->id_mmfr4, ID_MMFR4, CCIDX) != 0;
  FIELD(ID_DFR0, TRACEFILT, 28, 4)
 +FIELD(MVFR0, SIMDREG, 0, 4)
 +FIELD(MVFR0, FPSP, 4, 4)
 +FIELD(MVFR0, FPDP, 8, 4)
 +FIELD(MVFR0, FPTRAP, 12, 4)
 +FIELD(MVFR0, FPDIVIDE, 16, 4)
 +FIELD(MVFR0, FPSQRT, 20, 4)
 +FIELD(MVFR0, FPSHVEC, 24, 4)
 +FIELD(MVFR0, FPROUND, 28, 4)
 +
 +FIELD(MVFR1, FPFTZ, 0, 4)
 +FIELD(MVFR1, FPDNAN, 4, 4)
 +FIELD(MVFR1, SIMDLS, 8, 4)
 +FIELD(MVFR1, SIMDINT, 12, 4)
 +FIELD(MVFR1, SIMDSP, 16, 4)
 +FIELD(MVFR1, SIMDHP, 20, 4)
 +FIELD(MVFR1, FPHP, 24, 4)
 +FIELD(MVFR1, SIMDFMAC, 28, 4)
 +
 +FIELD(MVFR2, SIMDMISC, 0, 4)
 +FIELD(MVFR2, FPMISC, 4, 4)
 +
  QEMU_BUILD_BUG_ON(ARRAY_SIZE(((ARMCPU *)0)->ccsidr) <= R_V7M_CSSELR_INDEX_MASK);
  /* If adding a feature bit which corresponds to a Linux ELF
@@ -XXX,XX +XXX,XX @@ enum arm_features {
      ARM_FEATURE_THUMB2,
      ARM_FEATURE_PMSA,   /* no MMU; may have Memory Protection Unit */
      ARM_FEATURE_VFP3,
 -    ARM_FEATURE_VFP_FP16,
      ARM_FEATURE_NEON,
      ARM_FEATURE_M, /* Microcontroller profile.  */
      ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling.  */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
      return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
  }
-+/*
++static inline bool isar_feature_aa32_tts2uxn(const ARMISARegisters *id)
 + * We always set the FP and SIMD FP16 fields to indicate identical
 + * levels of support (assuming SIMD is implemented at all), so
 + * we only need one set of accessors.
 + */
 +static inline bool isar_feature_aa32_fp16_spconv(const ARMISARegisters *id)
 +{
-+    return FIELD_EX64(id->mvfr1, MVFR1, FPHP) > 0;
++    return FIELD_EX32(id->id_mmfr4, ID_MMFR4, XNX) != 0;
 +}
 +
 +static inline bool isar_feature_aa32_fp16_dpconv(const ARMISARegisters *id)
 +{
 +    return FIELD_EX64(id->mvfr1, MVFR1, FPHP) > 1;
 +}
 +
  /*
   * 64-bit feature tests via id registers.
   */
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_ccidx(const ARMISARegisters *id)
+     return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, CCIDX) != 0;
+ }
++static inline bool isar_feature_aa64_tts2uxn(const ARMISARegisters *id)
++{
++    return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, XNX) != 0;
++}
++
+ /*
+  * Feature tests for "does this exist in either 32-bit or 64-bit?"
+  */
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_ccidx(const ARMISARegisters *id)
+     return isar_feature_aa64_ccidx(id) || isar_feature_aa32_ccidx(id);
+ }
++static inline bool isar_feature_any_tts2uxn(const ARMISARegisters *id)
++{
++    return isar_feature_aa64_tts2uxn(id) || isar_feature_aa32_tts2uxn(id);
++}
++
+ /*
+  * Forward to the above feature tests given an ARMCPU pointer.
+  */
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
+@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
              t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
              t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
              t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
 +            t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* TTS2UXN */
              cpu->isar.id_mmfr4 = t;
          }
  #endif
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
          t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);
          t = FIELD_DP64(t, ID_AA64MMFR1, PAN, 2); /* ATS1E1 */
          t = FIELD_DP64(t, ID_AA64MMFR1, VMIDBITS, 2); /* VMID16 */
 +        t = FIELD_DP64(t, ID_AA64MMFR1, XNX, 1); /* TTS2UXN */
          cpu->isar.id_aa64mmfr1 = t;
          t = cpu->isar.id_aa64mmfr2;
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
          u = FIELD_DP32(u, ID_MMFR4, HPDS, 1); /* AA32HPD */
          u = FIELD_DP32(u, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
          u = FIELD_DP32(u, ID_MMFR4, CNP, 1); /* TTCNP */
 +        u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
          cpu->isar.id_mmfr4 = u;
          u = cpu->isar.id_aa64dfr0;
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ simple_ap_to_rw_prot(CPUARMState *env, ARMMMUIdx mmu_idx, int ap)
   *
   * @env:     CPUARMState
   * @s2ap:    The 2-bit stage2 access permissions (S2AP)
 - * @xn:      XN (execute-never) bit
 + * @xn:      XN (execute-never) bits
 + * @s1_is_el0: true if this is S2 of an S1+2 walk for EL0
   */
 -static int get_S2prot(CPUARMState *env, int s2ap, int xn)
 +static int get_S2prot(CPUARMState *env, int s2ap, int xn, bool s1_is_el0)
  {
      int prot = 0;
@@ -XXX,XX +XXX,XX @@ static int get_S2prot(CPUARMState *env, int s2ap, int xn)
      if (s2ap & 2) {
          prot |= PAGE_WRITE;
      }
-     if (arm_feature(env, ARM_FEATURE_VFP4)) {
+-    if (!xn) {
-         set_feature(env, ARM_FEATURE_VFP3);
+-        if (arm_el_is_aa64(env, 2) || prot & PAGE_READ) {
--        set_feature(env, ARM_FEATURE_VFP_FP16);
++
 +    if (cpu_isar_feature(any_tts2uxn, env_archcpu(env))) {
 +        switch (xn) {
 +        case 0:
              prot |= PAGE_EXEC;
 +            break;
 +        case 1:
 +            if (s1_is_el0) {
 +                prot |= PAGE_EXEC;
 +            }
 +            break;
 +        case 2:
 +            break;
 +        case 3:
 +            if (!s1_is_el0) {
 +                prot |= PAGE_EXEC;
 +            }
 +            break;
 +        default:
 +            g_assert_not_reached();
 +        }
 +    } else {
 +        if (!extract32(xn, 1, 1)) {
 +            if (arm_el_is_aa64(env, 2) || prot & PAGE_READ) {
 +                prot |= PAGE_EXEC;
 +            }
          }
      }
-     if (arm_feature(env, ARM_FEATURE_VFP3)) {
+     return prot;
-         set_feature(env, ARM_FEATURE_VFP);
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
      cpu->dtb_compatible = "arm,cortex-a9";
      set_feature(&cpu->env, ARM_FEATURE_V7);
      set_feature(&cpu->env, ARM_FEATURE_VFP3);
 -    set_feature(&cpu->env, ARM_FEATURE_VFP_FP16);
      set_feature(&cpu->env, ARM_FEATURE_NEON);
      set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
      set_feature(&cpu->env, ARM_FEATURE_EL3);
 diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm32.c
 +++ b/target/arm/kvm32.c
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
      if (extract32(id_pfr0, 12, 4) == 1) {
          set_feature(&features, ARM_FEATURE_THUMB2EE);
      }
--    if (extract32(ahcf->isar.mvfr1, 20, 4) == 1) {
--        set_feature(&features, ARM_FEATURE_VFP_FP16);
+     ap = extract32(attrs, 4, 2);
--    }
+-    xn = extract32(attrs, 12, 1);
-     if (extract32(ahcf->isar.mvfr1, 12, 4) == 1) {
-         set_feature(&features, ARM_FEATURE_NEON);
+     if (mmu_idx == ARMMMUIdx_Stage2) {
          ns = true;
 -        *prot = get_S2prot(env, ap, xn);
 +        xn = extract32(attrs, 11, 2);
 +        *prot = get_S2prot(env, ap, xn, s1_is_el0);
      } else {
          ns = extract32(attrs, 3, 1);
 +        xn = extract32(attrs, 12, 1);
          pxn = extract32(attrs, 11, 1);
          *prot = get_S1prot(env, mmu_idx, aarch64, ap, ns, xn, pxn);
      }
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
-                      * UNPREDICTABLE if bit 8 is set prior to ARMv8
-                      * (we choose to UNDEF)
-                      */
--                    if ((dp && !arm_dc_feature(s, ARM_FEATURE_V8)) ||
--                        !arm_dc_feature(s, ARM_FEATURE_VFP_FP16)) {
--                        return 1;
-+                    if (dp) {
-+                        if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
-+                            return 1;
-+                        }
-+                    } else {
-+                        if (!dc_isar_feature(aa32_fp16_spconv, s)) {
-+                            return 1;
-+                        }
-                     }
-                     rm_is_dp = false;
-                     break;
-                 case 0x06: /* vcvtb.f16.f32, vcvtb.f16.f64 */
-                 case 0x07: /* vcvtt.f16.f32, vcvtt.f16.f64 */
--                    if ((dp && !arm_dc_feature(s, ARM_FEATURE_V8)) ||
--                        !arm_dc_feature(s, ARM_FEATURE_VFP_FP16)) {
--                        return 1;
-+                    if (dp) {
-+                        if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
-+                            return 1;
-+                        }
-+                    } else {
-+                        if (!dc_isar_feature(aa32_fp16_spconv, s)) {
-+                            return 1;
-+                        }
-                     }
-                     rd_is_dp = false;
-                     break;
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                     TCGv_ptr fpst;
-                     TCGv_i32 ahp;
--                    if (!arm_dc_feature(s, ARM_FEATURE_VFP_FP16) ||
-+                    if (!dc_isar_feature(aa32_fp16_spconv, s) ||
-                         q || (rm & 1)) {
-                         return 1;
-                     }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                 {
-                     TCGv_ptr fpst;
-                     TCGv_i32 ahp;
--                    if (!arm_dc_feature(s, ARM_FEATURE_VFP_FP16) ||
-+                    if (!dc_isar_feature(aa32_fp16_spconv, s) ||
-                         q || (rd & 1)) {
-                         return 1;
-                     }
 --
 .20.1

-[Qemu-devel] [PULL 15/16] target/arm: Enable ARMv8.2-FHM for -cpu max
+[PULL 07/39] target/arm: Use correct variable for setting 'max' cpu's ID_AA64DFR0
-From: Richard Henderson <richard.henderson@linaro.org>
+In aarch64_max_initfn() we update both 32-bit and 64-bit ID
 registers.  The intended pattern is that for 64-bit ID registers we
 use FIELD_DP64 and the uint64_t 't' register, while 32-bit ID
 registers use FIELD_DP32 and the uint32_t 'u' register.  For
 ID_AA64DFR0 we accidentally used 'u', meaning that the top 32 bits of
 this 64-bit ID register would end up always zero.  Luckily at the
 moment that's what they should be anyway, so this bug has no visible
 effects.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Use the right-sized variable.
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190219222952.22183-5-richard.henderson@linaro.org
+Fixes: 3bec78447a958d481991
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20200423110915.10527-1-peter.maydell@linaro.org
 ---
- target/arm/cpu.c   | 1 +
+ target/arm/cpu64.c | 6 +++---
- target/arm/cpu64.c | 2 ++
+file changed, 3 insertions(+), 3 deletions(-)
 files changed, 3 insertions(+)
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
-+++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
-             t = cpu->isar.id_isar6;
-             t = FIELD_DP32(t, ID_ISAR6, JSCVT, 1);
-             t = FIELD_DP32(t, ID_ISAR6, DP, 1);
-+            t = FIELD_DP32(t, ID_ISAR6, FHM, 1);
-             cpu->isar.id_isar6 = t;
-             t = cpu->id_mmfr4;
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
 @@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
-         t = FIELD_DP64(t, ID_AA64ISAR0, SM3, 1);
+         u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
-         t = FIELD_DP64(t, ID_AA64ISAR0, SM4, 1);
+         cpu->isar.id_mmfr4 = u;
-         t = FIELD_DP64(t, ID_AA64ISAR0, DP, 1);
-+        t = FIELD_DP64(t, ID_AA64ISAR0, FHM, 1);
+-        u = cpu->isar.id_aa64dfr0;
-         cpu->isar.id_aa64isar0 = t;
+-        u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
+-        cpu->isar.id_aa64dfr0 = u;
-         t = cpu->isar.id_aa64isar1;
++        t = cpu->isar.id_aa64dfr0;
-@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
++        t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
-         u = cpu->isar.id_isar6;
++        cpu->isar.id_aa64dfr0 = t;
-         u = FIELD_DP32(u, ID_ISAR6, JSCVT, 1);
-         u = FIELD_DP32(u, ID_ISAR6, DP, 1);
+         u = cpu->isar.id_dfr0;
-+        u = FIELD_DP32(u, ID_ISAR6, FHM, 1);
+         u = FIELD_DP32(u, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
          cpu->isar.id_isar6 = u;
          /*
 --
 .20.1

-[Qemu-devel] [PULL 13/16] target/arm: Implement FMLAL and FMLSL for aarch64
+[PULL 08/39] target/arm: Use uint64_t for midr field in CPU state struct
-From: Richard Henderson <richard.henderson@linaro.org>
+From: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+MIDR_EL1 is a 64-bit system register with the top 32-bit being RES0.
-Message-id: 20190219222952.22183-3-richard.henderson@linaro.org
+Represent it in QEMU's ARMCPU struct with a uint64_t, not a
 uint32_t.
 This fixes an error when compiling with -Werror=conversion
 because we were manipulating the register value using a
 local uint64_t variable:
   target/arm/cpu64.c: In function ‘aarch64_max_initfn’:
   target/arm/cpu64.c:628:21: error: conversion from ‘uint64_t’ {aka ‘long unsigned int’} to ‘uint32_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
 |         cpu->midr = t;
         |                     ^
 and future-proofs us against a possible future architecture
 change using some of the top 32 bits.
 Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
 Suggested-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
 Message-id: 20200428172634.29707-1-f4bug@amsat.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h           |  5 ++++
+ target/arm/cpu.h | 2 +-
- target/arm/translate-a64.c | 49 +++++++++++++++++++++++++++++++++++++-
+ target/arm/cpu.c | 2 +-
-files changed, 53 insertions(+), 1 deletion(-)
+files changed, 2 insertions(+), 2 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_dp(const ARMISARegisters *id)
+@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
-     return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, DP) != 0;
+         uint64_t id_aa64dfr0;
- }
+         uint64_t id_aa64dfr1;
+     } isar;
-+static inline bool isar_feature_aa64_fhm(const ARMISARegisters *id)
+-    uint32_t midr;
-+{
++    uint64_t midr;
-+    return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, FHM) != 0;
+     uint32_t revidr;
-+}
+     uint32_t reset_fpsid;
-+
+     uint32_t ctr;
- static inline bool isar_feature_aa64_jscvt(const ARMISARegisters *id)
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
  {
      return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, JSCVT) != 0;
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/cpu.c
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo arm_cpus[] = {
-         if (!fp_access_check(s)) {
+ static Property arm_cpu_properties[] = {
-             return;
+     DEFINE_PROP_BOOL("start-powered-off", ARMCPU, start_powered_off, false),
-         }
+     DEFINE_PROP_UINT32("psci-conduit", ARMCPU, psci_conduit, 0),
--
+-    DEFINE_PROP_UINT32("midr", ARMCPU, midr, 0),
-         handle_3same_float(s, size, elements, fpopcode, rd, rn, rm);
++    DEFINE_PROP_UINT64("midr", ARMCPU, midr, 0),
-         return;
+     DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
-+
+                         mp_affinity, ARM64_AFFINITY_INVALID),
-+    case 0x1d: /* FMLAL  */
+     DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
 +    case 0x3d: /* FMLSL  */
 +    case 0x59: /* FMLAL2 */
 +    case 0x79: /* FMLSL2 */
 +        if (size & 1 || !dc_isar_feature(aa64_fhm, s)) {
 +            unallocated_encoding(s);
 +            return;
 +        }
 +        if (fp_access_check(s)) {
 +            int is_s = extract32(insn, 23, 1);
 +            int is_2 = extract32(insn, 29, 1);
 +            int data = (is_2 << 1) | is_s;
 +            tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
 +                               vec_full_reg_offset(s, rn),
 +                               vec_full_reg_offset(s, rm), cpu_env,
 +                               is_q ? 16 : 8, vec_full_reg_size(s),
 +                               data, gen_helper_gvec_fmlal_a64);
 +        }
 +        return;
 +
      default:
          unallocated_encoding(s);
          return;
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
          }
          is_fp = 2;
          break;
 +    case 0x00: /* FMLAL */
 +    case 0x04: /* FMLSL */
 +    case 0x18: /* FMLAL2 */
 +    case 0x1c: /* FMLSL2 */
 +        if (is_scalar || size != MO_32 || !dc_isar_feature(aa64_fhm, s)) {
 +            unallocated_encoding(s);
 +            return;
 +        }
 +        size = MO_16;
 +        /* is_fp, but we pass cpu_env not fp_status.  */
 +        break;
      default:
          unallocated_encoding(s);
          return;
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
              tcg_temp_free_ptr(fpst);
          }
          return;
 +
 +    case 0x00: /* FMLAL */
 +    case 0x04: /* FMLSL */
 +    case 0x18: /* FMLAL2 */
 +    case 0x1c: /* FMLSL2 */
 +        {
 +            int is_s = extract32(opcode, 2, 1);
 +            int is_2 = u;
 +            int data = (index << 2) | (is_2 << 1) | is_s;
 +            tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
 +                               vec_full_reg_offset(s, rn),
 +                               vec_full_reg_offset(s, rm), cpu_env,
 +                               is_q ? 16 : 8, vec_full_reg_size(s),
 +                               data, gen_helper_gvec_fmlal_idx_a64);
 +        }
 +        return;
      }
      if (size == 3) {
 --
 .20.1

-New patch
+[PULL 09/39] hw/arm: versal: Remove inclusion of arm_gicv3_common.h
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+Remove inclusion of arm_gicv3_common.h, this already gets
+included via xlnx-versal.h.
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-2-edgar.iglesias@gmail.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/arm/xlnx-versal.c | 1 -
+file changed, 1 deletion(-)
+diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/xlnx-versal.c
++++ b/hw/arm/xlnx-versal.c
+@@ -XXX,XX +XXX,XX @@
+ #include "hw/arm/boot.h"
+ #include "kvm_arm.h"
+ #include "hw/misc/unimp.h"
+-#include "hw/intc/arm_gicv3_common.h"
+ #include "hw/arm/xlnx-versal.h"
+ #include "hw/char/pl011.h"
+--
+.20.1

-New patch
+[PULL 10/39] hw/arm: versal: Move misplaced comment
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+Move misplaced comment.
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-3-edgar.iglesias@gmail.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/arm/xlnx-versal.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/xlnx-versal.c
++++ b/hw/arm/xlnx-versal.c
+@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
+         obj = object_new(XLNX_VERSAL_ACPU_TYPE);
+         if (!obj) {
+-            /* Secondary CPUs start in PSCI powered-down state */
+             error_report("Unable to create apu.cpu[%d] of type %s",
+                          i, XLNX_VERSAL_ACPU_TYPE);
+             exit(EXIT_FAILURE);
+@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
+         object_property_set_int(obj, s->cfg.psci_conduit,
+                                 "psci-conduit", &error_abort);
+         if (i) {
++            /* Secondary CPUs start in PSCI powered-down state */
+             object_property_set_bool(obj, true,
+                                      "start-powered-off", &error_abort);
+         }
+--
+.20.1

-New patch
+[PULL 11/39] hw/arm: versal-virt: Fix typo xlnx-ve -> xlnx-versal
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+Fix typo xlnx-ve -> xlnx-versal.
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-4-edgar.iglesias@gmail.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/arm/xlnx-versal-virt.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/xlnx-versal-virt.c
++++ b/hw/arm/xlnx-versal-virt.c
+@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
+         psci_conduit = QEMU_PSCI_CONDUIT_SMC;
+     }
+-    sysbus_init_child_obj(OBJECT(machine), "xlnx-ve", &s->soc,
++    sysbus_init_child_obj(OBJECT(machine), "xlnx-versal", &s->soc,
+                           sizeof(s->soc), TYPE_XLNX_VERSAL);
+     object_property_set_link(OBJECT(&s->soc), OBJECT(machine->ram),
+                              "ddr", &error_abort);
+--
+.20.1

-New patch
+[PULL 12/39] hw/arm: versal: Embed the UARTs into the SoC type
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+Embed the UARTs into the SoC type.
+Suggested-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-5-edgar.iglesias@gmail.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ include/hw/arm/xlnx-versal.h |  3 ++-
+ hw/arm/xlnx-versal.c         | 12 ++++++------
+files changed, 8 insertions(+), 7 deletions(-)
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
+index XXXXXXX..XXXXXXX 100644
+--- a/include/hw/arm/xlnx-versal.h
++++ b/include/hw/arm/xlnx-versal.h
+@@ -XXX,XX +XXX,XX @@
+ #include "hw/sysbus.h"
+ #include "hw/arm/boot.h"
+ #include "hw/intc/arm_gicv3.h"
++#include "hw/char/pl011.h"
+ #define TYPE_XLNX_VERSAL "xlnx-versal"
+ #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
+@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
+         MemoryRegion mr_ocm;
+         struct {
+-            SysBusDevice *uart[XLNX_VERSAL_NR_UARTS];
++            PL011State uart[XLNX_VERSAL_NR_UARTS];
+             SysBusDevice *gem[XLNX_VERSAL_NR_GEMS];
+             SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
+         } iou;
+diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/xlnx-versal.c
++++ b/hw/arm/xlnx-versal.c
+@@ -XXX,XX +XXX,XX @@
+ #include "kvm_arm.h"
+ #include "hw/misc/unimp.h"
+ #include "hw/arm/xlnx-versal.h"
+-#include "hw/char/pl011.h"
+ #define XLNX_VERSAL_ACPU_TYPE ARM_CPU_TYPE_NAME("cortex-a72")
+ #define GEM_REVISION        0x40070106
+@@ -XXX,XX +XXX,XX @@ static void versal_create_uarts(Versal *s, qemu_irq *pic)
+         DeviceState *dev;
+         MemoryRegion *mr;
+-        dev = qdev_create(NULL, TYPE_PL011);
+-        s->lpd.iou.uart[i] = SYS_BUS_DEVICE(dev);
++        sysbus_init_child_obj(OBJECT(s), name,
++                              &s->lpd.iou.uart[i], sizeof(s->lpd.iou.uart[i]),
++                              TYPE_PL011);
++        dev = DEVICE(&s->lpd.iou.uart[i]);
+         qdev_prop_set_chr(dev, "chardev", serial_hd(i));
+-        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
+         qdev_init_nofail(dev);
+-        mr = sysbus_mmio_get_region(s->lpd.iou.uart[i], 0);
++        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
+         memory_region_add_subregion(&s->mr_ps, addrs[i], mr);
+-        sysbus_connect_irq(s->lpd.iou.uart[i], 0, pic[irqs[i]]);
++        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[irqs[i]]);
+         g_free(name);
+     }
+ }
+--
+.20.1

-New patch
+[PULL 13/39] hw/arm: versal: Embed the GEMs into the SoC type
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+Embed the GEMs into the SoC type.
+Suggested-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-6-edgar.iglesias@gmail.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ include/hw/arm/xlnx-versal.h |  3 ++-
+ hw/arm/xlnx-versal.c         | 15 ++++++++-------
+files changed, 10 insertions(+), 8 deletions(-)
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
+index XXXXXXX..XXXXXXX 100644
+--- a/include/hw/arm/xlnx-versal.h
++++ b/include/hw/arm/xlnx-versal.h
+@@ -XXX,XX +XXX,XX @@
+ #include "hw/arm/boot.h"
+ #include "hw/intc/arm_gicv3.h"
+ #include "hw/char/pl011.h"
++#include "hw/net/cadence_gem.h"
+ #define TYPE_XLNX_VERSAL "xlnx-versal"
+ #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
+@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
+         struct {
+             PL011State uart[XLNX_VERSAL_NR_UARTS];
+-            SysBusDevice *gem[XLNX_VERSAL_NR_GEMS];
++            CadenceGEMState gem[XLNX_VERSAL_NR_GEMS];
+             SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
+         } iou;
+     } lpd;
+diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/xlnx-versal.c
++++ b/hw/arm/xlnx-versal.c
+@@ -XXX,XX +XXX,XX @@ static void versal_create_gems(Versal *s, qemu_irq *pic)
+         DeviceState *dev;
+         MemoryRegion *mr;
+-        dev = qdev_create(NULL, "cadence_gem");
+-        s->lpd.iou.gem[i] = SYS_BUS_DEVICE(dev);
+-        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
++        sysbus_init_child_obj(OBJECT(s), name,
++                              &s->lpd.iou.gem[i], sizeof(s->lpd.iou.gem[i]),
++                              TYPE_CADENCE_GEM);
++        dev = DEVICE(&s->lpd.iou.gem[i]);
+         if (nd->used) {
+             qemu_check_nic_model(nd, "cadence_gem");
+             qdev_set_nic_properties(dev, nd);
+         }
+-        object_property_set_int(OBJECT(s->lpd.iou.gem[i]),
++        object_property_set_int(OBJECT(dev),
+, "num-priority-queues",
+                                 &error_abort);
+-        object_property_set_link(OBJECT(s->lpd.iou.gem[i]),
++        object_property_set_link(OBJECT(dev),
+                                  OBJECT(&s->mr_ps), "dma",
+                                  &error_abort);
+         qdev_init_nofail(dev);
+-        mr = sysbus_mmio_get_region(s->lpd.iou.gem[i], 0);
++        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
+         memory_region_add_subregion(&s->mr_ps, addrs[i], mr);
+-        sysbus_connect_irq(s->lpd.iou.gem[i], 0, pic[irqs[i]]);
++        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[irqs[i]]);
+         g_free(name);
+     }
+ }
+--
+.20.1

-New patch
+[PULL 14/39] hw/arm: versal: Embed the ADMAs into the SoC type
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+Embed the ADMAs into the SoC type.
+Suggested-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-7-edgar.iglesias@gmail.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ include/hw/arm/xlnx-versal.h |  3 ++-
+ hw/arm/xlnx-versal.c         | 14 +++++++-------
+files changed, 9 insertions(+), 8 deletions(-)
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
+index XXXXXXX..XXXXXXX 100644
+--- a/include/hw/arm/xlnx-versal.h
++++ b/include/hw/arm/xlnx-versal.h
+@@ -XXX,XX +XXX,XX @@
+ #include "hw/arm/boot.h"
+ #include "hw/intc/arm_gicv3.h"
+ #include "hw/char/pl011.h"
++#include "hw/dma/xlnx-zdma.h"
+ #include "hw/net/cadence_gem.h"
+ #define TYPE_XLNX_VERSAL "xlnx-versal"
+@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
+         struct {
+             PL011State uart[XLNX_VERSAL_NR_UARTS];
+             CadenceGEMState gem[XLNX_VERSAL_NR_GEMS];
+-            SysBusDevice *adma[XLNX_VERSAL_NR_ADMAS];
++            XlnxZDMA adma[XLNX_VERSAL_NR_ADMAS];
+         } iou;
+     } lpd;
+diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/xlnx-versal.c
++++ b/hw/arm/xlnx-versal.c
+@@ -XXX,XX +XXX,XX @@ static void versal_create_admas(Versal *s, qemu_irq *pic)
+         DeviceState *dev;
+         MemoryRegion *mr;
+-        dev = qdev_create(NULL, "xlnx.zdma");
+-        s->lpd.iou.adma[i] = SYS_BUS_DEVICE(dev);
+-        object_property_set_int(OBJECT(s->lpd.iou.adma[i]), 128, "bus-width",
+-                                &error_abort);
+-        object_property_add_child(OBJECT(s), name, OBJECT(dev), &error_fatal);
++        sysbus_init_child_obj(OBJECT(s), name,
++                              &s->lpd.iou.adma[i], sizeof(s->lpd.iou.adma[i]),
++                              TYPE_XLNX_ZDMA);
++        dev = DEVICE(&s->lpd.iou.adma[i]);
++        object_property_set_int(OBJECT(dev), 128, "bus-width", &error_abort);
+         qdev_init_nofail(dev);
+-        mr = sysbus_mmio_get_region(s->lpd.iou.adma[i], 0);
++        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
+         memory_region_add_subregion(&s->mr_ps,
+                                     MM_ADMA_CH0 + i * MM_ADMA_CH0_SIZE, mr);
+-        sysbus_connect_irq(s->lpd.iou.adma[i], 0, pic[VERSAL_ADMA_IRQ_0 + i]);
++        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[VERSAL_ADMA_IRQ_0 + i]);
+         g_free(name);
+     }
+ }
+--
+.20.1

-New patch
+[PULL 15/39] hw/arm: versal: Embed the APUs into the SoC type
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+Embed the APUs into the SoC type.
+Suggested-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-8-edgar.iglesias@gmail.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ include/hw/arm/xlnx-versal.h |  2 +-
+ hw/arm/xlnx-versal-virt.c    |  4 ++--
+ hw/arm/xlnx-versal.c         | 19 +++++--------------
+files changed, 8 insertions(+), 17 deletions(-)
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
+index XXXXXXX..XXXXXXX 100644
+--- a/include/hw/arm/xlnx-versal.h
++++ b/include/hw/arm/xlnx-versal.h
+@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
+     struct {
+         struct {
+             MemoryRegion mr;
+-            ARMCPU *cpu[XLNX_VERSAL_NR_ACPUS];
++            ARMCPU cpu[XLNX_VERSAL_NR_ACPUS];
+             GICv3State gic;
+         } apu;
+     } fpd;
+diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/xlnx-versal-virt.c
++++ b/hw/arm/xlnx-versal-virt.c
+@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
+     s->binfo.get_dtb = versal_virt_get_dtb;
+     s->binfo.modify_dtb = versal_virt_modify_dtb;
+     if (machine->kernel_filename) {
+-        arm_load_kernel(s->soc.fpd.apu.cpu[0], machine, &s->binfo);
++        arm_load_kernel(&s->soc.fpd.apu.cpu[0], machine, &s->binfo);
+     } else {
+-        AddressSpace *as = arm_boot_address_space(s->soc.fpd.apu.cpu[0],
++        AddressSpace *as = arm_boot_address_space(&s->soc.fpd.apu.cpu[0],
+                                                   &s->binfo);
+         /* Some boot-loaders (e.g u-boot) don't like blobs at address 0 (NULL).
+          * Offset things by 4K.  */
+diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/xlnx-versal.c
++++ b/hw/arm/xlnx-versal.c
+@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
+     for (i = 0; i < ARRAY_SIZE(s->fpd.apu.cpu); i++) {
+         Object *obj;
+-        char *name;
+-
+-        obj = object_new(XLNX_VERSAL_ACPU_TYPE);
+-        if (!obj) {
+-            error_report("Unable to create apu.cpu[%d] of type %s",
+-                         i, XLNX_VERSAL_ACPU_TYPE);
+-            exit(EXIT_FAILURE);
+-        }
+-
+-        name = g_strdup_printf("apu-cpu[%d]", i);
+-        object_property_add_child(OBJECT(s), name, obj, &error_fatal);
+-        g_free(name);
++        object_initialize_child(OBJECT(s), "apu-cpu[*]",
++                                &s->fpd.apu.cpu[i], sizeof(s->fpd.apu.cpu[i]),
++                                XLNX_VERSAL_ACPU_TYPE, &error_abort, NULL);
++        obj = OBJECT(&s->fpd.apu.cpu[i]);
+         object_property_set_int(obj, s->cfg.psci_conduit,
+                                 "psci-conduit", &error_abort);
+         if (i) {
+@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_cpus(Versal *s)
+         object_property_set_link(obj, OBJECT(&s->fpd.apu.mr), "memory",
+                                  &error_abort);
+         object_property_set_bool(obj, true, "realized", &error_fatal);
+-        s->fpd.apu.cpu[i] = ARM_CPU(obj);
+     }
+ }
+@@ -XXX,XX +XXX,XX @@ static void versal_create_apu_gic(Versal *s, qemu_irq *pic)
+     }
+     for (i = 0; i < nr_apu_cpus; i++) {
+-        DeviceState *cpudev = DEVICE(s->fpd.apu.cpu[i]);
++        DeviceState *cpudev = DEVICE(&s->fpd.apu.cpu[i]);
+         int ppibase = XLNX_VERSAL_NR_IRQS + i * GIC_INTERNAL + GIC_NR_SGIS;
+         qemu_irq maint_irq;
+         int ti;
+--
+.20.1

-New patch
+[PULL 16/39] hw/arm: versal: Add support for SD
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+Add support for SD.
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-9-edgar.iglesias@gmail.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ include/hw/arm/xlnx-versal.h | 12 ++++++++++++
+ hw/arm/xlnx-versal.c         | 31 +++++++++++++++++++++++++++++++
+files changed, 43 insertions(+)
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
+index XXXXXXX..XXXXXXX 100644
+--- a/include/hw/arm/xlnx-versal.h
++++ b/include/hw/arm/xlnx-versal.h
+@@ -XXX,XX +XXX,XX @@
+ #include "hw/sysbus.h"
+ #include "hw/arm/boot.h"
++#include "hw/sd/sdhci.h"
+ #include "hw/intc/arm_gicv3.h"
+ #include "hw/char/pl011.h"
+ #include "hw/dma/xlnx-zdma.h"
+@@ -XXX,XX +XXX,XX @@
+ #define XLNX_VERSAL_NR_UARTS   2
+ #define XLNX_VERSAL_NR_GEMS    2
+ #define XLNX_VERSAL_NR_ADMAS   8
++#define XLNX_VERSAL_NR_SDS     2
+ #define XLNX_VERSAL_NR_IRQS    192
+ typedef struct Versal {
+@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
+         } iou;
+     } lpd;
++    /* The Platform Management Controller subsystem.  */
++    struct {
++        struct {
++            SDHCIState sd[XLNX_VERSAL_NR_SDS];
++        } iou;
++    } pmc;
++
+     struct {
+         MemoryRegion *mr_ddr;
+         uint32_t psci_conduit;
+@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
+ #define VERSAL_GEM1_IRQ_0          58
+ #define VERSAL_GEM1_WAKE_IRQ_0     59
+ #define VERSAL_ADMA_IRQ_0          60
++#define VERSAL_SD0_IRQ_0           126
+ /* Architecturally reserved IRQs suitable for virtualization.  */
+ #define VERSAL_RSVD_IRQ_FIRST 111
+@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
+ #define MM_FPD_CRF                  0xfd1a0000U
+ #define MM_FPD_CRF_SIZE             0x140000
++#define MM_PMC_SD0                  0xf1040000U
++#define MM_PMC_SD0_SIZE             0x10000
+ #define MM_PMC_CRP                  0xf1260000U
+ #define MM_PMC_CRP_SIZE             0x10000
+ #endif
+diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/xlnx-versal.c
++++ b/hw/arm/xlnx-versal.c
+@@ -XXX,XX +XXX,XX @@ static void versal_create_admas(Versal *s, qemu_irq *pic)
+     }
+ }
++#define SDHCI_CAPABILITIES  0x280737ec6481 /* Same as on ZynqMP.  */
++static void versal_create_sds(Versal *s, qemu_irq *pic)
++{
++    int i;
++
++    for (i = 0; i < ARRAY_SIZE(s->pmc.iou.sd); i++) {
++        DeviceState *dev;
++        MemoryRegion *mr;
++
++        sysbus_init_child_obj(OBJECT(s), "sd[*]",
++                              &s->pmc.iou.sd[i], sizeof(s->pmc.iou.sd[i]),
++                              TYPE_SYSBUS_SDHCI);
++        dev = DEVICE(&s->pmc.iou.sd[i]);
++
++        object_property_set_uint(OBJECT(dev),
++                                 3, "sd-spec-version", &error_fatal);
++        object_property_set_uint(OBJECT(dev), SDHCI_CAPABILITIES, "capareg",
++                                 &error_fatal);
++        object_property_set_uint(OBJECT(dev), UHS_I, "uhs", &error_fatal);
++        qdev_init_nofail(dev);
++
++        mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
++        memory_region_add_subregion(&s->mr_ps,
++                                    MM_PMC_SD0 + i * MM_PMC_SD0_SIZE, mr);
++
++        sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0,
++                           pic[VERSAL_SD0_IRQ_0 + i * 2]);
++    }
++}
++
+ /* This takes the board allocated linear DDR memory and creates aliases
+  * for each split DDR range/aperture on the Versal address map.
+  */
+@@ -XXX,XX +XXX,XX @@ static void versal_realize(DeviceState *dev, Error **errp)
+     versal_create_uarts(s, pic);
+     versal_create_gems(s, pic);
+     versal_create_admas(s, pic);
++    versal_create_sds(s, pic);
+     versal_map_ddr(s);
+     versal_unimp(s);
+--
+.20.1

-[Qemu-devel] [PULL 16/16] linux-user: Enable HWCAP_ASIMDFHM, HWCAP_JSCVT
+[PULL 17/39] hw/arm: versal: Add support for the RTC
-From: Richard Henderson <richard.henderson@linaro.org>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+hw/arm: versal: Add support for the RTC.
-Message-id: 20190219222952.22183-6-richard.henderson@linaro.org
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
 Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
 Reviewed-by: Luc Michel <luc.michel@greensocs.com>
 Message-id: 20200427181649.26851-10-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- linux-user/elfload.c | 2 ++
+ include/hw/arm/xlnx-versal.h |  8 ++++++++
-file changed, 2 insertions(+)
+ hw/arm/xlnx-versal.c         | 21 +++++++++++++++++++++
 files changed, 29 insertions(+)
-diff --git a/linux-user/elfload.c b/linux-user/elfload.c
+diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
 index XXXXXXX..XXXXXXX 100644
---- a/linux-user/elfload.c
+--- a/include/hw/arm/xlnx-versal.h
-+++ b/linux-user/elfload.c
++++ b/include/hw/arm/xlnx-versal.h
-@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
+@@ -XXX,XX +XXX,XX @@
-     GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
+ #include "hw/char/pl011.h"
-     GET_FEATURE_ID(aa64_sve, ARM_HWCAP_A64_SVE);
+ #include "hw/dma/xlnx-zdma.h"
-     GET_FEATURE_ID(aa64_pauth, ARM_HWCAP_A64_PACA | ARM_HWCAP_A64_PACG);
+ #include "hw/net/cadence_gem.h"
-+    GET_FEATURE_ID(aa64_fhm, ARM_HWCAP_A64_ASIMDFHM);
++#include "hw/rtc/xlnx-zynqmp-rtc.h"
-+    GET_FEATURE_ID(aa64_jscvt, ARM_HWCAP_A64_JSCVT);
+ #define TYPE_XLNX_VERSAL "xlnx-versal"
- #undef GET_FEATURE_ID
+ #define XLNX_VERSAL(obj) OBJECT_CHECK(Versal, (obj), TYPE_XLNX_VERSAL)
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
          struct {
              SDHCIState sd[XLNX_VERSAL_NR_SDS];
          } iou;
 +
 +        XlnxZynqMPRTC rtc;
      } pmc;
      struct {
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
  #define VERSAL_GEM1_IRQ_0          58
  #define VERSAL_GEM1_WAKE_IRQ_0     59
  #define VERSAL_ADMA_IRQ_0          60
 +#define VERSAL_RTC_APB_ERR_IRQ     121
  #define VERSAL_SD0_IRQ_0           126
 +#define VERSAL_RTC_ALARM_IRQ       142
 +#define VERSAL_RTC_SECONDS_IRQ     143
  /* Architecturally reserved IRQs suitable for virtualization.  */
  #define VERSAL_RSVD_IRQ_FIRST 111
@@ -XXX,XX +XXX,XX @@ typedef struct Versal {
  #define MM_PMC_SD0_SIZE             0x10000
  #define MM_PMC_CRP                  0xf1260000U
  #define MM_PMC_CRP_SIZE             0x10000
 +#define MM_PMC_RTC                  0xf12a0000
 +#define MM_PMC_RTC_SIZE             0x10000
  #endif
 diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/xlnx-versal.c
 +++ b/hw/arm/xlnx-versal.c
@@ -XXX,XX +XXX,XX @@ static void versal_create_sds(Versal *s, qemu_irq *pic)
      }
  }
 +static void versal_create_rtc(Versal *s, qemu_irq *pic)
 +{
 +    SysBusDevice *sbd;
 +    MemoryRegion *mr;
 +
 +    sysbus_init_child_obj(OBJECT(s), "rtc", &s->pmc.rtc, sizeof(s->pmc.rtc),
 +                          TYPE_XLNX_ZYNQMP_RTC);
 +    sbd = SYS_BUS_DEVICE(&s->pmc.rtc);
 +    qdev_init_nofail(DEVICE(sbd));
 +
 +    mr = sysbus_mmio_get_region(sbd, 0);
 +    memory_region_add_subregion(&s->mr_ps, MM_PMC_RTC, mr);
 +
 +    /*
 +     * TODO: Connect the ALARM and SECONDS interrupts once our RTC model
 +     * supports them.
 +     */
 +    sysbus_connect_irq(sbd, 1, pic[VERSAL_RTC_APB_ERR_IRQ]);
 +}
 +
  /* This takes the board allocated linear DDR memory and creates aliases
   * for each split DDR range/aperture on the Versal address map.
   */
@@ -XXX,XX +XXX,XX @@ static void versal_realize(DeviceState *dev, Error **errp)
      versal_create_gems(s, pic);
      versal_create_admas(s, pic);
      versal_create_sds(s, pic);
 +    versal_create_rtc(s, pic);
      versal_map_ddr(s);
      versal_unimp(s);
 --
 .20.1

-[Qemu-devel] [PULL 12/16] target/arm: Add helpers for FMLAL
+[PULL 18/39] hw/arm: versal-virt: Add support for SD
-From: Richard Henderson <richard.henderson@linaro.org>
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
-Note that float16_to_float32 rightly squashes SNaN to QNaN.
+Add support for SD.
 But of course pickNaNMulAdd, for ARM, selects SNaNs first.
 So we have to preserve SNaN long enough for the correct NaN
 to be selected.  Thus float16_to_float32_by_bits.
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
-Message-id: 20190219222952.22183-2-richard.henderson@linaro.org
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
 Message-id: 20200427181649.26851-11-edgar.iglesias@gmail.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h     |   9 +++
+ hw/arm/xlnx-versal-virt.c | 46 +++++++++++++++++++++++++++++++++++++++
- target/arm/vec_helper.c | 148 ++++++++++++++++++++++++++++++++++++++++
+file changed, 46 insertions(+)
 files changed, 157 insertions(+)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
+diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/hw/arm/xlnx-versal-virt.c
-+++ b/target/arm/helper.h
++++ b/hw/arm/xlnx-versal-virt.c
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_sqsub_s, TCG_CALL_NO_RWG,
+@@ -XXX,XX +XXX,XX @@
- DEF_HELPER_FLAGS_5(gvec_sqsub_d, TCG_CALL_NO_RWG,
+ #include "hw/arm/sysbus-fdt.h"
-                    void, ptr, ptr, ptr, ptr, i32)
+ #include "hw/arm/fdt.h"
+ #include "cpu.h"
-+DEF_HELPER_FLAGS_5(gvec_fmlal_a32, TCG_CALL_NO_RWG,
++#include "hw/qdev-properties.h"
-+                   void, ptr, ptr, ptr, ptr, i32)
+ #include "hw/arm/xlnx-versal.h"
-+DEF_HELPER_FLAGS_5(gvec_fmlal_a64, TCG_CALL_NO_RWG,
-+                   void, ptr, ptr, ptr, ptr, i32)
+ #define TYPE_XLNX_VERSAL_VIRT_MACHINE MACHINE_TYPE_NAME("xlnx-versal-virt")
-+DEF_HELPER_FLAGS_5(gvec_fmlal_idx_a32, TCG_CALL_NO_RWG,
+@@ -XXX,XX +XXX,XX @@ static void fdt_add_zdma_nodes(VersalVirt *s)
-+                   void, ptr, ptr, ptr, ptr, i32)
+     }
-+DEF_HELPER_FLAGS_5(gvec_fmlal_idx_a64, TCG_CALL_NO_RWG,
+ }
-+                   void, ptr, ptr, ptr, ptr, i32)
 +static void fdt_add_sd_nodes(VersalVirt *s)
 +{
 +    const char clocknames[] = "clk_xin\0clk_ahb";
 +    const char compat[] = "arasan,sdhci-8.9a";
 +    int i;
 +
- #ifdef TARGET_AARCH64
++    for (i = ARRAY_SIZE(s->soc.pmc.iou.sd) - 1; i >= 0; i--) {
- #include "helper-a64.h"
++        uint64_t addr = MM_PMC_SD0 + MM_PMC_SD0_SIZE * i;
- #include "helper-sve.h"
++        char *name = g_strdup_printf("/sdhci@%" PRIx64, addr);
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn,
      }
      clear_tail(d, oprsz, simd_maxsz(desc));
  }
 +
-+/*
++        qemu_fdt_add_subnode(s->fdt, name);
 + * Convert float16 to float32, raising no exceptions and
 + * preserving exceptional values, including SNaN.
 + * This is effectively an unpack+repack operation.
 + */
 +static float32 float16_to_float32_by_bits(uint32_t f16, bool fz16)
 +{
 +    const int f16_bias = 15;
 +    const int f32_bias = 127;
 +    uint32_t sign = extract32(f16, 15, 1);
 +    uint32_t exp = extract32(f16, 10, 5);
 +    uint32_t frac = extract32(f16, 0, 10);
 +
-+    if (exp == 0x1f) {
++        qemu_fdt_setprop_cells(s->fdt, name, "clocks",
-+        /* Inf or NaN */
++                               s->phandle.clk_25Mhz, s->phandle.clk_25Mhz);
-+        exp = 0xff;
++        qemu_fdt_setprop(s->fdt, name, "clock-names",
-+    } else if (exp == 0) {
++                         clocknames, sizeof(clocknames));
-+        /* Zero or denormal.  */
++        qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
-+        if (frac != 0) {
++                               GIC_FDT_IRQ_TYPE_SPI, VERSAL_SD0_IRQ_0 + i * 2,
-+            if (fz16) {
++                               GIC_FDT_IRQ_FLAGS_LEVEL_HI);
-+                frac = 0;
++        qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
-+            } else {
++                                     2, addr, 2, MM_PMC_SD0_SIZE);
-+                /*
++        qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat));
-+                 * Denormal; these are all normal float32.
++        g_free(name);
 +                 * Shift the fraction so that the msb is at bit 11,
 +                 * then remove bit 11 as the implicit bit of the
 +                 * normalized float32.  Note that we still go through
 +                 * the shift for normal numbers below, to put the
 +                 * float32 fraction at the right place.
 +                 */
 +                int shift = clz32(frac) - 21;
 +                frac = (frac << shift) & 0x3ff;
 +                exp = f32_bias - f16_bias - shift + 1;
 +            }
 +        }
 +    } else {
 +        /* Normal number; adjust the bias.  */
 +        exp += f32_bias - f16_bias;
 +    }
-+    sign <<= 31;
-+    exp <<= 23;
-+    frac <<= 23 - 10;
-+
-+    return sign | exp | frac;
 +}
 +
-+static uint64_t load4_f16(uint64_t *ptr, int is_q, int is_2)
+ static void fdt_nop_memory_nodes(void *fdt, Error **errp)
  {
      Error *err = NULL;
@@ -XXX,XX +XXX,XX @@ static void create_virtio_regions(VersalVirt *s)
      }
  }
 +static void sd_plugin_card(SDHCIState *sd, DriveInfo *di)
 +{
-+    /*
++    BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL;
-+     * Branchless load of u32[0], u64[0], u32[1], or u64[1].
++    DeviceState *card;
-+     * Load the 2nd qword iff is_q & is_2.
++
-+     * Shift to the 2nd dword iff !is_q & is_2.
++    card = qdev_create(qdev_get_child_bus(DEVICE(sd), "sd-bus"), TYPE_SD_CARD);
-+     * For !is_q & !is_2, the upper bits of the result are garbage.
++    object_property_add_child(OBJECT(sd), "card[*]", OBJECT(card),
-+     */
++                              &error_fatal);
-+    return ptr[is_q & is_2] >> ((is_2 & ~is_q) << 5);
++    qdev_prop_set_drive(card, "drive", blk, &error_fatal);
 +    object_property_set_bool(OBJECT(card), true, "realized", &error_fatal);
 +}
 +
-+/*
+ static void versal_virt_init(MachineState *machine)
-+ * Note that FMLAL requires oprsz == 8 or oprsz == 16,
+ {
-+ * as there is not yet SVE versions that might use blocking.
+     VersalVirt *s = XLNX_VERSAL_VIRT_MACHINE(machine);
-+ */
+     int psci_conduit = QEMU_PSCI_CONDUIT_DISABLED;
-+
++    int i;
-+static void do_fmlal(float32 *d, void *vn, void *vm, float_status *fpst,
-+                     uint32_t desc, bool fz16)
+     /*
-+{
+      * If the user provides an Operating System to be loaded, we expect them
-+    intptr_t i, oprsz = simd_oprsz(desc);
+@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
-+    int is_s = extract32(desc, SIMD_DATA_SHIFT, 1);
+     fdt_add_gic_nodes(s);
-+    int is_2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
+     fdt_add_timer_nodes(s);
-+    int is_q = oprsz == 16;
+     fdt_add_zdma_nodes(s);
-+    uint64_t n_4, m_4;
++    fdt_add_sd_nodes(s);
-+
+     fdt_add_cpu_nodes(s, psci_conduit);
-+    /* Pre-load all of the f16 data, avoiding overlap issues.  */
+     fdt_add_clk_node(s, "/clk125", 125000000, s->phandle.clk_125Mhz);
-+    n_4 = load4_f16(vn, is_q, is_2);
+     fdt_add_clk_node(s, "/clk25", 25000000, s->phandle.clk_25Mhz);
-+    m_4 = load4_f16(vm, is_q, is_2);
+@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
-+
+     memory_region_add_subregion_overlap(get_system_memory(),
-+    /* Negate all inputs for FMLSL at once.  */
+, &s->soc.fpd.apu.mr, 0);
-+    if (is_s) {
-+        n_4 ^= 0x8000800080008000ull;
++    /* Plugin SD cards.  */
 +    for (i = 0; i < ARRAY_SIZE(s->soc.pmc.iou.sd); i++) {
 +        sd_plugin_card(&s->soc.pmc.iou.sd[i], drive_get_next(IF_SD));
 +    }
 +
-+    for (i = 0; i < oprsz / 4; i++) {
+     s->binfo.ram_size = machine->ram_size;
-+        float32 n_1 = float16_to_float32_by_bits(n_4 >> (i * 16), fz16);
+     s->binfo.loader_start = 0x0;
-+        float32 m_1 = float16_to_float32_by_bits(m_4 >> (i * 16), fz16);
+     s->binfo.get_dtb = versal_virt_get_dtb;
 +        d[H4(i)] = float32_muladd(n_1, m_1, d[H4(i)], 0, fpst);
 +    }
 +    clear_tail(d, oprsz, simd_maxsz(desc));
 +}
 +
 +void HELPER(gvec_fmlal_a32)(void *vd, void *vn, void *vm,
 +                            void *venv, uint32_t desc)
 +{
 +    CPUARMState *env = venv;
 +    do_fmlal(vd, vn, vm, &env->vfp.standard_fp_status, desc,
 +             get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
 +}
 +
 +void HELPER(gvec_fmlal_a64)(void *vd, void *vn, void *vm,
 +                            void *venv, uint32_t desc)
 +{
 +    CPUARMState *env = venv;
 +    do_fmlal(vd, vn, vm, &env->vfp.fp_status, desc,
 +             get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
 +}
 +
 +static void do_fmlal_idx(float32 *d, void *vn, void *vm, float_status *fpst,
 +                         uint32_t desc, bool fz16)
 +{
 +    intptr_t i, oprsz = simd_oprsz(desc);
 +    int is_s = extract32(desc, SIMD_DATA_SHIFT, 1);
 +    int is_2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
 +    int index = extract32(desc, SIMD_DATA_SHIFT + 2, 3);
 +    int is_q = oprsz == 16;
 +    uint64_t n_4;
 +    float32 m_1;
 +
 +    /* Pre-load all of the f16 data, avoiding overlap issues.  */
 +    n_4 = load4_f16(vn, is_q, is_2);
 +
 +    /* Negate all inputs for FMLSL at once.  */
 +    if (is_s) {
 +        n_4 ^= 0x8000800080008000ull;
 +    }
 +
 +    m_1 = float16_to_float32_by_bits(((float16 *)vm)[H2(index)], fz16);
 +
 +    for (i = 0; i < oprsz / 4; i++) {
 +        float32 n_1 = float16_to_float32_by_bits(n_4 >> (i * 16), fz16);
 +        d[H4(i)] = float32_muladd(n_1, m_1, d[H4(i)], 0, fpst);
 +    }
 +    clear_tail(d, oprsz, simd_maxsz(desc));
 +}
 +
 +void HELPER(gvec_fmlal_idx_a32)(void *vd, void *vn, void *vm,
 +                                void *venv, uint32_t desc)
 +{
 +    CPUARMState *env = venv;
 +    do_fmlal_idx(vd, vn, vm, &env->vfp.standard_fp_status, desc,
 +                 get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
 +}
 +
 +void HELPER(gvec_fmlal_idx_a64)(void *vd, void *vn, void *vm,
 +                                void *venv, uint32_t desc)
 +{
 +    CPUARMState *env = venv;
 +    do_fmlal_idx(vd, vn, vm, &env->vfp.fp_status, desc,
 +                 get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
 +}
 --
 .20.1

-New patch
+[PULL 19/39] hw/arm: versal-virt: Add support for the RTC
+From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
+Add support for the RTC.
+Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Reviewed-by: Luc Michel <luc.michel@greensocs.com>
+Message-id: 20200427181649.26851-12-edgar.iglesias@gmail.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/arm/xlnx-versal-virt.c | 22 ++++++++++++++++++++++
+file changed, 22 insertions(+)
+diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/xlnx-versal-virt.c
++++ b/hw/arm/xlnx-versal-virt.c
+@@ -XXX,XX +XXX,XX @@ static void fdt_add_sd_nodes(VersalVirt *s)
+     }
+ }
++static void fdt_add_rtc_node(VersalVirt *s)
++{
++    const char compat[] = "xlnx,zynqmp-rtc";
++    const char interrupt_names[] = "alarm\0sec";
++    char *name = g_strdup_printf("/rtc@%x", MM_PMC_RTC);
++
++    qemu_fdt_add_subnode(s->fdt, name);
++
++    qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
++                           GIC_FDT_IRQ_TYPE_SPI, VERSAL_RTC_ALARM_IRQ,
++                           GIC_FDT_IRQ_FLAGS_LEVEL_HI,
++                           GIC_FDT_IRQ_TYPE_SPI, VERSAL_RTC_SECONDS_IRQ,
++                           GIC_FDT_IRQ_FLAGS_LEVEL_HI);
++    qemu_fdt_setprop(s->fdt, name, "interrupt-names",
++                     interrupt_names, sizeof(interrupt_names));
++    qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
++                                 2, MM_PMC_RTC, 2, MM_PMC_RTC_SIZE);
++    qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat));
++    g_free(name);
++}
++
+ static void fdt_nop_memory_nodes(void *fdt, Error **errp)
+ {
+     Error *err = NULL;
+@@ -XXX,XX +XXX,XX @@ static void versal_virt_init(MachineState *machine)
+     fdt_add_timer_nodes(s);
+     fdt_add_zdma_nodes(s);
+     fdt_add_sd_nodes(s);
++    fdt_add_rtc_node(s);
+     fdt_add_cpu_nodes(s, psci_conduit);
+     fdt_add_clk_node(s, "/clk125", 125000000, s->phandle.clk_125Mhz);
+     fdt_add_clk_node(s, "/clk25", 25000000, s->phandle.clk_25Mhz);
+--
+.20.1

-New patch
+[PULL 20/39] target/arm/translate-vfp.inc.c: Remove duplicate simd_r32 check
+Somewhere along theline we accidentally added a duplicate
+"using D16-D31 when they don't exist" check to do_vfm_dp()
+(probably an artifact of a patchseries rebase). Remove it.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20200430181003.21682-2-peter.maydell@linaro.org
+---
+ target/arm/translate-vfp.inc.c | 6 ------
+file changed, 6 deletions(-)
+diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-vfp.inc.c
++++ b/target/arm/translate-vfp.inc.c
+@@ -XXX,XX +XXX,XX @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
+         return false;
+     }
+-    /* UNDEF accesses to D16-D31 if they don't exist. */
+-    if (!dc_isar_feature(aa32_simd_r32, s) &&
+-        ((a->vd | a->vn | a->vm) & 0x10)) {
+-        return false;
+-    }
+-
+     if (!vfp_access_check(s)) {
+         return true;
+     }
+--
+.20.1

-[Qemu-devel] [PULL 14/16] target/arm: Implement VFMAL and VFMSL for aarch32
+[PULL 21/39] target/arm: Don't allow Thumb Neon insns without FEATURE_NEON
-From: Richard Henderson <richard.henderson@linaro.org>
+We were accidentally permitting decode of Thumb Neon insns even if
 the CPU didn't have the FEATURE_NEON bit set, because the feature
 check was being done before the call to disas_neon_data_insn() and
 disas_neon_ls_insn() in the Arm decoder but was omitted from the
 Thumb decoder.  Push the feature bit check down into the called
 functions so it is done for both Arm and Thumb encodings.
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190219222952.22183-4-richard.henderson@linaro.org
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20200430181003.21682-3-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       |   5 ++
+ target/arm/translate.c | 16 ++++++++--------
- target/arm/translate.c | 129 ++++++++++++++++++++++++++++++-----------
+file changed, 8 insertions(+), 8 deletions(-)
 files changed, 101 insertions(+), 33 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_dp(const ARMISARegisters *id)
-     return FIELD_EX32(id->id_isar6, ID_ISAR6, DP) != 0;
- }
-+static inline bool isar_feature_aa32_fhm(const ARMISARegisters *id)
-+{
-+    return FIELD_EX32(id->id_isar6, ID_ISAR6, FHM) != 0;
-+}
-+
- static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
- {
-     /*
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
-     gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
+     TCGv_i32 tmp2;
-     int rd, rn, rm, opr_sz;
+     TCGv_i64 tmp64;
-     int data = 0;
--    bool q;
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 -
 -    q = extract32(insn, 6, 1);
 -    VFP_DREG_D(rd, insn);
 -    VFP_DREG_N(rn, insn);
 -    VFP_DREG_M(rm, insn);
 -    if ((rd | rn | rm) & q) {
 -        return 1;
 -    }
 +    int off_rn, off_rm;
 +    bool is_long = false, q = extract32(insn, 6, 1);
 +    bool ptr_is_env = false;
      if ((insn & 0xfe200f10) == 0xfc200800) {
          /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
              return 1;
          }
          fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
 +    } else if ((insn & 0xff300f10) == 0xfc200810) {
 +        /* VFM[AS]L -- 1111 1100 S.10 .... .... 1000 .Q.1 .... */
 +        int is_s = extract32(insn, 23, 1);
 +        if (!dc_isar_feature(aa32_fhm, s)) {
 +            return 1;
 +        }
 +        is_long = true;
 +        data = is_s; /* is_2 == 0 */
 +        fn_gvec_ptr = gen_helper_gvec_fmlal_a32;
 +        ptr_is_env = true;
      } else {
          return 1;
      }
 +    VFP_DREG_D(rd, insn);
 +    if (rd & q) {
 +        return 1;
 +    }
-+    if (q || !is_long) {
++
-+        VFP_DREG_N(rn, insn);
+     /* FIXME: this access check should not take precedence over UNDEF
-+        VFP_DREG_M(rm, insn);
+      * for invalid encodings; we will generate incorrect syndrome information
-+        if ((rn | rm) & q & !is_long) {
+      * for attempts to execute invalid vfp/neon encodings with FP disabled.
-+            return 1;
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+        }
+     TCGv_ptr ptr1, ptr2, ptr3;
-+        off_rn = vfp_reg_offset(1, rn);
+     TCGv_i64 tmp64;
-+        off_rm = vfp_reg_offset(1, rm);
-+    } else {
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+        rn = VFP_SREG_N(insn);
++        return 1;
 +        rm = VFP_SREG_M(insn);
 +        off_rn = vfp_reg_offset(0, rn);
 +        off_rm = vfp_reg_offset(0, rm);
 +    }
 +
-     if (s->fp_excp_el) {
+     /* FIXME: this access check should not take precedence over UNDEF
-         gen_exception_insn(s, 4, EXCP_UDEF,
+      * for invalid encodings; we will generate incorrect syndrome information
-                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
+      * for attempts to execute invalid vfp/neon encodings with FP disabled.
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
-     opr_sz = (1 + q) * 8;
+         if (((insn >> 25) & 7) == 1) {
-     if (fn_gvec_ptr) {
+             /* NEON Data processing.  */
--        TCGv_ptr fpst = get_fpstatus_ptr(1);
+-            if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
--        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
+-                goto illegal_op;
--                           vfp_reg_offset(1, rn),
+-            }
 -                           vfp_reg_offset(1, rm), fpst,
 +        TCGv_ptr ptr;
 +        if (ptr_is_env) {
 +            ptr = cpu_env;
 +        } else {
 +            ptr = get_fpstatus_ptr(1);
 +        }
 +        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
                             opr_sz, opr_sz, data, fn_gvec_ptr);
 -        tcg_temp_free_ptr(fpst);
 +        if (!ptr_is_env) {
 +            tcg_temp_free_ptr(ptr);
 +        }
      } else {
 -        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd),
 -                           vfp_reg_offset(1, rn),
 -                           vfp_reg_offset(1, rm),
 +        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
                             opr_sz, opr_sz, data, fn_gvec);
      }
      return 0;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
      gen_helper_gvec_3 *fn_gvec = NULL;
      gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
      int rd, rn, rm, opr_sz, data;
 -    bool q;
 -
--    q = extract32(insn, 6, 1);
+             if (disas_neon_data_insn(s, insn)) {
--    VFP_DREG_D(rd, insn);
+                 goto illegal_op;
--    VFP_DREG_N(rn, insn);
+             }
--    if ((rd | rn) & q) {
+@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
 -        return 1;
 -    }
 +    int off_rn, off_rm;
 +    bool is_long = false, q = extract32(insn, 6, 1);
 +    bool ptr_is_env = false;
      if ((insn & 0xff000f10) == 0xfe000800) {
          /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
      } else if ((insn & 0xffb00f00) == 0xfe200d00) {
          /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
          int u = extract32(insn, 4, 1);
 +
          if (!dc_isar_feature(aa32_dp, s)) {
              return 1;
          }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
+         if ((insn & 0x0f100000) == 0x04000000) {
-         /* rm is just Vm, and index is M.  */
+             /* NEON load/store.  */
-         data = extract32(insn, 5, 1); /* index */
+-            if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-         rm = extract32(insn, 0, 4);
+-                goto illegal_op;
-+    } else if ((insn & 0xffa00f10) == 0xfe000810) {
+-            }
-+        /* VFM[AS]L -- 1111 1110 0.0S .... .... 1000 .Q.1 .... */
+-
-+        int is_s = extract32(insn, 20, 1);
+             if (disas_neon_ls_insn(s, insn)) {
-+        int vm20 = extract32(insn, 0, 3);
+                 goto illegal_op;
-+        int vm3 = extract32(insn, 3, 1);
+             }
 +        int m = extract32(insn, 5, 1);
 +        int index;
 +
 +        if (!dc_isar_feature(aa32_fhm, s)) {
 +            return 1;
 +        }
 +        if (q) {
 +            rm = vm20;
 +            index = m * 2 + vm3;
 +        } else {
 +            rm = vm20 * 2 + m;
 +            index = vm3;
 +        }
 +        is_long = true;
 +        data = (index << 2) | is_s; /* is_2 == 0 */
 +        fn_gvec_ptr = gen_helper_gvec_fmlal_idx_a32;
 +        ptr_is_env = true;
      } else {
          return 1;
      }
 +    VFP_DREG_D(rd, insn);
 +    if (rd & q) {
 +        return 1;
 +    }
 +    if (q || !is_long) {
 +        VFP_DREG_N(rn, insn);
 +        if (rn & q & !is_long) {
 +            return 1;
 +        }
 +        off_rn = vfp_reg_offset(1, rn);
 +        off_rm = vfp_reg_offset(1, rm);
 +    } else {
 +        rn = VFP_SREG_N(insn);
 +        off_rn = vfp_reg_offset(0, rn);
 +        off_rm = vfp_reg_offset(0, rm);
 +    }
      if (s->fp_excp_el) {
          gen_exception_insn(s, 4, EXCP_UDEF,
                             syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
      opr_sz = (1 + q) * 8;
      if (fn_gvec_ptr) {
 -        TCGv_ptr fpst = get_fpstatus_ptr(1);
 -        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
 -                           vfp_reg_offset(1, rn),
 -                           vfp_reg_offset(1, rm), fpst,
 +        TCGv_ptr ptr;
 +        if (ptr_is_env) {
 +            ptr = cpu_env;
 +        } else {
 +            ptr = get_fpstatus_ptr(1);
 +        }
 +        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
                             opr_sz, opr_sz, data, fn_gvec_ptr);
 -        tcg_temp_free_ptr(fpst);
 +        if (!ptr_is_env) {
 +            tcg_temp_free_ptr(ptr);
 +        }
      } else {
 -        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd),
 -                           vfp_reg_offset(1, rn),
 -                           vfp_reg_offset(1, rm),
 +        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
                             opr_sz, opr_sz, data, fn_gvec);
      }
      return 0;
 --
 .20.1

-New patch
+[PULL 22/39] target/arm: Add stubs for AArch32 Neon decodetree
+Add the infrastructure for building and invoking a decodetree decoder
 for the AArch32 Neon encodings.  At the moment the new decoder covers
 nothing, so we always fall back to the existing hand-written decode.
 We follow the same pattern we did for the VFP decodetree conversion
 (commit 78e138bc1f672c145ef6ace74617d and following): code that deals
 with Neon will be moving gradually out to translate-neon.vfp.inc,
 which we #include into translate.c.
 In order to share the decode files between A32 and T32, we
 split Neon into 3 parts:
  * data-processing
  * load-store
  * 'shared' encodings
 The first two groups of instructions have similar but not identical
 A32 and T32 encodings, so we need to manually transform the T32
 encoding into the A32 one before calling the decoder; the third group
 covers the Neon instructions which are identical in A32 and T32.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200430181003.21682-4-peter.maydell@linaro.org
 ---
  target/arm/neon-dp.decode       | 29 ++++++++++++++++++++++++++
  target/arm/neon-ls.decode       | 29 ++++++++++++++++++++++++++
  target/arm/neon-shared.decode   | 27 +++++++++++++++++++++++++
  target/arm/translate-neon.inc.c | 32 +++++++++++++++++++++++++++++
  target/arm/translate.c          | 36 +++++++++++++++++++++++++++++++--
  target/arm/Makefile.objs        | 18 +++++++++++++++++
 files changed, 169 insertions(+), 2 deletions(-)
  create mode 100644 target/arm/neon-dp.decode
  create mode 100644 target/arm/neon-ls.decode
  create mode 100644 target/arm/neon-shared.decode
  create mode 100644 target/arm/translate-neon.inc.c
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 +# AArch32 Neon data-processing instruction descriptions
 +#
 +#  Copyright (c) 2020 Linaro, Ltd
 +#
 +# This library is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU Lesser General Public
 +# License as published by the Free Software Foundation; either
 +# version 2 of the License, or (at your option) any later version.
 +#
 +# This library is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +# Lesser General Public License for more details.
 +#
 +# You should have received a copy of the GNU Lesser General Public
 +# License along with this library; if not, see <http://www.gnu.org/licenses/>.
 +
 +#
 +# This file is processed by scripts/decodetree.py
 +#
 +
 +# Encodings for Neon data processing instructions where the T32 encoding
 +# is a simple transformation of the A32 encoding.
 +# More specifically, this file covers instructions where the A32 encoding is
 +#   0b1111_001p_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 +# and the T32 encoding is
 +#   0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 +# This file works on the A32 encoding only; calling code for T32 has to
 +# transform the insn into the A32 version first.
 diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/neon-ls.decode
@@ -XXX,XX +XXX,XX @@
 +# AArch32 Neon load/store instruction descriptions
 +#
 +#  Copyright (c) 2020 Linaro, Ltd
 +#
 +# This library is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU Lesser General Public
 +# License as published by the Free Software Foundation; either
 +# version 2 of the License, or (at your option) any later version.
 +#
 +# This library is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +# Lesser General Public License for more details.
 +#
 +# You should have received a copy of the GNU Lesser General Public
 +# License along with this library; if not, see <http://www.gnu.org/licenses/>.
 +
 +#
 +# This file is processed by scripts/decodetree.py
 +#
 +
 +# Encodings for Neon load/store instructions where the T32 encoding
 +# is a simple transformation of the A32 encoding.
 +# More specifically, this file covers instructions where the A32 encoding is
 +#   0b1111_0100_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
 +# and the T32 encoding is
 +#   0b1111_1001_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
 +# This file works on the A32 encoding only; calling code for T32 has to
 +# transform the insn into the A32 version first.
 diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/neon-shared.decode
@@ -XXX,XX +XXX,XX @@
 +# AArch32 Neon instruction descriptions
 +#
 +#  Copyright (c) 2020 Linaro, Ltd
 +#
 +# This library is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU Lesser General Public
 +# License as published by the Free Software Foundation; either
 +# version 2 of the License, or (at your option) any later version.
 +#
 +# This library is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +# Lesser General Public License for more details.
 +#
 +# You should have received a copy of the GNU Lesser General Public
 +# License along with this library; if not, see <http://www.gnu.org/licenses/>.
 +
 +#
 +# This file is processed by scripts/decodetree.py
 +#
 +
 +# Encodings for Neon instructions whose encoding is the same for
 +# both A32 and T32.
 +
 +# More specifically, this covers:
 +# 2reg scalar ext: 0b1111_1110_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
 +# 3same ext:       0b1111_110x_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + *  ARM translation: AArch32 Neon instructions
 + *
 + *  Copyright (c) 2003 Fabrice Bellard
 + *  Copyright (c) 2005-2007 CodeSourcery
 + *  Copyright (c) 2007 OpenedHand, Ltd.
 + *  Copyright (c) 2020 Linaro, Ltd.
 + *
 + * This library is free software; you can redistribute it and/or
 + * modify it under the terms of the GNU Lesser General Public
 + * License as published by the Free Software Foundation; either
 + * version 2 of the License, or (at your option) any later version.
 + *
 + * This library is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 + * Lesser General Public License for more details.
 + *
 + * You should have received a copy of the GNU Lesser General Public
 + * License along with this library; if not, see <http://www.gnu.org/licenses/>.
 + */
 +
 +/*
 + * This file is intended to be included from translate.c; it uses
 + * some macros and definitions provided by that file.
 + * It might be possible to convert it to a standalone .c file eventually.
 + */
 +
 +/* Include the generated Neon decoder */
 +#include "decode-neon-dp.inc.c"
 +#include "decode-neon-ls.inc.c"
 +#include "decode-neon-shared.inc.c"
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static TCGv_ptr vfp_reg_ptr(bool dp, int reg)
  #define ARM_CP_RW_BIT   (1 << 20)
 -/* Include the VFP decoder */
 +/* Include the VFP and Neon decoders */
  #include "translate-vfp.inc.c"
 +#include "translate-neon.inc.c"
  static inline void iwmmxt_load_reg(TCGv_i64 var, int reg)
  {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
          /* Unconditional instructions.  */
          /* TODO: Perhaps merge these into one decodetree output file.  */
          if (disas_a32_uncond(s, insn) ||
 -            disas_vfp_uncond(s, insn)) {
 +            disas_vfp_uncond(s, insn) ||
 +            disas_neon_dp(s, insn) ||
 +            disas_neon_ls(s, insn) ||
 +            disas_neon_shared(s, insn)) {
              return;
          }
          /* fall back to legacy decoder */
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
          ARCH(6T2);
      }
 +    if ((insn & 0xef000000) == 0xef000000) {
 +        /*
 +         * T32 encodings 0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 +         * transform into
 +         * A32 encodings 0b1111_001p_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
 +         */
 +        uint32_t a32_insn = (insn & 0xe2ffffff) |
 +            ((insn & (1 << 28)) >> 4) | (1 << 28);
 +
 +        if (disas_neon_dp(s, a32_insn)) {
 +            return;
 +        }
 +    }
 +
 +    if ((insn & 0xff100000) == 0xf9000000) {
 +        /*
 +         * T32 encodings 0b1111_1001_ppp0_qqqq_qqqq_qqqq_qqqq_qqqq
 +         * transform into
 +         * A32 encodings 0b1111_0100_ppp0_qqqq_qqqq_qqqq_qqqq_qqqq
 +         */
 +        uint32_t a32_insn = (insn & 0x00ffffff) | 0xf4000000;
 +
 +        if (disas_neon_ls(s, a32_insn)) {
 +            return;
 +        }
 +    }
 +
      /*
       * TODO: Perhaps merge these into one decodetree output file.
       * Note disas_vfp is written for a32 with cond field in the
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
       */
      if (disas_t32(s, insn) ||
          disas_vfp_uncond(s, insn) ||
 +        disas_neon_shared(s, insn) ||
          ((insn >> 28) == 0xe && disas_vfp(s, insn))) {
          return;
      }
 diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/Makefile.objs
 +++ b/target/arm/Makefile.objs
@@ -XXX,XX +XXX,XX @@ target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.decode $(DECODETREE)
        $(PYTHON) $(DECODETREE) --decode disas_sve -o $@ $<,\
        "GEN", $(TARGET_DIR)$@)
 +target/arm/decode-neon-shared.inc.c: $(SRC_PATH)/target/arm/neon-shared.decode $(DECODETREE)
 +    $(call quiet-command,\
 +      $(PYTHON) $(DECODETREE) --static-decode disas_neon_shared -o $@ $<,\
 +      "GEN", $(TARGET_DIR)$@)
 +
 +target/arm/decode-neon-dp.inc.c: $(SRC_PATH)/target/arm/neon-dp.decode $(DECODETREE)
 +    $(call quiet-command,\
 +      $(PYTHON) $(DECODETREE) --static-decode disas_neon_dp -o $@ $<,\
 +      "GEN", $(TARGET_DIR)$@)
 +
 +target/arm/decode-neon-ls.inc.c: $(SRC_PATH)/target/arm/neon-ls.decode $(DECODETREE)
 +    $(call quiet-command,\
 +      $(PYTHON) $(DECODETREE) --static-decode disas_neon_ls -o $@ $<,\
 +      "GEN", $(TARGET_DIR)$@)
 +
  target/arm/decode-vfp.inc.c: $(SRC_PATH)/target/arm/vfp.decode $(DECODETREE)
      $(call quiet-command,\
        $(PYTHON) $(DECODETREE) --static-decode disas_vfp -o $@ $<,\
@@ -XXX,XX +XXX,XX @@ target/arm/decode-t16.inc.c: $(SRC_PATH)/target/arm/t16.decode $(DECODETREE)
        "GEN", $(TARGET_DIR)$@)
  target/arm/translate-sve.o: target/arm/decode-sve.inc.c
 +target/arm/translate.o: target/arm/decode-neon-shared.inc.c
 +target/arm/translate.o: target/arm/decode-neon-dp.inc.c
 +target/arm/translate.o: target/arm/decode-neon-ls.inc.c
  target/arm/translate.o: target/arm/decode-vfp.inc.c
  target/arm/translate.o: target/arm/decode-vfp-uncond.inc.c
  target/arm/translate.o: target/arm/decode-a32.inc.c
 --
 .20.1

-New patch
+[PULL 23/39] target/arm: Convert VCMLA (vector) to decodetree
+Convert the VCMLA (vector) insns in the 3same extension group to
+decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-5-peter.maydell@linaro.org
+---
+ target/arm/neon-shared.decode   | 11 ++++++++++
+ target/arm/translate-neon.inc.c | 37 +++++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 11 +---------
+files changed, 49 insertions(+), 10 deletions(-)
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon-shared.decode
++++ b/target/arm/neon-shared.decode
+@@ -XXX,XX +XXX,XX @@
+ # More specifically, this covers:
+ # 2reg scalar ext: 0b1111_1110_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
+ # 3same ext:       0b1111_110x_xxxx_xxxx_xxxx_1x0x_xxxx_xxxx
++
++# VFP/Neon register fields; same as vfp.decode
++%vm_dp  5:1 0:4
++%vm_sp  0:4 5:1
++%vn_dp  7:1 16:4
++%vn_sp  16:4 7:1
++%vd_dp  22:1 12:4
++%vd_sp  12:4 22:1
++
++VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@
+ #include "decode-neon-dp.inc.c"
+ #include "decode-neon-ls.inc.c"
+ #include "decode-neon-shared.inc.c"
++
++static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a)
++{
++    int opr_sz;
++    TCGv_ptr fpst;
++    gen_helper_gvec_3_ptr *fn_gvec_ptr;
++
++    if (!dc_isar_feature(aa32_vcma, s)
++        || (!a->size && !dc_isar_feature(aa32_fp16_arith, s))) {
++        return false;
++    }
++
++    /* UNDEF accesses to D16-D31 if they don't exist. */
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
++        ((a->vd | a->vn | a->vm) & 0x10)) {
++        return false;
++    }
++
++    if ((a->vn | a->vm | a->vd) & a->q) {
++        return false;
++    }
++
++    if (!vfp_access_check(s)) {
++        return true;
++    }
++
++    opr_sz = (1 + a->q) * 8;
++    fpst = get_fpstatus_ptr(1);
++    fn_gvec_ptr = a->size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
++    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
++                       vfp_reg_offset(1, a->vn),
++                       vfp_reg_offset(1, a->vm),
++                       fpst, opr_sz, opr_sz, a->rot,
++                       fn_gvec_ptr);
++    tcg_temp_free_ptr(fpst);
++    return true;
++}
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+     bool is_long = false, q = extract32(insn, 6, 1);
+     bool ptr_is_env = false;
+-    if ((insn & 0xfe200f10) == 0xfc200800) {
+-        /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
+-        int size = extract32(insn, 20, 1);
+-        data = extract32(insn, 23, 2); /* rot */
+-        if (!dc_isar_feature(aa32_vcma, s)
+-            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
+-            return 1;
+-        }
+-        fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
+-    } else if ((insn & 0xfea00f10) == 0xfc800800) {
++    if ((insn & 0xfea00f10) == 0xfc800800) {
+         /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
+         int size = extract32(insn, 20, 1);
+         data = extract32(insn, 24, 1); /* rot */
+--
+.20.1

-New patch
+[PULL 24/39] target/arm: Convert VCADD (vector) to decodetree
+Convert the VCADD (vector) insns to decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-6-peter.maydell@linaro.org
+---
+ target/arm/neon-shared.decode   |  3 +++
+ target/arm/translate-neon.inc.c | 37 +++++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 11 +---------
+files changed, 41 insertions(+), 10 deletions(-)
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon-shared.decode
++++ b/target/arm/neon-shared.decode
+@@ -XXX,XX +XXX,XX @@
+ VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
+                vm=%vm_dp vn=%vn_dp vd=%vd_dp
++
++VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a)
+     tcg_temp_free_ptr(fpst);
+     return true;
+ }
++
++static bool trans_VCADD(DisasContext *s, arg_VCADD *a)
++{
++    int opr_sz;
++    TCGv_ptr fpst;
++    gen_helper_gvec_3_ptr *fn_gvec_ptr;
++
++    if (!dc_isar_feature(aa32_vcma, s)
++        || (!a->size && !dc_isar_feature(aa32_fp16_arith, s))) {
++        return false;
++    }
++
++    /* UNDEF accesses to D16-D31 if they don't exist. */
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
++        ((a->vd | a->vn | a->vm) & 0x10)) {
++        return false;
++    }
++
++    if ((a->vn | a->vm | a->vd) & a->q) {
++        return false;
++    }
++
++    if (!vfp_access_check(s)) {
++        return true;
++    }
++
++    opr_sz = (1 + a->q) * 8;
++    fpst = get_fpstatus_ptr(1);
++    fn_gvec_ptr = a->size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
++    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
++                       vfp_reg_offset(1, a->vn),
++                       vfp_reg_offset(1, a->vm),
++                       fpst, opr_sz, opr_sz, a->rot,
++                       fn_gvec_ptr);
++    tcg_temp_free_ptr(fpst);
++    return true;
++}
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+     bool is_long = false, q = extract32(insn, 6, 1);
+     bool ptr_is_env = false;
+-    if ((insn & 0xfea00f10) == 0xfc800800) {
+-        /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
+-        int size = extract32(insn, 20, 1);
+-        data = extract32(insn, 24, 1); /* rot */
+-        if (!dc_isar_feature(aa32_vcma, s)
+-            || (!size && !dc_isar_feature(aa32_fp16_arith, s))) {
+-            return 1;
+-        }
+-        fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
+-    } else if ((insn & 0xfeb00f00) == 0xfc200d00) {
++    if ((insn & 0xfeb00f00) == 0xfc200d00) {
+         /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
+         bool u = extract32(insn, 4, 1);
+         if (!dc_isar_feature(aa32_dp, s)) {
+--
+.20.1

-New patch
+[PULL 25/39] target/arm: Convert V[US]DOT (vector) to decodetree
+Convert the V[US]DOT (vector) insns to decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-7-peter.maydell@linaro.org
+---
+ target/arm/neon-shared.decode   |  4 ++++
+ target/arm/translate-neon.inc.c | 32 ++++++++++++++++++++++++++++++++
+ target/arm/translate.c          |  9 +--------
+files changed, 37 insertions(+), 8 deletions(-)
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon-shared.decode
++++ b/target/arm/neon-shared.decode
+@@ -XXX,XX +XXX,XX @@ VCMLA          1111 110 rot:2 . 1 size:1 .... .... 1000 . q:1 . 0 .... \
+ VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
+                vm=%vm_dp vn=%vn_dp vd=%vd_dp
++
++# VUDOT and VSDOT
++VDOT           1111 110 00 . 10 .... .... 1101 . q:1 . u:1 .... \
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCADD(DisasContext *s, arg_VCADD *a)
+     tcg_temp_free_ptr(fpst);
+     return true;
+ }
++
++static bool trans_VDOT(DisasContext *s, arg_VDOT *a)
++{
++    int opr_sz;
++    gen_helper_gvec_3 *fn_gvec;
++
++    if (!dc_isar_feature(aa32_dp, s)) {
++        return false;
++    }
++
++    /* UNDEF accesses to D16-D31 if they don't exist. */
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
++        ((a->vd | a->vn | a->vm) & 0x10)) {
++        return false;
++    }
++
++    if ((a->vn | a->vm | a->vd) & a->q) {
++        return false;
++    }
++
++    if (!vfp_access_check(s)) {
++        return true;
++    }
++
++    opr_sz = (1 + a->q) * 8;
++    fn_gvec = a->u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
++    tcg_gen_gvec_3_ool(vfp_reg_offset(1, a->vd),
++                       vfp_reg_offset(1, a->vn),
++                       vfp_reg_offset(1, a->vm),
++                       opr_sz, opr_sz, 0, fn_gvec);
++    return true;
++}
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+     bool is_long = false, q = extract32(insn, 6, 1);
+     bool ptr_is_env = false;
+-    if ((insn & 0xfeb00f00) == 0xfc200d00) {
+-        /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */
+-        bool u = extract32(insn, 4, 1);
+-        if (!dc_isar_feature(aa32_dp, s)) {
+-            return 1;
+-        }
+-        fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
+-    } else if ((insn & 0xff300f10) == 0xfc200810) {
++    if ((insn & 0xff300f10) == 0xfc200810) {
+         /* VFM[AS]L -- 1111 1100 S.10 .... .... 1000 .Q.1 .... */
+         int is_s = extract32(insn, 23, 1);
+         if (!dc_isar_feature(aa32_fhm, s)) {
+--
+.20.1

-[Qemu-devel] [PULL 10/16] target/arm: Gate "miscellaneous FP" insns by ID register field
+[PULL 26/39] target/arm: Convert VFM[AS]L (vector) to decodetree
-There is a set of VFP instructions which we implement in
+Convert the VFM[AS]L (vector) insns to decodetree.  This is the last
-disas_vfp_v8_insn() and gate on the ARM_FEATURE_V8 bit.
+insn in the legacy decoder for the 3same_ext group, so we can
-These were all first introduced in v8 for A-profile, but in
+delete the legacy decoder function for the group entirely.
-M-profile they appeared in v7M. Gate them on the MVFR2
-FPMisc field instead, and rename the function appropriately.
+Note that in disas_thumb2_insn() the parts of this encoding space
 where the decodetree decoder returns false will correctly be directed
 to illegal_op by the "(insn & (1 << 28))" check so they won't fall
 into disas_coproc_insn() by mistake.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190222170936.13268-3-peter.maydell@linaro.org
+Message-id: 20200430181003.21682-8-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       | 20 ++++++++++++++++++++
+ target/arm/neon-shared.decode   |  6 +++
- target/arm/translate.c | 25 +++++++++++++------------
+ target/arm/translate-neon.inc.c | 31 +++++++++++
-files changed, 33 insertions(+), 12 deletions(-)
+ target/arm/translate.c          | 92 +--------------------------------
 files changed, 38 insertions(+), 91 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-shared.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-shared.decode
-@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp16_dpconv(const ARMISARegisters *id)
+@@ -XXX,XX +XXX,XX @@ VCADD          1111 110 rot:1 1 . 0 size:1 .... .... 1000 . q:1 . 0 .... \
-     return FIELD_EX64(id->mvfr1, MVFR1, FPHP) > 1;
+ # VUDOT and VSDOT
  VDOT           1111 110 00 . 10 .... .... 1101 . q:1 . u:1 .... \
                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
 +# VFM[AS]L
 +VFML           1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \
 +               vm=%vm_sp vn=%vn_sp vd=%vd_dp q=0
 +VFML           1111 110 0 s:1 . 10 .... .... 1000 . 1 . 1 .... \
 +               vm=%vm_dp vn=%vn_dp vd=%vd_dp q=1
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VDOT(DisasContext *s, arg_VDOT *a)
                         opr_sz, opr_sz, 0, fn_gvec);
      return true;
  }
++
-+static inline bool isar_feature_aa32_vsel(const ARMISARegisters *id)
++static bool trans_VFML(DisasContext *s, arg_VFML *a)
 +{
-+    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 1;
++    int opr_sz;
 +
 +    if (!dc_isar_feature(aa32_fhm, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        (a->vd & 0x10)) {
 +        return false;
 +    }
 +
 +    if (a->vd & a->q) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    opr_sz = (1 + a->q) * 8;
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(a->q, a->vn),
 +                       vfp_reg_offset(a->q, a->vm),
 +                       cpu_env, opr_sz, opr_sz, a->s, /* is_2 == 0 */
 +                       gen_helper_gvec_fmlal_a32);
 +    return true;
 +}
-+
-+static inline bool isar_feature_aa32_vcvt_dr(const ARMISARegisters *id)
-+{
-+    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 2;
-+}
-+
-+static inline bool isar_feature_aa32_vrint(const ARMISARegisters *id)
-+{
-+    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 3;
-+}
-+
-+static inline bool isar_feature_aa32_vminmaxnm(const ARMISARegisters *id)
-+{
-+    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 4;
-+}
-+
- /*
-  * 64-bit feature tests via id registers.
-  */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static const uint8_t fp_decode_rm[] = {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     FPROUNDING_NEGINF,
+     return 0;
- };
+ }
--static int disas_vfp_v8_insn(DisasContext *s, uint32_t insn)
+-/* Advanced SIMD three registers of the same length extension.
-+static int disas_vfp_misc_insn(DisasContext *s, uint32_t insn)
+- *  31           25    23  22    20   16   12  11   10   9    8        3     0
- {
+- * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
-     uint32_t rd, rn, rm, dp = extract32(insn, 8, 1);
+- * | 1 1 1 1 1 1 0 | op1 | D | op2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
+- * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
--    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
+- */
 -static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
 -{
 -    gen_helper_gvec_3 *fn_gvec = NULL;
 -    gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
 -    int rd, rn, rm, opr_sz;
 -    int data = 0;
 -    int off_rn, off_rm;
 -    bool is_long = false, q = extract32(insn, 6, 1);
 -    bool ptr_is_env = false;
 -
 -    if ((insn & 0xff300f10) == 0xfc200810) {
 -        /* VFM[AS]L -- 1111 1100 S.10 .... .... 1000 .Q.1 .... */
 -        int is_s = extract32(insn, 23, 1);
 -        if (!dc_isar_feature(aa32_fhm, s)) {
 -            return 1;
 -        }
 -        is_long = true;
 -        data = is_s; /* is_2 == 0 */
 -        fn_gvec_ptr = gen_helper_gvec_fmlal_a32;
 -        ptr_is_env = true;
 -    } else {
 -        return 1;
 -    }
 -
-     if (dp) {
+-    VFP_DREG_D(rd, insn);
-         VFP_DREG_D(rd, insn);
+-    if (rd & q) {
-         VFP_DREG_N(rn, insn);
+-        return 1;
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_v8_insn(DisasContext *s, uint32_t insn)
+-    }
-         rm = VFP_SREG_M(insn);
+-    if (q || !is_long) {
-     }
+-        VFP_DREG_N(rn, insn);
+-        VFP_DREG_M(rm, insn);
--    if ((insn & 0x0f800e50) == 0x0e000a00) {
+-        if ((rn | rm) & q & !is_long) {
-+    if ((insn & 0x0f800e50) == 0x0e000a00 && dc_isar_feature(aa32_vsel, s)) {
+-            return 1;
-         return handle_vsel(insn, rd, rn, rm, dp);
+-        }
--    } else if ((insn & 0x0fb00e10) == 0x0e800a00) {
+-        off_rn = vfp_reg_offset(1, rn);
-+    } else if ((insn & 0x0fb00e10) == 0x0e800a00 &&
+-        off_rm = vfp_reg_offset(1, rm);
-+               dc_isar_feature(aa32_vminmaxnm, s)) {
+-    } else {
-         return handle_vminmaxnm(insn, rd, rn, rm, dp);
+-        rn = VFP_SREG_N(insn);
--    } else if ((insn & 0x0fbc0ed0) == 0x0eb80a40) {
+-        rm = VFP_SREG_M(insn);
-+    } else if ((insn & 0x0fbc0ed0) == 0x0eb80a40 &&
+-        off_rn = vfp_reg_offset(0, rn);
-+               dc_isar_feature(aa32_vrint, s)) {
+-        off_rm = vfp_reg_offset(0, rm);
-         /* VRINTA, VRINTN, VRINTP, VRINTM */
+-    }
-         int rounding = fp_decode_rm[extract32(insn, 16, 2)];
+-
-         return handle_vrint(insn, rd, rm, dp, rounding);
+-    if (s->fp_excp_el) {
--    } else if ((insn & 0x0fbc0e50) == 0x0ebc0a40) {
+-        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
-+    } else if ((insn & 0x0fbc0e50) == 0x0ebc0a40 &&
+-                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
-+               dc_isar_feature(aa32_vcvt_dr, s)) {
+-        return 0;
-         /* VCVTA, VCVTN, VCVTP, VCVTM */
+-    }
-         int rounding = fp_decode_rm[extract32(insn, 16, 2)];
+-    if (!s->vfp_enabled) {
-         return handle_vcvt(insn, rd, rm, dp, rounding);
+-        return 1;
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+-    }
-     }
+-
+-    opr_sz = (1 + q) * 8;
-     if (extract32(insn, 28, 4) == 0xf) {
+-    if (fn_gvec_ptr) {
--        /* Encodings with T=1 (Thumb) or unconditional (ARM):
+-        TCGv_ptr ptr;
--         * only used in v8 and above.
+-        if (ptr_is_env) {
-+        /*
+-            ptr = cpu_env;
-+         * Encodings with T=1 (Thumb) or unconditional (ARM):
+-        } else {
-+         * only used for the "miscellaneous VFP features" added in v8A
+-            ptr = get_fpstatus_ptr(1);
-+         * and v7M (and gated on the MVFR2.FPMisc field).
+-        }
-          */
+-        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
--        return disas_vfp_v8_insn(s, insn);
+-                           opr_sz, opr_sz, data, fn_gvec_ptr);
-+        return disas_vfp_misc_insn(s, insn);
+-        if (!ptr_is_env) {
-     }
+-            tcg_temp_free_ptr(ptr);
+-        }
-     dp = ((insn & 0xf00) == 0xb00);
+-    } else {
 -        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
 -                           opr_sz, opr_sz, data, fn_gvec);
 -    }
 -    return 0;
 -}
 -
  /* Advanced SIMD two registers and a scalar extension.
   *  31             24   23  22   20   16   12  11   10   9    8        3     0
   * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                      }
                  }
              }
 -        } else if ((insn & 0x0e000a00) == 0x0c000800
 -                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            if (disas_neon_insn_3same_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -            return;
          } else if ((insn & 0x0f000a00) == 0x0e000800
                     && arm_dc_feature(s, ARM_FEATURE_V8)) {
              if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
              }
              break;
          }
 -        if ((insn & 0xfe000a00) == 0xfc000800
 +        if ((insn & 0xff000a00) == 0xfe000800
              && arm_dc_feature(s, ARM_FEATURE_V8)) {
              /* The Thumb2 and ARM encodings are identical.  */
 -            if (disas_neon_insn_3same_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -        } else if ((insn & 0xff000a00) == 0xfe000800
 -                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            /* The Thumb2 and ARM encodings are identical.  */
              if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
                  goto illegal_op;
              }
 --
 .20.1

-[Qemu-devel] [PULL 08/16] hw/arm/armsse: Unify init-svtor and cpuwait handling
+[PULL 27/39] target/arm: Convert VCMLA (scalar) to decodetree
-At the moment the handling of init-svtor and cpuwait initial
+Convert VCMLA (scalar) in the 2reg-scalar-ext group to decodetree.
 values is split between armsse.c and iotkit-sysctl.c:
 the code in armsse.c sets the initial state of the CPU
 object by setting the init-svtor and start-powered-off
 properties, but the iotkit-sysctl.c code has its own
 code setting the reset values of its registers (which are
 then used when updating the CPU when the guest makes
 runtime changes).
 Clean this up by making the armsse.c code set properties on the
 iotkit-sysctl object to define the initial values of the
 registers, so they always match the initial CPU state,
 and update the comments in armsse.c accordingly.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190219125808.25174-9-peter.maydell@linaro.org
+Message-id: 20200430181003.21682-9-peter.maydell@linaro.org
 ---
- include/hw/misc/iotkit-sysctl.h |  3 ++
+ target/arm/neon-shared.decode   |  5 +++++
- hw/arm/armsse.c                 | 49 +++++++++++++++++++++------------
+ target/arm/translate-neon.inc.c | 40 +++++++++++++++++++++++++++++++++
- hw/misc/iotkit-sysctl.c         | 20 ++++++--------
+ target/arm/translate.c          | 26 +--------------------
-files changed, 42 insertions(+), 30 deletions(-)
+files changed, 46 insertions(+), 25 deletions(-)
-diff --git a/include/hw/misc/iotkit-sysctl.h b/include/hw/misc/iotkit-sysctl.h
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/misc/iotkit-sysctl.h
+--- a/target/arm/neon-shared.decode
-+++ b/include/hw/misc/iotkit-sysctl.h
++++ b/target/arm/neon-shared.decode
-@@ -XXX,XX +XXX,XX @@ typedef struct IoTKitSysCtl {
+@@ -XXX,XX +XXX,XX @@ VFML           1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \
+                vm=%vm_sp vn=%vn_sp vd=%vd_dp q=0
-     /* Properties */
+ VFML           1111 110 0 s:1 . 10 .... .... 1000 . 1 . 1 .... \
-     uint32_t sys_version;
+                vm=%vm_dp vn=%vn_dp vd=%vd_dp q=1
-+    uint32_t cpuwait_rst;
++
-+    uint32_t initsvtor0_rst;
++VCMLA_scalar   1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \
-+    uint32_t initsvtor1_rst;
++               vn=%vn_dp vd=%vd_dp size=0
++VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
-     bool is_sse200;
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp size=1 index=0
- } IoTKitSysCtl;
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 diff --git a/hw/arm/armsse.c b/hw/arm/armsse.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/armsse.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/arm/armsse.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static bool trans_VFML(DisasContext *s, arg_VFML *a)
+                        gen_helper_gvec_fmlal_a32);
- #include "qemu/osdep.h"
+     return true;
- #include "qemu/log.h"
+ }
-+#include "qemu/bitops.h"
++
- #include "qapi/error.h"
++static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a)
- #include "trace.h"
++{
- #include "hw/sysbus.h"
++    gen_helper_gvec_3_ptr *fn_gvec_ptr;
-@@ -XXX,XX +XXX,XX @@ struct ARMSSEInfo {
++    int opr_sz;
-     int sram_banks;
++    TCGv_ptr fpst;
-     int num_cpus;
++
-     uint32_t sys_version;
++    if (!dc_isar_feature(aa32_vcma, s)) {
-+    uint32_t cpuwait_rst;
++        return false;
-     SysConfigFormat sys_config_format;
++    }
-     bool has_mhus;
++    if (a->size == 0 && !dc_isar_feature(aa32_fp16_arith, s)) {
-     bool has_ppus;
++        return false;
-@@ -XXX,XX +XXX,XX @@ static const ARMSSEInfo armsse_variants[] = {
++    }
-         .sram_banks = 1,
++
-         .num_cpus = 1,
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-         .sys_version = 0x41743,
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        .cpuwait_rst = 0,
++        ((a->vd | a->vn | a->vm) & 0x10)) {
-         .sys_config_format = IoTKitFormat,
++        return false;
-         .has_mhus = false,
++    }
-         .has_ppus = false,
++
-@@ -XXX,XX +XXX,XX @@ static const ARMSSEInfo armsse_variants[] = {
++    if ((a->vd | a->vn) & a->q) {
-         .sram_banks = 4,
++        return false;
-         .num_cpus = 2,
++    }
-         .sys_version = 0x22041743,
++
-+        .cpuwait_rst = 2,
++    if (!vfp_access_check(s)) {
-         .sys_config_format = SSE200Format,
++        return true;
-         .has_mhus = true,
++    }
-         .has_ppus = true,
++
-@@ -XXX,XX +XXX,XX @@ static void armsse_realize(DeviceState *dev, Error **errp)
++    fn_gvec_ptr = (a->size ? gen_helper_gvec_fcmlas_idx
++                   : gen_helper_gvec_fcmlah_idx);
-         qdev_prop_set_uint32(cpudev, "num-irq", s->exp_numirq + 32);
++    opr_sz = (1 + a->q) * 8;
-         /*
++    fpst = get_fpstatus_ptr(1);
--         * In real hardware the initial Secure VTOR is set from the INITSVTOR0
++    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
--         * register in the IoT Kit System Control Register block, and the
++                       vfp_reg_offset(1, a->vn),
--         * initial value of that is in turn specifiable by the FPGA that
++                       vfp_reg_offset(1, a->vm),
--         * instantiates the IoT Kit. In QEMU we don't implement this wrinkle,
++                       fpst, opr_sz, opr_sz,
--         * and simply set the CPU's init-svtor to the IoT Kit default value.
++                       (a->index << 2) | a->rot, fn_gvec_ptr);
--         * In SSE-200 the situation is similar, except that the default value
++    tcg_temp_free_ptr(fpst);
--         * is a reset-time signal input. Typically a board using the SSE-200
++    return true;
--         * will have a system control processor whose boot firmware initializes
++}
--         * the INITSVTOR* registers before powering up the CPUs in any case,
+diff --git a/target/arm/translate.c b/target/arm/translate.c
 -         * so the hardware's default value doesn't matter. QEMU doesn't emulate
 +         * In real hardware the initial Secure VTOR is set from the INITSVTOR*
 +         * registers in the IoT Kit System Control Register block. In QEMU
 +         * we set the initial value here, and also the reset value of the
 +         * sysctl register, from this object's QOM init-svtor property.
 +         * If the guest changes the INITSVTOR* registers at runtime then the
 +         * code in iotkit-sysctl.c will update the CPU init-svtor property
 +         * (which will then take effect on the next CPU warm-reset).
 +         *
 +         * Note that typically a board using the SSE-200 will have a system
 +         * control processor whose boot firmware initializes the INITSVTOR*
 +         * registers before powering up the CPUs. QEMU doesn't emulate
           * the control processor, so instead we behave in the way that the
 -         * firmware does. The initial value is configurable by the board code
 -         * to match whatever its firmware does.
 +         * firmware does: the initial value should be set by the board code
 +         * (using the init-svtor property on the ARMSSE object) to match
 +         * whatever its firmware does.
           */
          qdev_prop_set_uint32(cpudev, "init-svtor", s->init_svtor);
          /*
 -         * Start all CPUs except CPU0 powered down. In real hardware it is
 -         * a configurable property of the SSE-200 which CPUs start powered up
 -         * (via the CPUWAIT0_RST and CPUWAIT1_RST parameters), but since all
 -         * the boards we care about start CPU0 and leave CPU1 powered off,
 -         * we hard-code that for now. We can add QOM properties for this
 +         * CPUs start powered down if the corresponding bit in the CPUWAIT
 +         * register is 1. In real hardware the CPUWAIT register reset value is
 +         * a configurable property of the SSE-200 (via the CPUWAIT0_RST and
 +         * CPUWAIT1_RST parameters), but since all the boards we care about
 +         * start CPU0 and leave CPU1 powered off, we hard-code that in
 +         * info->cpuwait_rst for now. We can add QOM properties for this
           * later if necessary.
           */
 -        if (i > 0) {
 +        if (extract32(info->cpuwait_rst, i, 1)) {
              object_property_set_bool(cpuobj, true, "start-powered-off", &err);
              if (err) {
                  error_propagate(errp, err);
@@ -XXX,XX +XXX,XX @@ static void armsse_realize(DeviceState *dev, Error **errp)
      /* System control registers */
      object_property_set_int(OBJECT(&s->sysctl), info->sys_version,
                              "SYS_VERSION", &err);
 +    object_property_set_int(OBJECT(&s->sysctl), info->cpuwait_rst,
 +                            "CPUWAIT_RST", &err);
 +    object_property_set_int(OBJECT(&s->sysctl), s->init_svtor,
 +                            "INITSVTOR0_RST", &err);
 +    object_property_set_int(OBJECT(&s->sysctl), s->init_svtor,
 +                            "INITSVTOR1_RST", &err);
      object_property_set_bool(OBJECT(&s->sysctl), true, "realized", &err);
      if (err) {
          error_propagate(errp, err);
 diff --git a/hw/misc/iotkit-sysctl.c b/hw/misc/iotkit-sysctl.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/iotkit-sysctl.c
+--- a/target/arm/translate.c
-+++ b/hw/misc/iotkit-sysctl.c
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_reset(DeviceState *dev)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
-     s->reset_syndrome = 1;
+     bool is_long = false, q = extract32(insn, 6, 1);
-     s->reset_mask = 0;
+     bool ptr_is_env = false;
-     s->gretreg = 0;
--    s->initsvtor0 = 0x10000000;
+-    if ((insn & 0xff000f10) == 0xfe000800) {
--    s->initsvtor1 = 0x10000000;
+-        /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */
--    if (s->is_sse200) {
+-        int rot = extract32(insn, 20, 2);
--        /*
+-        int size = extract32(insn, 23, 1);
--         * CPU 0 starts on, CPU 1 starts off. In real hardware this is
+-        int index;
--         * configurable by the SoC integrator as a verilog parameter.
+-
--         */
+-        if (!dc_isar_feature(aa32_vcma, s)) {
--        s->cpuwait = 2;
+-            return 1;
--    } else {
+-        }
--        /* CPU 0 starts on */
+-        if (size == 0) {
--        s->cpuwait = 0;
+-            if (!dc_isar_feature(aa32_fp16_arith, s)) {
--    }
+-                return 1;
-+    s->initsvtor0 = s->initsvtor0_rst;
+-            }
-+    s->initsvtor1 = s->initsvtor1_rst;
+-            /* For fp16, rm is just Vm, and index is M.  */
-+    s->cpuwait = s->cpuwait_rst;
+-            rm = extract32(insn, 0, 4);
-     s->wicctrl = 0;
+-            index = extract32(insn, 5, 1);
-     s->scsecctrl = 0;
+-        } else {
-     s->fclk_div = 0;
+-            /* For fp32, rm is the usual M:Vm, and index is 0.  */
-@@ -XXX,XX +XXX,XX @@ static const VMStateDescription iotkit_sysctl_vmstate = {
+-            VFP_DREG_M(rm, insn);
+-            index = 0;
- static Property iotkit_sysctl_props[] = {
+-        }
-     DEFINE_PROP_UINT32("SYS_VERSION", IoTKitSysCtl, sys_version, 0),
+-        data = (index << 2) | rot;
-+    DEFINE_PROP_UINT32("CPUWAIT_RST", IoTKitSysCtl, cpuwait_rst, 0),
+-        fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx
-+    DEFINE_PROP_UINT32("INITSVTOR0_RST", IoTKitSysCtl, initsvtor0_rst,
+-                       : gen_helper_gvec_fcmlah_idx);
-+                       0x10000000),
+-    } else if ((insn & 0xffb00f00) == 0xfe200d00) {
-+    DEFINE_PROP_UINT32("INITSVTOR1_RST", IoTKitSysCtl, initsvtor1_rst,
++    if ((insn & 0xffb00f00) == 0xfe200d00) {
-+                       0x10000000),
+         /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
-     DEFINE_PROP_END_OF_LIST()
+         int u = extract32(insn, 4, 1);
  };
 --
 .20.1

-New patch
+[PULL 28/39] target/arm: Convert V[US]DOT (scalar) to decodetree
+Convert the V[US]DOT (scalar) insns in the 2reg-scalar-ext group
+to decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-10-peter.maydell@linaro.org
+---
+ target/arm/neon-shared.decode   |  3 +++
+ target/arm/translate-neon.inc.c | 35 +++++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 13 +-----------
+files changed, 39 insertions(+), 12 deletions(-)
+diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon-shared.decode
++++ b/target/arm/neon-shared.decode
+@@ -XXX,XX +XXX,XX @@ VCMLA_scalar   1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \
+                vn=%vn_dp vd=%vd_dp size=0
+ VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
+                vm=%vm_dp vn=%vn_dp vd=%vd_dp size=1 index=0
++
++VDOT_scalar    1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \
++               vm=%vm_dp vn=%vn_dp vd=%vd_dp
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a)
+     tcg_temp_free_ptr(fpst);
+     return true;
+ }
++
++static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a)
++{
++    gen_helper_gvec_3 *fn_gvec;
++    int opr_sz;
++    TCGv_ptr fpst;
++
++    if (!dc_isar_feature(aa32_dp, s)) {
++        return false;
++    }
++
++    /* UNDEF accesses to D16-D31 if they don't exist. */
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
++        ((a->vd | a->vn) & 0x10)) {
++        return false;
++    }
++
++    if ((a->vd | a->vn) & a->q) {
++        return false;
++    }
++
++    if (!vfp_access_check(s)) {
++        return true;
++    }
++
++    fn_gvec = a->u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
++    opr_sz = (1 + a->q) * 8;
++    fpst = get_fpstatus_ptr(1);
++    tcg_gen_gvec_3_ool(vfp_reg_offset(1, a->vd),
++                       vfp_reg_offset(1, a->vn),
++                       vfp_reg_offset(1, a->rm),
++                       opr_sz, opr_sz, a->index, fn_gvec);
++    tcg_temp_free_ptr(fpst);
++    return true;
++}
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
+     bool is_long = false, q = extract32(insn, 6, 1);
+     bool ptr_is_env = false;
+-    if ((insn & 0xffb00f00) == 0xfe200d00) {
+-        /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
+-        int u = extract32(insn, 4, 1);
+-
+-        if (!dc_isar_feature(aa32_dp, s)) {
+-            return 1;
+-        }
+-        fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b;
+-        /* rm is just Vm, and index is M.  */
+-        data = extract32(insn, 5, 1); /* index */
+-        rm = extract32(insn, 0, 4);
+-    } else if ((insn & 0xffa00f10) == 0xfe000810) {
++    if ((insn & 0xffa00f10) == 0xfe000810) {
+         /* VFM[AS]L -- 1111 1110 0.0S .... .... 1000 .Q.1 .... */
+         int is_s = extract32(insn, 20, 1);
+         int vm20 = extract32(insn, 0, 3);
+--
+.20.1

-New patch
+[PULL 29/39] target/arm: Convert VFM[AS]L (scalar) to decodetree
+Convert the VFM[AS]L (scalar) insns in the 2reg-scalar-ext group
 to decodetree. These are the last ones in the group so we can remove
 all the legacy decode for the group.
 Note that in disas_thumb2_insn() the parts of this encoding space
 where the decodetree decoder returns false will correctly be directed
 to illegal_op by the "(insn & (1 << 28))" check so they won't fall
 into disas_coproc_insn() by mistake.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200430181003.21682-11-peter.maydell@linaro.org
 ---
  target/arm/neon-shared.decode   |   7 +++
  target/arm/translate-neon.inc.c |  32 ++++++++++
  target/arm/translate.c          | 107 +-------------------------------
 files changed, 40 insertions(+), 106 deletions(-)
 diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-shared.decode
 +++ b/target/arm/neon-shared.decode
@@ -XXX,XX +XXX,XX @@ VCMLA_scalar   1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \
  VDOT_scalar    1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \
                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
 +%vfml_scalar_q0_rm 0:3 5:1
 +%vfml_scalar_q1_index 5:1 3:1
 +VFML_scalar    1111 1110 0 . 0 s:1 .... .... 1000 . 0 . 1 index:1 ... \
 +               rm=%vfml_scalar_q0_rm vn=%vn_sp vd=%vd_dp q=0
 +VFML_scalar    1111 1110 0 . 0 s:1 .... .... 1000 . 1 . 1 . rm:3 \
 +               index=%vfml_scalar_q1_index vn=%vn_dp vd=%vd_dp q=1
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a)
      tcg_temp_free_ptr(fpst);
      return true;
  }
 +
 +static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a)
 +{
 +    int opr_sz;
 +
 +    if (!dc_isar_feature(aa32_fhm, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd & 0x10) || (a->q && (a->vn & 0x10)))) {
 +        return false;
 +    }
 +
 +    if (a->vd & a->q) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    opr_sz = (1 + a->q) * 8;
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd),
 +                       vfp_reg_offset(a->q, a->vn),
 +                       vfp_reg_offset(a->q, a->rm),
 +                       cpu_env, opr_sz, opr_sz,
 +                       (a->index << 2) | a->s, /* is_2 == 0 */
 +                       gen_helper_gvec_fmlal_idx_a32);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
  }
  #define VFP_REG_SHR(x, n) (((n) > 0) ? (x) >> (n) : (x) << -(n))
 -#define VFP_SREG(insn, bigbit, smallbit) \
 -  ((VFP_REG_SHR(insn, bigbit - 1) & 0x1e) | (((insn) >> (smallbit)) & 1))
  #define VFP_DREG(reg, insn, bigbit, smallbit) do { \
      if (dc_isar_feature(aa32_simd_r32, s)) { \
          reg = (((insn) >> (bigbit)) & 0x0f) \
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
          reg = ((insn) >> (bigbit)) & 0x0f; \
      }} while (0)
 -#define VFP_SREG_D(insn) VFP_SREG(insn, 12, 22)
  #define VFP_DREG_D(reg, insn) VFP_DREG(reg, insn, 12, 22)
 -#define VFP_SREG_N(insn) VFP_SREG(insn, 16,  7)
  #define VFP_DREG_N(reg, insn) VFP_DREG(reg, insn, 16,  7)
 -#define VFP_SREG_M(insn) VFP_SREG(insn,  0,  5)
  #define VFP_DREG_M(reg, insn) VFP_DREG(reg, insn,  0,  5)
  static void gen_neon_dup_low16(TCGv_i32 var)
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      return 0;
  }
 -/* Advanced SIMD two registers and a scalar extension.
 - *  31             24   23  22   20   16   12  11   10   9    8        3     0
 - * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
 - * | 1 1 1 1 1 1 1 0 | o1 | D | o2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
 - * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
 - *
 - */
 -
 -static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
 -{
 -    gen_helper_gvec_3 *fn_gvec = NULL;
 -    gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
 -    int rd, rn, rm, opr_sz, data;
 -    int off_rn, off_rm;
 -    bool is_long = false, q = extract32(insn, 6, 1);
 -    bool ptr_is_env = false;
 -
 -    if ((insn & 0xffa00f10) == 0xfe000810) {
 -        /* VFM[AS]L -- 1111 1110 0.0S .... .... 1000 .Q.1 .... */
 -        int is_s = extract32(insn, 20, 1);
 -        int vm20 = extract32(insn, 0, 3);
 -        int vm3 = extract32(insn, 3, 1);
 -        int m = extract32(insn, 5, 1);
 -        int index;
 -
 -        if (!dc_isar_feature(aa32_fhm, s)) {
 -            return 1;
 -        }
 -        if (q) {
 -            rm = vm20;
 -            index = m * 2 + vm3;
 -        } else {
 -            rm = vm20 * 2 + m;
 -            index = vm3;
 -        }
 -        is_long = true;
 -        data = (index << 2) | is_s; /* is_2 == 0 */
 -        fn_gvec_ptr = gen_helper_gvec_fmlal_idx_a32;
 -        ptr_is_env = true;
 -    } else {
 -        return 1;
 -    }
 -
 -    VFP_DREG_D(rd, insn);
 -    if (rd & q) {
 -        return 1;
 -    }
 -    if (q || !is_long) {
 -        VFP_DREG_N(rn, insn);
 -        if (rn & q & !is_long) {
 -            return 1;
 -        }
 -        off_rn = vfp_reg_offset(1, rn);
 -        off_rm = vfp_reg_offset(1, rm);
 -    } else {
 -        rn = VFP_SREG_N(insn);
 -        off_rn = vfp_reg_offset(0, rn);
 -        off_rm = vfp_reg_offset(0, rm);
 -    }
 -    if (s->fp_excp_el) {
 -        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
 -                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
 -        return 0;
 -    }
 -    if (!s->vfp_enabled) {
 -        return 1;
 -    }
 -
 -    opr_sz = (1 + q) * 8;
 -    if (fn_gvec_ptr) {
 -        TCGv_ptr ptr;
 -        if (ptr_is_env) {
 -            ptr = cpu_env;
 -        } else {
 -            ptr = get_fpstatus_ptr(1);
 -        }
 -        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
 -                           opr_sz, opr_sz, data, fn_gvec_ptr);
 -        if (!ptr_is_env) {
 -            tcg_temp_free_ptr(ptr);
 -        }
 -    } else {
 -        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
 -                           opr_sz, opr_sz, data, fn_gvec);
 -    }
 -    return 0;
 -}
 -
  static int disas_coproc_insn(DisasContext *s, uint32_t insn)
  {
      int cpnum, is64, crn, crm, opc1, opc2, isread, rt, rt2;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                      }
                  }
              }
 -        } else if ((insn & 0x0f000a00) == 0x0e000800
 -                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -            return;
          }
          goto illegal_op;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
              }
              break;
          }
 -        if ((insn & 0xff000a00) == 0xfe000800
 -            && arm_dc_feature(s, ARM_FEATURE_V8)) {
 -            /* The Thumb2 and ARM encodings are identical.  */
 -            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
 -                goto illegal_op;
 -            }
 -        } else if (((insn >> 24) & 3) == 3) {
 +        if (((insn >> 24) & 3) == 3) {
              /* Translate into the equivalent ARM encoding.  */
              insn = (insn & 0xe2ffffff) | ((insn & (1 << 28)) >> 4) | (1 << 28);
              if (disas_neon_data_insn(s, insn)) {
 --
 .20.1

-[Qemu-devel] [PULL 07/16] hw/arm/iotkit-sysctl: Implement CPUWAIT and INITSVTOR*
+[PULL 30/39] target/arm: Convert Neon load/store multiple structures to decodetree
-The CPUWAIT register acts as a sort of power-control: if a bit
+Convert the Neon "load/store multiple structures" insns to decodetree.
 in it is 1 then the CPU will have been forced into waiting
 when the system was reset (which in QEMU we model as the
 CPU starting powered off). Writing a 0 to the register will
 allow the CPU to boot (for QEMU, we model this as powering
 it on). Note that writing 0 to the register does not power
 off a CPU.
 For this to work correctly we need to also honour the
 INITSVTOR* registers, which let the guest control where the
 CPU will load its SP and PC from when it comes out of reset.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190219125808.25174-8-peter.maydell@linaro.org
+Message-id: 20200430181003.21682-12-peter.maydell@linaro.org
 ---
- hw/misc/iotkit-sysctl.c | 41 +++++++++++++++++++++++++++++++++++++----
+ target/arm/neon-ls.decode       |   7 ++
-file changed, 37 insertions(+), 4 deletions(-)
+ target/arm/translate-neon.inc.c | 124 ++++++++++++++++++++++++++++++++
+ target/arm/translate.c          |  91 +----------------------
-diff --git a/hw/misc/iotkit-sysctl.c b/hw/misc/iotkit-sysctl.c
+files changed, 133 insertions(+), 89 deletions(-)
 diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/iotkit-sysctl.c
+--- a/target/arm/neon-ls.decode
-+++ b/hw/misc/iotkit-sysctl.c
++++ b/target/arm/neon-ls.decode
 @@ -XXX,XX +XXX,XX @@
- #include "hw/sysbus.h"
+ #   0b1111_1001_xxx0_xxxx_xxxx_xxxx_xxxx_xxxx
- #include "hw/registerfields.h"
+ # This file works on the A32 encoding only; calling code for T32 has to
- #include "hw/misc/iotkit-sysctl.h"
+ # transform the insn into the A32 version first.
-+#include "target/arm/arm-powerctl.h"
++
-+#include "target/arm/cpu.h"
++%vd_dp  22:1 12:4
++
- REG32(SECDBGSTAT, 0x0)
++# Neon load/store multiple structures
- REG32(SECDBGSET, 0x4)
++
-@@ -XXX,XX +XXX,XX @@ static const int sysctl_id[] = {
++VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
-x0d, 0xf0, 0x05, 0xb1, /* CID0..CID3 */
++               vd=%vd_dp
- };
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
-+/*
+--- a/target/arm/translate-neon.inc.c
-+ * Set the initial secure vector table offset address for the core.
++++ b/target/arm/translate-neon.inc.c
-+ * This will take effect when the CPU next resets.
+@@ -XXX,XX +XXX,XX @@ static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a)
-+ */
+                        gen_helper_gvec_fmlal_idx_a32);
-+static void set_init_vtor(uint64_t cpuid, uint32_t vtor)
+     return true;
  }
 +
 +static struct {
 +    int nregs;
 +    int interleave;
 +    int spacing;
 +} const neon_ls_element_type[11] = {
 +    {1, 4, 1},
 +    {1, 4, 2},
 +    {4, 1, 1},
 +    {2, 2, 2},
 +    {1, 3, 1},
 +    {1, 3, 2},
 +    {3, 1, 1},
 +    {1, 1, 1},
 +    {1, 2, 1},
 +    {1, 2, 2},
 +    {2, 1, 1}
 +};
 +
 +static void gen_neon_ldst_base_update(DisasContext *s, int rm, int rn,
 +                                      int stride)
 +{
-+    Object *cpuobj = OBJECT(arm_get_cpu_by_id(cpuid));
++    if (rm != 15) {
-+
++        TCGv_i32 base;
-+    if (cpuobj) {
++
-+        if (object_property_find(cpuobj, "init-svtor", NULL)) {
++        base = load_reg(s, rn);
-+            object_property_set_uint(cpuobj, vtor, "init-svtor", &error_abort);
++        if (rm == 13) {
-+        }
++            tcg_gen_addi_i32(base, base, stride);
 +        } else {
 +            TCGv_i32 index;
 +            index = load_reg(s, rm);
 +            tcg_gen_add_i32(base, base, index);
 +            tcg_temp_free_i32(index);
 +        }
 +        store_reg(s, rn, base);
 +    }
 +}
 +
- static uint64_t iotkit_sysctl_read(void *opaque, hwaddr offset,
++static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
-                                     unsigned size)
++{
 +    /* Neon load/store multiple structures */
 +    int nregs, interleave, spacing, reg, n;
 +    MemOp endian = s->be_data;
 +    int mmu_idx = get_mem_index(s);
 +    int size = a->size;
 +    TCGv_i64 tmp64;
 +    TCGv_i32 addr, tmp;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist */
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
 +        return false;
 +    }
 +    if (a->itype > 10) {
 +        return false;
 +    }
 +    /* Catch UNDEF cases for bad values of align field */
 +    switch (a->itype & 0xc) {
 +    case 4:
 +        if (a->align >= 2) {
 +            return false;
 +        }
 +        break;
 +    case 8:
 +        if (a->align == 3) {
 +            return false;
 +        }
 +        break;
 +    default:
 +        break;
 +    }
 +    nregs = neon_ls_element_type[a->itype].nregs;
 +    interleave = neon_ls_element_type[a->itype].interleave;
 +    spacing = neon_ls_element_type[a->itype].spacing;
 +    if (size == 3 && (interleave | spacing) != 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    /* For our purposes, bytes are always little-endian.  */
 +    if (size == 0) {
 +        endian = MO_LE;
 +    }
 +    /*
 +     * Consecutive little-endian elements from a single register
 +     * can be promoted to a larger little-endian operation.
 +     */
 +    if (interleave == 1 && endian == MO_LE) {
 +        size = 3;
 +    }
 +    tmp64 = tcg_temp_new_i64();
 +    addr = tcg_temp_new_i32();
 +    tmp = tcg_const_i32(1 << size);
 +    load_reg_var(s, addr, a->rn);
 +    for (reg = 0; reg < nregs; reg++) {
 +        for (n = 0; n < 8 >> size; n++) {
 +            int xs;
 +            for (xs = 0; xs < interleave; xs++) {
 +                int tt = a->vd + reg + spacing * xs;
 +
 +                if (a->l) {
 +                    gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
 +                    neon_store_element64(tt, n, size, tmp64);
 +                } else {
 +                    neon_load_element64(tmp64, tt, n, size);
 +                    gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
 +                }
 +                tcg_gen_add_i32(addr, addr, tmp);
 +            }
 +        }
 +    }
 +    tcg_temp_free_i32(addr);
 +    tcg_temp_free_i32(tmp);
 +    tcg_temp_free_i64(tmp64);
 +
 +    gen_neon_ldst_base_update(s, a->rm, a->rn, nregs * interleave * 8);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
  }
 -static struct {
 -    int nregs;
 -    int interleave;
 -    int spacing;
 -} const neon_ls_element_type[11] = {
 -    {1, 4, 1},
 -    {1, 4, 2},
 -    {4, 1, 1},
 -    {2, 2, 2},
 -    {1, 3, 1},
 -    {1, 3, 2},
 -    {3, 1, 1},
 -    {1, 1, 1},
 -    {1, 2, 1},
 -    {1, 2, 2},
 -    {2, 1, 1}
 -};
 -
  /* Translate a NEON load/store element instruction.  Return nonzero if the
     instruction is invalid.  */
  static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
  {
-@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_write(void *opaque, hwaddr offset,
+     int rd, rn, rm;
-         s->gretreg = value;
+-    int op;
-         break;
+     int nregs;
-     case A_INITSVTOR0:
+-    int interleave;
--        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl INITSVTOR0 unimplemented\n");
+-    int spacing;
-         s->initsvtor0 = value;
+     int stride;
-+        set_init_vtor(0, s->initsvtor0);
+     int size;
-         break;
+     int reg;
-     case A_CPUWAIT:
+     int load;
--        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl CPUWAIT unimplemented\n");
+-    int n;
-+        if ((s->cpuwait & 1) && !(value & 1)) {
+     int vec_size;
-+            /* Powering up CPU 0 */
+-    int mmu_idx;
-+            arm_set_cpu_on_and_reset(0);
+-    MemOp endian;
-+        }
+     TCGv_i32 addr;
-+        if ((s->cpuwait & 2) && !(value & 2)) {
+     TCGv_i32 tmp;
-+            /* Powering up CPU 1 */
+-    TCGv_i32 tmp2;
-+            arm_set_cpu_on_and_reset(1);
+-    TCGv_i64 tmp64;
-+        }
-         s->cpuwait = value;
+     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-         break;
+         return 1;
-     case A_WICCTRL:
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
-@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_write(void *opaque, hwaddr offset,
+     rn = (insn >> 16) & 0xf;
-         if (!s->is_sse200) {
+     rm = insn & 0xf;
-             goto bad_offset;
+     load = (insn & (1 << 21)) != 0;
-         }
+-    endian = s->be_data;
--        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl INITSVTOR1 unimplemented\n");
+-    mmu_idx = get_mem_index(s);
-         s->initsvtor1 = value;
+     if ((insn & (1 << 23)) == 0) {
-+        set_init_vtor(1, s->initsvtor1);
+-        /* Load store all elements.  */
-         break;
+-        op = (insn >> 8) & 0xf;
-     case A_EWCTRL:
+-        size = (insn >> 6) & 3;
-         if (!s->is_sse200) {
+-        if (op > 10)
-@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_reset(DeviceState *dev)
+-            return 1;
-     s->gretreg = 0;
+-        /* Catch UNDEF cases for bad values of align field */
-     s->initsvtor0 = 0x10000000;
+-        switch (op & 0xc) {
-     s->initsvtor1 = 0x10000000;
+-        case 4:
--    s->cpuwait = 0;
+-            if (((insn >> 5) & 1) == 1) {
-+    if (s->is_sse200) {
+-                return 1;
-+        /*
+-            }
-+         * CPU 0 starts on, CPU 1 starts off. In real hardware this is
+-            break;
-+         * configurable by the SoC integrator as a verilog parameter.
+-        case 8:
-+         */
+-            if (((insn >> 4) & 3) == 3) {
-+        s->cpuwait = 2;
+-                return 1;
-+    } else {
+-            }
-+        /* CPU 0 starts on */
+-            break;
-+        s->cpuwait = 0;
+-        default:
-+    }
+-            break;
-     s->wicctrl = 0;
+-        }
-     s->scsecctrl = 0;
+-        nregs = neon_ls_element_type[op].nregs;
-     s->fclk_div = 0;
+-        interleave = neon_ls_element_type[op].interleave;
 -        spacing = neon_ls_element_type[op].spacing;
 -        if (size == 3 && (interleave | spacing) != 1) {
 -            return 1;
 -        }
 -        /* For our purposes, bytes are always little-endian.  */
 -        if (size == 0) {
 -            endian = MO_LE;
 -        }
 -        /* Consecutive little-endian elements from a single register
 -         * can be promoted to a larger little-endian operation.
 -         */
 -        if (interleave == 1 && endian == MO_LE) {
 -            size = 3;
 -        }
 -        tmp64 = tcg_temp_new_i64();
 -        addr = tcg_temp_new_i32();
 -        tmp2 = tcg_const_i32(1 << size);
 -        load_reg_var(s, addr, rn);
 -        for (reg = 0; reg < nregs; reg++) {
 -            for (n = 0; n < 8 >> size; n++) {
 -                int xs;
 -                for (xs = 0; xs < interleave; xs++) {
 -                    int tt = rd + reg + spacing * xs;
 -
 -                    if (load) {
 -                        gen_aa32_ld_i64(s, tmp64, addr, mmu_idx, endian | size);
 -                        neon_store_element64(tt, n, size, tmp64);
 -                    } else {
 -                        neon_load_element64(tmp64, tt, n, size);
 -                        gen_aa32_st_i64(s, tmp64, addr, mmu_idx, endian | size);
 -                    }
 -                    tcg_gen_add_i32(addr, addr, tmp2);
 -                }
 -            }
 -        }
 -        tcg_temp_free_i32(addr);
 -        tcg_temp_free_i32(tmp2);
 -        tcg_temp_free_i64(tmp64);
 -        stride = nregs * interleave * 8;
 +        /* Load store all elements -- handled already by decodetree */
 +        return 1;
      } else {
          size = (insn >> 10) & 3;
          if (size == 3) {
 --
 .20.1

-[Qemu-devel] [PULL 04/16] target/arm/arm-powerctl: Add new arm_set_cpu_on_and_reset()
+[PULL 31/39] target/arm: Convert Neon 'load single structure to all lanes' to decodetree
-Currently the Arm arm-powerctl.h APIs allow:
+Convert the Neon "load single structure to all lanes" insns to
- * arm_set_cpu_on(), which powers on a CPU and sets its
+decodetree.
    initial PC and other startup state
  * arm_reset_cpu(), which resets a CPU which is already on
    (and fails if the CPU is powered off)
 but there is no way to say "power on a CPU as if it had
 just come out of reset and don't do anything else to it".
 Add a new function arm_set_cpu_on_and_reset(), which does this.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190219125808.25174-5-peter.maydell@linaro.org
+Message-id: 20200430181003.21682-13-peter.maydell@linaro.org
 ---
- target/arm/arm-powerctl.h | 16 +++++++++++
+ target/arm/neon-ls.decode       |  5 +++
- target/arm/arm-powerctl.c | 56 +++++++++++++++++++++++++++++++++++++++
+ target/arm/translate-neon.inc.c | 73 +++++++++++++++++++++++++++++++++
-files changed, 72 insertions(+)
+ target/arm/translate.c          | 55 +------------------------
 files changed, 80 insertions(+), 53 deletions(-)
-diff --git a/target/arm/arm-powerctl.h b/target/arm/arm-powerctl.h
+diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/arm-powerctl.h
+--- a/target/arm/neon-ls.decode
-+++ b/target/arm/arm-powerctl.h
++++ b/target/arm/neon-ls.decode
-@@ -XXX,XX +XXX,XX @@ int arm_set_cpu_off(uint64_t cpuid);
+@@ -XXX,XX +XXX,XX @@
-  */
- int arm_reset_cpu(uint64_t cpuid);
+ VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
+                vd=%vd_dp
 +/*
 + * arm_set_cpu_on_and_reset:
 + * @cpuid: the id of the CPU we want to star
 + *
 + * Start the cpu designated by @cpuid and put it through its normal
 + * CPU reset process. The CPU will start in the way it is architected
 + * to start after a power-on reset.
 + *
 + * Returns: QEMU_ARM_POWERCTL_RET_SUCCESS on success.
 + * QEMU_ARM_POWERCTL_INVALID_PARAM if there is no CPU with that ID.
 + * QEMU_ARM_POWERCTL_ALREADY_ON if the CPU is already on.
 + * QEMU_ARM_POWERCTL_ON_PENDING if the CPU is already partway through
 + * powering on.
 + */
 +int arm_set_cpu_on_and_reset(uint64_t cpuid);
 +
- #endif
++# Neon load single element to all lanes
-diff --git a/target/arm/arm-powerctl.c b/target/arm/arm-powerctl.c
++
 +VLD_all_lanes  1111 0100 1 . 1 0 rn:4 .... 11 n:2 size:2 t:1 a:1 rm:4 \
 +               vd=%vd_dp
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/arm-powerctl.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/arm-powerctl.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
+@@ -XXX,XX +XXX,XX @@ static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
-     return QEMU_ARM_POWERCTL_RET_SUCCESS;
+     gen_neon_ldst_base_update(s, a->rm, a->rn, nregs * interleave * 8);
      return true;
  }
++
-+static void arm_set_cpu_on_and_reset_async_work(CPUState *target_cpu_state,
++static bool trans_VLD_all_lanes(DisasContext *s, arg_VLD_all_lanes *a)
 +                                                run_on_cpu_data data)
 +{
-+    ARMCPU *target_cpu = ARM_CPU(target_cpu_state);
++    /* Neon load single structure to all lanes */
 +    int reg, stride, vec_size;
 +    int vd = a->vd;
 +    int size = a->size;
 +    int nregs = a->n + 1;
 +    TCGv_i32 addr, tmp;
 +
-+    /* Initialize the cpu we are turning on */
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+    cpu_reset(target_cpu_state);
++        return false;
 +    target_cpu_state->halted = 0;
 +
 +    /* Finally set the power status */
 +    assert(qemu_mutex_iothread_locked());
 +    target_cpu->power_state = PSCI_ON;
 +}
 +
 +int arm_set_cpu_on_and_reset(uint64_t cpuid)
 +{
 +    CPUState *target_cpu_state;
 +    ARMCPU *target_cpu;
 +
 +    assert(qemu_mutex_iothread_locked());
 +
 +    /* Retrieve the cpu we are powering up */
 +    target_cpu_state = arm_get_cpu_by_id(cpuid);
 +    if (!target_cpu_state) {
 +        /* The cpu was not found */
 +        return QEMU_ARM_POWERCTL_INVALID_PARAM;
 +    }
 +
-+    target_cpu = ARM_CPU(target_cpu_state);
++    /* UNDEF accesses to D16-D31 if they don't exist */
-+    if (target_cpu->power_state == PSCI_ON) {
++    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
-+        qemu_log_mask(LOG_GUEST_ERROR,
++        return false;
-+                      "[ARM]%s: CPU %" PRId64 " is already on\n",
++    }
-+                      __func__, cpuid);
++
-+        return QEMU_ARM_POWERCTL_ALREADY_ON;
++    if (size == 3) {
 +        if (nregs != 4 || a->a == 0) {
 +            return false;
 +        }
 +        /* For VLD4 size == 3 a == 1 means 32 bits at 16 byte alignment */
 +        size = 2;
 +    }
 +    if (nregs == 1 && a->a == 1 && size == 0) {
 +        return false;
 +    }
 +    if (nregs == 3 && a->a == 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    /*
-+     * If another CPU has powered the target on we are in the state
++     * VLD1 to all lanes: T bit indicates how many Dregs to write.
-+     * ON_PENDING and additional attempts to power on the CPU should
++     * VLD2/3/4 to all lanes: T bit indicates register stride.
 +     * fail (see 6.6 Implementation CPU_ON/CPU_OFF races in the PSCI
 +     * spec)
 +     */
-+    if (target_cpu->power_state == PSCI_ON_PENDING) {
++    stride = a->t ? 2 : 1;
-+        qemu_log_mask(LOG_GUEST_ERROR,
++    vec_size = nregs == 1 ? stride * 8 : 8;
-+                      "[ARM]%s: CPU %" PRId64 " is already powering on\n",
++
-+                      __func__, cpuid);
++    tmp = tcg_temp_new_i32();
-+        return QEMU_ARM_POWERCTL_ON_PENDING;
++    addr = tcg_temp_new_i32();
 +    load_reg_var(s, addr, a->rn);
 +    for (reg = 0; reg < nregs; reg++) {
 +        gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 +                        s->be_data | size);
 +        if ((vd & 1) && vec_size == 16) {
 +            /*
 +             * We cannot write 16 bytes at once because the
 +             * destination is unaligned.
 +             */
 +            tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0),
 +                                 8, 8, tmp);
 +            tcg_gen_gvec_mov(0, neon_reg_offset(vd + 1, 0),
 +                             neon_reg_offset(vd, 0), 8, 8);
 +        } else {
 +            tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0),
 +                                 vec_size, vec_size, tmp);
 +        }
 +        tcg_gen_addi_i32(addr, addr, 1 << size);
 +        vd += stride;
 +    }
++    tcg_temp_free_i32(tmp);
++    tcg_temp_free_i32(addr);
 +
-+    async_run_on_cpu(target_cpu_state, arm_set_cpu_on_and_reset_async_work,
++    gen_neon_ldst_base_update(s, a->rm, a->rn, (1 << size) * nregs);
 +                     RUN_ON_CPU_NULL);
 +
-+    /* We are good to go */
++    return true;
 +    return QEMU_ARM_POWERCTL_RET_SUCCESS;
 +}
-+
+diff --git a/target/arm/translate.c b/target/arm/translate.c
- static void arm_set_cpu_off_async_work(CPUState *target_cpu_state,
+index XXXXXXX..XXXXXXX 100644
-                                        run_on_cpu_data data)
+--- a/target/arm/translate.c
- {
++++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      int size;
      int reg;
      int load;
 -    int vec_size;
      TCGv_i32 addr;
      TCGv_i32 tmp;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      } else {
          size = (insn >> 10) & 3;
          if (size == 3) {
 -            /* Load single element to all lanes.  */
 -            int a = (insn >> 4) & 1;
 -            if (!load) {
 -                return 1;
 -            }
 -            size = (insn >> 6) & 3;
 -            nregs = ((insn >> 8) & 3) + 1;
 -
 -            if (size == 3) {
 -                if (nregs != 4 || a == 0) {
 -                    return 1;
 -                }
 -                /* For VLD4 size==3 a == 1 means 32 bits at 16 byte alignment */
 -                size = 2;
 -            }
 -            if (nregs == 1 && a == 1 && size == 0) {
 -                return 1;
 -            }
 -            if (nregs == 3 && a == 1) {
 -                return 1;
 -            }
 -            addr = tcg_temp_new_i32();
 -            load_reg_var(s, addr, rn);
 -
 -            /* VLD1 to all lanes: bit 5 indicates how many Dregs to write.
 -             * VLD2/3/4 to all lanes: bit 5 indicates register stride.
 -             */
 -            stride = (insn & (1 << 5)) ? 2 : 1;
 -            vec_size = nregs == 1 ? stride * 8 : 8;
 -
 -            tmp = tcg_temp_new_i32();
 -            for (reg = 0; reg < nregs; reg++) {
 -                gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 -                                s->be_data | size);
 -                if ((rd & 1) && vec_size == 16) {
 -                    /* We cannot write 16 bytes at once because the
 -                     * destination is unaligned.
 -                     */
 -                    tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
 -                                         8, 8, tmp);
 -                    tcg_gen_gvec_mov(0, neon_reg_offset(rd + 1, 0),
 -                                     neon_reg_offset(rd, 0), 8, 8);
 -                } else {
 -                    tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0),
 -                                         vec_size, vec_size, tmp);
 -                }
 -                tcg_gen_addi_i32(addr, addr, 1 << size);
 -                rd += stride;
 -            }
 -            tcg_temp_free_i32(tmp);
 -            tcg_temp_free_i32(addr);
 -            stride = (1 << size) * nregs;
 +            /* Load single element to all lanes -- handled by decodetree  */
 +            return 1;
          } else {
              /* Single element.  */
              int idx = (insn >> 4) & 0xf;
 --
 .20.1

-[Qemu-devel] [PULL 03/16] target/arm/cpu: Allow init-svtor property to be set after realize
+[PULL 32/39] target/arm: Convert Neon 'load/store single structure' to decodetree
-Make the M-profile "init-svtor" property be settable after realize.
+Convert the Neon "load/store single structure to one lane" insns to
-This matches the hardware, where this is a config signal which
+decodetree.
-is sampled on CPU reset and can thus be changed between one
-reset and another. To do this we have to change the API we
+As this is the last set of insns in the neon load/store group,
-use to add the property.
+we can remove the whole disas_neon_ls_insn() function.
 (We will need this capability for the SSE-200.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190219125808.25174-4-peter.maydell@linaro.org
+Message-id: 20200430181003.21682-14-peter.maydell@linaro.org
 ---
- target/arm/cpu.c | 29 ++++++++++++++++++++++++-----
+ target/arm/neon-ls.decode       |  11 +++
-file changed, 24 insertions(+), 5 deletions(-)
+ target/arm/translate-neon.inc.c |  89 +++++++++++++++++++
+ target/arm/translate.c          | 147 --------------------------------
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+files changed, 100 insertions(+), 147 deletions(-)
 diff --git a/target/arm/neon-ls.decode b/target/arm/neon-ls.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+--- a/target/arm/neon-ls.decode
-+++ b/target/arm/cpu.c
++++ b/target/arm/neon-ls.decode
@@ -XXX,XX +XXX,XX @@ VLDST_multiple 1111 0100 0 . l:1 0 rn:4 .... itype:4 size:2 align:2 rm:4 \
  VLD_all_lanes  1111 0100 1 . 1 0 rn:4 .... 11 n:2 size:2 t:1 a:1 rm:4 \
                 vd=%vd_dp
 +
 +# Neon load/store single structure to one lane
 +%imm1_5_p1 5:1 !function=plus1
 +%imm1_6_p1 6:1 !function=plus1
 +
 +VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 00 n:2 reg_idx:3 align:1 rm:4 \
 +               vd=%vd_dp size=0 stride=1
 +VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 01 n:2 reg_idx:2 align:2 rm:4 \
 +               vd=%vd_dp size=1 stride=%imm1_5_p1
 +VLDST_single   1111 0100 1 . l:1 0 rn:4 .... 10 n:2 reg_idx:1 align:3 rm:4 \
 +               vd=%vd_dp size=2 stride=%imm1_6_p1
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
 @@ -XXX,XX +XXX,XX @@
- #include "target/arm/idau.h"
+  * It might be possible to convert it to a standalone .c file eventually.
- #include "qemu/error-report.h"
+  */
- #include "qapi/error.h"
-+#include "qapi/visitor.h"
++static inline int plus1(DisasContext *s, int x)
  #include "cpu.h"
  #include "internals.h"
  #include "qemu-common.h"
@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_pmsav7_dregion_property =
                                             pmsav7_dregion,
                                             qdev_prop_uint32, uint32_t);
 -/* M profile: initial value of the Secure VTOR */
 -static Property arm_cpu_initsvtor_property =
 -            DEFINE_PROP_UINT32("init-svtor", ARMCPU, init_svtor, 0);
 +static void arm_get_init_svtor(Object *obj, Visitor *v, const char *name,
 +                               void *opaque, Error **errp)
 +{
-+    ARMCPU *cpu = ARM_CPU(obj);
++    return x + 1;
 +
 +    visit_type_uint32(v, name, &cpu->init_svtor, errp);
 +}
 +
-+static void arm_set_init_svtor(Object *obj, Visitor *v, const char *name,
+ /* Include the generated Neon decoder */
-+                               void *opaque, Error **errp)
+ #include "decode-neon-dp.inc.c"
  #include "decode-neon-ls.inc.c"
@@ -XXX,XX +XXX,XX @@ static bool trans_VLD_all_lanes(DisasContext *s, arg_VLD_all_lanes *a)
      return true;
  }
 +
 +static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a)
 +{
-+    ARMCPU *cpu = ARM_CPU(obj);
++    /* Neon load/store single structure to one lane */
-+
++    int reg;
-+    visit_type_uint32(v, name, &cpu->init_svtor, errp);
++    int nregs = a->n + 1;
 +    int vd = a->vd;
 +    TCGv_i32 addr, tmp;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist */
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
 +        return false;
 +    }
 +
 +    /* Catch the UNDEF cases. This is unavoidably a bit messy. */
 +    switch (nregs) {
 +    case 1:
 +        if (((a->align & (1 << a->size)) != 0) ||
 +            (a->size == 2 && ((a->align & 3) == 1 || (a->align & 3) == 2))) {
 +            return false;
 +        }
 +        break;
 +    case 3:
 +        if ((a->align & 1) != 0) {
 +            return false;
 +        }
 +        /* fall through */
 +    case 2:
 +        if (a->size == 2 && (a->align & 2) != 0) {
 +            return false;
 +        }
 +        break;
 +    case 4:
 +        if ((a->size == 2) && ((a->align & 3) == 3)) {
 +            return false;
 +        }
 +        break;
 +    default:
 +        abort();
 +    }
 +    if ((vd + a->stride * (nregs - 1)) > 31) {
 +        /*
 +         * Attempts to write off the end of the register file are
 +         * UNPREDICTABLE; we choose to UNDEF because otherwise we would
 +         * access off the end of the array that holds the register data.
 +         */
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    tmp = tcg_temp_new_i32();
 +    addr = tcg_temp_new_i32();
 +    load_reg_var(s, addr, a->rn);
 +    /*
 +     * TODO: if we implemented alignment exceptions, we should check
 +     * addr against the alignment encoded in a->align here.
 +     */
 +    for (reg = 0; reg < nregs; reg++) {
 +        if (a->l) {
 +            gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 +                            s->be_data | a->size);
 +            neon_store_element(vd, a->reg_idx, a->size, tmp);
 +        } else { /* Store */
 +            neon_load_element(tmp, vd, a->reg_idx, a->size);
 +            gen_aa32_st_i32(s, tmp, addr, get_mem_index(s),
 +                            s->be_data | a->size);
 +        }
 +        vd += a->stride;
 +        tcg_gen_addi_i32(addr, addr, 1 << a->size);
 +    }
 +    tcg_temp_free_i32(addr);
 +    tcg_temp_free_i32(tmp);
 +
 +    gen_neon_ldst_base_update(s, a->rm, a->rn, (1 << a->size) * nregs);
 +
 +    return true;
 +}
+diff --git a/target/arm/translate.c b/target/arm/translate.c
- void arm_cpu_post_init(Object *obj)
+index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
      tcg_temp_free_i32(rd);
  }
 -
 -/* Translate a NEON load/store element instruction.  Return nonzero if the
 -   instruction is invalid.  */
 -static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
 -{
 -    int rd, rn, rm;
 -    int nregs;
 -    int stride;
 -    int size;
 -    int reg;
 -    int load;
 -    TCGv_i32 addr;
 -    TCGv_i32 tmp;
 -
 -    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 -        return 1;
 -    }
 -
 -    /* FIXME: this access check should not take precedence over UNDEF
 -     * for invalid encodings; we will generate incorrect syndrome information
 -     * for attempts to execute invalid vfp/neon encodings with FP disabled.
 -     */
 -    if (s->fp_excp_el) {
 -        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
 -                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
 -        return 0;
 -    }
 -
 -    if (!s->vfp_enabled)
 -      return 1;
 -    VFP_DREG_D(rd, insn);
 -    rn = (insn >> 16) & 0xf;
 -    rm = insn & 0xf;
 -    load = (insn & (1 << 21)) != 0;
 -    if ((insn & (1 << 23)) == 0) {
 -        /* Load store all elements -- handled already by decodetree */
 -        return 1;
 -    } else {
 -        size = (insn >> 10) & 3;
 -        if (size == 3) {
 -            /* Load single element to all lanes -- handled by decodetree  */
 -            return 1;
 -        } else {
 -            /* Single element.  */
 -            int idx = (insn >> 4) & 0xf;
 -            int reg_idx;
 -            switch (size) {
 -            case 0:
 -                reg_idx = (insn >> 5) & 7;
 -                stride = 1;
 -                break;
 -            case 1:
 -                reg_idx = (insn >> 6) & 3;
 -                stride = (insn & (1 << 5)) ? 2 : 1;
 -                break;
 -            case 2:
 -                reg_idx = (insn >> 7) & 1;
 -                stride = (insn & (1 << 6)) ? 2 : 1;
 -                break;
 -            default:
 -                abort();
 -            }
 -            nregs = ((insn >> 8) & 3) + 1;
 -            /* Catch the UNDEF cases. This is unavoidably a bit messy. */
 -            switch (nregs) {
 -            case 1:
 -                if (((idx & (1 << size)) != 0) ||
 -                    (size == 2 && ((idx & 3) == 1 || (idx & 3) == 2))) {
 -                    return 1;
 -                }
 -                break;
 -            case 3:
 -                if ((idx & 1) != 0) {
 -                    return 1;
 -                }
 -                /* fall through */
 -            case 2:
 -                if (size == 2 && (idx & 2) != 0) {
 -                    return 1;
 -                }
 -                break;
 -            case 4:
 -                if ((size == 2) && ((idx & 3) == 3)) {
 -                    return 1;
 -                }
 -                break;
 -            default:
 -                abort();
 -            }
 -            if ((rd + stride * (nregs - 1)) > 31) {
 -                /* Attempts to write off the end of the register file
 -                 * are UNPREDICTABLE; we choose to UNDEF because otherwise
 -                 * the neon_load_reg() would write off the end of the array.
 -                 */
 -                return 1;
 -            }
 -            tmp = tcg_temp_new_i32();
 -            addr = tcg_temp_new_i32();
 -            load_reg_var(s, addr, rn);
 -            for (reg = 0; reg < nregs; reg++) {
 -                if (load) {
 -                    gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
 -                                    s->be_data | size);
 -                    neon_store_element(rd, reg_idx, size, tmp);
 -                } else { /* Store */
 -                    neon_load_element(tmp, rd, reg_idx, size);
 -                    gen_aa32_st_i32(s, tmp, addr, get_mem_index(s),
 -                                    s->be_data | size);
 -                }
 -                rd += stride;
 -                tcg_gen_addi_i32(addr, addr, 1 << size);
 -            }
 -            tcg_temp_free_i32(addr);
 -            tcg_temp_free_i32(tmp);
 -            stride = nregs * (1 << size);
 -        }
 -    }
 -    if (rm != 15) {
 -        TCGv_i32 base;
 -
 -        base = load_reg(s, rn);
 -        if (rm == 13) {
 -            tcg_gen_addi_i32(base, base, stride);
 -        } else {
 -            TCGv_i32 index;
 -            index = load_reg(s, rm);
 -            tcg_gen_add_i32(base, base, index);
 -            tcg_temp_free_i32(index);
 -        }
 -        store_reg(s, rn, base);
 -    }
 -    return 0;
 -}
 -
  static inline void gen_neon_narrow(int size, TCGv_i32 dest, TCGv_i64 src)
  {
-@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
+     switch (size) {
-                                  qdev_prop_allow_set_link_before_realize,
+@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
-                                  OBJ_PROP_LINK_STRONG,
+             }
-                                  &error_abort);
+             return;
--        qdev_property_add_static(DEVICE(obj), &arm_cpu_initsvtor_property,
+         }
--                                 &error_abort);
+-        if ((insn & 0x0f100000) == 0x04000000) {
-+        /*
+-            /* NEON load/store.  */
-+         * M profile: initial value of the Secure VTOR. We can't just use
+-            if (disas_neon_ls_insn(s, insn)) {
-+         * a simple DEFINE_PROP_UINT32 for this because we want to permit
+-                goto illegal_op;
-+         * the property to be set after realize.
+-            }
-+         */
+-            return;
-+        object_property_add(obj, "init-svtor", "uint32",
+-        }
-+                            arm_get_init_svtor, arm_set_init_svtor,
+         if ((insn & 0x0e000f00) == 0x0c000100) {
-+                            NULL, NULL, &error_abort);
+             if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
-     }
+                 /* iWMMXt register transfer.  */
+@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
-     qdev_property_add_static(DEVICE(obj), &arm_cpu_cfgend_property,
+         }
          break;
      case 12:
 -        if ((insn & 0x01100000) == 0x01000000) {
 -            if (disas_neon_ls_insn(s, insn)) {
 -                goto illegal_op;
 -            }
 -            break;
 -        }
          goto illegal_op;
      default:
      illegal_op:
 --
 .20.1

-[Qemu-devel] [PULL 01/16] hw/misc/armsse-mhu.c: Model the SSE-200 Message Handling Unit
+[PULL 33/39] target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree
-Implement a model of the Message Handling Unit (MHU) found in
+Convert the Neon 3-reg-same VADD and VSUB insns to decodetree.
-the Arm SSE-200. This is a simple device which just contains
-some registers which allow the two cores of the SSE-200
+Note that we don't need the neon_3r_sizes[op] check here because all
-to raise interrupts on each other.
+size values are OK for VADD and VSUB; we'll add this when we convert
 the first insn that has size restrictions.
 For this we need one of the GVecGen*Fn typedefs currently in
 translate-a64.h; move them all to translate.h as a block so they
 are visible to the 32-bit decoder.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190219125808.25174-2-peter.maydell@linaro.org
+Message-id: 20200430181003.21682-15-peter.maydell@linaro.org
 ---
- hw/misc/Makefile.objs           |   1 +
+ target/arm/translate-a64.h      |  9 --------
- include/hw/misc/armsse-mhu.h    |  44 +++++++
+ target/arm/translate.h          |  9 ++++++++
- hw/misc/armsse-mhu.c            | 198 ++++++++++++++++++++++++++++++++
+ target/arm/neon-dp.decode       | 17 +++++++++++++++
- MAINTAINERS                     |   2 +
+ target/arm/translate-neon.inc.c | 38 +++++++++++++++++++++++++++++++++
- default-configs/arm-softmmu.mak |   1 +
+ target/arm/translate.c          | 14 ++++--------
- hw/misc/trace-events            |   4 +
+files changed, 68 insertions(+), 19 deletions(-)
 files changed, 250 insertions(+)
  create mode 100644 include/hw/misc/armsse-mhu.h
  create mode 100644 hw/misc/armsse-mhu.c
-diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
+diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/Makefile.objs
+--- a/target/arm/translate-a64.h
-+++ b/hw/misc/Makefile.objs
++++ b/target/arm/translate-a64.h
-@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_IOTKIT_SECCTL) += iotkit-secctl.o
+@@ -XXX,XX +XXX,XX @@ static inline int vec_full_reg_size(DisasContext *s)
- obj-$(CONFIG_IOTKIT_SYSCTL) += iotkit-sysctl.o
- obj-$(CONFIG_IOTKIT_SYSINFO) += iotkit-sysinfo.o
+ bool disas_sve(DisasContext *, uint32_t);
- obj-$(CONFIG_ARMSSE_CPUID) += armsse-cpuid.o
-+obj-$(CONFIG_ARMSSE_MHU) += armsse-mhu.o
+-/* Note that the gvec expanders operate on offsets + sizes.  */
+-typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t);
- obj-$(CONFIG_PVPANIC) += pvpanic.o
+-typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
- obj-$(CONFIG_AUX) += auxbus.o
+-                         uint32_t, uint32_t);
-diff --git a/include/hw/misc/armsse-mhu.h b/include/hw/misc/armsse-mhu.h
+-typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
-new file mode 100644
+-                        uint32_t, uint32_t, uint32_t);
-index XXXXXXX..XXXXXXX
+-typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
---- /dev/null
+-                        uint32_t, uint32_t, uint32_t);
-+++ b/include/hw/misc/armsse-mhu.h
+-
  #endif /* TARGET_ARM_TRANSLATE_A64_H */
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
  #define dc_isar_feature(name, ctx) \
      ({ DisasContext *ctx_ = (ctx); isar_feature_##name(ctx_->isar); })
 +/* Note that the gvec expanders operate on offsets + sizes.  */
 +typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t);
 +typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t,
 +                         uint32_t, uint32_t);
 +typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
 +                        uint32_t, uint32_t, uint32_t);
 +typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
 +                        uint32_t, uint32_t, uint32_t);
 +
  #endif /* TARGET_ARM_TRANSLATE_H */
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-dp.decode
 +++ b/target/arm/neon-dp.decode
 @@ -XXX,XX +XXX,XX @@
-+/*
+ #
-+ * ARM SSE-200 Message Handling Unit (MHU)
+ # This file is processed by scripts/decodetree.py
-+ *
+ #
-+ * Copyright (c) 2019 Linaro Limited
++# VFP/Neon register fields; same as vfp.decode
-+ * Written by Peter Maydell
++%vm_dp  5:1 0:4
-+ *
++%vn_dp  7:1 16:4
-+ *  This program is free software; you can redistribute it and/or modify
++%vd_dp  22:1 12:4
-+ *  it under the terms of the GNU General Public License version 2 or
-+ *  (at your option) any later version.
+ # Encodings for Neon data processing instructions where the T32 encoding
-+ */
+ # is a simple transformation of the A32 encoding.
@@ -XXX,XX +XXX,XX @@
  #   0b111p_1111_qqqq_qqqq_qqqq_qqqq_qqqq_qqqq
  # This file works on the A32 encoding only; calling code for T32 has to
  # transform the insn into the A32 version first.
 +
-+/*
++######################################################################
-+ * This is a model of the Message Handling Unit (MHU) which is part of the
++# 3-reg-same grouping:
-+ * Arm SSE-200 and documented in
++# 1111 001 U 0 D sz:2 Vn:4 Vd:4 opc:4 N Q M op Vm:4
-+ * http://infocenter.arm.com/help/topic/com.arm.doc.101104_0100_00_en/corelink_sse200_subsystem_for_embedded_technical_reference_manual_101104_0100_00_en.pdf
++######################################################################
 + *
 + * QEMU interface:
 + *  + sysbus MMIO region 0: the system information register bank
 + *  + sysbus IRQ 0: interrupt for CPU 0
 + *  + sysbus IRQ 1: interrupt for CPU 1
 + */
 +
-+#ifndef HW_MISC_SSE_MHU_H
++&3same vm vn vd q size
 +#define HW_MISC_SSE_MHU_H
 +
-+#include "hw/sysbus.h"
++@3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
 +                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
-+#define TYPE_ARMSSE_MHU "armsse-mhu"
++VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
-+#define ARMSSE_MHU(obj) OBJECT_CHECK(ARMSSEMHU, (obj), TYPE_ARMSSE_MHU)
++VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a)
      return true;
  }
 +
-+typedef struct ARMSSEMHU {
++static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn)
-+    /*< private >*/
++{
-+    SysBusDevice parent_obj;
++    int vec_size = a->q ? 16 : 8;
 +    int rd_ofs = neon_reg_offset(a->vd, 0);
 +    int rn_ofs = neon_reg_offset(a->vn, 0);
 +    int rm_ofs = neon_reg_offset(a->vm, 0);
 +
-+    /*< public >*/
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+    MemoryRegion iomem;
++        return false;
-+    qemu_irq cpu0irq;
++    }
 +    qemu_irq cpu1irq;
 +
-+    uint32_t cpu0intr;
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    uint32_t cpu1intr;
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+} ARMSSEMHU;
++        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
-+#endif
++    if ((a->vn | a->vm | a->vd) & a->q) {
-diff --git a/hw/misc/armsse-mhu.c b/hw/misc/armsse-mhu.c
++        return false;
-new file mode 100644
++    }
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/hw/misc/armsse-mhu.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + * ARM SSE-200 Message Handling Unit (MHU)
 + *
 + * Copyright (c) 2019 Linaro Limited
 + * Written by Peter Maydell
 + *
 + *  This program is free software; you can redistribute it and/or modify
 + *  it under the terms of the GNU General Public License version 2 or
 + *  (at your option) any later version.
 + */
 +
-+/*
++    if (!vfp_access_check(s)) {
-+ * This is a model of the Message Handling Unit (MHU) which is part of the
++        return true;
-+ * Arm SSE-200 and documented in
++    }
 + * http://infocenter.arm.com/help/topic/com.arm.doc.101104_0100_00_en/corelink_sse200_subsystem_for_embedded_technical_reference_manual_101104_0100_00_en.pdf
 + */
 +
-+#include "qemu/osdep.h"
++    fn(a->size, rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
-+#include "qemu/log.h"
++    return true;
 +#include "trace.h"
 +#include "qapi/error.h"
 +#include "sysemu/sysemu.h"
 +#include "hw/sysbus.h"
 +#include "hw/registerfields.h"
 +#include "hw/misc/armsse-mhu.h"
 +
 +REG32(CPU0INTR_STAT, 0x0)
 +REG32(CPU0INTR_SET, 0x4)
 +REG32(CPU0INTR_CLR, 0x8)
 +REG32(CPU1INTR_STAT, 0x10)
 +REG32(CPU1INTR_SET, 0x14)
 +REG32(CPU1INTR_CLR, 0x18)
 +REG32(PID4, 0xfd0)
 +REG32(PID5, 0xfd4)
 +REG32(PID6, 0xfd8)
 +REG32(PID7, 0xfdc)
 +REG32(PID0, 0xfe0)
 +REG32(PID1, 0xfe4)
 +REG32(PID2, 0xfe8)
 +REG32(PID3, 0xfec)
 +REG32(CID0, 0xff0)
 +REG32(CID1, 0xff4)
 +REG32(CID2, 0xff8)
 +REG32(CID3, 0xffc)
 +
 +/* Valid bits in the interrupt registers. If any are set the IRQ is raised */
 +#define INTR_MASK 0xf
 +
 +/* PID/CID values */
 +static const int armsse_mhu_id[] = {
 +    0x04, 0x00, 0x00, 0x00, /* PID4..PID7 */
 +    0x56, 0xb8, 0x0b, 0x00, /* PID0..PID3 */
 +    0x0d, 0xf0, 0x05, 0xb1, /* CID0..CID3 */
 +};
 +
 +static void armsse_mhu_update(ARMSSEMHU *s)
 +{
 +    qemu_set_irq(s->cpu0irq, s->cpu0intr != 0);
 +    qemu_set_irq(s->cpu1irq, s->cpu1intr != 0);
 +}
 +
-+static uint64_t armsse_mhu_read(void *opaque, hwaddr offset, unsigned size)
++#define DO_3SAME(INSN, FUNC)                                            \
-+{
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-+    ARMSSEMHU *s = ARMSSE_MHU(opaque);
++    {                                                                   \
-+    uint64_t r;
++        return do_3same(s, a, FUNC);                                    \
 +
 +    switch (offset) {
 +    case A_CPU0INTR_STAT:
 +        r = s->cpu0intr;
 +        break;
 +
 +    case A_CPU1INTR_STAT:
 +        r = s->cpu1intr;
 +        break;
 +
 +    case A_PID4 ... A_CID3:
 +        r = armsse_mhu_id[(offset - A_PID4) / 4];
 +        break;
 +
 +    case A_CPU0INTR_SET:
 +    case A_CPU0INTR_CLR:
 +    case A_CPU1INTR_SET:
 +    case A_CPU1INTR_CLR:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "SSE MHU: read of write-only register at offset 0x%x\n",
 +                      (int)offset);
 +        r = 0;
 +        break;
 +
 +    default:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "SSE MHU read: bad offset 0x%x\n", (int)offset);
 +        r = 0;
 +        break;
 +    }
 +    trace_armsse_mhu_read(offset, r, size);
 +    return r;
 +}
 +
 +static void armsse_mhu_write(void *opaque, hwaddr offset,
 +                             uint64_t value, unsigned size)
 +{
 +    ARMSSEMHU *s = ARMSSE_MHU(opaque);
 +
 +    trace_armsse_mhu_write(offset, value, size);
 +
 +    switch (offset) {
 +    case A_CPU0INTR_SET:
 +        s->cpu0intr |= (value & INTR_MASK);
 +        break;
 +    case A_CPU0INTR_CLR:
 +        s->cpu0intr &= ~(value & INTR_MASK);
 +        break;
 +    case A_CPU1INTR_SET:
 +        s->cpu1intr |= (value & INTR_MASK);
 +        break;
 +    case A_CPU1INTR_CLR:
 +        s->cpu1intr &= ~(value & INTR_MASK);
 +        break;
 +
 +    case A_CPU0INTR_STAT:
 +    case A_CPU1INTR_STAT:
 +    case A_PID4 ... A_CID3:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "SSE MHU: write to read-only register at offset 0x%x\n",
 +                      (int)offset);
 +        break;
 +
 +    default:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "SSE MHU write: bad offset 0x%x\n", (int)offset);
 +        break;
 +    }
 +
-+    armsse_mhu_update(s);
++DO_3SAME(VADD, tcg_gen_gvec_add)
-+}
++DO_3SAME(VSUB, tcg_gen_gvec_sub)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 0;
 -        case NEON_3R_VADD_VSUB:
 -            if (u) {
 -                tcg_gen_gvec_sub(size, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -            } else {
 -                tcg_gen_gvec_add(size, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -            }
 -            return 0;
 -
          case NEON_3R_VQADD:
              tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
                             rn_ofs, rm_ofs, vec_size, vec_size,
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
                             u ? &ushl_op[size] : &sshl_op[size]);
              return 0;
 +
-+static const MemoryRegionOps armsse_mhu_ops = {
++        case NEON_3R_VADD_VSUB:
-+    .read = armsse_mhu_read,
++            /* Already handled by decodetree */
-+    .write = armsse_mhu_write,
++            return 1;
-+    .endianness = DEVICE_LITTLE_ENDIAN,
+         }
-+    .valid.min_access_size = 4,
-+    .valid.max_access_size = 4,
+         if (size == 3) {
 +};
 +
 +static void armsse_mhu_reset(DeviceState *dev)
 +{
 +    ARMSSEMHU *s = ARMSSE_MHU(dev);
 +
 +    s->cpu0intr = 0;
 +    s->cpu1intr = 0;
 +}
 +
 +static const VMStateDescription armsse_mhu_vmstate = {
 +    .name = "armsse-mhu",
 +    .version_id = 1,
 +    .minimum_version_id = 1,
 +    .fields = (VMStateField[]) {
 +        VMSTATE_UINT32(cpu0intr, ARMSSEMHU),
 +        VMSTATE_UINT32(cpu1intr, ARMSSEMHU),
 +        VMSTATE_END_OF_LIST()
 +    },
 +};
 +
 +static void armsse_mhu_init(Object *obj)
 +{
 +    SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
 +    ARMSSEMHU *s = ARMSSE_MHU(obj);
 +
 +    memory_region_init_io(&s->iomem, obj, &armsse_mhu_ops,
 +                          s, "armsse-mhu", 0x1000);
 +    sysbus_init_mmio(sbd, &s->iomem);
 +    sysbus_init_irq(sbd, &s->cpu0irq);
 +    sysbus_init_irq(sbd, &s->cpu1irq);
 +}
 +
 +static void armsse_mhu_class_init(ObjectClass *klass, void *data)
 +{
 +    DeviceClass *dc = DEVICE_CLASS(klass);
 +
 +    dc->reset = armsse_mhu_reset;
 +    dc->vmsd = &armsse_mhu_vmstate;
 +}
 +
 +static const TypeInfo armsse_mhu_info = {
 +    .name = TYPE_ARMSSE_MHU,
 +    .parent = TYPE_SYS_BUS_DEVICE,
 +    .instance_size = sizeof(ARMSSEMHU),
 +    .instance_init = armsse_mhu_init,
 +    .class_init = armsse_mhu_class_init,
 +};
 +
 +static void armsse_mhu_register_types(void)
 +{
 +    type_register_static(&armsse_mhu_info);
 +}
 +
 +type_init(armsse_mhu_register_types);
 diff --git a/MAINTAINERS b/MAINTAINERS
 index XXXXXXX..XXXXXXX 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ F: hw/misc/iotkit-sysinfo.c
  F: include/hw/misc/iotkit-sysinfo.h
  F: hw/misc/armsse-cpuid.c
  F: include/hw/misc/armsse-cpuid.h
 +F: hw/misc/armsse-mhu.c
 +F: include/hw/misc/armsse-mhu.h
  Musca
  M: Peter Maydell <peter.maydell@linaro.org>
 diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
 index XXXXXXX..XXXXXXX 100644
 --- a/default-configs/arm-softmmu.mak
 +++ b/default-configs/arm-softmmu.mak
@@ -XXX,XX +XXX,XX @@ CONFIG_IOTKIT_SECCTL=y
  CONFIG_IOTKIT_SYSCTL=y
  CONFIG_IOTKIT_SYSINFO=y
  CONFIG_ARMSSE_CPUID=y
 +CONFIG_ARMSSE_MHU=y
  CONFIG_VERSATILE=y
  CONFIG_VERSATILE_PCI=y
 diff --git a/hw/misc/trace-events b/hw/misc/trace-events
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/misc/trace-events
 +++ b/hw/misc/trace-events
@@ -XXX,XX +XXX,XX @@ iotkit_sysctl_reset(void) "IoTKit SysCtl: reset"
  # hw/misc/armsse-cpuid.c
  armsse_cpuid_read(uint64_t offset, uint64_t data, unsigned size) "SSE-200 CPU_IDENTITY read: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
  armsse_cpuid_write(uint64_t offset, uint64_t data, unsigned size) "SSE-200 CPU_IDENTITY write: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
 +
 +# hw/misc/armsse-mhu.c
 +armsse_mhu_read(uint64_t offset, uint64_t data, unsigned size) "SSE-200 MHU read: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
 +armsse_mhu_write(uint64_t offset, uint64_t data, unsigned size) "SSE-200 MHU write: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
 --
 .20.1

-New patch
+[PULL 34/39] target/arm: Convert Neon 3-reg-same logic ops to decodetree
+Convert the Neon logic ops in the 3-reg-same grouping to decodetree.
+Note that for the logic ops the 'size' field forms part of their
+decode and the actual operations are always bitwise.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-16-peter.maydell@linaro.org
+---
+ target/arm/neon-dp.decode       | 12 +++++++++++
+ target/arm/translate-neon.inc.c | 19 +++++++++++++++++
+ target/arm/translate.c          | 38 +--------------------------------
+files changed, 32 insertions(+), 37 deletions(-)
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon-dp.decode
++++ b/target/arm/neon-dp.decode
+@@ -XXX,XX +XXX,XX @@
+ @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
++@3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
++                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
++
++VAND_3s          1111 001 0 0 . 00 .... .... 0001 ... 1 .... @3same_logic
++VBIC_3s          1111 001 0 0 . 01 .... .... 0001 ... 1 .... @3same_logic
++VORR_3s          1111 001 0 0 . 10 .... .... 0001 ... 1 .... @3same_logic
++VORN_3s          1111 001 0 0 . 11 .... .... 0001 ... 1 .... @3same_logic
++VEOR_3s          1111 001 1 0 . 00 .... .... 0001 ... 1 .... @3same_logic
++VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
++VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
++VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
++
+ VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
+ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn)
+ DO_3SAME(VADD, tcg_gen_gvec_add)
+ DO_3SAME(VSUB, tcg_gen_gvec_sub)
++DO_3SAME(VAND, tcg_gen_gvec_and)
++DO_3SAME(VBIC, tcg_gen_gvec_andc)
++DO_3SAME(VORR, tcg_gen_gvec_or)
++DO_3SAME(VORN, tcg_gen_gvec_orc)
++DO_3SAME(VEOR, tcg_gen_gvec_xor)
++
++/* These insns are all gvec_bitsel but with the inputs in various orders. */
++#define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
++                                uint32_t oprsz, uint32_t maxsz)         \
++    {                                                                   \
++        tcg_gen_gvec_bitsel(vece, rd_ofs, O1, O2, O3, oprsz, maxsz);    \
++    }                                                                   \
++    DO_3SAME(INSN, gen_##INSN##_3s)
++
++DO_3SAME_BITSEL(VBSL, rd_ofs, rn_ofs, rm_ofs)
++DO_3SAME_BITSEL(VBIT, rm_ofs, rn_ofs, rd_ofs)
++DO_3SAME_BITSEL(VBIF, rm_ofs, rd_ofs, rn_ofs)
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+             }
+             return 1;
+-        case NEON_3R_LOGIC: /* Logic ops.  */
+-            switch ((u << 2) | size) {
+-            case 0: /* VAND */
+-                tcg_gen_gvec_and(0, rd_ofs, rn_ofs, rm_ofs,
+-                                 vec_size, vec_size);
+-                break;
+-            case 1: /* VBIC */
+-                tcg_gen_gvec_andc(0, rd_ofs, rn_ofs, rm_ofs,
+-                                  vec_size, vec_size);
+-                break;
+-            case 2: /* VORR */
+-                tcg_gen_gvec_or(0, rd_ofs, rn_ofs, rm_ofs,
+-                                vec_size, vec_size);
+-                break;
+-            case 3: /* VORN */
+-                tcg_gen_gvec_orc(0, rd_ofs, rn_ofs, rm_ofs,
+-                                 vec_size, vec_size);
+-                break;
+-            case 4: /* VEOR */
+-                tcg_gen_gvec_xor(0, rd_ofs, rn_ofs, rm_ofs,
+-                                 vec_size, vec_size);
+-                break;
+-            case 5: /* VBSL */
+-                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rd_ofs, rn_ofs, rm_ofs,
+-                                    vec_size, vec_size);
+-                break;
+-            case 6: /* VBIT */
+-                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rn_ofs, rd_ofs,
+-                                    vec_size, vec_size);
+-                break;
+-            case 7: /* VBIF */
+-                tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rd_ofs, rn_ofs,
+-                                    vec_size, vec_size);
+-                break;
+-            }
+-            return 0;
+-
+         case NEON_3R_VQADD:
+             tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+                            rn_ofs, rm_ofs, vec_size, vec_size,
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+             return 0;
+         case NEON_3R_VADD_VSUB:
++        case NEON_3R_LOGIC:
+             /* Already handled by decodetree */
+             return 1;
+         }
+--
+.20.1

-New patch
+[PULL 35/39] target/arm: Convert Neon 3-reg-same VMAX/VMIN to decodetree
+Convert the Neon 3-reg-same VMAX and VMIN insns to decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-17-peter.maydell@linaro.org
+---
+ target/arm/neon-dp.decode       |  5 +++++
+ target/arm/translate-neon.inc.c | 14 ++++++++++++++
+ target/arm/translate.c          | 21 ++-------------------
+files changed, 21 insertions(+), 19 deletions(-)
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon-dp.decode
++++ b/target/arm/neon-dp.decode
+@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
+ VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
+ VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
++VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
++VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
++VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
++VMIN_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
++
+ VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
+ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ DO_3SAME(VEOR, tcg_gen_gvec_xor)
+ DO_3SAME_BITSEL(VBSL, rd_ofs, rn_ofs, rm_ofs)
+ DO_3SAME_BITSEL(VBIT, rm_ofs, rn_ofs, rd_ofs)
+ DO_3SAME_BITSEL(VBIF, rm_ofs, rd_ofs, rn_ofs)
++
++#define DO_3SAME_NO_SZ_3(INSN, FUNC)                                    \
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
++    {                                                                   \
++        if (a->size == 3) {                                             \
++            return false;                                               \
++        }                                                               \
++        return do_3same(s, a, FUNC);                                    \
++    }
++
++DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
++DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
++DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
++DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+                              rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
+             return 0;
+-        case NEON_3R_VMAX:
+-            if (u) {
+-                tcg_gen_gvec_umax(size, rd_ofs, rn_ofs, rm_ofs,
+-                                  vec_size, vec_size);
+-            } else {
+-                tcg_gen_gvec_smax(size, rd_ofs, rn_ofs, rm_ofs,
+-                                  vec_size, vec_size);
+-            }
+-            return 0;
+-        case NEON_3R_VMIN:
+-            if (u) {
+-                tcg_gen_gvec_umin(size, rd_ofs, rn_ofs, rm_ofs,
+-                                  vec_size, vec_size);
+-            } else {
+-                tcg_gen_gvec_smin(size, rd_ofs, rn_ofs, rm_ofs,
+-                                  vec_size, vec_size);
+-            }
+-            return 0;
+-
+         case NEON_3R_VSHL:
+             /* Note the operation is vshl vd,vm,vn */
+             tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+         case NEON_3R_VADD_VSUB:
+         case NEON_3R_LOGIC:
++        case NEON_3R_VMAX:
++        case NEON_3R_VMIN:
+             /* Already handled by decodetree */
+             return 1;
+         }
+--
+.20.1

-New patch
+[PULL 36/39] target/arm: Convert Neon 3-reg-same comparisons to decodetree
+Convert the Neon comparison ops in the 3-reg-same grouping
+to decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200430181003.21682-18-peter.maydell@linaro.org
+---
+ target/arm/neon-dp.decode       |  8 ++++++++
+ target/arm/translate-neon.inc.c | 22 ++++++++++++++++++++++
+ target/arm/translate.c          | 23 +++--------------------
+files changed, 33 insertions(+), 20 deletions(-)
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon-dp.decode
++++ b/target/arm/neon-dp.decode
+@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
+ VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
+ VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
++VCGT_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 0 .... @3same
++VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
++VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
++VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
++
+ VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
+ VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
+ VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
+@@ -XXX,XX +XXX,XX @@ VMIN_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
+ VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
+ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
++
++VTST_3s          1111 001 0 0 . .. .... .... 1000 . . . 1 .... @3same
++VCEQ_3s          1111 001 1 0 . .. .... .... 1000 . . . 1 .... @3same
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
+ DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
+ DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
+ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
++
++#define DO_3SAME_CMP(INSN, COND)                                        \
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
++                                uint32_t oprsz, uint32_t maxsz)         \
++    {                                                                   \
++        tcg_gen_gvec_cmp(COND, vece, rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz); \
++    }                                                                   \
++    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
++
++DO_3SAME_CMP(VCGT_S, TCG_COND_GT)
++DO_3SAME_CMP(VCGT_U, TCG_COND_GTU)
++DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
++DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
++DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
++
++static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++                         uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
++{
++    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
++}
++DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+                            u ? &mls_op[size] : &mla_op[size]);
+             return 0;
+-        case NEON_3R_VTST_VCEQ:
+-            if (u) { /* VCEQ */
+-                tcg_gen_gvec_cmp(TCG_COND_EQ, size, rd_ofs, rn_ofs, rm_ofs,
+-                                 vec_size, vec_size);
+-            } else { /* VTST */
+-                tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
+-                               vec_size, vec_size, &cmtst_op[size]);
+-            }
+-            return 0;
+-
+-        case NEON_3R_VCGT:
+-            tcg_gen_gvec_cmp(u ? TCG_COND_GTU : TCG_COND_GT, size,
+-                             rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
+-            return 0;
+-
+-        case NEON_3R_VCGE:
+-            tcg_gen_gvec_cmp(u ? TCG_COND_GEU : TCG_COND_GE, size,
+-                             rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size);
+-            return 0;
+-
+         case NEON_3R_VSHL:
+             /* Note the operation is vshl vd,vm,vn */
+             tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+         case NEON_3R_LOGIC:
+         case NEON_3R_VMAX:
+         case NEON_3R_VMIN:
++        case NEON_3R_VTST_VCEQ:
++        case NEON_3R_VCGT:
++        case NEON_3R_VCGE:
+             /* Already handled by decodetree */
+             return 1;
+         }
+--
+.20.1

-[Qemu-devel] [PULL 02/16] hw/arm/armsse: Wire up the MHUs
+[PULL 37/39] target/arm: Convert Neon 3-reg-same VQADD/VQSUB to decodetree
-Create and connect the MHUs in the SSE-200.
+Convert the Neon VQADD/VQSUB insns in the 3-reg-same grouping
 to decodetree.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190219125808.25174-3-peter.maydell@linaro.org
+Message-id: 20200430181003.21682-19-peter.maydell@linaro.org
 ---
- include/hw/arm/armsse.h |  3 ++-
+ target/arm/neon-dp.decode       |  6 ++++++
- hw/arm/armsse.c         | 40 ++++++++++++++++++++++++++++++----------
+ target/arm/translate-neon.inc.c | 15 +++++++++++++++
-files changed, 32 insertions(+), 11 deletions(-)
+ target/arm/translate.c          | 14 ++------------
 files changed, 23 insertions(+), 12 deletions(-)
-diff --git a/include/hw/arm/armsse.h b/include/hw/arm/armsse.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/armsse.h
+--- a/target/arm/neon-dp.decode
-+++ b/include/hw/arm/armsse.h
++++ b/target/arm/neon-dp.decode
 @@ -XXX,XX +XXX,XX @@
- #include "hw/misc/iotkit-sysctl.h"
+ @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
- #include "hw/misc/iotkit-sysinfo.h"
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
- #include "hw/misc/armsse-cpuid.h"
-+#include "hw/misc/armsse-mhu.h"
++VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
- #include "hw/misc/unimp.h"
++VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
- #include "hw/or-irq.h"
++
- #include "hw/core/split-irq.h"
+ @3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
-@@ -XXX,XX +XXX,XX @@ typedef struct ARMSSE {
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
-     IoTKitSysCtl sysctl;
-     IoTKitSysCtl sysinfo;
+@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
+ VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
--    UnimplementedDeviceState mhu[2];
+ VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
-+    ARMSSEMHU mhu[2];
-     UnimplementedDeviceState ppu[NUM_PPUS];
++VQSUB_S_3s       1111 001 0 0 . .. .... .... 0010 . . . 1 .... @3same
-     UnimplementedDeviceState cachectrl[SSE_MAX_CPUS];
++VQSUB_U_3s       1111 001 1 0 . .. .... .... 0010 . . . 1 .... @3same
-     UnimplementedDeviceState cpusecctrl[SSE_MAX_CPUS];
++
-diff --git a/hw/arm/armsse.c b/hw/arm/armsse.c
+ VCGT_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 0 .... @3same
  VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
  VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/armsse.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/arm/armsse.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static void armsse_init(Object *obj)
+@@ -XXX,XX +XXX,XX @@ static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-                           sizeof(s->sysinfo), TYPE_IOTKIT_SYSINFO);
+     tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
-     if (info->has_mhus) {
+ }
-         sysbus_init_child_obj(obj, "mhu0", &s->mhu[0], sizeof(s->mhu[0]),
+ DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
 -                              TYPE_UNIMPLEMENTED_DEVICE);
 +                              TYPE_ARMSSE_MHU);
          sysbus_init_child_obj(obj, "mhu1", &s->mhu[1], sizeof(s->mhu[1]),
 -                              TYPE_UNIMPLEMENTED_DEVICE);
 +                              TYPE_ARMSSE_MHU);
      }
      if (info->has_ppus) {
          for (i = 0; i < info->num_cpus; i++) {
@@ -XXX,XX +XXX,XX @@ static void armsse_realize(DeviceState *dev, Error **errp)
      }
      if (info->has_mhus) {
 -        for (i = 0; i < ARRAY_SIZE(s->mhu); i++) {
 -            char *name;
 -            char *port;
 +        /*
 +         * An SSE-200 with only one CPU should have only one MHU created,
 +         * with the region where the second MHU usually is being RAZ/WI.
 +         * We don't implement that SSE-200 config; if we want to support
 +         * it then this code needs to be enhanced to handle creating the
 +         * RAZ/WI region instead of the second MHU.
 +         */
 +        assert(info->num_cpus == ARRAY_SIZE(s->mhu));
 +
-+        for (i = 0; i < ARRAY_SIZE(s->mhu); i++) {
++#define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
-+            char *port;
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+            int cpunum;
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+            SysBusDevice *mhu_sbd = SYS_BUS_DEVICE(&s->mhu[i]);
++                                uint32_t oprsz, uint32_t maxsz)         \
++    {                                                                   \
--            name = g_strdup_printf("MHU%d", i);
++        tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),           \
--            qdev_prop_set_string(DEVICE(&s->mhu[i]), "name", name);
++                       rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]);   \
--            qdev_prop_set_uint64(DEVICE(&s->mhu[i]), "size", 0x1000);
++    }                                                                   \
-             object_property_set_bool(OBJECT(&s->mhu[i]), true,
++    DO_3SAME(INSN, gen_##INSN##_3s)
-                                      "realized", &err);
++
--            g_free(name);
++DO_3SAME_GVEC4(VQADD_S, sqadd_op)
-             if (err) {
++DO_3SAME_GVEC4(VQADD_U, uqadd_op)
-                 error_propagate(errp, err);
++DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
-                 return;
++DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
-             port = g_strdup_printf("port[%d]", i + 3);
+             return 1;
--            mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->mhu[i]), 0);
-+            mr = sysbus_mmio_get_region(mhu_sbd, 0);
+-        case NEON_3R_VQADD:
-             object_property_set_link(OBJECT(&s->apb_ppc0), OBJECT(mr),
+-            tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
-                                      port, &err);
+-                           rn_ofs, rm_ofs, vec_size, vec_size,
-             g_free(port);
+-                           (u ? uqadd_op : sqadd_op) + size);
-@@ -XXX,XX +XXX,XX @@ static void armsse_realize(DeviceState *dev, Error **errp)
+-            return 0;
-                 error_propagate(errp, err);
+-
-                 return;
+-        case NEON_3R_VQSUB:
-             }
+-            tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
-+
+-                           rn_ofs, rm_ofs, vec_size, vec_size,
-+            /*
+-                           (u ? uqsub_op : sqsub_op) + size);
-+             * Each MHU has an irq line for each CPU:
+-            return 0;
-+             *  MHU 0 irq line 0 -> CPU 0 IRQ 6
+-
-+             *  MHU 0 irq line 1 -> CPU 1 IRQ 6
+         case NEON_3R_VMUL: /* VMUL */
-+             *  MHU 1 irq line 0 -> CPU 0 IRQ 7
+             if (u) {
-+             *  MHU 1 irq line 1 -> CPU 1 IRQ 7
+                 /* Polynomial case allows only P8.  */
-+             */
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+            for (cpunum = 0; cpunum < info->num_cpus; cpunum++) {
+         case NEON_3R_VTST_VCEQ:
-+                DeviceState *cpudev = DEVICE(&s->armv7m[cpunum]);
+         case NEON_3R_VCGT:
-+
+         case NEON_3R_VCGE:
-+                sysbus_connect_irq(mhu_sbd, cpunum,
++        case NEON_3R_VQADD:
-+                                   qdev_get_gpio_in(cpudev, 6 + i));
++        case NEON_3R_VQSUB:
-+            }
+             /* Already handled by decodetree */
              return 1;
          }
-     }
 --
 .20.1

-[Qemu-devel] [PULL 06/16] hw/arm/iotkit-sysctl: Add SSE-200 registers
+[PULL 38/39] target/arm: Convert Neon 3-reg-same VMUL, VMLA, VMLS, VSHL to decodetree
-The SYSCTL block in the SSE-200 has some extra registers that
+Convert the Neon VMUL, VMLA, VMLS and VSHL insns in the
-are not present in the IoTKit version. Add these registers
+-reg-same grouping to decodetree.
 (as reads-as-written stubs), enabled by a new QOM property.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190219125808.25174-7-peter.maydell@linaro.org
+Message-id: 20200430181003.21682-20-peter.maydell@linaro.org
 ---
- include/hw/misc/iotkit-sysctl.h |  20 +++
+ target/arm/neon-dp.decode       |  9 +++++++
- hw/arm/armsse.c                 |   2 +
+ target/arm/translate-neon.inc.c | 44 +++++++++++++++++++++++++++++++++
- hw/misc/iotkit-sysctl.c         | 245 +++++++++++++++++++++++++++++++-
+ target/arm/translate.c          | 28 +++------------------
-files changed, 262 insertions(+), 5 deletions(-)
+files changed, 56 insertions(+), 25 deletions(-)
-diff --git a/include/hw/misc/iotkit-sysctl.h b/include/hw/misc/iotkit-sysctl.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/misc/iotkit-sysctl.h
+--- a/target/arm/neon-dp.decode
-+++ b/include/hw/misc/iotkit-sysctl.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
-  * "system control register" blocks.
+ VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
-  *
+ VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
-  * QEMU interface:
-+ *  + QOM property "SYS_VERSION": value of the SYS_VERSION register of the
++VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
-+ *    system information block of the SSE
++VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
 + *    (used to identify whether to provide SSE-200-only registers)
   *  + sysbus MMIO region 0: the system information register bank
   *  + sysbus MMIO region 1: the system control register bank
   */
@@ -XXX,XX +XXX,XX @@ typedef struct IoTKitSysCtl {
      uint32_t initsvtor0;
      uint32_t cpuwait;
      uint32_t wicctrl;
 +    uint32_t scsecctrl;
 +    uint32_t fclk_div;
 +    uint32_t sysclk_div;
 +    uint32_t clock_force;
 +    uint32_t initsvtor1;
 +    uint32_t nmi_enable;
 +    uint32_t ewctrl;
 +    uint32_t pdcm_pd_sys_sense;
 +    uint32_t pdcm_pd_sram0_sense;
 +    uint32_t pdcm_pd_sram1_sense;
 +    uint32_t pdcm_pd_sram2_sense;
 +    uint32_t pdcm_pd_sram3_sense;
 +
-+    /* Properties */
+ VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
-+    uint32_t sys_version;
+ VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
  VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
@@ -XXX,XX +XXX,XX @@ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
  VTST_3s          1111 001 0 0 . .. .... .... 1000 . . . 1 .... @3same
  VCEQ_3s          1111 001 1 0 . .. .... .... 1000 . . . 1 .... @3same
 +
-+    bool is_sse200;
++VMLA_3s          1111 001 0 0 . .. .... .... 1001 . . . 0 .... @3same
- } IoTKitSysCtl;
++VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
++
- #endif
++VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
-diff --git a/hw/arm/armsse.c b/hw/arm/armsse.c
++VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/armsse.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/arm/armsse.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static void armsse_realize(DeviceState *dev, Error **errp)
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_S, tcg_gen_gvec_smax)
-     /* System information registers */
+ DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
-     sysbus_mmio_map(SYS_BUS_DEVICE(&s->sysinfo), 0, 0x40020000);
+ DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
-     /* System control registers */
+ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
-+    object_property_set_int(OBJECT(&s->sysctl), info->sys_version,
++DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
-+                            "SYS_VERSION", &err);
-     object_property_set_bool(OBJECT(&s->sysctl), true, "realized", &err);
+ #define DO_3SAME_CMP(INSN, COND)                                        \
-     if (err) {
+     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-         error_propagate(errp, err);
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_GVEC4(VQADD_S, sqadd_op)
-diff --git a/hw/misc/iotkit-sysctl.c b/hw/misc/iotkit-sysctl.c
+ DO_3SAME_GVEC4(VQADD_U, uqadd_op)
-index XXXXXXX..XXXXXXX 100644
+ DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
---- a/hw/misc/iotkit-sysctl.c
+ DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
-+++ b/hw/misc/iotkit-sysctl.c
++
-@@ -XXX,XX +XXX,XX @@
++static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-  */
++                           uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
  #include "qemu/osdep.h"
 +#include "qemu/bitops.h"
  #include "qemu/log.h"
  #include "trace.h"
  #include "qapi/error.h"
@@ -XXX,XX +XXX,XX @@
  REG32(SECDBGSTAT, 0x0)
  REG32(SECDBGSET, 0x4)
  REG32(SECDBGCLR, 0x8)
 +REG32(SCSECCTRL, 0xc)
 +REG32(FCLK_DIV, 0x10)
 +REG32(SYSCLK_DIV, 0x14)
 +REG32(CLOCK_FORCE, 0x18)
  REG32(RESET_SYNDROME, 0x100)
  REG32(RESET_MASK, 0x104)
  REG32(SWRESET, 0x108)
      FIELD(SWRESET, SWRESETREQ, 9, 1)
  REG32(GRETREG, 0x10c)
  REG32(INITSVTOR0, 0x110)
 +REG32(INITSVTOR1, 0x114)
  REG32(CPUWAIT, 0x118)
 -REG32(BUSWAIT, 0x11c)
 +REG32(NMI_ENABLE, 0x11c) /* BUSWAIT in IoTKit */
  REG32(WICCTRL, 0x120)
 +REG32(EWCTRL, 0x124)
 +REG32(PDCM_PD_SYS_SENSE, 0x200)
 +REG32(PDCM_PD_SRAM0_SENSE, 0x20c)
 +REG32(PDCM_PD_SRAM1_SENSE, 0x210)
 +REG32(PDCM_PD_SRAM2_SENSE, 0x214)
 +REG32(PDCM_PD_SRAM3_SENSE, 0x218)
  REG32(PID4, 0xfd0)
  REG32(PID5, 0xfd4)
  REG32(PID6, 0xfd8)
@@ -XXX,XX +XXX,XX @@ static uint64_t iotkit_sysctl_read(void *opaque, hwaddr offset,
      case A_SECDBGSTAT:
          r = s->secure_debug;
          break;
 +    case A_SCSECCTRL:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->scsecctrl;
 +        break;
 +    case A_FCLK_DIV:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->fclk_div;
 +        break;
 +    case A_SYSCLK_DIV:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->sysclk_div;
 +        break;
 +    case A_CLOCK_FORCE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->clock_force;
 +        break;
      case A_RESET_SYNDROME:
          r = s->reset_syndrome;
          break;
@@ -XXX,XX +XXX,XX @@ static uint64_t iotkit_sysctl_read(void *opaque, hwaddr offset,
      case A_INITSVTOR0:
          r = s->initsvtor0;
          break;
 +    case A_INITSVTOR1:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->initsvtor1;
 +        break;
      case A_CPUWAIT:
          r = s->cpuwait;
          break;
 -    case A_BUSWAIT:
 -        /* In IoTKit BUSWAIT is reserved, R/O, zero */
 -        r = 0;
 +    case A_NMI_ENABLE:
 +        /* In IoTKit this is named BUSWAIT but is marked reserved, R/O, zero */
 +        if (!s->is_sse200) {
 +            r = 0;
 +            break;
 +        }
 +        r = s->nmi_enable;
          break;
      case A_WICCTRL:
          r = s->wicctrl;
          break;
 +    case A_EWCTRL:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->ewctrl;
 +        break;
 +    case A_PDCM_PD_SYS_SENSE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->pdcm_pd_sys_sense;
 +        break;
 +    case A_PDCM_PD_SRAM0_SENSE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->pdcm_pd_sram0_sense;
 +        break;
 +    case A_PDCM_PD_SRAM1_SENSE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->pdcm_pd_sram1_sense;
 +        break;
 +    case A_PDCM_PD_SRAM2_SENSE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->pdcm_pd_sram2_sense;
 +        break;
 +    case A_PDCM_PD_SRAM3_SENSE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        r = s->pdcm_pd_sram3_sense;
 +        break;
      case A_PID4 ... A_CID3:
          r = sysctl_id[(offset - A_PID4) / 4];
          break;
@@ -XXX,XX +XXX,XX @@ static uint64_t iotkit_sysctl_read(void *opaque, hwaddr offset,
          r = 0;
          break;
      default:
 +    bad_offset:
          qemu_log_mask(LOG_GUEST_ERROR,
                        "IoTKit SysCtl read: bad offset %x\n", (int)offset);
          r = 0;
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_write(void *opaque, hwaddr offset,
              qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
          }
          break;
 -    case A_BUSWAIT:        /* In IoTKit BUSWAIT is reserved, R/O, zero */
 +    case A_SCSECCTRL:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl SCSECCTRL unimplemented\n");
 +        s->scsecctrl = value;
 +        break;
 +    case A_FCLK_DIV:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl FCLK_DIV unimplemented\n");
 +        s->fclk_div = value;
 +        break;
 +    case A_SYSCLK_DIV:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl SYSCLK_DIV unimplemented\n");
 +        s->sysclk_div = value;
 +        break;
 +    case A_CLOCK_FORCE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl CLOCK_FORCE unimplemented\n");
 +        s->clock_force = value;
 +        break;
 +    case A_INITSVTOR1:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl INITSVTOR1 unimplemented\n");
 +        s->initsvtor1 = value;
 +        break;
 +    case A_EWCTRL:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl EWCTRL unimplemented\n");
 +        s->ewctrl = value;
 +        break;
 +    case A_PDCM_PD_SYS_SENSE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP,
 +                      "IoTKit SysCtl PDCM_PD_SYS_SENSE unimplemented\n");
 +        s->pdcm_pd_sys_sense = value;
 +        break;
 +    case A_PDCM_PD_SRAM0_SENSE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP,
 +                      "IoTKit SysCtl PDCM_PD_SRAM0_SENSE unimplemented\n");
 +        s->pdcm_pd_sram0_sense = value;
 +        break;
 +    case A_PDCM_PD_SRAM1_SENSE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP,
 +                      "IoTKit SysCtl PDCM_PD_SRAM1_SENSE unimplemented\n");
 +        s->pdcm_pd_sram1_sense = value;
 +        break;
 +    case A_PDCM_PD_SRAM2_SENSE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP,
 +                      "IoTKit SysCtl PDCM_PD_SRAM2_SENSE unimplemented\n");
 +        s->pdcm_pd_sram2_sense = value;
 +        break;
 +    case A_PDCM_PD_SRAM3_SENSE:
 +        if (!s->is_sse200) {
 +            goto bad_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP,
 +                      "IoTKit SysCtl PDCM_PD_SRAM3_SENSE unimplemented\n");
 +        s->pdcm_pd_sram3_sense = value;
 +        break;
 +    case A_NMI_ENABLE:
 +        /* In IoTKit this is BUSWAIT: reserved, R/O, zero */
 +        if (!s->is_sse200) {
 +            goto ro_offset;
 +        }
 +        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl NMI_ENABLE unimplemented\n");
 +        s->nmi_enable = value;
 +        break;
      case A_SECDBGSTAT:
      case A_PID4 ... A_CID3:
 +    ro_offset:
          qemu_log_mask(LOG_GUEST_ERROR,
                        "IoTKit SysCtl write: write of RO offset %x\n",
                        (int)offset);
          break;
      default:
 +    bad_offset:
          qemu_log_mask(LOG_GUEST_ERROR,
                        "IoTKit SysCtl write: bad offset %x\n", (int)offset);
          break;
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_reset(DeviceState *dev)
      s->reset_mask = 0;
      s->gretreg = 0;
      s->initsvtor0 = 0x10000000;
 +    s->initsvtor1 = 0x10000000;
      s->cpuwait = 0;
      s->wicctrl = 0;
 +    s->scsecctrl = 0;
 +    s->fclk_div = 0;
 +    s->sysclk_div = 0;
 +    s->clock_force = 0;
 +    s->nmi_enable = 0;
 +    s->ewctrl = 0;
 +    s->pdcm_pd_sys_sense = 0x7f;
 +    s->pdcm_pd_sram0_sense = 0;
 +    s->pdcm_pd_sram1_sense = 0;
 +    s->pdcm_pd_sram2_sense = 0;
 +    s->pdcm_pd_sram3_sense = 0;
  }
  static void iotkit_sysctl_init(Object *obj)
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_init(Object *obj)
      sysbus_init_mmio(sbd, &s->iomem);
  }
 +static void iotkit_sysctl_realize(DeviceState *dev, Error **errp)
 +{
-+    IoTKitSysCtl *s = IOTKIT_SYSCTL(dev);
++    tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz,
-+
++                       0, gen_helper_gvec_pmul_b);
 +    /* The top 4 bits of the SYS_VERSION register tell us if we're an SSE-200 */
 +    if (extract32(s->sys_version, 28, 4) == 2) {
 +        s->is_sse200 = true;
 +    }
 +}
 +
-+static bool sse200_needed(void *opaque)
++static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
 +{
-+    IoTKitSysCtl *s = IOTKIT_SYSCTL(opaque);
++    if (a->size != 0) {
-+
++        return false;
-+    return s->is_sse200;
++    }
 +    return do_3same(s, a, gen_VMUL_p_3s);
 +}
 +
-+static const VMStateDescription iotkit_sysctl_sse200_vmstate = {
++#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY)                           \
-+    .name = "iotkit-sysctl/sse-200",
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+    .version_id = 1,
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+    .minimum_version_id = 1,
++                                uint32_t oprsz, uint32_t maxsz)         \
-+    .needed = sse200_needed,
++    {                                                                   \
-+    .fields = (VMStateField[]) {
++        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
-+        VMSTATE_UINT32(scsecctrl, IoTKitSysCtl),
++                       oprsz, maxsz, &OPARRAY[vece]);                   \
-+        VMSTATE_UINT32(fclk_div, IoTKitSysCtl),
++    }                                                                   \
-+        VMSTATE_UINT32(sysclk_div, IoTKitSysCtl),
++    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
 +        VMSTATE_UINT32(clock_force, IoTKitSysCtl),
 +        VMSTATE_UINT32(initsvtor1, IoTKitSysCtl),
 +        VMSTATE_UINT32(nmi_enable, IoTKitSysCtl),
 +        VMSTATE_UINT32(pdcm_pd_sys_sense, IoTKitSysCtl),
 +        VMSTATE_UINT32(pdcm_pd_sram0_sense, IoTKitSysCtl),
 +        VMSTATE_UINT32(pdcm_pd_sram1_sense, IoTKitSysCtl),
 +        VMSTATE_UINT32(pdcm_pd_sram2_sense, IoTKitSysCtl),
 +        VMSTATE_UINT32(pdcm_pd_sram3_sense, IoTKitSysCtl),
 +        VMSTATE_END_OF_LIST()
 +    }
 +};
 +
- static const VMStateDescription iotkit_sysctl_vmstate = {
-     .name = "iotkit-sysctl",
-     .version_id = 1,
-@@ -XXX,XX +XXX,XX @@ static const VMStateDescription iotkit_sysctl_vmstate = {
-         VMSTATE_UINT32(cpuwait, IoTKitSysCtl),
-         VMSTATE_UINT32(wicctrl, IoTKitSysCtl),
-         VMSTATE_END_OF_LIST()
-+    },
-+    .subsections = (const VMStateDescription*[]) {
-+        &iotkit_sysctl_sse200_vmstate,
-+        NULL
-     }
- };
-+static Property iotkit_sysctl_props[] = {
-+    DEFINE_PROP_UINT32("SYS_VERSION", IoTKitSysCtl, sys_version, 0),
-+    DEFINE_PROP_END_OF_LIST()
-+};
 +
- static void iotkit_sysctl_class_init(ObjectClass *klass, void *data)
++DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op)
- {
++DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op)
-     DeviceClass *dc = DEVICE_CLASS(klass);
++
++#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
-     dc->vmsd = &iotkit_sysctl_vmstate;
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-     dc->reset = iotkit_sysctl_reset;
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+    dc->props = iotkit_sysctl_props;
++                                uint32_t oprsz, uint32_t maxsz)         \
-+    dc->realize = iotkit_sysctl_realize;
++    {                                                                   \
- }
++        /* Note the operation is vshl vd,vm,vn */                       \
++        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
- static const TypeInfo iotkit_sysctl_info = {
++                       oprsz, maxsz, &OPARRAY[vece]);                   \
 +    }                                                                   \
 +    DO_3SAME(INSN, gen_##INSN##_3s)
 +
 +DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
 +DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 1;
 -        case NEON_3R_VMUL: /* VMUL */
 -            if (u) {
 -                /* Polynomial case allows only P8.  */
 -                if (size != 0) {
 -                    return 1;
 -                }
 -                tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
 -                                   0, gen_helper_gvec_pmul_b);
 -            } else {
 -                tcg_gen_gvec_mul(size, rd_ofs, rn_ofs, rm_ofs,
 -                                 vec_size, vec_size);
 -            }
 -            return 0;
 -
 -        case NEON_3R_VML: /* VMLA, VMLS */
 -            tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
 -                           u ? &mls_op[size] : &mla_op[size]);
 -            return 0;
 -
 -        case NEON_3R_VSHL:
 -            /* Note the operation is vshl vd,vm,vn */
 -            tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
 -                           u ? &ushl_op[size] : &sshl_op[size]);
 -            return 0;
 -
          case NEON_3R_VADD_VSUB:
          case NEON_3R_LOGIC:
          case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VCGE:
          case NEON_3R_VQADD:
          case NEON_3R_VQSUB:
 +        case NEON_3R_VMUL:
 +        case NEON_3R_VML:
 +        case NEON_3R_VSHL:
              /* Already handled by decodetree */
              return 1;
          }
 --
 .20.1

-[Qemu-devel] [PULL 05/16] hw/misc/iotkit-sysctl: Correct typo in INITSVTOR0 register name
+[PULL 39/39] target/arm: Move gen_ function typedefs to translate.h
-The iotkit-sysctl device has a register it names INITSVRTOR0.
+We're going to want at least some of the NeonGen* typedefs
-This is actually a typo present in the IoTKit documentation
+for the refactored 32-bit Neon decoder, so move them all
-and also in part of the SSE-200 documentation:  it should be
+to translate.h since it makes more sense to keep them in
-INITSVTOR0 because it is specifying the initial value of the
+one group.
 Secure VTOR register in the CPU. Correct the typo.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190219125808.25174-6-peter.maydell@linaro.org
+Message-id: 20200430181003.21682-23-peter.maydell@linaro.org
 ---
- include/hw/misc/iotkit-sysctl.h |  2 +-
+ target/arm/translate.h     | 17 +++++++++++++++++
- hw/misc/iotkit-sysctl.c         | 16 ++++++++--------
+ target/arm/translate-a64.c | 17 -----------------
-files changed, 9 insertions(+), 9 deletions(-)
+files changed, 17 insertions(+), 17 deletions(-)
-diff --git a/include/hw/misc/iotkit-sysctl.h b/include/hw/misc/iotkit-sysctl.h
+diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/misc/iotkit-sysctl.h
+--- a/target/arm/translate.h
-+++ b/include/hw/misc/iotkit-sysctl.h
++++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ typedef struct IoTKitSysCtl {
+@@ -XXX,XX +XXX,XX @@ typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
-     uint32_t reset_syndrome;
+ typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
-     uint32_t reset_mask;
+                         uint32_t, uint32_t, uint32_t);
-     uint32_t gretreg;
--    uint32_t initsvrtor0;
++/* Function prototype for gen_ functions for calling Neon helpers */
-+    uint32_t initsvtor0;
++typedef void NeonGenOneOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32);
-     uint32_t cpuwait;
++typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
-     uint32_t wicctrl;
++typedef void NeonGenTwoOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
- } IoTKitSysCtl;
++typedef void NeonGenTwo64OpFn(TCGv_i64, TCGv_i64, TCGv_i64);
-diff --git a/hw/misc/iotkit-sysctl.c b/hw/misc/iotkit-sysctl.c
++typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
 +typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
 +typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
 +typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
 +typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
 +typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
 +typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
 +typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
 +typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
 +typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
 +typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
 +
  #endif /* TARGET_ARM_TRANSLATE_H */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/iotkit-sysctl.c
+--- a/target/arm/translate-a64.c
-+++ b/hw/misc/iotkit-sysctl.c
++++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ REG32(RESET_MASK, 0x104)
+@@ -XXX,XX +XXX,XX @@ typedef struct AArch64DecodeTable {
- REG32(SWRESET, 0x108)
+     AArch64DecodeFn *disas_fn;
-     FIELD(SWRESET, SWRESETREQ, 9, 1)
+ } AArch64DecodeTable;
- REG32(GRETREG, 0x10c)
--REG32(INITSVRTOR0, 0x110)
+-/* Function prototype for gen_ functions for calling Neon helpers */
-+REG32(INITSVTOR0, 0x110)
+-typedef void NeonGenOneOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32);
- REG32(CPUWAIT, 0x118)
+-typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
- REG32(BUSWAIT, 0x11c)
+-typedef void NeonGenTwoOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
- REG32(WICCTRL, 0x120)
+-typedef void NeonGenTwo64OpFn(TCGv_i64, TCGv_i64, TCGv_i64);
-@@ -XXX,XX +XXX,XX @@ static uint64_t iotkit_sysctl_read(void *opaque, hwaddr offset,
+-typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
-     case A_GRETREG:
+-typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
-         r = s->gretreg;
+-typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
-         break;
+-typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
--    case A_INITSVRTOR0:
+-typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
--        r = s->initsvrtor0;
+-typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
-+    case A_INITSVTOR0:
+-typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
-+        r = s->initsvtor0;
+-typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
-         break;
+-typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
-     case A_CPUWAIT:
+-typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
-         r = s->cpuwait;
+-typedef void AtomicThreeOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp);
-@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_write(void *opaque, hwaddr offset,
+-
-          */
+ /* initialize TCG globals.  */
-         s->gretreg = value;
+ void a64_translate_init(void)
-         break;
+ {
 -    case A_INITSVRTOR0:
 -        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl INITSVRTOR0 unimplemented\n");
 -        s->initsvrtor0 = value;
 +    case A_INITSVTOR0:
 +        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl INITSVTOR0 unimplemented\n");
 +        s->initsvtor0 = value;
          break;
      case A_CPUWAIT:
          qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl CPUWAIT unimplemented\n");
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_reset(DeviceState *dev)
      s->reset_syndrome = 1;
      s->reset_mask = 0;
      s->gretreg = 0;
 -    s->initsvrtor0 = 0x10000000;
 +    s->initsvtor0 = 0x10000000;
      s->cpuwait = 0;
      s->wicctrl = 0;
  }
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription iotkit_sysctl_vmstate = {
          VMSTATE_UINT32(reset_syndrome, IoTKitSysCtl),
          VMSTATE_UINT32(reset_mask, IoTKitSysCtl),
          VMSTATE_UINT32(gretreg, IoTKitSysCtl),
 -        VMSTATE_UINT32(initsvrtor0, IoTKitSysCtl),
 +        VMSTATE_UINT32(initsvtor0, IoTKitSysCtl),
          VMSTATE_UINT32(cpuwait, IoTKitSysCtl),
          VMSTATE_UINT32(wicctrl, IoTKitSysCtl),
          VMSTATE_END_OF_LIST()
 --
 .20.1

The following changes since commit adf2e451f357e993f173ba9b4176dbf3e65fee7e:

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging (2019-02-26 19:04:47 +0000)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20190228-1

for you to fetch changes up to 1c9af3a9e05c1607a36df4943f8f5393d7621a91:

linux-user: Enable HWCAP_ASIMDFHM, HWCAP_JSCVT (2019-02-28 11:03:05 +0000)

----------------------------------------------------------------
target-arm queue:
 * add MHU and dual-core support to Musca boards
 * refactor some VFP insns to be gated by ID registers
 * Revert "arm: Allow system registers for KVM guests to be changed by QEMU code"
 * Implement ARMv8.2-FHM extension
 * Advertise JSCVT via HWCAP for linux-user

----------------------------------------------------------------
Peter Maydell (11):
      hw/misc/armsse-mhu.c: Model the SSE-200 Message Handling Unit
      hw/arm/armsse: Wire up the MHUs
      target/arm/cpu: Allow init-svtor property to be set after realize
      target/arm/arm-powerctl: Add new arm_set_cpu_on_and_reset()
      hw/misc/iotkit-sysctl: Correct typo in INITSVTOR0 register name
      hw/arm/iotkit-sysctl: Add SSE-200 registers
      hw/arm/iotkit-sysctl: Implement CPUWAIT and INITSVTOR*
      hw/arm/armsse: Unify init-svtor and cpuwait handling
      target/arm: Use MVFR1 feature bits to gate A32/T32 FP16 instructions
      target/arm: Gate "miscellaneous FP" insns by ID register field
      Revert "arm: Allow system registers for KVM guests to be changed by QEMU code"

Richard Henderson (5):
      target/arm: Add helpers for FMLAL
      target/arm: Implement FMLAL and FMLSL for aarch64
      target/arm: Implement VFMAL and VFMSL for aarch32
      target/arm: Enable ARMv8.2-FHM for -cpu max
      linux-user: Enable HWCAP_ASIMDFHM, HWCAP_JSCVT

Implement a model of the Message Handling Unit (MHU) found in
the Arm SSE-200. This is a simple device which just contains
some registers which allow the two cores of the SSE-200
to raise interrupts on each other.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219125808.25174-2-peter.maydell@linaro.org
---
 hw/misc/Makefile.objs           |   1 +
 include/hw/misc/armsse-mhu.h    |  44 +++++++
 hw/misc/armsse-mhu.c            | 198 ++++++++++++++++++++++++++++++++
 MAINTAINERS                     |   2 +
 default-configs/arm-softmmu.mak |   1 +
 hw/misc/trace-events            |   4 +
 6 files changed, 250 insertions(+)
 create mode 100644 include/hw/misc/armsse-mhu.h
 create mode 100644 hw/misc/armsse-mhu.c

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_IOTKIT_SECCTL) += iotkit-secctl.o
 obj-$(CONFIG_IOTKIT_SYSCTL) += iotkit-sysctl.o
 obj-$(CONFIG_IOTKIT_SYSINFO) += iotkit-sysinfo.o
 obj-$(CONFIG_ARMSSE_CPUID) += armsse-cpuid.o
+obj-$(CONFIG_ARMSSE_MHU) += armsse-mhu.o
 
 obj-$(CONFIG_PVPANIC) += pvpanic.o
 obj-$(CONFIG_AUX) += auxbus.o
diff --git a/include/hw/misc/armsse-mhu.h b/include/hw/misc/armsse-mhu.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/misc/armsse-mhu.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * ARM SSE-200 Message Handling Unit (MHU)
+ *
+ * Copyright (c) 2019 Linaro Limited
+ * Written by Peter Maydell
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License version 2 or
+ *  (at your option) any later version.
+ */
+
+/*
+ * This is a model of the Message Handling Unit (MHU) which is part of the
+ * Arm SSE-200 and documented in
+ * http://infocenter.arm.com/help/topic/com.arm.doc.101104_0100_00_en/corelink_sse200_subsystem_for_embedded_technical_reference_manual_101104_0100_00_en.pdf
+ *
+ * QEMU interface:
+ *  + sysbus MMIO region 0: the system information register bank
+ *  + sysbus IRQ 0: interrupt for CPU 0
+ *  + sysbus IRQ 1: interrupt for CPU 1
+ */
+
+#ifndef HW_MISC_SSE_MHU_H
+#define HW_MISC_SSE_MHU_H
+
+#include "hw/sysbus.h"
+
+#define TYPE_ARMSSE_MHU "armsse-mhu"
+#define ARMSSE_MHU(obj) OBJECT_CHECK(ARMSSEMHU, (obj), TYPE_ARMSSE_MHU)
+
+typedef struct ARMSSEMHU {
+    /*< private >*/
+    SysBusDevice parent_obj;
+
+    /*< public >*/
+    MemoryRegion iomem;
+    qemu_irq cpu0irq;
+    qemu_irq cpu1irq;
+
+    uint32_t cpu0intr;
+    uint32_t cpu1intr;
+} ARMSSEMHU;
+
+#endif
diff --git a/hw/misc/armsse-mhu.c b/hw/misc/armsse-mhu.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/hw/misc/armsse-mhu.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * ARM SSE-200 Message Handling Unit (MHU)
+ *
+ * Copyright (c) 2019 Linaro Limited
+ * Written by Peter Maydell
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License version 2 or
+ *  (at your option) any later version.
+ */
+
+/*
+ * This is a model of the Message Handling Unit (MHU) which is part of the
+ * Arm SSE-200 and documented in
+ * http://infocenter.arm.com/help/topic/com.arm.doc.101104_0100_00_en/corelink_sse200_subsystem_for_embedded_technical_reference_manual_101104_0100_00_en.pdf
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "trace.h"
+#include "qapi/error.h"
+#include "sysemu/sysemu.h"
+#include "hw/sysbus.h"
+#include "hw/registerfields.h"
+#include "hw/misc/armsse-mhu.h"
+
+REG32(CPU0INTR_STAT, 0x0)
+REG32(CPU0INTR_SET, 0x4)
+REG32(CPU0INTR_CLR, 0x8)
+REG32(CPU1INTR_STAT, 0x10)
+REG32(CPU1INTR_SET, 0x14)
+REG32(CPU1INTR_CLR, 0x18)
+REG32(PID4, 0xfd0)
+REG32(PID5, 0xfd4)
+REG32(PID6, 0xfd8)
+REG32(PID7, 0xfdc)
+REG32(PID0, 0xfe0)
+REG32(PID1, 0xfe4)
+REG32(PID2, 0xfe8)
+REG32(PID3, 0xfec)
+REG32(CID0, 0xff0)
+REG32(CID1, 0xff4)
+REG32(CID2, 0xff8)
+REG32(CID3, 0xffc)
+
+/* Valid bits in the interrupt registers. If any are set the IRQ is raised */
+#define INTR_MASK 0xf
+
+/* PID/CID values */
+static const int armsse_mhu_id[] = {
+    0x04, 0x00, 0x00, 0x00, /* PID4..PID7 */
+    0x56, 0xb8, 0x0b, 0x00, /* PID0..PID3 */
+    0x0d, 0xf0, 0x05, 0xb1, /* CID0..CID3 */
+};
+
+static void armsse_mhu_update(ARMSSEMHU *s)
+{
+    qemu_set_irq(s->cpu0irq, s->cpu0intr != 0);
+    qemu_set_irq(s->cpu1irq, s->cpu1intr != 0);
+}
+
+static uint64_t armsse_mhu_read(void *opaque, hwaddr offset, unsigned size)
+{
+    ARMSSEMHU *s = ARMSSE_MHU(opaque);
+    uint64_t r;
+
+    switch (offset) {
+    case A_CPU0INTR_STAT:
+        r = s->cpu0intr;
+        break;
+
+    case A_CPU1INTR_STAT:
+        r = s->cpu1intr;
+        break;
+
+    case A_PID4 ... A_CID3:
+        r = armsse_mhu_id[(offset - A_PID4) / 4];
+        break;
+
+    case A_CPU0INTR_SET:
+    case A_CPU0INTR_CLR:
+    case A_CPU1INTR_SET:
+    case A_CPU1INTR_CLR:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "SSE MHU: read of write-only register at offset 0x%x\n",
+                      (int)offset);
+        r = 0;
+        break;
+
+    default:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "SSE MHU read: bad offset 0x%x\n", (int)offset);
+        r = 0;
+        break;
+    }
+    trace_armsse_mhu_read(offset, r, size);
+    return r;
+}
+
+static void armsse_mhu_write(void *opaque, hwaddr offset,
+                             uint64_t value, unsigned size)
+{
+    ARMSSEMHU *s = ARMSSE_MHU(opaque);
+
+    trace_armsse_mhu_write(offset, value, size);
+
+    switch (offset) {
+    case A_CPU0INTR_SET:
+        s->cpu0intr |= (value & INTR_MASK);
+        break;
+    case A_CPU0INTR_CLR:
+        s->cpu0intr &= ~(value & INTR_MASK);
+        break;
+    case A_CPU1INTR_SET:
+        s->cpu1intr |= (value & INTR_MASK);
+        break;
+    case A_CPU1INTR_CLR:
+        s->cpu1intr &= ~(value & INTR_MASK);
+        break;
+
+    case A_CPU0INTR_STAT:
+    case A_CPU1INTR_STAT:
+    case A_PID4 ... A_CID3:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "SSE MHU: write to read-only register at offset 0x%x\n",
+                      (int)offset);
+        break;
+
+    default:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "SSE MHU write: bad offset 0x%x\n", (int)offset);
+        break;
+    }
+
+    armsse_mhu_update(s);
+}
+
+static const MemoryRegionOps armsse_mhu_ops = {
+    .read = armsse_mhu_read,
+    .write = armsse_mhu_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid.min_access_size = 4,
+    .valid.max_access_size = 4,
+};
+
+static void armsse_mhu_reset(DeviceState *dev)
+{
+    ARMSSEMHU *s = ARMSSE_MHU(dev);
+
+    s->cpu0intr = 0;
+    s->cpu1intr = 0;
+}
+
+static const VMStateDescription armsse_mhu_vmstate = {
+    .name = "armsse-mhu",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(cpu0intr, ARMSSEMHU),
+        VMSTATE_UINT32(cpu1intr, ARMSSEMHU),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static void armsse_mhu_init(Object *obj)
+{
+    SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
+    ARMSSEMHU *s = ARMSSE_MHU(obj);
+
+    memory_region_init_io(&s->iomem, obj, &armsse_mhu_ops,
+                          s, "armsse-mhu", 0x1000);
+    sysbus_init_mmio(sbd, &s->iomem);
+    sysbus_init_irq(sbd, &s->cpu0irq);
+    sysbus_init_irq(sbd, &s->cpu1irq);
+}
+
+static void armsse_mhu_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->reset = armsse_mhu_reset;
+    dc->vmsd = &armsse_mhu_vmstate;
+}
+
+static const TypeInfo armsse_mhu_info = {
+    .name = TYPE_ARMSSE_MHU,
+    .parent = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(ARMSSEMHU),
+    .instance_init = armsse_mhu_init,
+    .class_init = armsse_mhu_class_init,
+};
+
+static void armsse_mhu_register_types(void)
+{
+    type_register_static(&armsse_mhu_info);
+}
+
+type_init(armsse_mhu_register_types);
diff --git a/MAINTAINERS b/MAINTAINERS
index XXXXXXX..XXXXXXX 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ F: hw/misc/iotkit-sysinfo.c
 F: include/hw/misc/iotkit-sysinfo.h
 F: hw/misc/armsse-cpuid.c
 F: include/hw/misc/armsse-cpuid.h
+F: hw/misc/armsse-mhu.c
+F: include/hw/misc/armsse-mhu.h
 
 Musca
 M: Peter Maydell <peter.maydell@linaro.org>
diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index XXXXXXX..XXXXXXX 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -XXX,XX +XXX,XX @@ CONFIG_IOTKIT_SECCTL=y
 CONFIG_IOTKIT_SYSCTL=y
 CONFIG_IOTKIT_SYSINFO=y
 CONFIG_ARMSSE_CPUID=y
+CONFIG_ARMSSE_MHU=y
 
 CONFIG_VERSATILE=y
 CONFIG_VERSATILE_PCI=y
diff --git a/hw/misc/trace-events b/hw/misc/trace-events
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/trace-events
+++ b/hw/misc/trace-events
@@ -XXX,XX +XXX,XX @@ iotkit_sysctl_reset(void) "IoTKit SysCtl: reset"
 # hw/misc/armsse-cpuid.c
 armsse_cpuid_read(uint64_t offset, uint64_t data, unsigned size) "SSE-200 CPU_IDENTITY read: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
 armsse_cpuid_write(uint64_t offset, uint64_t data, unsigned size) "SSE-200 CPU_IDENTITY write: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
+
+# hw/misc/armsse-mhu.c
+armsse_mhu_read(uint64_t offset, uint64_t data, unsigned size) "SSE-200 MHU read: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
+armsse_mhu_write(uint64_t offset, uint64_t data, unsigned size) "SSE-200 MHU write: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
-- 
2.20.1

Create and connect the MHUs in the SSE-200.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219125808.25174-3-peter.maydell@linaro.org
---
 include/hw/arm/armsse.h |  3 ++-
 hw/arm/armsse.c         | 40 ++++++++++++++++++++++++++++++----------
 2 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/include/hw/arm/armsse.h b/include/hw/arm/armsse.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/armsse.h
+++ b/include/hw/arm/armsse.h
@@ -XXX,XX +XXX,XX @@
 #include "hw/misc/iotkit-sysctl.h"
 #include "hw/misc/iotkit-sysinfo.h"
 #include "hw/misc/armsse-cpuid.h"
+#include "hw/misc/armsse-mhu.h"
 #include "hw/misc/unimp.h"
 #include "hw/or-irq.h"
 #include "hw/core/split-irq.h"
@@ -XXX,XX +XXX,XX @@ typedef struct ARMSSE {
     IoTKitSysCtl sysctl;
     IoTKitSysCtl sysinfo;
 
-    UnimplementedDeviceState mhu[2];
+    ARMSSEMHU mhu[2];
     UnimplementedDeviceState ppu[NUM_PPUS];
     UnimplementedDeviceState cachectrl[SSE_MAX_CPUS];
     UnimplementedDeviceState cpusecctrl[SSE_MAX_CPUS];
diff --git a/hw/arm/armsse.c b/hw/arm/armsse.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/armsse.c
+++ b/hw/arm/armsse.c
@@ -XXX,XX +XXX,XX @@ static void armsse_init(Object *obj)
                           sizeof(s->sysinfo), TYPE_IOTKIT_SYSINFO);
     if (info->has_mhus) {
         sysbus_init_child_obj(obj, "mhu0", &s->mhu[0], sizeof(s->mhu[0]),
-                              TYPE_UNIMPLEMENTED_DEVICE);
+                              TYPE_ARMSSE_MHU);
         sysbus_init_child_obj(obj, "mhu1", &s->mhu[1], sizeof(s->mhu[1]),
-                              TYPE_UNIMPLEMENTED_DEVICE);
+                              TYPE_ARMSSE_MHU);
     }
     if (info->has_ppus) {
         for (i = 0; i < info->num_cpus; i++) {
@@ -XXX,XX +XXX,XX @@ static void armsse_realize(DeviceState *dev, Error **errp)
     }
 
     if (info->has_mhus) {
-        for (i = 0; i < ARRAY_SIZE(s->mhu); i++) {
-            char *name;
-            char *port;
+        /*
+         * An SSE-200 with only one CPU should have only one MHU created,
+         * with the region where the second MHU usually is being RAZ/WI.
+         * We don't implement that SSE-200 config; if we want to support
+         * it then this code needs to be enhanced to handle creating the
+         * RAZ/WI region instead of the second MHU.
+         */
+        assert(info->num_cpus == ARRAY_SIZE(s->mhu));
+
+        for (i = 0; i < ARRAY_SIZE(s->mhu); i++) {
+            char *port;
+            int cpunum;
+            SysBusDevice *mhu_sbd = SYS_BUS_DEVICE(&s->mhu[i]);
 
-            name = g_strdup_printf("MHU%d", i);
-            qdev_prop_set_string(DEVICE(&s->mhu[i]), "name", name);
-            qdev_prop_set_uint64(DEVICE(&s->mhu[i]), "size", 0x1000);
             object_property_set_bool(OBJECT(&s->mhu[i]), true,
                                      "realized", &err);
-            g_free(name);
             if (err) {
                 error_propagate(errp, err);
                 return;
             }
             port = g_strdup_printf("port[%d]", i + 3);
-            mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->mhu[i]), 0);
+            mr = sysbus_mmio_get_region(mhu_sbd, 0);
             object_property_set_link(OBJECT(&s->apb_ppc0), OBJECT(mr),
                                      port, &err);
             g_free(port);
@@ -XXX,XX +XXX,XX @@ static void armsse_realize(DeviceState *dev, Error **errp)
                 error_propagate(errp, err);
                 return;
             }
+
+            /*
+             * Each MHU has an irq line for each CPU:
+             *  MHU 0 irq line 0 -> CPU 0 IRQ 6
+             *  MHU 0 irq line 1 -> CPU 1 IRQ 6
+             *  MHU 1 irq line 0 -> CPU 0 IRQ 7
+             *  MHU 1 irq line 1 -> CPU 1 IRQ 7
+             */
+            for (cpunum = 0; cpunum < info->num_cpus; cpunum++) {
+                DeviceState *cpudev = DEVICE(&s->armv7m[cpunum]);
+
+                sysbus_connect_irq(mhu_sbd, cpunum,
+                                   qdev_get_gpio_in(cpudev, 6 + i));
+            }
         }
     }
 
-- 
2.20.1

Make the M-profile "init-svtor" property be settable after realize.
This matches the hardware, where this is a config signal which
is sampled on CPU reset and can thus be changed between one
reset and another. To do this we have to change the API we
use to add the property.

(We will need this capability for the SSE-200.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219125808.25174-4-peter.maydell@linaro.org
---
 target/arm/cpu.c | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@
 #include "target/arm/idau.h"
 #include "qemu/error-report.h"
 #include "qapi/error.h"
+#include "qapi/visitor.h"
 #include "cpu.h"
 #include "internals.h"
 #include "qemu-common.h"
@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_pmsav7_dregion_property =
                                            pmsav7_dregion,
                                            qdev_prop_uint32, uint32_t);
 
-/* M profile: initial value of the Secure VTOR */
-static Property arm_cpu_initsvtor_property =
-            DEFINE_PROP_UINT32("init-svtor", ARMCPU, init_svtor, 0);
+static void arm_get_init_svtor(Object *obj, Visitor *v, const char *name,
+                               void *opaque, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    visit_type_uint32(v, name, &cpu->init_svtor, errp);
+}
+
+static void arm_set_init_svtor(Object *obj, Visitor *v, const char *name,
+                               void *opaque, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    visit_type_uint32(v, name, &cpu->init_svtor, errp);
+}
 
 void arm_cpu_post_init(Object *obj)
 {
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
                                  qdev_prop_allow_set_link_before_realize,
                                  OBJ_PROP_LINK_STRONG,
                                  &error_abort);
-        qdev_property_add_static(DEVICE(obj), &arm_cpu_initsvtor_property,
-                                 &error_abort);
+        /*
+         * M profile: initial value of the Secure VTOR. We can't just use
+         * a simple DEFINE_PROP_UINT32 for this because we want to permit
+         * the property to be set after realize.
+         */
+        object_property_add(obj, "init-svtor", "uint32",
+                            arm_get_init_svtor, arm_set_init_svtor,
+                            NULL, NULL, &error_abort);
     }
 
     qdev_property_add_static(DEVICE(obj), &arm_cpu_cfgend_property,
-- 
2.20.1

Currently the Arm arm-powerctl.h APIs allow:
 * arm_set_cpu_on(), which powers on a CPU and sets its
   initial PC and other startup state
 * arm_reset_cpu(), which resets a CPU which is already on
   (and fails if the CPU is powered off)

but there is no way to say "power on a CPU as if it had
just come out of reset and don't do anything else to it".

Add a new function arm_set_cpu_on_and_reset(), which does this.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219125808.25174-5-peter.maydell@linaro.org
---
 target/arm/arm-powerctl.h | 16 +++++++++++
 target/arm/arm-powerctl.c | 56 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 72 insertions(+)

diff --git a/target/arm/arm-powerctl.h b/target/arm/arm-powerctl.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/arm-powerctl.h
+++ b/target/arm/arm-powerctl.h
@@ -XXX,XX +XXX,XX @@ int arm_set_cpu_off(uint64_t cpuid);
  */
 int arm_reset_cpu(uint64_t cpuid);
 
+/*
+ * arm_set_cpu_on_and_reset:
+ * @cpuid: the id of the CPU we want to star
+ *
+ * Start the cpu designated by @cpuid and put it through its normal
+ * CPU reset process. The CPU will start in the way it is architected
+ * to start after a power-on reset.
+ *
+ * Returns: QEMU_ARM_POWERCTL_RET_SUCCESS on success.
+ * QEMU_ARM_POWERCTL_INVALID_PARAM if there is no CPU with that ID.
+ * QEMU_ARM_POWERCTL_ALREADY_ON if the CPU is already on.
+ * QEMU_ARM_POWERCTL_ON_PENDING if the CPU is already partway through
+ * powering on.
+ */
+int arm_set_cpu_on_and_reset(uint64_t cpuid);
+
 #endif
diff --git a/target/arm/arm-powerctl.c b/target/arm/arm-powerctl.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/arm-powerctl.c
+++ b/target/arm/arm-powerctl.c
@@ -XXX,XX +XXX,XX @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
     return QEMU_ARM_POWERCTL_RET_SUCCESS;
 }
 
+static void arm_set_cpu_on_and_reset_async_work(CPUState *target_cpu_state,
+                                                run_on_cpu_data data)
+{
+    ARMCPU *target_cpu = ARM_CPU(target_cpu_state);
+
+    /* Initialize the cpu we are turning on */
+    cpu_reset(target_cpu_state);
+    target_cpu_state->halted = 0;
+
+    /* Finally set the power status */
+    assert(qemu_mutex_iothread_locked());
+    target_cpu->power_state = PSCI_ON;
+}
+
+int arm_set_cpu_on_and_reset(uint64_t cpuid)
+{
+    CPUState *target_cpu_state;
+    ARMCPU *target_cpu;
+
+    assert(qemu_mutex_iothread_locked());
+
+    /* Retrieve the cpu we are powering up */
+    target_cpu_state = arm_get_cpu_by_id(cpuid);
+    if (!target_cpu_state) {
+        /* The cpu was not found */
+        return QEMU_ARM_POWERCTL_INVALID_PARAM;
+    }
+
+    target_cpu = ARM_CPU(target_cpu_state);
+    if (target_cpu->power_state == PSCI_ON) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "[ARM]%s: CPU %" PRId64 " is already on\n",
+                      __func__, cpuid);
+        return QEMU_ARM_POWERCTL_ALREADY_ON;
+    }
+
+    /*
+     * If another CPU has powered the target on we are in the state
+     * ON_PENDING and additional attempts to power on the CPU should
+     * fail (see 6.6 Implementation CPU_ON/CPU_OFF races in the PSCI
+     * spec)
+     */
+    if (target_cpu->power_state == PSCI_ON_PENDING) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "[ARM]%s: CPU %" PRId64 " is already powering on\n",
+                      __func__, cpuid);
+        return QEMU_ARM_POWERCTL_ON_PENDING;
+    }
+
+    async_run_on_cpu(target_cpu_state, arm_set_cpu_on_and_reset_async_work,
+                     RUN_ON_CPU_NULL);
+
+    /* We are good to go */
+    return QEMU_ARM_POWERCTL_RET_SUCCESS;
+}
+
 static void arm_set_cpu_off_async_work(CPUState *target_cpu_state,
                                        run_on_cpu_data data)
 {
-- 
2.20.1

The iotkit-sysctl device has a register it names INITSVRTOR0.
This is actually a typo present in the IoTKit documentation
and also in part of the SSE-200 documentation:  it should be
INITSVTOR0 because it is specifying the initial value of the
Secure VTOR register in the CPU. Correct the typo.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219125808.25174-6-peter.maydell@linaro.org
---
 include/hw/misc/iotkit-sysctl.h |  2 +-
 hw/misc/iotkit-sysctl.c         | 16 ++++++++--------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/hw/misc/iotkit-sysctl.h b/include/hw/misc/iotkit-sysctl.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/misc/iotkit-sysctl.h
+++ b/include/hw/misc/iotkit-sysctl.h
@@ -XXX,XX +XXX,XX @@ typedef struct IoTKitSysCtl {
     uint32_t reset_syndrome;
     uint32_t reset_mask;
     uint32_t gretreg;
-    uint32_t initsvrtor0;
+    uint32_t initsvtor0;
     uint32_t cpuwait;
     uint32_t wicctrl;
 } IoTKitSysCtl;
diff --git a/hw/misc/iotkit-sysctl.c b/hw/misc/iotkit-sysctl.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/iotkit-sysctl.c
+++ b/hw/misc/iotkit-sysctl.c
@@ -XXX,XX +XXX,XX @@ REG32(RESET_MASK, 0x104)
 REG32(SWRESET, 0x108)
     FIELD(SWRESET, SWRESETREQ, 9, 1)
 REG32(GRETREG, 0x10c)
-REG32(INITSVRTOR0, 0x110)
+REG32(INITSVTOR0, 0x110)
 REG32(CPUWAIT, 0x118)
 REG32(BUSWAIT, 0x11c)
 REG32(WICCTRL, 0x120)
@@ -XXX,XX +XXX,XX @@ static uint64_t iotkit_sysctl_read(void *opaque, hwaddr offset,
     case A_GRETREG:
         r = s->gretreg;
         break;
-    case A_INITSVRTOR0:
-        r = s->initsvrtor0;
+    case A_INITSVTOR0:
+        r = s->initsvtor0;
         break;
     case A_CPUWAIT:
         r = s->cpuwait;
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_write(void *opaque, hwaddr offset,
          */
         s->gretreg = value;
         break;
-    case A_INITSVRTOR0:
-        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl INITSVRTOR0 unimplemented\n");
-        s->initsvrtor0 = value;
+    case A_INITSVTOR0:
+        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl INITSVTOR0 unimplemented\n");
+        s->initsvtor0 = value;
         break;
     case A_CPUWAIT:
         qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl CPUWAIT unimplemented\n");
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_reset(DeviceState *dev)
     s->reset_syndrome = 1;
     s->reset_mask = 0;
     s->gretreg = 0;
-    s->initsvrtor0 = 0x10000000;
+    s->initsvtor0 = 0x10000000;
     s->cpuwait = 0;
     s->wicctrl = 0;
 }
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription iotkit_sysctl_vmstate = {
         VMSTATE_UINT32(reset_syndrome, IoTKitSysCtl),
         VMSTATE_UINT32(reset_mask, IoTKitSysCtl),
         VMSTATE_UINT32(gretreg, IoTKitSysCtl),
-        VMSTATE_UINT32(initsvrtor0, IoTKitSysCtl),
+        VMSTATE_UINT32(initsvtor0, IoTKitSysCtl),
         VMSTATE_UINT32(cpuwait, IoTKitSysCtl),
         VMSTATE_UINT32(wicctrl, IoTKitSysCtl),
         VMSTATE_END_OF_LIST()
-- 
2.20.1

The SYSCTL block in the SSE-200 has some extra registers that
are not present in the IoTKit version. Add these registers
(as reads-as-written stubs), enabled by a new QOM property.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219125808.25174-7-peter.maydell@linaro.org
---
 include/hw/misc/iotkit-sysctl.h |  20 +++
 hw/arm/armsse.c                 |   2 +
 hw/misc/iotkit-sysctl.c         | 245 +++++++++++++++++++++++++++++++-
 3 files changed, 262 insertions(+), 5 deletions(-)

diff --git a/include/hw/misc/iotkit-sysctl.h b/include/hw/misc/iotkit-sysctl.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/misc/iotkit-sysctl.h
+++ b/include/hw/misc/iotkit-sysctl.h
@@ -XXX,XX +XXX,XX @@
  * "system control register" blocks.
  *
  * QEMU interface:
+ *  + QOM property "SYS_VERSION": value of the SYS_VERSION register of the
+ *    system information block of the SSE
+ *    (used to identify whether to provide SSE-200-only registers)
  *  + sysbus MMIO region 0: the system information register bank
  *  + sysbus MMIO region 1: the system control register bank
  */
@@ -XXX,XX +XXX,XX @@ typedef struct IoTKitSysCtl {
     uint32_t initsvtor0;
     uint32_t cpuwait;
     uint32_t wicctrl;
+    uint32_t scsecctrl;
+    uint32_t fclk_div;
+    uint32_t sysclk_div;
+    uint32_t clock_force;
+    uint32_t initsvtor1;
+    uint32_t nmi_enable;
+    uint32_t ewctrl;
+    uint32_t pdcm_pd_sys_sense;
+    uint32_t pdcm_pd_sram0_sense;
+    uint32_t pdcm_pd_sram1_sense;
+    uint32_t pdcm_pd_sram2_sense;
+    uint32_t pdcm_pd_sram3_sense;
+
+    /* Properties */
+    uint32_t sys_version;
+
+    bool is_sse200;
 } IoTKitSysCtl;
 
 #endif
diff --git a/hw/arm/armsse.c b/hw/arm/armsse.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/armsse.c
+++ b/hw/arm/armsse.c
@@ -XXX,XX +XXX,XX @@ static void armsse_realize(DeviceState *dev, Error **errp)
     /* System information registers */
     sysbus_mmio_map(SYS_BUS_DEVICE(&s->sysinfo), 0, 0x40020000);
     /* System control registers */
+    object_property_set_int(OBJECT(&s->sysctl), info->sys_version,
+                            "SYS_VERSION", &err);
     object_property_set_bool(OBJECT(&s->sysctl), true, "realized", &err);
     if (err) {
         error_propagate(errp, err);
diff --git a/hw/misc/iotkit-sysctl.c b/hw/misc/iotkit-sysctl.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/iotkit-sysctl.c
+++ b/hw/misc/iotkit-sysctl.c
@@ -XXX,XX +XXX,XX @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/bitops.h"
 #include "qemu/log.h"
 #include "trace.h"
 #include "qapi/error.h"
@@ -XXX,XX +XXX,XX @@
 REG32(SECDBGSTAT, 0x0)
 REG32(SECDBGSET, 0x4)
 REG32(SECDBGCLR, 0x8)
+REG32(SCSECCTRL, 0xc)
+REG32(FCLK_DIV, 0x10)
+REG32(SYSCLK_DIV, 0x14)
+REG32(CLOCK_FORCE, 0x18)
 REG32(RESET_SYNDROME, 0x100)
 REG32(RESET_MASK, 0x104)
 REG32(SWRESET, 0x108)
     FIELD(SWRESET, SWRESETREQ, 9, 1)
 REG32(GRETREG, 0x10c)
 REG32(INITSVTOR0, 0x110)
+REG32(INITSVTOR1, 0x114)
 REG32(CPUWAIT, 0x118)
-REG32(BUSWAIT, 0x11c)
+REG32(NMI_ENABLE, 0x11c) /* BUSWAIT in IoTKit */
 REG32(WICCTRL, 0x120)
+REG32(EWCTRL, 0x124)
+REG32(PDCM_PD_SYS_SENSE, 0x200)
+REG32(PDCM_PD_SRAM0_SENSE, 0x20c)
+REG32(PDCM_PD_SRAM1_SENSE, 0x210)
+REG32(PDCM_PD_SRAM2_SENSE, 0x214)
+REG32(PDCM_PD_SRAM3_SENSE, 0x218)
 REG32(PID4, 0xfd0)
 REG32(PID5, 0xfd4)
 REG32(PID6, 0xfd8)
@@ -XXX,XX +XXX,XX @@ static uint64_t iotkit_sysctl_read(void *opaque, hwaddr offset,
     case A_SECDBGSTAT:
         r = s->secure_debug;
         break;
+    case A_SCSECCTRL:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->scsecctrl;
+        break;
+    case A_FCLK_DIV:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->fclk_div;
+        break;
+    case A_SYSCLK_DIV:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->sysclk_div;
+        break;
+    case A_CLOCK_FORCE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->clock_force;
+        break;
     case A_RESET_SYNDROME:
         r = s->reset_syndrome;
         break;
@@ -XXX,XX +XXX,XX @@ static uint64_t iotkit_sysctl_read(void *opaque, hwaddr offset,
     case A_INITSVTOR0:
         r = s->initsvtor0;
         break;
+    case A_INITSVTOR1:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->initsvtor1;
+        break;
     case A_CPUWAIT:
         r = s->cpuwait;
         break;
-    case A_BUSWAIT:
-        /* In IoTKit BUSWAIT is reserved, R/O, zero */
-        r = 0;
+    case A_NMI_ENABLE:
+        /* In IoTKit this is named BUSWAIT but is marked reserved, R/O, zero */
+        if (!s->is_sse200) {
+            r = 0;
+            break;
+        }
+        r = s->nmi_enable;
         break;
     case A_WICCTRL:
         r = s->wicctrl;
         break;
+    case A_EWCTRL:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->ewctrl;
+        break;
+    case A_PDCM_PD_SYS_SENSE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->pdcm_pd_sys_sense;
+        break;
+    case A_PDCM_PD_SRAM0_SENSE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->pdcm_pd_sram0_sense;
+        break;
+    case A_PDCM_PD_SRAM1_SENSE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->pdcm_pd_sram1_sense;
+        break;
+    case A_PDCM_PD_SRAM2_SENSE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->pdcm_pd_sram2_sense;
+        break;
+    case A_PDCM_PD_SRAM3_SENSE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        r = s->pdcm_pd_sram3_sense;
+        break;
     case A_PID4 ... A_CID3:
         r = sysctl_id[(offset - A_PID4) / 4];
         break;
@@ -XXX,XX +XXX,XX @@ static uint64_t iotkit_sysctl_read(void *opaque, hwaddr offset,
         r = 0;
         break;
     default:
+    bad_offset:
         qemu_log_mask(LOG_GUEST_ERROR,
                       "IoTKit SysCtl read: bad offset %x\n", (int)offset);
         r = 0;
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_write(void *opaque, hwaddr offset,
             qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
         }
         break;
-    case A_BUSWAIT:        /* In IoTKit BUSWAIT is reserved, R/O, zero */
+    case A_SCSECCTRL:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl SCSECCTRL unimplemented\n");
+        s->scsecctrl = value;
+        break;
+    case A_FCLK_DIV:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl FCLK_DIV unimplemented\n");
+        s->fclk_div = value;
+        break;
+    case A_SYSCLK_DIV:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl SYSCLK_DIV unimplemented\n");
+        s->sysclk_div = value;
+        break;
+    case A_CLOCK_FORCE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl CLOCK_FORCE unimplemented\n");
+        s->clock_force = value;
+        break;
+    case A_INITSVTOR1:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl INITSVTOR1 unimplemented\n");
+        s->initsvtor1 = value;
+        break;
+    case A_EWCTRL:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl EWCTRL unimplemented\n");
+        s->ewctrl = value;
+        break;
+    case A_PDCM_PD_SYS_SENSE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP,
+                      "IoTKit SysCtl PDCM_PD_SYS_SENSE unimplemented\n");
+        s->pdcm_pd_sys_sense = value;
+        break;
+    case A_PDCM_PD_SRAM0_SENSE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP,
+                      "IoTKit SysCtl PDCM_PD_SRAM0_SENSE unimplemented\n");
+        s->pdcm_pd_sram0_sense = value;
+        break;
+    case A_PDCM_PD_SRAM1_SENSE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP,
+                      "IoTKit SysCtl PDCM_PD_SRAM1_SENSE unimplemented\n");
+        s->pdcm_pd_sram1_sense = value;
+        break;
+    case A_PDCM_PD_SRAM2_SENSE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP,
+                      "IoTKit SysCtl PDCM_PD_SRAM2_SENSE unimplemented\n");
+        s->pdcm_pd_sram2_sense = value;
+        break;
+    case A_PDCM_PD_SRAM3_SENSE:
+        if (!s->is_sse200) {
+            goto bad_offset;
+        }
+        qemu_log_mask(LOG_UNIMP,
+                      "IoTKit SysCtl PDCM_PD_SRAM3_SENSE unimplemented\n");
+        s->pdcm_pd_sram3_sense = value;
+        break;
+    case A_NMI_ENABLE:
+        /* In IoTKit this is BUSWAIT: reserved, R/O, zero */
+        if (!s->is_sse200) {
+            goto ro_offset;
+        }
+        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl NMI_ENABLE unimplemented\n");
+        s->nmi_enable = value;
+        break;
     case A_SECDBGSTAT:
     case A_PID4 ... A_CID3:
+    ro_offset:
         qemu_log_mask(LOG_GUEST_ERROR,
                       "IoTKit SysCtl write: write of RO offset %x\n",
                       (int)offset);
         break;
     default:
+    bad_offset:
         qemu_log_mask(LOG_GUEST_ERROR,
                       "IoTKit SysCtl write: bad offset %x\n", (int)offset);
         break;
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_reset(DeviceState *dev)
     s->reset_mask = 0;
     s->gretreg = 0;
     s->initsvtor0 = 0x10000000;
+    s->initsvtor1 = 0x10000000;
     s->cpuwait = 0;
     s->wicctrl = 0;
+    s->scsecctrl = 0;
+    s->fclk_div = 0;
+    s->sysclk_div = 0;
+    s->clock_force = 0;
+    s->nmi_enable = 0;
+    s->ewctrl = 0;
+    s->pdcm_pd_sys_sense = 0x7f;
+    s->pdcm_pd_sram0_sense = 0;
+    s->pdcm_pd_sram1_sense = 0;
+    s->pdcm_pd_sram2_sense = 0;
+    s->pdcm_pd_sram3_sense = 0;
 }
 
 static void iotkit_sysctl_init(Object *obj)
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_init(Object *obj)
     sysbus_init_mmio(sbd, &s->iomem);
 }
 
+static void iotkit_sysctl_realize(DeviceState *dev, Error **errp)
+{
+    IoTKitSysCtl *s = IOTKIT_SYSCTL(dev);
+
+    /* The top 4 bits of the SYS_VERSION register tell us if we're an SSE-200 */
+    if (extract32(s->sys_version, 28, 4) == 2) {
+        s->is_sse200 = true;
+    }
+}
+
+static bool sse200_needed(void *opaque)
+{
+    IoTKitSysCtl *s = IOTKIT_SYSCTL(opaque);
+
+    return s->is_sse200;
+}
+
+static const VMStateDescription iotkit_sysctl_sse200_vmstate = {
+    .name = "iotkit-sysctl/sse-200",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = sse200_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(scsecctrl, IoTKitSysCtl),
+        VMSTATE_UINT32(fclk_div, IoTKitSysCtl),
+        VMSTATE_UINT32(sysclk_div, IoTKitSysCtl),
+        VMSTATE_UINT32(clock_force, IoTKitSysCtl),
+        VMSTATE_UINT32(initsvtor1, IoTKitSysCtl),
+        VMSTATE_UINT32(nmi_enable, IoTKitSysCtl),
+        VMSTATE_UINT32(pdcm_pd_sys_sense, IoTKitSysCtl),
+        VMSTATE_UINT32(pdcm_pd_sram0_sense, IoTKitSysCtl),
+        VMSTATE_UINT32(pdcm_pd_sram1_sense, IoTKitSysCtl),
+        VMSTATE_UINT32(pdcm_pd_sram2_sense, IoTKitSysCtl),
+        VMSTATE_UINT32(pdcm_pd_sram3_sense, IoTKitSysCtl),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription iotkit_sysctl_vmstate = {
     .name = "iotkit-sysctl",
     .version_id = 1,
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription iotkit_sysctl_vmstate = {
         VMSTATE_UINT32(cpuwait, IoTKitSysCtl),
         VMSTATE_UINT32(wicctrl, IoTKitSysCtl),
         VMSTATE_END_OF_LIST()
+    },
+    .subsections = (const VMStateDescription*[]) {
+        &iotkit_sysctl_sse200_vmstate,
+        NULL
     }
 };
 
+static Property iotkit_sysctl_props[] = {
+    DEFINE_PROP_UINT32("SYS_VERSION", IoTKitSysCtl, sys_version, 0),
+    DEFINE_PROP_END_OF_LIST()
+};
+
 static void iotkit_sysctl_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
 
     dc->vmsd = &iotkit_sysctl_vmstate;
     dc->reset = iotkit_sysctl_reset;
+    dc->props = iotkit_sysctl_props;
+    dc->realize = iotkit_sysctl_realize;
 }
 
 static const TypeInfo iotkit_sysctl_info = {
-- 
2.20.1

The CPUWAIT register acts as a sort of power-control: if a bit
in it is 1 then the CPU will have been forced into waiting
when the system was reset (which in QEMU we model as the
CPU starting powered off). Writing a 0 to the register will
allow the CPU to boot (for QEMU, we model this as powering
it on). Note that writing 0 to the register does not power
off a CPU.

For this to work correctly we need to also honour the
INITSVTOR* registers, which let the guest control where the
CPU will load its SP and PC from when it comes out of reset.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219125808.25174-8-peter.maydell@linaro.org
---
 hw/misc/iotkit-sysctl.c | 41 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/hw/misc/iotkit-sysctl.c b/hw/misc/iotkit-sysctl.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/iotkit-sysctl.c
+++ b/hw/misc/iotkit-sysctl.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/sysbus.h"
 #include "hw/registerfields.h"
 #include "hw/misc/iotkit-sysctl.h"
+#include "target/arm/arm-powerctl.h"
+#include "target/arm/cpu.h"
 
 REG32(SECDBGSTAT, 0x0)
 REG32(SECDBGSET, 0x4)
@@ -XXX,XX +XXX,XX @@ static const int sysctl_id[] = {
     0x0d, 0xf0, 0x05, 0xb1, /* CID0..CID3 */
 };
 
+/*
+ * Set the initial secure vector table offset address for the core.
+ * This will take effect when the CPU next resets.
+ */
+static void set_init_vtor(uint64_t cpuid, uint32_t vtor)
+{
+    Object *cpuobj = OBJECT(arm_get_cpu_by_id(cpuid));
+
+    if (cpuobj) {
+        if (object_property_find(cpuobj, "init-svtor", NULL)) {
+            object_property_set_uint(cpuobj, vtor, "init-svtor", &error_abort);
+        }
+    }
+}
+
 static uint64_t iotkit_sysctl_read(void *opaque, hwaddr offset,
                                     unsigned size)
 {
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_write(void *opaque, hwaddr offset,
         s->gretreg = value;
         break;
     case A_INITSVTOR0:
-        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl INITSVTOR0 unimplemented\n");
         s->initsvtor0 = value;
+        set_init_vtor(0, s->initsvtor0);
         break;
     case A_CPUWAIT:
-        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl CPUWAIT unimplemented\n");
+        if ((s->cpuwait & 1) && !(value & 1)) {
+            /* Powering up CPU 0 */
+            arm_set_cpu_on_and_reset(0);
+        }
+        if ((s->cpuwait & 2) && !(value & 2)) {
+            /* Powering up CPU 1 */
+            arm_set_cpu_on_and_reset(1);
+        }
         s->cpuwait = value;
         break;
     case A_WICCTRL:
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_write(void *opaque, hwaddr offset,
         if (!s->is_sse200) {
             goto bad_offset;
         }
-        qemu_log_mask(LOG_UNIMP, "IoTKit SysCtl INITSVTOR1 unimplemented\n");
         s->initsvtor1 = value;
+        set_init_vtor(1, s->initsvtor1);
         break;
     case A_EWCTRL:
         if (!s->is_sse200) {
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_reset(DeviceState *dev)
     s->gretreg = 0;
     s->initsvtor0 = 0x10000000;
     s->initsvtor1 = 0x10000000;
-    s->cpuwait = 0;
+    if (s->is_sse200) {
+        /*
+         * CPU 0 starts on, CPU 1 starts off. In real hardware this is
+         * configurable by the SoC integrator as a verilog parameter.
+         */
+        s->cpuwait = 2;
+    } else {
+        /* CPU 0 starts on */
+        s->cpuwait = 0;
+    }
     s->wicctrl = 0;
     s->scsecctrl = 0;
     s->fclk_div = 0;
-- 
2.20.1

At the moment the handling of init-svtor and cpuwait initial
values is split between armsse.c and iotkit-sysctl.c:
the code in armsse.c sets the initial state of the CPU
object by setting the init-svtor and start-powered-off
properties, but the iotkit-sysctl.c code has its own
code setting the reset values of its registers (which are
then used when updating the CPU when the guest makes
runtime changes).

Clean this up by making the armsse.c code set properties on the
iotkit-sysctl object to define the initial values of the
registers, so they always match the initial CPU state,
and update the comments in armsse.c accordingly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219125808.25174-9-peter.maydell@linaro.org
---
 include/hw/misc/iotkit-sysctl.h |  3 ++
 hw/arm/armsse.c                 | 49 +++++++++++++++++++++------------
 hw/misc/iotkit-sysctl.c         | 20 ++++++--------
 3 files changed, 42 insertions(+), 30 deletions(-)

diff --git a/include/hw/misc/iotkit-sysctl.h b/include/hw/misc/iotkit-sysctl.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/misc/iotkit-sysctl.h
+++ b/include/hw/misc/iotkit-sysctl.h
@@ -XXX,XX +XXX,XX @@ typedef struct IoTKitSysCtl {
 
     /* Properties */
     uint32_t sys_version;
+    uint32_t cpuwait_rst;
+    uint32_t initsvtor0_rst;
+    uint32_t initsvtor1_rst;
 
     bool is_sse200;
 } IoTKitSysCtl;
diff --git a/hw/arm/armsse.c b/hw/arm/armsse.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/armsse.c
+++ b/hw/arm/armsse.c
@@ -XXX,XX +XXX,XX @@
 
 #include "qemu/osdep.h"
 #include "qemu/log.h"
+#include "qemu/bitops.h"
 #include "qapi/error.h"
 #include "trace.h"
 #include "hw/sysbus.h"
@@ -XXX,XX +XXX,XX @@ struct ARMSSEInfo {
     int sram_banks;
     int num_cpus;
     uint32_t sys_version;
+    uint32_t cpuwait_rst;
     SysConfigFormat sys_config_format;
     bool has_mhus;
     bool has_ppus;
@@ -XXX,XX +XXX,XX @@ static const ARMSSEInfo armsse_variants[] = {
         .sram_banks = 1,
         .num_cpus = 1,
         .sys_version = 0x41743,
+        .cpuwait_rst = 0,
         .sys_config_format = IoTKitFormat,
         .has_mhus = false,
         .has_ppus = false,
@@ -XXX,XX +XXX,XX @@ static const ARMSSEInfo armsse_variants[] = {
         .sram_banks = 4,
         .num_cpus = 2,
         .sys_version = 0x22041743,
+        .cpuwait_rst = 2,
         .sys_config_format = SSE200Format,
         .has_mhus = true,
         .has_ppus = true,
@@ -XXX,XX +XXX,XX @@ static void armsse_realize(DeviceState *dev, Error **errp)
 
         qdev_prop_set_uint32(cpudev, "num-irq", s->exp_numirq + 32);
         /*
-         * In real hardware the initial Secure VTOR is set from the INITSVTOR0
-         * register in the IoT Kit System Control Register block, and the
-         * initial value of that is in turn specifiable by the FPGA that
-         * instantiates the IoT Kit. In QEMU we don't implement this wrinkle,
-         * and simply set the CPU's init-svtor to the IoT Kit default value.
-         * In SSE-200 the situation is similar, except that the default value
-         * is a reset-time signal input. Typically a board using the SSE-200
-         * will have a system control processor whose boot firmware initializes
-         * the INITSVTOR* registers before powering up the CPUs in any case,
-         * so the hardware's default value doesn't matter. QEMU doesn't emulate
+         * In real hardware the initial Secure VTOR is set from the INITSVTOR*
+         * registers in the IoT Kit System Control Register block. In QEMU
+         * we set the initial value here, and also the reset value of the
+         * sysctl register, from this object's QOM init-svtor property.
+         * If the guest changes the INITSVTOR* registers at runtime then the
+         * code in iotkit-sysctl.c will update the CPU init-svtor property
+         * (which will then take effect on the next CPU warm-reset).
+         *
+         * Note that typically a board using the SSE-200 will have a system
+         * control processor whose boot firmware initializes the INITSVTOR*
+         * registers before powering up the CPUs. QEMU doesn't emulate
          * the control processor, so instead we behave in the way that the
-         * firmware does. The initial value is configurable by the board code
-         * to match whatever its firmware does.
+         * firmware does: the initial value should be set by the board code
+         * (using the init-svtor property on the ARMSSE object) to match
+         * whatever its firmware does.
          */
         qdev_prop_set_uint32(cpudev, "init-svtor", s->init_svtor);
         /*
-         * Start all CPUs except CPU0 powered down. In real hardware it is
-         * a configurable property of the SSE-200 which CPUs start powered up
-         * (via the CPUWAIT0_RST and CPUWAIT1_RST parameters), but since all
-         * the boards we care about start CPU0 and leave CPU1 powered off,
-         * we hard-code that for now. We can add QOM properties for this
+         * CPUs start powered down if the corresponding bit in the CPUWAIT
+         * register is 1. In real hardware the CPUWAIT register reset value is
+         * a configurable property of the SSE-200 (via the CPUWAIT0_RST and
+         * CPUWAIT1_RST parameters), but since all the boards we care about
+         * start CPU0 and leave CPU1 powered off, we hard-code that in
+         * info->cpuwait_rst for now. We can add QOM properties for this
          * later if necessary.
          */
-        if (i > 0) {
+        if (extract32(info->cpuwait_rst, i, 1)) {
             object_property_set_bool(cpuobj, true, "start-powered-off", &err);
             if (err) {
                 error_propagate(errp, err);
@@ -XXX,XX +XXX,XX @@ static void armsse_realize(DeviceState *dev, Error **errp)
     /* System control registers */
     object_property_set_int(OBJECT(&s->sysctl), info->sys_version,
                             "SYS_VERSION", &err);
+    object_property_set_int(OBJECT(&s->sysctl), info->cpuwait_rst,
+                            "CPUWAIT_RST", &err);
+    object_property_set_int(OBJECT(&s->sysctl), s->init_svtor,
+                            "INITSVTOR0_RST", &err);
+    object_property_set_int(OBJECT(&s->sysctl), s->init_svtor,
+                            "INITSVTOR1_RST", &err);
     object_property_set_bool(OBJECT(&s->sysctl), true, "realized", &err);
     if (err) {
         error_propagate(errp, err);
diff --git a/hw/misc/iotkit-sysctl.c b/hw/misc/iotkit-sysctl.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/iotkit-sysctl.c
+++ b/hw/misc/iotkit-sysctl.c
@@ -XXX,XX +XXX,XX @@ static void iotkit_sysctl_reset(DeviceState *dev)
     s->reset_syndrome = 1;
     s->reset_mask = 0;
     s->gretreg = 0;
-    s->initsvtor0 = 0x10000000;
-    s->initsvtor1 = 0x10000000;
-    if (s->is_sse200) {
-        /*
-         * CPU 0 starts on, CPU 1 starts off. In real hardware this is
-         * configurable by the SoC integrator as a verilog parameter.
-         */
-        s->cpuwait = 2;
-    } else {
-        /* CPU 0 starts on */
-        s->cpuwait = 0;
-    }
+    s->initsvtor0 = s->initsvtor0_rst;
+    s->initsvtor1 = s->initsvtor1_rst;
+    s->cpuwait = s->cpuwait_rst;
     s->wicctrl = 0;
     s->scsecctrl = 0;
     s->fclk_div = 0;
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription iotkit_sysctl_vmstate = {
 
 static Property iotkit_sysctl_props[] = {
     DEFINE_PROP_UINT32("SYS_VERSION", IoTKitSysCtl, sys_version, 0),
+    DEFINE_PROP_UINT32("CPUWAIT_RST", IoTKitSysCtl, cpuwait_rst, 0),
+    DEFINE_PROP_UINT32("INITSVTOR0_RST", IoTKitSysCtl, initsvtor0_rst,
+                       0x10000000),
+    DEFINE_PROP_UINT32("INITSVTOR1_RST", IoTKitSysCtl, initsvtor1_rst,
+                       0x10000000),
     DEFINE_PROP_END_OF_LIST()
 };
 
-- 
2.20.1

Instead of gating the A32/T32 FP16 conversion instructions on
the ARM_FEATURE_VFP_FP16 flag, switch to our new approach of
looking at ID register bits. In this case MVFR1 fields FPHP
and SIMDHP indicate the presence of these insns.

This change doesn't alter behaviour for any of our CPUs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190222170936.13268-2-peter.maydell@linaro.org
---
 target/arm/cpu.h       | 37 ++++++++++++++++++++++++++++++++++++-
 target/arm/cpu.c       |  2 --
 target/arm/kvm32.c     |  3 ---
 target/arm/translate.c | 26 ++++++++++++++++++--------
 4 files changed, 54 insertions(+), 14 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(ID_DFR0, MPROFDBG, 20, 4)
 FIELD(ID_DFR0, PERFMON, 24, 4)
 FIELD(ID_DFR0, TRACEFILT, 28, 4)
 
+FIELD(MVFR0, SIMDREG, 0, 4)
+FIELD(MVFR0, FPSP, 4, 4)
+FIELD(MVFR0, FPDP, 8, 4)
+FIELD(MVFR0, FPTRAP, 12, 4)
+FIELD(MVFR0, FPDIVIDE, 16, 4)
+FIELD(MVFR0, FPSQRT, 20, 4)
+FIELD(MVFR0, FPSHVEC, 24, 4)
+FIELD(MVFR0, FPROUND, 28, 4)
+
+FIELD(MVFR1, FPFTZ, 0, 4)
+FIELD(MVFR1, FPDNAN, 4, 4)
+FIELD(MVFR1, SIMDLS, 8, 4)
+FIELD(MVFR1, SIMDINT, 12, 4)
+FIELD(MVFR1, SIMDSP, 16, 4)
+FIELD(MVFR1, SIMDHP, 20, 4)
+FIELD(MVFR1, FPHP, 24, 4)
+FIELD(MVFR1, SIMDFMAC, 28, 4)
+
+FIELD(MVFR2, SIMDMISC, 0, 4)
+FIELD(MVFR2, FPMISC, 4, 4)
+
 QEMU_BUILD_BUG_ON(ARRAY_SIZE(((ARMCPU *)0)->ccsidr) <= R_V7M_CSSELR_INDEX_MASK);
 
 /* If adding a feature bit which corresponds to a Linux ELF
@@ -XXX,XX +XXX,XX @@ enum arm_features {
     ARM_FEATURE_THUMB2,
     ARM_FEATURE_PMSA,   /* no MMU; may have Memory Protection Unit */
     ARM_FEATURE_VFP3,
-    ARM_FEATURE_VFP_FP16,
     ARM_FEATURE_NEON,
     ARM_FEATURE_M, /* Microcontroller profile.  */
     ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling.  */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
 }
 
+/*
+ * We always set the FP and SIMD FP16 fields to indicate identical
+ * levels of support (assuming SIMD is implemented at all), so
+ * we only need one set of accessors.
+ */
+static inline bool isar_feature_aa32_fp16_spconv(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->mvfr1, MVFR1, FPHP) > 0;
+}
+
+static inline bool isar_feature_aa32_fp16_dpconv(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->mvfr1, MVFR1, FPHP) > 1;
+}
+
 /*
  * 64-bit feature tests via id registers.
  */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
     }
     if (arm_feature(env, ARM_FEATURE_VFP4)) {
         set_feature(env, ARM_FEATURE_VFP3);
-        set_feature(env, ARM_FEATURE_VFP_FP16);
     }
     if (arm_feature(env, ARM_FEATURE_VFP3)) {
         set_feature(env, ARM_FEATURE_VFP);
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
     cpu->dtb_compatible = "arm,cortex-a9";
     set_feature(&cpu->env, ARM_FEATURE_V7);
     set_feature(&cpu->env, ARM_FEATURE_VFP3);
-    set_feature(&cpu->env, ARM_FEATURE_VFP_FP16);
     set_feature(&cpu->env, ARM_FEATURE_NEON);
     set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
     set_feature(&cpu->env, ARM_FEATURE_EL3);
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm32.c
+++ b/target/arm/kvm32.c
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
     if (extract32(id_pfr0, 12, 4) == 1) {
         set_feature(&features, ARM_FEATURE_THUMB2EE);
     }
-    if (extract32(ahcf->isar.mvfr1, 20, 4) == 1) {
-        set_feature(&features, ARM_FEATURE_VFP_FP16);
-    }
     if (extract32(ahcf->isar.mvfr1, 12, 4) == 1) {
         set_feature(&features, ARM_FEATURE_NEON);
     }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
                      * UNPREDICTABLE if bit 8 is set prior to ARMv8
                      * (we choose to UNDEF)
                      */
-                    if ((dp && !arm_dc_feature(s, ARM_FEATURE_V8)) ||
-                        !arm_dc_feature(s, ARM_FEATURE_VFP_FP16)) {
-                        return 1;
+                    if (dp) {
+                        if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
+                            return 1;
+                        }
+                    } else {
+                        if (!dc_isar_feature(aa32_fp16_spconv, s)) {
+                            return 1;
+                        }
                     }
                     rm_is_dp = false;
                     break;
                 case 0x06: /* vcvtb.f16.f32, vcvtb.f16.f64 */
                 case 0x07: /* vcvtt.f16.f32, vcvtt.f16.f64 */
-                    if ((dp && !arm_dc_feature(s, ARM_FEATURE_V8)) ||
-                        !arm_dc_feature(s, ARM_FEATURE_VFP_FP16)) {
-                        return 1;
+                    if (dp) {
+                        if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
+                            return 1;
+                        }
+                    } else {
+                        if (!dc_isar_feature(aa32_fp16_spconv, s)) {
+                            return 1;
+                        }
                     }
                     rd_is_dp = false;
                     break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     TCGv_ptr fpst;
                     TCGv_i32 ahp;
 
-                    if (!arm_dc_feature(s, ARM_FEATURE_VFP_FP16) ||
+                    if (!dc_isar_feature(aa32_fp16_spconv, s) ||
                         q || (rm & 1)) {
                         return 1;
                     }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 {
                     TCGv_ptr fpst;
                     TCGv_i32 ahp;
-                    if (!arm_dc_feature(s, ARM_FEATURE_VFP_FP16) ||
+                    if (!dc_isar_feature(aa32_fp16_spconv, s) ||
                         q || (rd & 1)) {
                         return 1;
                     }
-- 
2.20.1

There is a set of VFP instructions which we implement in
disas_vfp_v8_insn() and gate on the ARM_FEATURE_V8 bit.
These were all first introduced in v8 for A-profile, but in
M-profile they appeared in v7M. Gate them on the MVFR2
FPMisc field instead, and rename the function appropriately.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190222170936.13268-3-peter.maydell@linaro.org
---
 target/arm/cpu.h       | 20 ++++++++++++++++++++
 target/arm/translate.c | 25 +++++++++++++------------
 2 files changed, 33 insertions(+), 12 deletions(-)

This reverts commit 823e1b3818f9b10b824ddcd756983b6e2fa68730,
which introduces a regression running EDK2 guest firmware
under KVM:

error: kvm run failed Function not implemented
 PC=000000013f5a6208 X00=00000000404003c4 X01=000000000000003a
X02=0000000000000000 X03=00000000404003c4 X04=0000000000000000
X05=0000000096000046 X06=000000013d2ef270 X07=000000013e3d1710
X08=09010755ffaf8ba8 X09=ffaf8b9cfeeb5468 X10=feeb546409010756
X11=09010757ffaf8b90 X12=feeb50680903068b X13=090306a1ffaf8bc0
X14=0000000000000000 X15=0000000000000000 X16=000000013f872da0
X17=00000000ffffa6ab X18=0000000000000000 X19=000000013f5a92d0
X20=000000013f5a7a78 X21=000000000000003a X22=000000013f5a7ab2
X23=000000013f5a92e8 X24=000000013f631090 X25=0000000000000010
X26=0000000000000100 X27=000000013f89501b X28=000000013e3d14e0
X29=000000013e3d12a0 X30=000000013f5a2518  SP=000000013b7be0b0
PSTATE=404003c4 -Z-- EL1t

with
[ 3507.926571] kvm [35042]: load/store instruction decoding not implemented
in the host dmesg.

Revert the change for the moment until we can investigate the
cause of the regression.

Reported-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h     |  9 +--------
 target/arm/helper.c  | 27 ++-------------------------
 target/arm/kvm32.c   | 20 ++++++++++++++++++--
 target/arm/kvm64.c   |  2 --
 target/arm/machine.c |  2 +-
 5 files changed, 22 insertions(+), 38 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ bool write_list_to_cpustate(ARMCPU *cpu);
 /**
  * write_cpustate_to_list:
  * @cpu: ARMCPU
- * @kvm_sync: true if this is for syncing back to KVM
  *
  * For each register listed in the ARMCPU cpreg_indexes list, write
  * its value from the ARMCPUState structure into the cpreg_values list.
  * This is used to copy info from TCG's working data structures into
  * KVM or for outbound migration.
  *
- * @kvm_sync is true if we are doing this in order to sync the
- * register state back to KVM. In this case we will only update
- * values in the list if the previous list->cpustate sync actually
- * successfully wrote the CPU state. Otherwise we will keep the value
- * that is in the list.
- *
  * Returns: true if all register values were read correctly,
  * false if some register was unknown or could not be read.
  * Note that we do not stop early on failure -- we will attempt
  * reading all registers in the list.
  */
-bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
+bool write_cpustate_to_list(ARMCPU *cpu);
 
 #define ARM_CPUID_TI915T      0x54029152
 #define ARM_CPUID_TI925T      0x54029252
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static bool raw_accessors_invalid(const ARMCPRegInfo *ri)
     return true;
 }
 
-bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync)
+bool write_cpustate_to_list(ARMCPU *cpu)
 {
     /* Write the coprocessor state from cpu->env to the (index,value) list. */
     int i;
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync)
     for (i = 0; i < cpu->cpreg_array_len; i++) {
         uint32_t regidx = kvm_to_cpreg_id(cpu->cpreg_indexes[i]);
         const ARMCPRegInfo *ri;
-        uint64_t newval;
 
         ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
         if (!ri) {
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync)
         if (ri->type & ARM_CP_NO_RAW) {
             continue;
         }
-
-        newval = read_raw_cp_reg(&cpu->env, ri);
-        if (kvm_sync) {
-            /*
-             * Only sync if the previous list->cpustate sync succeeded.
-             * Rather than tracking the success/failure state for every
-             * item in the list, we just recheck "does the raw write we must
-             * have made in write_list_to_cpustate() read back OK" here.
-             */
-            uint64_t oldval = cpu->cpreg_values[i];
-
-            if (oldval == newval) {
-                continue;
-            }
-
-            write_raw_cp_reg(&cpu->env, ri, oldval);
-            if (read_raw_cp_reg(&cpu->env, ri) != oldval) {
-                continue;
-            }
-
-            write_raw_cp_reg(&cpu->env, ri, newval);
-        }
-        cpu->cpreg_values[i] = newval;
+        cpu->cpreg_values[i] = read_raw_cp_reg(&cpu->env, ri);
     }
     return ok;
 }
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm32.c
+++ b/target/arm/kvm32.c
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
         return ret;
     }
 
-    write_cpustate_to_list(cpu, true);
-
+    /* Note that we do not call write_cpustate_to_list()
+     * here, so we are only writing the tuple list back to
+     * KVM. This is safe because nothing can change the
+     * CPUARMState cp15 fields (in particular gdb accesses cannot)
+     * and so there are no changes to sync. In fact syncing would
+     * be wrong at this point: for a constant register where TCG and
+     * KVM disagree about its value, the preceding write_list_to_cpustate()
+     * would not have had any effect on the CPUARMState value (since the
+     * register is read-only), and a write_cpustate_to_list() here would
+     * then try to write the TCG value back into KVM -- this would either
+     * fail or incorrectly change the value the guest sees.
+     *
+     * If we ever want to allow the user to modify cp15 registers via
+     * the gdb stub, we would need to be more clever here (for instance
+     * tracking the set of registers kvm_arch_get_registers() successfully
+     * managed to update the CPUARMState with, and only allowing those
+     * to be written back up into the kernel).
+     */
     if (!write_list_to_kvmstate(cpu, level)) {
         return EINVAL;
     }
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
         return ret;
     }
 
-    write_cpustate_to_list(cpu, true);
-
     if (!write_list_to_kvmstate(cpu, level)) {
         return EINVAL;
     }
diff --git a/target/arm/machine.c b/target/arm/machine.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static int cpu_pre_save(void *opaque)
             abort();
         }
     } else {
-        if (!write_cpustate_to_list(cpu, false)) {
+        if (!write_cpustate_to_list(cpu)) {
             /* This should never fail. */
             abort();
         }
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Note that float16_to_float32 rightly squashes SNaN to QNaN.
But of course pickNaNMulAdd, for ARM, selects SNaNs first.
So we have to preserve SNaN long enough for the correct NaN
to be selected.  Thus float16_to_float32_by_bits.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219222952.22183-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h     |   9 +++
 target/arm/vec_helper.c | 148 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 157 insertions(+)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_sqsub_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(gvec_sqsub_d, TCG_CALL_NO_RWG,
                    void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_5(gvec_fmlal_a32, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_fmlal_a64, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_fmlal_idx_a32, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_fmlal_idx_a64, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, ptr, i32)
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #include "helper-sve.h"
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn,
     }
     clear_tail(d, oprsz, simd_maxsz(desc));
 }
+
+/*
+ * Convert float16 to float32, raising no exceptions and
+ * preserving exceptional values, including SNaN.
+ * This is effectively an unpack+repack operation.
+ */
+static float32 float16_to_float32_by_bits(uint32_t f16, bool fz16)
+{
+    const int f16_bias = 15;
+    const int f32_bias = 127;
+    uint32_t sign = extract32(f16, 15, 1);
+    uint32_t exp = extract32(f16, 10, 5);
+    uint32_t frac = extract32(f16, 0, 10);
+
+    if (exp == 0x1f) {
+        /* Inf or NaN */
+        exp = 0xff;
+    } else if (exp == 0) {
+        /* Zero or denormal.  */
+        if (frac != 0) {
+            if (fz16) {
+                frac = 0;
+            } else {
+                /*
+                 * Denormal; these are all normal float32.
+                 * Shift the fraction so that the msb is at bit 11,
+                 * then remove bit 11 as the implicit bit of the
+                 * normalized float32.  Note that we still go through
+                 * the shift for normal numbers below, to put the
+                 * float32 fraction at the right place.
+                 */
+                int shift = clz32(frac) - 21;
+                frac = (frac << shift) & 0x3ff;
+                exp = f32_bias - f16_bias - shift + 1;
+            }
+        }
+    } else {
+        /* Normal number; adjust the bias.  */
+        exp += f32_bias - f16_bias;
+    }
+    sign <<= 31;
+    exp <<= 23;
+    frac <<= 23 - 10;
+
+    return sign | exp | frac;
+}
+
+static uint64_t load4_f16(uint64_t *ptr, int is_q, int is_2)
+{
+    /*
+     * Branchless load of u32[0], u64[0], u32[1], or u64[1].
+     * Load the 2nd qword iff is_q & is_2.
+     * Shift to the 2nd dword iff !is_q & is_2.
+     * For !is_q & !is_2, the upper bits of the result are garbage.
+     */
+    return ptr[is_q & is_2] >> ((is_2 & ~is_q) << 5);
+}
+
+/*
+ * Note that FMLAL requires oprsz == 8 or oprsz == 16,
+ * as there is not yet SVE versions that might use blocking.
+ */
+
+static void do_fmlal(float32 *d, void *vn, void *vm, float_status *fpst,
+                     uint32_t desc, bool fz16)
+{
+    intptr_t i, oprsz = simd_oprsz(desc);
+    int is_s = extract32(desc, SIMD_DATA_SHIFT, 1);
+    int is_2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
+    int is_q = oprsz == 16;
+    uint64_t n_4, m_4;
+
+    /* Pre-load all of the f16 data, avoiding overlap issues.  */
+    n_4 = load4_f16(vn, is_q, is_2);
+    m_4 = load4_f16(vm, is_q, is_2);
+
+    /* Negate all inputs for FMLSL at once.  */
+    if (is_s) {
+        n_4 ^= 0x8000800080008000ull;
+    }
+
+    for (i = 0; i < oprsz / 4; i++) {
+        float32 n_1 = float16_to_float32_by_bits(n_4 >> (i * 16), fz16);
+        float32 m_1 = float16_to_float32_by_bits(m_4 >> (i * 16), fz16);
+        d[H4(i)] = float32_muladd(n_1, m_1, d[H4(i)], 0, fpst);
+    }
+    clear_tail(d, oprsz, simd_maxsz(desc));
+}
+
+void HELPER(gvec_fmlal_a32)(void *vd, void *vn, void *vm,
+                            void *venv, uint32_t desc)
+{
+    CPUARMState *env = venv;
+    do_fmlal(vd, vn, vm, &env->vfp.standard_fp_status, desc,
+             get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
+}
+
+void HELPER(gvec_fmlal_a64)(void *vd, void *vn, void *vm,
+                            void *venv, uint32_t desc)
+{
+    CPUARMState *env = venv;
+    do_fmlal(vd, vn, vm, &env->vfp.fp_status, desc,
+             get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
+}
+
+static void do_fmlal_idx(float32 *d, void *vn, void *vm, float_status *fpst,
+                         uint32_t desc, bool fz16)
+{
+    intptr_t i, oprsz = simd_oprsz(desc);
+    int is_s = extract32(desc, SIMD_DATA_SHIFT, 1);
+    int is_2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
+    int index = extract32(desc, SIMD_DATA_SHIFT + 2, 3);
+    int is_q = oprsz == 16;
+    uint64_t n_4;
+    float32 m_1;
+
+    /* Pre-load all of the f16 data, avoiding overlap issues.  */
+    n_4 = load4_f16(vn, is_q, is_2);
+
+    /* Negate all inputs for FMLSL at once.  */
+    if (is_s) {
+        n_4 ^= 0x8000800080008000ull;
+    }
+
+    m_1 = float16_to_float32_by_bits(((float16 *)vm)[H2(index)], fz16);
+
+    for (i = 0; i < oprsz / 4; i++) {
+        float32 n_1 = float16_to_float32_by_bits(n_4 >> (i * 16), fz16);
+        d[H4(i)] = float32_muladd(n_1, m_1, d[H4(i)], 0, fpst);
+    }
+    clear_tail(d, oprsz, simd_maxsz(desc));
+}
+
+void HELPER(gvec_fmlal_idx_a32)(void *vd, void *vn, void *vm,
+                                void *venv, uint32_t desc)
+{
+    CPUARMState *env = venv;
+    do_fmlal_idx(vd, vn, vm, &env->vfp.standard_fp_status, desc,
+                 get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
+}
+
+void HELPER(gvec_fmlal_idx_a64)(void *vd, void *vn, void *vm,
+                                void *venv, uint32_t desc)
+{
+    CPUARMState *env = venv;
+    do_fmlal_idx(vd, vn, vm, &env->vfp.fp_status, desc,
+                 get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
+}
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219222952.22183-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h           |  5 ++++
 target/arm/translate-a64.c | 49 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 53 insertions(+), 1 deletion(-)

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219222952.22183-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h       |   5 ++
 target/arm/translate.c | 129 ++++++++++++++++++++++++++++++-----------
 2 files changed, 101 insertions(+), 33 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_dp(const ARMISARegisters *id)
     return FIELD_EX32(id->id_isar6, ID_ISAR6, DP) != 0;
 }
 
+static inline bool isar_feature_aa32_fhm(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_isar6, ID_ISAR6, FHM) != 0;
+}
+
 static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
 {
     /*
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
     gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
     int rd, rn, rm, opr_sz;
     int data = 0;
-    bool q;
-
-    q = extract32(insn, 6, 1);
-    VFP_DREG_D(rd, insn);
-    VFP_DREG_N(rn, insn);
-    VFP_DREG_M(rm, insn);
-    if ((rd | rn | rm) & q) {
-        return 1;
-    }
+    int off_rn, off_rm;
+    bool is_long = false, q = extract32(insn, 6, 1);
+    bool ptr_is_env = false;
 
     if ((insn & 0xfe200f10) == 0xfc200800) {
         /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
             return 1;
         }
         fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b;
+    } else if ((insn & 0xff300f10) == 0xfc200810) {
+        /* VFM[AS]L -- 1111 1100 S.10 .... .... 1000 .Q.1 .... */
+        int is_s = extract32(insn, 23, 1);
+        if (!dc_isar_feature(aa32_fhm, s)) {
+            return 1;
+        }
+        is_long = true;
+        data = is_s; /* is_2 == 0 */
+        fn_gvec_ptr = gen_helper_gvec_fmlal_a32;
+        ptr_is_env = true;
     } else {
         return 1;
     }
 
+    VFP_DREG_D(rd, insn);
+    if (rd & q) {
+        return 1;
+    }
+    if (q || !is_long) {
+        VFP_DREG_N(rn, insn);
+        VFP_DREG_M(rm, insn);
+        if ((rn | rm) & q & !is_long) {
+            return 1;
+        }
+        off_rn = vfp_reg_offset(1, rn);
+        off_rm = vfp_reg_offset(1, rm);
+    } else {
+        rn = VFP_SREG_N(insn);
+        rm = VFP_SREG_M(insn);
+        off_rn = vfp_reg_offset(0, rn);
+        off_rm = vfp_reg_offset(0, rm);
+    }
+
     if (s->fp_excp_el) {
         gen_exception_insn(s, 4, EXCP_UDEF,
                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
 
     opr_sz = (1 + q) * 8;
     if (fn_gvec_ptr) {
-        TCGv_ptr fpst = get_fpstatus_ptr(1);
-        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
-                           vfp_reg_offset(1, rn),
-                           vfp_reg_offset(1, rm), fpst,
+        TCGv_ptr ptr;
+        if (ptr_is_env) {
+            ptr = cpu_env;
+        } else {
+            ptr = get_fpstatus_ptr(1);
+        }
+        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
                            opr_sz, opr_sz, data, fn_gvec_ptr);
-        tcg_temp_free_ptr(fpst);
+        if (!ptr_is_env) {
+            tcg_temp_free_ptr(ptr);
+        }
     } else {
-        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd),
-                           vfp_reg_offset(1, rn),
-                           vfp_reg_offset(1, rm),
+        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
                            opr_sz, opr_sz, data, fn_gvec);
     }
     return 0;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
     gen_helper_gvec_3 *fn_gvec = NULL;
     gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL;
     int rd, rn, rm, opr_sz, data;
-    bool q;
-
-    q = extract32(insn, 6, 1);
-    VFP_DREG_D(rd, insn);
-    VFP_DREG_N(rn, insn);
-    if ((rd | rn) & q) {
-        return 1;
-    }
+    int off_rn, off_rm;
+    bool is_long = false, q = extract32(insn, 6, 1);
+    bool ptr_is_env = false;
 
     if ((insn & 0xff000f10) == 0xfe000800) {
         /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
     } else if ((insn & 0xffb00f00) == 0xfe200d00) {
         /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */
         int u = extract32(insn, 4, 1);
+
         if (!dc_isar_feature(aa32_dp, s)) {
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
         /* rm is just Vm, and index is M.  */
         data = extract32(insn, 5, 1); /* index */
         rm = extract32(insn, 0, 4);
+    } else if ((insn & 0xffa00f10) == 0xfe000810) {
+        /* VFM[AS]L -- 1111 1110 0.0S .... .... 1000 .Q.1 .... */
+        int is_s = extract32(insn, 20, 1);
+        int vm20 = extract32(insn, 0, 3);
+        int vm3 = extract32(insn, 3, 1);
+        int m = extract32(insn, 5, 1);
+        int index;
+
+        if (!dc_isar_feature(aa32_fhm, s)) {
+            return 1;
+        }
+        if (q) {
+            rm = vm20;
+            index = m * 2 + vm3;
+        } else {
+            rm = vm20 * 2 + m;
+            index = vm3;
+        }
+        is_long = true;
+        data = (index << 2) | is_s; /* is_2 == 0 */
+        fn_gvec_ptr = gen_helper_gvec_fmlal_idx_a32;
+        ptr_is_env = true;
     } else {
         return 1;
     }
 
+    VFP_DREG_D(rd, insn);
+    if (rd & q) {
+        return 1;
+    }
+    if (q || !is_long) {
+        VFP_DREG_N(rn, insn);
+        if (rn & q & !is_long) {
+            return 1;
+        }
+        off_rn = vfp_reg_offset(1, rn);
+        off_rm = vfp_reg_offset(1, rm);
+    } else {
+        rn = VFP_SREG_N(insn);
+        off_rn = vfp_reg_offset(0, rn);
+        off_rm = vfp_reg_offset(0, rm);
+    }
     if (s->fp_excp_el) {
         gen_exception_insn(s, 4, EXCP_UDEF,
                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
 
     opr_sz = (1 + q) * 8;
     if (fn_gvec_ptr) {
-        TCGv_ptr fpst = get_fpstatus_ptr(1);
-        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
-                           vfp_reg_offset(1, rn),
-                           vfp_reg_offset(1, rm), fpst,
+        TCGv_ptr ptr;
+        if (ptr_is_env) {
+            ptr = cpu_env;
+        } else {
+            ptr = get_fpstatus_ptr(1);
+        }
+        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), off_rn, off_rm, ptr,
                            opr_sz, opr_sz, data, fn_gvec_ptr);
-        tcg_temp_free_ptr(fpst);
+        if (!ptr_is_env) {
+            tcg_temp_free_ptr(ptr);
+        }
     } else {
-        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd),
-                           vfp_reg_offset(1, rn),
-                           vfp_reg_offset(1, rm),
+        tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), off_rn, off_rm,
                            opr_sz, opr_sz, data, fn_gvec);
     }
     return 0;
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190219222952.22183-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c   | 1 +
 target/arm/cpu64.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
             t = cpu->isar.id_isar6;
             t = FIELD_DP32(t, ID_ISAR6, JSCVT, 1);
             t = FIELD_DP32(t, ID_ISAR6, DP, 1);
+            t = FIELD_DP32(t, ID_ISAR6, FHM, 1);
             cpu->isar.id_isar6 = t;
 
             t = cpu->id_mmfr4;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         t = FIELD_DP64(t, ID_AA64ISAR0, SM3, 1);
         t = FIELD_DP64(t, ID_AA64ISAR0, SM4, 1);
         t = FIELD_DP64(t, ID_AA64ISAR0, DP, 1);
+        t = FIELD_DP64(t, ID_AA64ISAR0, FHM, 1);
         cpu->isar.id_aa64isar0 = t;
 
         t = cpu->isar.id_aa64isar1;
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         u = cpu->isar.id_isar6;
         u = FIELD_DP32(u, ID_ISAR6, JSCVT, 1);
         u = FIELD_DP32(u, ID_ISAR6, DP, 1);
+        u = FIELD_DP32(u, ID_ISAR6, FHM, 1);
         cpu->isar.id_isar6 = u;
 
         /*
-- 
2.20.1

Most of this is the Neon decodetree patches, followed by Edgar's versal cleanups.

thanks
-- PMM

The following changes since commit 2ef486e76d64436be90f7359a3071fb2a56ce835:

Merge remote-tracking branch 'remotes/marcel/tags/rdma-pull-request' into staging (2020-05-03 14:12:56 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200504

for you to fetch changes up to 9aefc6cf9b73f66062d2f914a0136756e7a28211:

target/arm: Move gen_ function typedefs to translate.h (2020-05-04 12:59:26 +0100)

----------------------------------------------------------------
target-arm queue:
 * Start of conversion of Neon insns to decodetree
 * versal board: support SD and RTC
 * Implement ARMv8.2-TTS2UXN
 * Make VQDMULL undefined when U=1
 * Some minor code cleanups

----------------------------------------------------------------
Edgar E. Iglesias (11):
      hw/arm: versal: Remove inclusion of arm_gicv3_common.h
      hw/arm: versal: Move misplaced comment
      hw/arm: versal-virt: Fix typo xlnx-ve -> xlnx-versal
      hw/arm: versal: Embed the UARTs into the SoC type
      hw/arm: versal: Embed the GEMs into the SoC type
      hw/arm: versal: Embed the ADMAs into the SoC type
      hw/arm: versal: Embed the APUs into the SoC type
      hw/arm: versal: Add support for SD
      hw/arm: versal: Add support for the RTC
      hw/arm: versal-virt: Add support for SD
      hw/arm: versal-virt: Add support for the RTC

Fredrik Strupe (1):
      target/arm: Make VQDMULL undefined when U=1

Peter Maydell (25):
      target/arm: Don't use a TLB for ARMMMUIdx_Stage2
      target/arm: Use enum constant in get_phys_addr_lpae() call
      target/arm: Add new 's1_is_el0' argument to get_phys_addr_lpae()
      target/arm: Implement ARMv8.2-TTS2UXN
      target/arm: Use correct variable for setting 'max' cpu's ID_AA64DFR0
      target/arm/translate-vfp.inc.c: Remove duplicate simd_r32 check
      target/arm: Don't allow Thumb Neon insns without FEATURE_NEON
      target/arm: Add stubs for AArch32 Neon decodetree
      target/arm: Convert VCMLA (vector) to decodetree
      target/arm: Convert VCADD (vector) to decodetree
      target/arm: Convert V[US]DOT (vector) to decodetree
      target/arm: Convert VFM[AS]L (vector) to decodetree
      target/arm: Convert VCMLA (scalar) to decodetree
      target/arm: Convert V[US]DOT (scalar) to decodetree
      target/arm: Convert VFM[AS]L (scalar) to decodetree
      target/arm: Convert Neon load/store multiple structures to decodetree
      target/arm: Convert Neon 'load single structure to all lanes' to decodetree
      target/arm: Convert Neon 'load/store single structure' to decodetree
      target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree
      target/arm: Convert Neon 3-reg-same logic ops to decodetree
      target/arm: Convert Neon 3-reg-same VMAX/VMIN to decodetree
      target/arm: Convert Neon 3-reg-same comparisons to decodetree
      target/arm: Convert Neon 3-reg-same VQADD/VQSUB to decodetree
      target/arm: Convert Neon 3-reg-same VMUL, VMLA, VMLS, VSHL to decodetree
      target/arm: Move gen_ function typedefs to translate.h

Philippe Mathieu-Daudé (2):
      hw/arm/mps2-tz: Use TYPE_IOTKIT instead of hardcoded string
      target/arm: Use uint64_t for midr field in CPU state struct

include/hw/arm/xlnx-versal.h    |  31 +-
 target/arm/cpu-param.h          |   2 +-
 target/arm/cpu.h                |  38 ++-
 target/arm/translate-a64.h      |   9 -
 target/arm/translate.h          |  26 ++
 target/arm/neon-dp.decode       |  86 +++++
 target/arm/neon-ls.decode       |  52 +++
 target/arm/neon-shared.decode   |  66 ++++
 hw/arm/mps2-tz.c                |   2 +-
 hw/arm/xlnx-versal-virt.c       |  74 ++++-
 hw/arm/xlnx-versal.c            | 115 +++++--
 target/arm/cpu.c                |   3 +-
 target/arm/cpu64.c              |   8 +-
 target/arm/helper.c             | 183 ++++------
 target/arm/translate-a64.c      |  17 -
 target/arm/translate-neon.inc.c | 714 +++++++++++++++++++++++++++++++++++++++
 target/arm/translate-vfp.inc.c  |   6 -
 target/arm/translate.c          | 716 +++-------------------------------------
 target/arm/Makefile.objs        |  18 +
 19 files changed, 1302 insertions(+), 864 deletions(-)
 create mode 100644 target/arm/neon-dp.decode
 create mode 100644 target/arm/neon-ls.decode
 create mode 100644 target/arm/neon-shared.decode
 create mode 100644 target/arm/translate-neon.inc.c

From: Fredrik Strupe <fredrik@strupe.net>

According to Arm ARM, VQDMULL is only valid when U=0, while having
U=1 is unallocated.

Signed-off-by: Fredrik Strupe <fredrik@strupe.net>
Fixes: 695272dcb976 ("target-arm: Handle UNDEF cases for Neon 3-regs-different-widths")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     {0, 0, 0, 0}, /* VMLSL */
                     {0, 0, 0, 9}, /* VQDMLSL */
                     {0, 0, 0, 0}, /* Integer VMULL */
-                    {0, 0, 0, 1}, /* VQDMULL */
+                    {0, 0, 0, 9}, /* VQDMULL */
                     {0, 0, 0, 0xa}, /* Polynomial VMULL */
                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
                 };
-- 
2.20.1

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

By using the TYPE_* definitions for devices, we can:
 - quickly find where devices are used with 'git-grep'
 - easily rename a device (one-line change).

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200428154650.21991-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/mps2-tz.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/mps2-tz.c
+++ b/hw/arm/mps2-tz.c
@@ -XXX,XX +XXX,XX @@ static void mps2tz_common_init(MachineState *machine)
         exit(EXIT_FAILURE);
     }
 
-    sysbus_init_child_obj(OBJECT(machine), "iotkit", &mms->iotkit,
+    sysbus_init_child_obj(OBJECT(machine), TYPE_IOTKIT, &mms->iotkit,
                           sizeof(mms->iotkit), mmc->armsse_type);
     iotkitdev = DEVICE(&mms->iotkit);
     object_property_set_link(OBJECT(&mms->iotkit), OBJECT(system_memory),
-- 
2.20.1

We define ARMMMUIdx_Stage2 as being an MMU index which uses a QEMU
TLB.  However we never actually use the TLB -- all stage 2 lookups
are done by direct calls to get_phys_addr_lpae() followed by a
physical address load via address_space_ld*().

Remove Stage2 from the list of ARM MMU indexes which correspond to
real core MMU indexes, and instead put it in the set of "NOTLB" ARM
MMU indexes.

This allows us to drop NB_MMU_MODES to 11.  It also means we can
safely add support for the ARMv8.3-TTS2UXN extension, which adds
permission bits to the stage 2 descriptors which define execute
permission separatel for EL0 and EL1; supporting that while keeping
Stage2 in a QEMU TLB would require us to use separate TLBs for
"Stage2 for an EL0 access" and "Stage2 for an EL1 access", which is a
lot of extra complication given we aren't even using the QEMU TLB.

In the process of updating the comment on our MMU index use,
fix a couple of other minor errors:
 * NS EL2 EL2&0 was missing from the list in the comment
 * some text hadn't been updated from when we bumped NB_MMU_MODES
   above 8

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-2-peter.maydell@linaro.org
---
 target/arm/cpu-param.h |   2 +-
 target/arm/cpu.h       |  21 +++++---
 target/arm/helper.c    | 112 ++++-------------------------------------
 3 files changed, 27 insertions(+), 108 deletions(-)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -XXX,XX +XXX,XX @@
 # define TARGET_PAGE_BITS_MIN  10
 #endif
 
-#define NB_MMU_MODES 12
+#define NB_MMU_MODES 11
 
 #endif
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  *     handling via the TLB. The only way to do a stage 1 translation without
  *     the immediate stage 2 translation is via the ATS or AT system insns,
  *     which can be slow-pathed and always do a page table walk.
+ *     The only use of stage 2 translations is either as part of an s1+2
+ *     lookup or when loading the descriptors during a stage 1 page table walk,
+ *     and in both those cases we don't use the TLB.
  *  4. we can also safely fold together the "32 bit EL3" and "64 bit EL3"
  *     translation regimes, because they map reasonably well to each other
  *     and they can't both be active at the same time.
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  * NS EL1 EL1&0 stage 1+2 (aka NS PL1)
  * NS EL1 EL1&0 stage 1+2 +PAN
  * NS EL0 EL2&0
+ * NS EL2 EL2&0
  * NS EL2 EL2&0 +PAN
  * NS EL2 (aka NS PL2)
  * S EL0 EL1&0 (aka S PL0)
  * S EL1 EL1&0 (not used if EL3 is 32 bit)
  * S EL1 EL1&0 +PAN
  * S EL3 (aka S PL1)
- * NS EL1&0 stage 2
  *
- * for a total of 12 different mmu_idx.
+ * for a total of 11 different mmu_idx.
  *
  * R profile CPUs have an MPU, but can use the same set of MMU indexes
  * as A profile. They only need to distinguish NS EL0 and NS EL1 (and
@@ -XXX,XX +XXX,XX @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  * are not quite the same -- different CPU types (most notably M profile
  * vs A/R profile) would like to use MMU indexes with different semantics,
  * but since we don't ever need to use all of those in a single CPU we
- * can avoid setting NB_MMU_MODES to more than 8. The lower bits of
+ * can avoid having to set NB_MMU_MODES to "total number of A profile MMU
+ * modes + total number of M profile MMU modes". The lower bits of
  * ARMMMUIdx are the core TLB mmu index, and the higher bits are always
  * the same for any particular CPU.
  * Variables of type ARMMUIdx are always full values, and the core
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
     ARMMMUIdx_SE10_1_PAN = 9 | ARM_MMU_IDX_A,
     ARMMMUIdx_SE3        = 10 | ARM_MMU_IDX_A,
 
-    ARMMMUIdx_Stage2     = 11 | ARM_MMU_IDX_A,
-
     /*
      * These are not allocated TLBs and are used only for AT system
      * instructions or for the first stage of an S12 page table walk.
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdx {
     ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
     ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
     ARMMMUIdx_Stage1_E1_PAN = 2 | ARM_MMU_IDX_NOTLB,
+    /*
+     * Not allocated a TLB: used only for second stage of an S12 page
+     * table walk, or for descriptor loads during first stage of an S1
+     * page table walk. Note that if we ever want to have a TLB for this
+     * then various TLB flush insns which currently are no-ops or flush
+     * only stage 1 MMU indexes will need to change to flush stage 2.
+     */
+    ARMMMUIdx_Stage2     = 3 | ARM_MMU_IDX_NOTLB,
 
     /*
      * M-profile.
@@ -XXX,XX +XXX,XX @@ typedef enum ARMMMUIdxBit {
     TO_CORE_BIT(SE10_1),
     TO_CORE_BIT(SE10_1_PAN),
     TO_CORE_BIT(SE3),
-    TO_CORE_BIT(Stage2),
 
     TO_CORE_BIT(MUser),
     TO_CORE_BIT(MPriv),
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
     tlb_flush_by_mmuidx(cs,
                         ARMMMUIdxBit_E10_1 |
                         ARMMMUIdxBit_E10_1_PAN |
-                        ARMMMUIdxBit_E10_0 |
-                        ARMMMUIdxBit_Stage2);
+                        ARMMMUIdxBit_E10_0);
 }
 
 static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     tlb_flush_by_mmuidx_all_cpus_synced(cs,
                                         ARMMMUIdxBit_E10_1 |
                                         ARMMMUIdxBit_E10_1_PAN |
-                                        ARMMMUIdxBit_E10_0 |
-                                        ARMMMUIdxBit_Stage2);
+                                        ARMMMUIdxBit_E10_0);
 }
 
-static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                            uint64_t value)
-{
-    /* Invalidate by IPA. This has to invalidate any structures that
-     * contain only stage 2 translation information, but does not need
-     * to apply to structures that contain combined stage 1 and stage 2
-     * translation information.
-     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
-     */
-    CPUState *cs = env_cpu(env);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 40);
-
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
-}
-
-static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                               uint64_t value)
-{
-    CPUState *cs = env_cpu(env);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 40);
-
-    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
-                                             ARMMMUIdxBit_Stage2);
-}
 
 static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
                               uint64_t value)
@@ -XXX,XX +XXX,XX @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
         tlb_flush_by_mmuidx(cs,
                             ARMMMUIdxBit_E10_1 |
                             ARMMMUIdxBit_E10_1_PAN |
-                            ARMMMUIdxBit_E10_0 |
-                            ARMMMUIdxBit_Stage2);
+                            ARMMMUIdxBit_E10_0);
         raw_write(env, ri, value);
     }
 }
@@ -XXX,XX +XXX,XX @@ static int alle1_tlbmask(CPUARMState *env)
         return ARMMMUIdxBit_SE10_1 |
                ARMMMUIdxBit_SE10_1_PAN |
                ARMMMUIdxBit_SE10_0;
-    } else if (arm_feature(env, ARM_FEATURE_EL2)) {
-        return ARMMMUIdxBit_E10_1 |
-               ARMMMUIdxBit_E10_1_PAN |
-               ARMMMUIdxBit_E10_0 |
-               ARMMMUIdxBit_Stage2;
     } else {
         return ARMMMUIdxBit_E10_1 |
                ARMMMUIdxBit_E10_1_PAN |
@@ -XXX,XX +XXX,XX @@ static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                              ARMMMUIdxBit_SE3);
 }
 
-static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                                    uint64_t value)
-{
-    /* Invalidate by IPA. This has to invalidate any structures that
-     * contain only stage 2 translation information, but does not need
-     * to apply to structures that contain combined stage 1 and stage 2
-     * translation information.
-     * This must NOP if EL2 isn't implemented or SCR_EL3.NS is zero.
-     */
-    ARMCPU *cpu = env_archcpu(env);
-    CPUState *cs = CPU(cpu);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 48);
-
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
-}
-
-static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                                      uint64_t value)
-{
-    CPUState *cs = env_cpu(env);
-    uint64_t pageaddr;
-
-    if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
-        return;
-    }
-
-    pageaddr = sextract64(value << 12, 0, 48);
-
-    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
-                                             ARMMMUIdxBit_Stage2);
-}
-
 static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
                                       bool isread)
 {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .writefn = tlbi_aa64_vae1_write },
     { .name = "TLBI_IPAS2E1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1is_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_IPAS2LE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1is_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_ALLE1IS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 3, .opc2 = 4,
       .access = PL2_W, .type = ARM_CP_NO_RAW,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .writefn = tlbi_aa64_alle1is_write },
     { .name = "TLBI_IPAS2E1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_IPAS2LE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
-      .access = PL2_W, .type = ARM_CP_NO_RAW,
-      .writefn = tlbi_aa64_ipas2e1_write },
+      .access = PL2_W, .type = ARM_CP_NOP },
     { .name = "TLBI_ALLE1", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 4, .crn = 8, .crm = 7, .opc2 = 4,
       .access = PL2_W, .type = ARM_CP_NO_RAW,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .writefn = tlbimva_hyp_is_write },
     { .name = "TLBIIPAS2",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 1,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     { .name = "TLBIIPAS2IS",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 1,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_is_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     { .name = "TLBIIPAS2L",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 4, .opc2 = 5,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     { .name = "TLBIIPAS2LIS",
       .cp = 15, .opc1 = 4, .crn = 8, .crm = 0, .opc2 = 5,
-      .type = ARM_CP_NO_RAW, .access = PL2_W,
-      .writefn = tlbiipas2_is_write },
+      .type = ARM_CP_NOP, .access = PL2_W },
     /* 32 bit cache operations */
     { .name = "ICIALLUIS", .cp = 15, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = aa64_cacheop_pou_access },
-- 
2.20.1

The access_type argument to get_phys_addr_lpae() is an MMUAccessType;
use the enum constant MMU_DATA_LOAD rather than a literal 0 when we
call it in S1_ptw_translate().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-3-peter.maydell@linaro.org
---
 target/arm/helper.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
             pcacheattrs = &cacheattrs;
         }
 
-        ret = get_phys_addr_lpae(env, addr, 0, ARMMMUIdx_Stage2, &s2pa,
-                                 &txattrs, &s2prot, &s2size, fi, pcacheattrs);
+        ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
+                                 &s2pa, &txattrs, &s2prot, &s2size, fi,
+                                 pcacheattrs);
         if (ret) {
             assert(fi->type != ARMFault_None);
             fi->s2addr = addr;
-- 
2.20.1

For ARMv8.2-TTS2UXN, the stage 2 page table walk wants to know
whether the stage 1 access is for EL0 or not, because whether
exec permission is given can depend on whether this is an EL0
or EL1 access. Add a new argument to get_phys_addr_lpae() so
the call sites can pass this information in.

Since get_phys_addr_lpae() doesn't already have a doc comment,
add one so we have a place to put the documentation of the
semantics of the new s1_is_el0 argument.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-4-peter.maydell@linaro.org
---
 target/arm/helper.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@
 
 static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
+                               bool s1_is_el0,
                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
                                target_ulong *page_size_ptr,
                                ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs);
@@ -XXX,XX +XXX,XX @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
         }
 
         ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,
+                                 false,
                                  &s2pa, &txattrs, &s2prot, &s2size, fi,
                                  pcacheattrs);
         if (ret) {
@@ -XXX,XX +XXX,XX @@ static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
     };
 }
 
+/**
+ * get_phys_addr_lpae: perform one stage of page table walk, LPAE format
+ *
+ * Returns false if the translation was successful. Otherwise, phys_ptr, attrs,
+ * prot and page_size may not be filled in, and the populated fsr value provides
+ * information on why the translation aborted, in the format of a long-format
+ * DFSR/IFSR fault register, with the following caveats:
+ *  * the WnR bit is never set (the caller must do this).
+ *
+ * @env: CPUARMState
+ * @address: virtual address to get physical address for
+ * @access_type: MMU_DATA_LOAD, MMU_DATA_STORE or MMU_INST_FETCH
+ * @mmu_idx: MMU index indicating required translation regime
+ * @s1_is_el0: if @mmu_idx is ARMMMUIdx_Stage2 (so this is a stage 2 page table
+ *             walk), must be true if this is stage 2 of a stage 1+2 walk for an
+ *             EL0 access). If @mmu_idx is anything else, @s1_is_el0 is ignored.
+ * @phys_ptr: set to the physical address corresponding to the virtual address
+ * @attrs: set to the memory transaction attributes to use
+ * @prot: set to the permissions for the page containing phys_ptr
+ * @page_size_ptr: set to the size of the page containing phys_ptr
+ * @fi: set to fault info if the translation fails
+ * @cacheattrs: (if non-NULL) set to the cacheability/shareability attributes
+ */
 static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
                                MMUAccessType access_type, ARMMMUIdx mmu_idx,
+                               bool s1_is_el0,
                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
                                target_ulong *page_size_ptr,
                                ARMMMUFaultInfo *fi, ARMCacheAttrs *cacheattrs)
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
 
             /* S1 is done. Now do S2 translation.  */
             ret = get_phys_addr_lpae(env, ipa, access_type, ARMMMUIdx_Stage2,
+                                     mmu_idx == ARMMMUIdx_E10_0,
                                      phys_ptr, attrs, &s2_prot,
                                      page_size, fi,
                                      cacheattrs != NULL ? &cacheattrs2 : NULL);
@@ -XXX,XX +XXX,XX @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
     }
 
     if (regime_using_lpae_format(env, mmu_idx)) {
-        return get_phys_addr_lpae(env, address, access_type, mmu_idx,
+        return get_phys_addr_lpae(env, address, access_type, mmu_idx, false,
                                   phys_ptr, attrs, prot, page_size,
                                   fi, cacheattrs);
     } else if (regime_sctlr(env, mmu_idx) & SCTLR_XP) {
-- 
2.20.1

The ARMv8.2-TTS2UXN feature extends the XN field in stage 2
translation table descriptors from just bit [54] to bits [54:53],
allowing stage 2 to control execution permissions separately for EL0
and EL1. Implement the new semantics of the XN field and enable
the feature for our 'max' CPU.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-5-peter.maydell@linaro.org
---
 target/arm/cpu.h    | 15 +++++++++++++++
 target/arm/cpu.c    |  1 +
 target/arm/cpu64.c  |  2 ++
 target/arm/helper.c | 37 +++++++++++++++++++++++++++++++------
 4 files changed, 49 insertions(+), 6 deletions(-)

In aarch64_max_initfn() we update both 32-bit and 64-bit ID
registers.  The intended pattern is that for 64-bit ID registers we
use FIELD_DP64 and the uint64_t 't' register, while 32-bit ID
registers use FIELD_DP32 and the uint32_t 'u' register.  For
ID_AA64DFR0 we accidentally used 'u', meaning that the top 32 bits of
this 64-bit ID register would end up always zero.  Luckily at the
moment that's what they should be anyway, so this bug has no visible
effects.

Use the right-sized variable.

Fixes: 3bec78447a958d481991
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200423110915.10527-1-peter.maydell@linaro.org
---
 target/arm/cpu64.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
         cpu->isar.id_mmfr4 = u;
 
-        u = cpu->isar.id_aa64dfr0;
-        u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
-        cpu->isar.id_aa64dfr0 = u;
+        t = cpu->isar.id_aa64dfr0;
+        t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
+        cpu->isar.id_aa64dfr0 = t;
 
         u = cpu->isar.id_dfr0;
         u = FIELD_DP32(u, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
-- 
2.20.1

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

MIDR_EL1 is a 64-bit system register with the top 32-bit being RES0.
Represent it in QEMU's ARMCPU struct with a uint64_t, not a
uint32_t.

This fixes an error when compiling with -Werror=conversion
because we were manipulating the register value using a
local uint64_t variable:

target/arm/cpu64.c: In function ‘aarch64_max_initfn’:
  target/arm/cpu64.c:628:21: error: conversion from ‘uint64_t’ {aka ‘long unsigned int’} to ‘uint32_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
    628 |         cpu->midr = t;
        |                     ^

and future-proofs us against a possible future architecture
change using some of the top 32 bits.

Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20200428172634.29707-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h | 2 +-
 target/arm/cpu.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
         uint64_t id_aa64dfr0;
         uint64_t id_aa64dfr1;
     } isar;
-    uint32_t midr;
+    uint64_t midr;
     uint32_t revidr;
     uint32_t reset_fpsid;
     uint32_t ctr;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo arm_cpus[] = {
 static Property arm_cpu_properties[] = {
     DEFINE_PROP_BOOL("start-powered-off", ARMCPU, start_powered_off, false),
     DEFINE_PROP_UINT32("psci-conduit", ARMCPU, psci_conduit, 0),
-    DEFINE_PROP_UINT32("midr", ARMCPU, midr, 0),
+    DEFINE_PROP_UINT64("midr", ARMCPU, midr, 0),
     DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
                         mp_affinity, ARM64_AFFINITY_INVALID),
     DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
-- 
2.20.1