Series comparison

-[Qemu-devel] [PULL 00/29] target-arm queue
+[PULL 00/52] target-arm queue
-First arm pullreq of 4.2...
+Big pullreq this week, though none of the new features are
 particularly earthshaking. Most of the bulk is from code cleanup
 patches from me or rth.
 thanks
 -- PMM
-The following changes since commit 27608c7c66bd923eb5e5faab80e795408cbe2b51:
+The following changes since commit b651b80822fa8cb66ca30087ac7fbc75507ae5d2:
-  Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20190814a' into staging (2019-08-16 12:00:18 +0100)
+  Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-5.0-pull-request' into staging (2020-02-20 17:35:42 +0000)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20190816
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200221
-for you to fetch changes up to 664b7e3b97d6376f3329986c465b3782458b0f8b:
+for you to fetch changes up to 270a679b3f950d7c4c600f324aab8bff292d0971:
-  target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word (2019-08-16 14:02:53 +0100)
+  target/arm: Add missing checks for fpsp_v2 (2020-02-21 12:54:25 +0000)
 ----------------------------------------------------------------
 target-arm queue:
- * target/arm: generate a custom MIDR for -cpu max
+ * aspeed/scu: Implement chip ID register
- * hw/misc/zynq_slcr: refactor to use standard register definition
+ * hw/misc/iotkit-secctl: Fix writing to 'PPC Interrupt Clear' register
- * Set ENET_BD_BDU in I.MX FEC controller
+ * mainstone: Make providing flash images non-mandatory
- * target/arm: Fix routing of singlestep exceptions
+ * z2: Make providing flash images non-mandatory
- * refactor a32/t32 decoder handling of PC
+ * Fix failures to flush SVE high bits after AdvSIMD INS/ZIP/UZP/TRN/TBL/TBX/EXT
- * minor optimisations/cleanups of some a32/t32 codegen
+ * Minor performance improvement: spend less time recalculating hflags values
- * target/arm/cpu64: Ensure kvm really supports aarch64=off
+ * Code cleanup to isar_feature function tests
- * target/arm/cpu: Ensure we can use the pmu with kvm
+ * Implement ARMv8.1-PMU and ARMv8.4-PMU extensions
- * target/arm: Minor cleanups preparatory to KVM SVE support
+ * Bugfix: correct handling of PMCR_EL0.LC bit
  * Bugfix: correct definition of PMCRDP
  * Correctly implement ACTLR2, HACTLR2
  * allwinner: Wire up USB ports
  * Vectorize emulation of USHL, SSHL, PMUL*
  * xilinx_spips: Correct the number of dummy cycles for the FAST_READ_4 cmd
  * sh4: Fix PCI ISA IO memory subregion
  * Code cleanup to use more isar_feature tests and fewer ARM_FEATURE_* tests
 ----------------------------------------------------------------
-Aaron Hill (1):
+Francisco Iglesias (1):
-      Set ENET_BD_BDU in I.MX FEC controller
+      xilinx_spips: Correct the number of dummy cycles for the FAST_READ_4 cmd
-Alex Bennée (1):
+Guenter Roeck (6):
-      target/arm: generate a custom MIDR for -cpu max
+      mainstone: Make providing flash images non-mandatory
       z2: Make providing flash images non-mandatory
       hw: usb: hcd-ohci: Move OHCISysBusState and TYPE_SYSBUS_OHCI to include file
       hcd-ehci: Introduce "companion-enable" sysbus property
       arm: allwinner: Wire up USB ports
       sh4: Fix PCI ISA IO memory subregion
-Andrew Jones (6):
+Joel Stanley (2):
-      target/arm/cpu64: Ensure kvm really supports aarch64=off
+      aspeed/scu: Create separate write callbacks
-      target/arm/cpu: Ensure we can use the pmu with kvm
+      aspeed/scu: Implement chip ID register
       target/arm/helper: zcr: Add build bug next to value range assumption
       target/arm/cpu: Use div-round-up to determine predicate register array size
       target/arm/kvm64: Fix error returns
       target/arm/kvm64: Move the get/put of fpsimd registers out
-Damien Hedde (1):
+Peter Maydell (21):
-      hw/misc/zynq_slcr: use standard register definition
+      target/arm: Add _aa32_ to isar_feature functions testing 32-bit ID registers
       target/arm: Check aa32_pan in take_aarch32_exception(), not aa64_pan
       target/arm: Add isar_feature_any_fp16 and document naming/usage conventions
       target/arm: Define and use any_predinv isar_feature test
       target/arm: Factor out PMU register definitions
       target/arm: Add and use FIELD definitions for ID_AA64DFR0_EL1
       target/arm: Use FIELD macros for clearing ID_DFR0 PERFMON field
       target/arm: Define an aa32_pmu_8_1 isar feature test function
       target/arm: Add _aa64_ and _any_ versions of pmu_8_1 isar checks
       target/arm: Stop assuming DBGDIDR always exists
       target/arm: Move DBGDIDR into ARMISARegisters
       target/arm: Read debug-related ID registers from KVM
       target/arm: Implement ARMv8.1-PMU extension
       target/arm: Implement ARMv8.4-PMU extension
       target/arm: Provide ARMv8.4-PMU in '-cpu max'
       target/arm: Correct definition of PMCRDP
       target/arm: Correct handling of PMCR_EL0.LC bit
       target/arm: Test correct register in aa32_pan and aa32_ats1e1 checks
       target/arm: Use isar_feature function for testing AA32HPD feature
       target/arm: Use FIELD_EX32 for testing 32-bit fields
       target/arm: Correctly implement ACTLR2, HACTLR2
-Peter Maydell (2):
+Philippe Mathieu-Daudé (1):
-      target/arm: Factor out 'generate singlestep exception' function
+      hw/misc/iotkit-secctl: Fix writing to 'PPC Interrupt Clear' register
       target/arm: Fix routing of singlestep exceptions
-Richard Henderson (18):
+Richard Henderson (21):
-      target/arm: Pass in pc to thumb_insn_is_16bit
+      target/arm: Flush high bits of sve register after AdvSIMD EXT
-      target/arm: Introduce pc_curr
+      target/arm: Flush high bits of sve register after AdvSIMD TBL/TBX
-      target/arm: Introduce read_pc
+      target/arm: Flush high bits of sve register after AdvSIMD ZIP/UZP/TRN
-      target/arm: Introduce add_reg_for_lit
+      target/arm: Flush high bits of sve register after AdvSIMD INS
-      target/arm: Remove redundant s->pc & ~1
+      target/arm: Use bit 55 explicitly for pauth
-      target/arm: Replace s->pc with s->base.pc_next
+      target/arm: Fix select for aa64_va_parameters_both
-      target/arm: Replace offset with pc in gen_exception_insn
+      target/arm: Remove ttbr1_valid check from get_phys_addr_lpae
-      target/arm: Replace offset with pc in gen_exception_internal_insn
+      target/arm: Split out aa64_va_parameter_tbi, aa64_va_parameter_tbid
-      target/arm: Remove offset argument to gen_exception_bkpt_insn
+      target/arm: Vectorize USHL and SSHL
-      target/arm: Use unallocated_encoding for aarch32
+      target/arm: Convert PMUL.8 to gvec
-      target/arm: Remove helper_double_saturate
+      target/arm: Convert PMULL.64 to gvec
-      target/arm: Use tcg_gen_extract_i32 for shifter_out_im
+      target/arm: Convert PMULL.8 to gvec
-      target/arm: Use tcg_gen_deposit_i32 for PKHBT, PKHTB
+      target/arm: Rename isar_feature_aa32_simd_r32
-      target/arm: Remove redundant shift tests
+      target/arm: Use isar_feature_aa32_simd_r32 more places
-      target/arm: Use ror32 instead of open-coding the operation
+      target/arm: Set MVFR0.FPSP for ARMv5 cpus
-      target/arm: Use tcg_gen_rotri_i32 for gen_swap_half
+      target/arm: Add isar_feature_aa32_simd_r16
-      target/arm: Simplify SMMLA, SMMLAR, SMMLS, SMMLSR
+      target/arm: Rename isar_feature_aa32_fpdp_v2
-      target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word
+      target/arm: Add isar_feature_aa32_{fpsp_v2, fpsp_v3, fpdp_v3}
       target/arm: Perform fpdp_v2 check first
       target/arm: Replace ARM_FEATURE_VFP3 checks with fp{sp, dp}_v3
       target/arm: Add missing checks for fpsp_v2
- target/arm/cpu.h               |  13 +-
+ hw/usb/hcd-ohci.h              |  16 ++
- target/arm/helper.h            |   1 -
+ include/hw/arm/allwinner-a10.h |   6 +
- target/arm/kvm_arm.h           |  28 ++
+ target/arm/cpu.h               | 173 ++++++++++++---
- target/arm/translate-a64.h     |   4 +-
+ target/arm/helper-sve.h        |   2 +
- target/arm/translate.h         |  39 ++-
+ target/arm/helper.h            |  21 +-
- hw/misc/zynq_slcr.c            | 450 ++++++++++++++++----------------
+ target/arm/internals.h         |  47 +++-
- hw/net/imx_fec.c               |   4 +
+ target/arm/translate.h         |   6 +
- target/arm/cpu.c               |  30 ++-
+ hw/arm/allwinner-a10.c         |  43 ++++
- target/arm/cpu64.c             |  31 ++-
+ hw/arm/mainstone.c             |  11 +-
- target/arm/helper.c            |   7 +
+ hw/arm/z2.c                    |   6 -
- target/arm/kvm.c               |   7 +
+ hw/intc/armv7m_nvic.c          |  30 +--
- target/arm/kvm64.c             | 161 +++++++-----
+ hw/misc/aspeed_scu.c           |  93 ++++++--
- target/arm/op_helper.c         |  15 --
+ hw/misc/iotkit-secctl.c        |   2 +-
- target/arm/translate-a64.c     | 130 ++++------
+ hw/sh4/sh_pci.c                |  11 +-
- target/arm/translate-vfp.inc.c |  45 +---
+ hw/ssi/xilinx_spips.c          |   2 +-
- target/arm/translate.c         | 572 +++++++++++++++++------------------------
+ hw/usb/hcd-ehci-sysbus.c       |   2 +
-files changed, 771 insertions(+), 766 deletions(-)
+ hw/usb/hcd-ohci.c              |  15 --
  linux-user/arm/signal.c        |   4 +-
  linux-user/elfload.c           |   4 +-
  target/arm/arch_dump.c         |  11 +-
  target/arm/cpu.c               | 175 +++++++--------
  target/arm/cpu64.c             |  58 +++--
  target/arm/debug_helper.c      |   6 +-
  target/arm/helper.c            | 472 +++++++++++++++++++++++------------------
  target/arm/kvm32.c             |  25 +++
  target/arm/kvm64.c             |  46 ++++
  target/arm/m_helper.c          |  11 +-
  target/arm/machine.c           |   3 +-
  target/arm/neon_helper.c       | 117 ----------
  target/arm/pauth_helper.c      |   3 +-
  target/arm/translate-a64.c     |  92 ++++----
  target/arm/translate-vfp.inc.c | 263 ++++++++++++++---------
  target/arm/translate.c         | 356 ++++++++++++++++++++++++++-----
  target/arm/vec_helper.c        | 211 ++++++++++++++++++
  target/arm/vfp_helper.c        |   2 +-
 files changed, 1564 insertions(+), 781 deletions(-)

-[Qemu-devel] [PULL 22/29] target/arm/kvm64: Move the get/put of fpsimd registers out
+[PULL 01/52] aspeed/scu: Create separate write callbacks
-From: Andrew Jones <drjones@redhat.com>
+From: Joel Stanley <joel@jms.id.au>
-Move the getting/putting of the fpsimd registers out of
+This splits the common write callback into separate ast2400 and ast2500
-kvm_arch_get/put_registers() into their own helper functions
+implementations. This makes it clearer when implementing differing
-to prepare for alternatively getting/putting SVE registers.
+behaviour.
-No functional change.
+Signed-off-by: Joel Stanley <joel@jms.id.au>
+Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
-Signed-off-by: Andrew Jones <drjones@redhat.com>
+Reviewed-by: Cédric Le Goater <clg@kaod.org>
-Reviewed-by: Eric Auger <eric.auger@redhat.com>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200121013302.43839-2-joel@jms.id.au
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/kvm64.c | 148 +++++++++++++++++++++++++++------------------
+ hw/misc/aspeed_scu.c | 80 +++++++++++++++++++++++++++++++-------------
-file changed, 88 insertions(+), 60 deletions(-)
+file changed, 57 insertions(+), 23 deletions(-)
-diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
+diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/kvm64.c
+--- a/hw/misc/aspeed_scu.c
-+++ b/target/arm/kvm64.c
++++ b/hw/misc/aspeed_scu.c
-@@ -XXX,XX +XXX,XX @@ int kvm_arm_cpreg_level(uint64_t regidx)
+@@ -XXX,XX +XXX,XX @@ static uint64_t aspeed_scu_read(void *opaque, hwaddr offset, unsigned size)
- #define AARCH64_SIMD_CTRL_REG(x)   (KVM_REG_ARM64 | KVM_REG_SIZE_U32 | \
+     return s->regs[reg];
-                  KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
+ }
-+static int kvm_arch_put_fpsimd(CPUState *cs)
+-static void aspeed_scu_write(void *opaque, hwaddr offset, uint64_t data,
 -                             unsigned size)
 +static void aspeed_ast2400_scu_write(void *opaque, hwaddr offset,
 +                                     uint64_t data, unsigned size)
 +{
-+    ARMCPU *cpu = ARM_CPU(cs);
++    AspeedSCUState *s = ASPEED_SCU(opaque);
-+    CPUARMState *env = &cpu->env;
++    int reg = TO_REG(offset);
 +    struct kvm_one_reg reg;
 +    uint32_t fpr;
 +    int i, ret;
 +
-+    for (i = 0; i < 32; i++) {
++    if (reg >= ASPEED_SCU_NR_REGS) {
-+        uint64_t *q = aa64_vfp_qreg(env, i);
++        qemu_log_mask(LOG_GUEST_ERROR,
-+#ifdef HOST_WORDS_BIGENDIAN
++                      "%s: Out-of-bounds write at offset 0x%" HWADDR_PRIx "\n",
-+        uint64_t fp_val[2] = { q[1], q[0] };
++                      __func__, offset);
-+        reg.addr = (uintptr_t)fp_val;
++        return;
 +#else
 +        reg.addr = (uintptr_t)q;
 +#endif
 +        reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]);
 +        ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
 +        if (ret) {
 +            return ret;
 +        }
 +    }
 +
-+    reg.addr = (uintptr_t)(&fpr);
++    if (reg > PROT_KEY && reg < CPU2_BASE_SEG1 &&
-+    fpr = vfp_get_fpsr(env);
++            !s->regs[PROT_KEY]) {
-+    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpsr);
++        qemu_log_mask(LOG_GUEST_ERROR, "%s: SCU is locked!\n", __func__);
 +    ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
 +    if (ret) {
 +        return ret;
 +    }
 +
-+    reg.addr = (uintptr_t)(&fpr);
++    trace_aspeed_scu_write(offset, size, data);
-+    fpr = vfp_get_fpcr(env);
++
-+    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpcr);
++    switch (reg) {
-+    ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
++    case PROT_KEY:
-+    if (ret) {
++        s->regs[reg] = (data == ASPEED_SCU_PROT_KEY) ? 1 : 0;
-+        return ret;
++        return;
 +    case SILICON_REV:
 +    case FREQ_CNTR_EVAL:
 +    case VGA_SCRATCH1 ... VGA_SCRATCH8:
 +    case RNG_DATA:
 +    case FREE_CNTR4:
 +    case FREE_CNTR4_EXT:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "%s: Write to read-only offset 0x%" HWADDR_PRIx "\n",
 +                      __func__, offset);
 +        return;
 +    }
 +
-+    return 0;
++    s->regs[reg] = data;
 +}
 +
- int kvm_arch_put_registers(CPUState *cs, int level)
++static void aspeed_ast2500_scu_write(void *opaque, hwaddr offset,
 +                                     uint64_t data, unsigned size)
  {
-     struct kvm_one_reg reg;
+     AspeedSCUState *s = ASPEED_SCU(opaque);
--    uint32_t fpr;
+     int reg = TO_REG(offset);
-     uint64_t val;
+@@ -XXX,XX +XXX,XX @@ static void aspeed_scu_write(void *opaque, hwaddr offset, uint64_t data,
--    int i;
+     case PROT_KEY:
--    int ret;
+         s->regs[reg] = (data == ASPEED_SCU_PROT_KEY) ? 1 : 0;
-+    int i, ret;
+         return;
-     unsigned int el;
+-    case CLK_SEL:
+-        s->regs[reg] = data;
-     ARMCPU *cpu = ARM_CPU(cs);
+-        break;
-@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
+     case HW_STRAP1:
-         }
+-        if (ASPEED_IS_AST2500(s->regs[SILICON_REV])) {
-     }
+-            s->regs[HW_STRAP1] |= data;
+-            return;
 -    /* Advanced SIMD and FP registers. */
 -    for (i = 0; i < 32; i++) {
 -        uint64_t *q = aa64_vfp_qreg(env, i);
 -#ifdef HOST_WORDS_BIGENDIAN
 -        uint64_t fp_val[2] = { q[1], q[0] };
 -        reg.addr = (uintptr_t)fp_val;
 -#else
 -        reg.addr = (uintptr_t)q;
 -#endif
 -        reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]);
 -        ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
 -        if (ret) {
 -            return ret;
 -        }
--    }
+-        /* Jump to assignment below */
--
+-        break;
--    reg.addr = (uintptr_t)(&fpr);
++        s->regs[HW_STRAP1] |= data;
--    fpr = vfp_get_fpsr(env);
++        return;
--    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpsr);
+     case SILICON_REV:
--    ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
+-        if (ASPEED_IS_AST2500(s->regs[SILICON_REV])) {
--    if (ret) {
+-            s->regs[HW_STRAP1] &= ~data;
--        return ret;
+-        } else {
--    }
+-            qemu_log_mask(LOG_GUEST_ERROR,
--
+-                          "%s: Write to read-only offset 0x%" HWADDR_PRIx "\n",
--    fpr = vfp_get_fpcr(env);
+-                          __func__, offset);
--    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpcr);
+-        }
--    ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
+-        /* Avoid assignment below, we've handled everything */
-+    ret = kvm_arch_put_fpsimd(cs);
++        s->regs[HW_STRAP1] &= ~data;
-     if (ret) {
+         return;
-         return ret;
+     case FREQ_CNTR_EVAL:
-     }
+     case VGA_SCRATCH1 ... VGA_SCRATCH8:
-@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
+@@ -XXX,XX +XXX,XX @@ static void aspeed_scu_write(void *opaque, hwaddr offset, uint64_t data,
-     return ret;
+     s->regs[reg] = data;
  }
-+static int kvm_arch_get_fpsimd(CPUState *cs)
+-static const MemoryRegionOps aspeed_scu_ops = {
-+{
++static const MemoryRegionOps aspeed_ast2400_scu_ops = {
-+    ARMCPU *cpu = ARM_CPU(cs);
+     .read = aspeed_scu_read,
-+    CPUARMState *env = &cpu->env;
+-    .write = aspeed_scu_write,
-+    struct kvm_one_reg reg;
++    .write = aspeed_ast2400_scu_write,
-+    uint32_t fpr;
++    .endianness = DEVICE_LITTLE_ENDIAN,
-+    int i, ret;
++    .valid.min_access_size = 4,
 +    .valid.max_access_size = 4,
 +    .valid.unaligned = false,
 +};
 +
-+    for (i = 0; i < 32; i++) {
++static const MemoryRegionOps aspeed_ast2500_scu_ops = {
-+        uint64_t *q = aa64_vfp_qreg(env, i);
++    .read = aspeed_scu_read,
-+        reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]);
++    .write = aspeed_ast2500_scu_write,
-+        reg.addr = (uintptr_t)q;
+     .endianness = DEVICE_LITTLE_ENDIAN,
-+        ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
+     .valid.min_access_size = 4,
-+        if (ret) {
+     .valid.max_access_size = 4,
-+            return ret;
+@@ -XXX,XX +XXX,XX @@ static void aspeed_2400_scu_class_init(ObjectClass *klass, void *data)
-+        } else {
+     asc->calc_hpll = aspeed_2400_scu_calc_hpll;
-+#ifdef HOST_WORDS_BIGENDIAN
+     asc->apb_divider = 2;
-+            uint64_t t;
+     asc->nr_regs = ASPEED_SCU_NR_REGS;
-+            t = q[0], q[0] = q[1], q[1] = t;
+-    asc->ops = &aspeed_scu_ops;
-+#endif
++    asc->ops = &aspeed_ast2400_scu_ops;
-+        }
+ }
-+    }
-+
+ static const TypeInfo aspeed_2400_scu_info = {
-+    reg.addr = (uintptr_t)(&fpr);
+@@ -XXX,XX +XXX,XX @@ static void aspeed_2500_scu_class_init(ObjectClass *klass, void *data)
-+    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpsr);
+     asc->calc_hpll = aspeed_2500_scu_calc_hpll;
-+    ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
+     asc->apb_divider = 4;
-+    if (ret) {
+     asc->nr_regs = ASPEED_SCU_NR_REGS;
-+        return ret;
+-    asc->ops = &aspeed_scu_ops;
-+    }
++    asc->ops = &aspeed_ast2500_scu_ops;
-+    vfp_set_fpsr(env, fpr);
+ }
-+
-+    reg.addr = (uintptr_t)(&fpr);
+ static const TypeInfo aspeed_2500_scu_info = {
 +    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpcr);
 +    ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
 +    if (ret) {
 +        return ret;
 +    }
 +    vfp_set_fpcr(env, fpr);
 +
 +    return 0;
 +}
 +
  int kvm_arch_get_registers(CPUState *cs)
  {
      struct kvm_one_reg reg;
      uint64_t val;
 -    uint32_t fpr;
      unsigned int el;
 -    int i;
 -    int ret;
 +    int i, ret;
      ARMCPU *cpu = ARM_CPU(cs);
      CPUARMState *env = &cpu->env;
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
          env->spsr = env->banked_spsr[i];
      }
 -    /* Advanced SIMD and FP registers */
 -    for (i = 0; i < 32; i++) {
 -        uint64_t *q = aa64_vfp_qreg(env, i);
 -        reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]);
 -        reg.addr = (uintptr_t)q;
 -        ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
 -        if (ret) {
 -            return ret;
 -        } else {
 -#ifdef HOST_WORDS_BIGENDIAN
 -            uint64_t t;
 -            t = q[0], q[0] = q[1], q[1] = t;
 -#endif
 -        }
 -    }
 -
 -    reg.addr = (uintptr_t)(&fpr);
 -    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpsr);
 -    ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
 +    ret = kvm_arch_get_fpsimd(cs);
      if (ret) {
          return ret;
      }
 -    vfp_set_fpsr(env, fpr);
 -
 -    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpcr);
 -    ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
 -    if (ret) {
 -        return ret;
 -    }
 -    vfp_set_fpcr(env, fpr);
      ret = kvm_get_vcpu_events(cpu);
      if (ret) {
 --
 .20.1

-New patch
+[PULL 02/52] aspeed/scu: Implement chip ID register
+From: Joel Stanley <joel@jms.id.au>
+This returns a fixed but non-zero value for the chip id.
+Signed-off-by: Joel Stanley <joel@jms.id.au>
+Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
+Reviewed-by: Cédric Le Goater <clg@kaod.org>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Message-id: 20200121013302.43839-3-joel@jms.id.au
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/misc/aspeed_scu.c | 13 +++++++++++++
+file changed, 13 insertions(+)
+diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/misc/aspeed_scu.c
++++ b/hw/misc/aspeed_scu.c
+@@ -XXX,XX +XXX,XX @@
+ #define CPU2_BASE_SEG4       TO_REG(0x110)
+ #define CPU2_BASE_SEG5       TO_REG(0x114)
+ #define CPU2_CACHE_CTRL      TO_REG(0x118)
++#define CHIP_ID0             TO_REG(0x150)
++#define CHIP_ID1             TO_REG(0x154)
+ #define UART_HPLL_CLK        TO_REG(0x160)
+ #define PCIE_CTRL            TO_REG(0x180)
+ #define BMC_MMIO_CTRL        TO_REG(0x184)
+@@ -XXX,XX +XXX,XX @@
+ #define AST2600_HW_STRAP2_PROT    TO_REG(0x518)
+ #define AST2600_RNG_CTRL          TO_REG(0x524)
+ #define AST2600_RNG_DATA          TO_REG(0x540)
++#define AST2600_CHIP_ID0          TO_REG(0x5B0)
++#define AST2600_CHIP_ID1          TO_REG(0x5B4)
+ #define AST2600_CLK TO_REG(0x40)
+@@ -XXX,XX +XXX,XX @@ static const uint32_t ast2500_a1_resets[ASPEED_SCU_NR_REGS] = {
+      [CPU2_BASE_SEG1]  = 0x80000000U,
+      [CPU2_BASE_SEG4]  = 0x1E600000U,
+      [CPU2_BASE_SEG5]  = 0xC0000000U,
++     [CHIP_ID0]        = 0x1234ABCDU,
++     [CHIP_ID1]        = 0x88884444U,
+      [UART_HPLL_CLK]   = 0x00001903U,
+      [PCIE_CTRL]       = 0x0000007BU,
+      [BMC_DEV_ID]      = 0x00002402U
+@@ -XXX,XX +XXX,XX @@ static void aspeed_ast2500_scu_write(void *opaque, hwaddr offset,
+     case RNG_DATA:
+     case FREE_CNTR4:
+     case FREE_CNTR4_EXT:
++    case CHIP_ID0:
++    case CHIP_ID1:
+         qemu_log_mask(LOG_GUEST_ERROR,
+                       "%s: Write to read-only offset 0x%" HWADDR_PRIx "\n",
+                       __func__, offset);
+@@ -XXX,XX +XXX,XX @@ static void aspeed_ast2600_scu_write(void *opaque, hwaddr offset,
+     case AST2600_RNG_DATA:
+     case AST2600_SILICON_REV:
+     case AST2600_SILICON_REV2:
++    case AST2600_CHIP_ID0:
++    case AST2600_CHIP_ID1:
+         /* Add read only registers here */
+         qemu_log_mask(LOG_GUEST_ERROR,
+                       "%s: Write to read-only offset 0x%" HWADDR_PRIx "\n",
+@@ -XXX,XX +XXX,XX @@ static const uint32_t ast2600_a0_resets[ASPEED_AST2600_SCU_NR_REGS] = {
+     [AST2600_CLK_STOP_CTRL2]    = 0xFFF0FFF0,
+     [AST2600_SDRAM_HANDSHAKE]   = 0x00000040,  /* SoC completed DRAM init */
+     [AST2600_HPLL_PARAM]        = 0x1000405F,
++    [AST2600_CHIP_ID0]          = 0x1234ABCD,
++    [AST2600_CHIP_ID1]          = 0x88884444,
++
+ };
+ static void aspeed_ast2600_scu_reset(DeviceState *dev)
+--
+.20.1

-New patch
+[PULL 03/52] hw/misc/iotkit-secctl: Fix writing to 'PPC Interrupt Clear' register
+From: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Fix warning reported by Clang static code analyzer:
+    CC      hw/misc/iotkit-secctl.o
+  hw/misc/iotkit-secctl.c:343:9: warning: Value stored to 'value' is never read
+          value &= 0x00f000f3;
+          ^        ~~~~~~~~~~
+Fixes: b3717c23e1c
+Reported-by: Clang Static Analyzer
+Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20200217132922.24607-1-f4bug@amsat.org
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/misc/iotkit-secctl.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/hw/misc/iotkit-secctl.c b/hw/misc/iotkit-secctl.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/misc/iotkit-secctl.c
++++ b/hw/misc/iotkit-secctl.c
+@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
+         qemu_set_irq(s->sec_resp_cfg, s->secrespcfg);
+         break;
+     case A_SECPPCINTCLR:
+-        value &= 0x00f000f3;
++        s->secppcintstat &= ~(value & 0x00f000f3);
+         foreach_ppc(s, iotkit_secctl_ppc_update_irq_clear);
+         break;
+     case A_SECPPCINTEN:
+--
+.20.1

-New patch
+[PULL 04/52] mainstone: Make providing flash images non-mandatory
+From: Guenter Roeck <linux@roeck-us.net>
+Up to now, the mainstone machine only boots if two flash images are
+provided. This is not really necessary; the machine can boot from initrd
+or from SD without it. At the same time, having to provide dummy flash
+images is a nuisance and does not add any real value. Make it optional.
+Signed-off-by: Guenter Roeck <linux@roeck-us.net>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Message-id: 20200217210824.18513-1-linux@roeck-us.net
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/arm/mainstone.c | 11 +----------
+file changed, 1 insertion(+), 10 deletions(-)
+diff --git a/hw/arm/mainstone.c b/hw/arm/mainstone.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/mainstone.c
++++ b/hw/arm/mainstone.c
+@@ -XXX,XX +XXX,XX @@ static void mainstone_common_init(MemoryRegion *address_space_mem,
+     /* There are two 32MiB flash devices on the board */
+     for (i = 0; i < 2; i ++) {
+         dinfo = drive_get(IF_PFLASH, 0, i);
+-        if (!dinfo) {
+-            if (qtest_enabled()) {
+-                break;
+-            }
+-            error_report("Two flash images must be given with the "
+-                         "'pflash' parameter");
+-            exit(1);
+-        }
+-
+         if (!pflash_cfi01_register(mainstone_flash_base[i],
+                                    i ? "mainstone.flash1" : "mainstone.flash0",
+                                    MAINSTONE_FLASH,
+-                                   blk_by_legacy_dinfo(dinfo),
++                                   dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
+                                    sector_len, 4, 0, 0, 0, 0, be)) {
+             error_report("Error registering flash memory");
+             exit(1);
+--
+.20.1

-New patch
+[PULL 05/52] z2: Make providing flash images non-mandatory
+From: Guenter Roeck <linux@roeck-us.net>
+Up to now, the z2 machine only boots if a flash image is provided.
+This is not really necessary; the machine can boot from initrd or from
+SD without it. At the same time, having to provide dummy flash images
+is a nuisance and does not add any real value. Make it optional.
+Signed-off-by: Guenter Roeck <linux@roeck-us.net>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Message-id: 20200217210903.18602-1-linux@roeck-us.net
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/arm/z2.c | 6 ------
+file changed, 6 deletions(-)
+diff --git a/hw/arm/z2.c b/hw/arm/z2.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/z2.c
++++ b/hw/arm/z2.c
+@@ -XXX,XX +XXX,XX @@ static void z2_init(MachineState *machine)
+     be = 0;
+ #endif
+     dinfo = drive_get(IF_PFLASH, 0, 0);
+-    if (!dinfo && !qtest_enabled()) {
+-        error_report("Flash image must be given with the "
+-                     "'pflash' parameter");
+-        exit(1);
+-    }
+-
+     if (!pflash_cfi01_register(Z2_FLASH_BASE, "z2.flash0", Z2_FLASH_SIZE,
+                                dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
+                                sector_len, 4, 0, 0, 0, 0, be)) {
+--
+.20.1

-[Qemu-devel] [PULL 29/29] target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word
+[PULL 06/52] target/arm: Flush high bits of sve register after AdvSIMD EXT
 From: Richard Henderson <richard.henderson@linaro.org>
-Separate shift + extract low will result in one extra insn
+Writes to AdvSIMD registers flush the bits above 128.
 for hosts like RISC-V, MIPS, and Sparc.
+Buglink: https://bugs.launchpad.net/bugs/1863247
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190808202616.13782-8-richard.henderson@linaro.org
+Message-id: 20200214194643.23317-2-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 18 ++++++------------
+ target/arm/translate-a64.c | 1 +
-file changed, 6 insertions(+), 12 deletions(-)
+file changed, 1 insertion(+)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/translate-a64.c
-+++ b/target/arm/translate.c
++++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static int disas_iwmmxt_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_ext(DisasContext *s, uint32_t insn)
-             if (insn & ARM_CP_RW_BIT) {                         /* TMRRC */
+     tcg_temp_free_i64(tcg_resl);
-                 iwmmxt_load_reg(cpu_V0, wrd);
+     write_vec_element(s, tcg_resh, rd, 1, MO_64);
-                 tcg_gen_extrl_i64_i32(cpu_R[rdlo], cpu_V0);
+     tcg_temp_free_i64(tcg_resh);
--                tcg_gen_shri_i64(cpu_V0, cpu_V0, 32);
++    clear_vec_high(s, true, rd);
 -                tcg_gen_extrl_i64_i32(cpu_R[rdhi], cpu_V0);
 +                tcg_gen_extrh_i64_i32(cpu_R[rdhi], cpu_V0);
              } else {                                    /* TMCRR */
                  tcg_gen_concat_i32_i64(cpu_V0, cpu_R[rdlo], cpu_R[rdhi]);
                  iwmmxt_store_reg(cpu_V0, wrd);
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
          if (insn & ARM_CP_RW_BIT) {                     /* MRA */
              iwmmxt_load_reg(cpu_V0, acc);
              tcg_gen_extrl_i64_i32(cpu_R[rdlo], cpu_V0);
 -            tcg_gen_shri_i64(cpu_V0, cpu_V0, 32);
 -            tcg_gen_extrl_i64_i32(cpu_R[rdhi], cpu_V0);
 +            tcg_gen_extrh_i64_i32(cpu_R[rdhi], cpu_V0);
              tcg_gen_andi_i32(cpu_R[rdhi], cpu_R[rdhi], (1 << (40 - 32)) - 1);
          } else {                                        /* MAR */
              tcg_gen_concat_i32_i64(cpu_V0, cpu_R[rdlo], cpu_R[rdhi]);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                  gen_helper_neon_narrow_high_u16(tmp, cpu_V0);
                                  break;
                              case 2:
 -                                tcg_gen_shri_i64(cpu_V0, cpu_V0, 32);
 -                                tcg_gen_extrl_i64_i32(tmp, cpu_V0);
 +                                tcg_gen_extrh_i64_i32(tmp, cpu_V0);
                                  break;
                              default: abort();
                              }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                  break;
                              case 2:
                                  tcg_gen_addi_i64(cpu_V0, cpu_V0, 1u << 31);
 -                                tcg_gen_shri_i64(cpu_V0, cpu_V0, 32);
 -                                tcg_gen_extrl_i64_i32(tmp, cpu_V0);
 +                                tcg_gen_extrh_i64_i32(tmp, cpu_V0);
                                  break;
                              default: abort();
                              }
@@ -XXX,XX +XXX,XX @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
                  tmp = tcg_temp_new_i32();
                  tcg_gen_extrl_i64_i32(tmp, tmp64);
                  store_reg(s, rt, tmp);
 -                tcg_gen_shri_i64(tmp64, tmp64, 32);
                  tmp = tcg_temp_new_i32();
 -                tcg_gen_extrl_i64_i32(tmp, tmp64);
 +                tcg_gen_extrh_i64_i32(tmp, tmp64);
                  tcg_temp_free_i64(tmp64);
                  store_reg(s, rt2, tmp);
              } else {
@@ -XXX,XX +XXX,XX @@ static void gen_storeq_reg(DisasContext *s, int rlow, int rhigh, TCGv_i64 val)
      tcg_gen_extrl_i64_i32(tmp, val);
      store_reg(s, rlow, tmp);
      tmp = tcg_temp_new_i32();
 -    tcg_gen_shri_i64(val, val, 32);
 -    tcg_gen_extrl_i64_i32(tmp, val);
 +    tcg_gen_extrh_i64_i32(tmp, val);
      store_reg(s, rhigh, tmp);
  }
+ /* TBL/TBX
 --
 .20.1

-[Qemu-devel] [PULL 28/29] target/arm: Simplify SMMLA, SMMLAR, SMMLS, SMMLSR
+[PULL 07/52] target/arm: Flush high bits of sve register after AdvSIMD TBL/TBX
 From: Richard Henderson <richard.henderson@linaro.org>
-All of the inputs to these instructions are 32-bits.  Rather than
+Writes to AdvSIMD registers flush the bits above 128.
 extend each input to 64-bits and then extract the high 32-bits of
 the output, use tcg_gen_muls2_i32 and other 32-bit generator functions.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190808202616.13782-7-richard.henderson@linaro.org
+Message-id: 20200214194643.23317-3-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 72 +++++++++++++++---------------------------
+ target/arm/translate-a64.c | 1 +
-file changed, 26 insertions(+), 46 deletions(-)
+file changed, 1 insertion(+)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/translate-a64.c
-+++ b/target/arm/translate.c
++++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void gen_revsh(TCGv_i32 var)
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_tb(DisasContext *s, uint32_t insn)
-     tcg_gen_ext16s_i32(var, var);
+     tcg_temp_free_i64(tcg_resl);
      write_vec_element(s, tcg_resh, rd, 1, MO_64);
      tcg_temp_free_i64(tcg_resh);
 +    clear_vec_high(s, true, rd);
  }
--/* Return (b << 32) + a. Mark inputs as dead */
+ /* ZIP/UZP/TRN
 -static TCGv_i64 gen_addq_msw(TCGv_i64 a, TCGv_i32 b)
 -{
 -    TCGv_i64 tmp64 = tcg_temp_new_i64();
 -
 -    tcg_gen_extu_i32_i64(tmp64, b);
 -    tcg_temp_free_i32(b);
 -    tcg_gen_shli_i64(tmp64, tmp64, 32);
 -    tcg_gen_add_i64(a, tmp64, a);
 -
 -    tcg_temp_free_i64(tmp64);
 -    return a;
 -}
 -
 -/* Return (b << 32) - a. Mark inputs as dead. */
 -static TCGv_i64 gen_subq_msw(TCGv_i64 a, TCGv_i32 b)
 -{
 -    TCGv_i64 tmp64 = tcg_temp_new_i64();
 -
 -    tcg_gen_extu_i32_i64(tmp64, b);
 -    tcg_temp_free_i32(b);
 -    tcg_gen_shli_i64(tmp64, tmp64, 32);
 -    tcg_gen_sub_i64(a, tmp64, a);
 -
 -    tcg_temp_free_i64(tmp64);
 -    return a;
 -}
 -
  /* 32x32->64 multiply.  Marks inputs as dead.  */
  static TCGv_i64 gen_mulu_i64_i32(TCGv_i32 a, TCGv_i32 b)
  {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                             (SMMUL, SMMLA, SMMLS) */
                          tmp = load_reg(s, rm);
                          tmp2 = load_reg(s, rs);
 -                        tmp64 = gen_muls_i64_i32(tmp, tmp2);
 +                        tcg_gen_muls2_i32(tmp2, tmp, tmp, tmp2);
                          if (rd != 15) {
 -                            tmp = load_reg(s, rd);
 +                            tmp3 = load_reg(s, rd);
                              if (insn & (1 << 6)) {
 -                                tmp64 = gen_subq_msw(tmp64, tmp);
 +                                tcg_gen_sub_i32(tmp, tmp, tmp3);
                              } else {
 -                                tmp64 = gen_addq_msw(tmp64, tmp);
 +                                tcg_gen_add_i32(tmp, tmp, tmp3);
                              }
 +                            tcg_temp_free_i32(tmp3);
                          }
                          if (insn & (1 << 5)) {
 -                            tcg_gen_addi_i64(tmp64, tmp64, 0x80000000u);
 +                            /*
 +                             * Adding 0x80000000 to the 64-bit quantity
 +                             * means that we have carry in to the high
 +                             * word when the low word has the high bit set.
 +                             */
 +                            tcg_gen_shri_i32(tmp2, tmp2, 31);
 +                            tcg_gen_add_i32(tmp, tmp, tmp2);
                          }
 -                        tcg_gen_shri_i64(tmp64, tmp64, 32);
 -                        tmp = tcg_temp_new_i32();
 -                        tcg_gen_extrl_i64_i32(tmp, tmp64);
 -                        tcg_temp_free_i64(tmp64);
 +                        tcg_temp_free_i32(tmp2);
                          store_reg(s, rn, tmp);
                          break;
                      case 0:
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                    }
                  break;
              case 5: case 6: /* 32 * 32 -> 32msb (SMMUL, SMMLA, SMMLS) */
 -                tmp64 = gen_muls_i64_i32(tmp, tmp2);
 +                tcg_gen_muls2_i32(tmp2, tmp, tmp, tmp2);
                  if (rs != 15) {
 -                    tmp = load_reg(s, rs);
 +                    tmp3 = load_reg(s, rs);
                      if (insn & (1 << 20)) {
 -                        tmp64 = gen_addq_msw(tmp64, tmp);
 +                        tcg_gen_add_i32(tmp, tmp, tmp3);
                      } else {
 -                        tmp64 = gen_subq_msw(tmp64, tmp);
 +                        tcg_gen_sub_i32(tmp, tmp, tmp3);
                      }
 +                    tcg_temp_free_i32(tmp3);
                  }
                  if (insn & (1 << 4)) {
 -                    tcg_gen_addi_i64(tmp64, tmp64, 0x80000000u);
 +                    /*
 +                     * Adding 0x80000000 to the 64-bit quantity
 +                     * means that we have carry in to the high
 +                     * word when the low word has the high bit set.
 +                     */
 +                    tcg_gen_shri_i32(tmp2, tmp2, 31);
 +                    tcg_gen_add_i32(tmp, tmp, tmp2);
                  }
 -                tcg_gen_shri_i64(tmp64, tmp64, 32);
 -                tmp = tcg_temp_new_i32();
 -                tcg_gen_extrl_i64_i32(tmp, tmp64);
 -                tcg_temp_free_i64(tmp64);
 +                tcg_temp_free_i32(tmp2);
                  break;
              case 7: /* Unsigned sum of absolute differences.  */
                  gen_helper_usad8(tmp, tmp, tmp2);
 --
 .20.1

-[Qemu-devel] [PULL 27/29] target/arm: Use tcg_gen_rotri_i32 for gen_swap_half
+[PULL 08/52] target/arm: Flush high bits of sve register after AdvSIMD ZIP/UZP/TRN
 From: Richard Henderson <richard.henderson@linaro.org>
-Rotate is the more compact and obvious way to swap 16-bit
+Writes to AdvSIMD registers flush the bits above 128.
 elements of a 32-bit word.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190808202616.13782-6-richard.henderson@linaro.org
+Message-id: 20200214194643.23317-4-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 6 +-----
+ target/arm/translate-a64.c | 1 +
-file changed, 1 insertion(+), 5 deletions(-)
+file changed, 1 insertion(+)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/translate-a64.c
-+++ b/target/arm/translate.c
++++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static TCGv_i64 gen_muls_i64_i32(TCGv_i32 a, TCGv_i32 b)
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
- /* Swap low and high halfwords.  */
+     tcg_temp_free_i64(tcg_resl);
- static void gen_swap_half(TCGv_i32 var)
+     write_vec_element(s, tcg_resh, rd, 1, MO_64);
- {
+     tcg_temp_free_i64(tcg_resh);
--    TCGv_i32 tmp = tcg_temp_new_i32();
++    clear_vec_high(s, true, rd);
 -    tcg_gen_shri_i32(tmp, var, 16);
 -    tcg_gen_shli_i32(var, var, 16);
 -    tcg_gen_or_i32(var, var, tmp);
 -    tcg_temp_free_i32(tmp);
 +    tcg_gen_rotri_i32(var, var, 16);
  }
- /* Dual 16-bit add.  Result placed in t0 and t1 is marked as dead.
+ /*
 --
 .20.1

-[Qemu-devel] [PULL 14/29] target/arm: Remove offset argument to gen_exception_bkpt_insn
+[PULL 09/52] target/arm: Flush high bits of sve register after AdvSIMD INS
 From: Richard Henderson <richard.henderson@linaro.org>
-Unlike the other more generic gen_exception{,_internal}_insn
+Writes to AdvSIMD registers flush the bits above 128.
 interfaces, breakpoints always refer to the current instruction.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200214194643.23317-5-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190807045335.1361-10-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 7 +++----
+ target/arm/translate-a64.c | 6 ++++++
- target/arm/translate.c     | 8 ++++----
+file changed, 6 insertions(+)
 files changed, 7 insertions(+), 8 deletions(-)
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void gen_exception_insn(DisasContext *s, uint64_t pc, int excp,
+@@ -XXX,XX +XXX,XX @@ static void handle_simd_inse(DisasContext *s, int rd, int rn,
-     s->base.is_jmp = DISAS_NORETURN;
+     write_vec_element(s, tmp, rd, dst_index, size);
      tcg_temp_free_i64(tmp);
 +
 +    /* INS is considered a 128-bit write for SVE. */
 +    clear_vec_high(s, true, rd);
  }
--static void gen_exception_bkpt_insn(DisasContext *s, int offset,
--                                    uint32_t syndrome)
+@@ -XXX,XX +XXX,XX @@ static void handle_simd_insg(DisasContext *s, int rd, int rn, int imm5)
-+static void gen_exception_bkpt_insn(DisasContext *s, uint32_t syndrome)
- {
+     idx = extract32(imm5, 1 + size, 4 - size);
-     TCGv_i32 tcg_syn;
+     write_vec_element(s, cpu_reg(s, rn), rd, idx, size);
++
--    gen_a64_set_pc_im(s->base.pc_next - offset);
++    /* INS is considered a 128-bit write for SVE. */
-+    gen_a64_set_pc_im(s->pc_curr);
++    clear_vec_high(s, true, rd);
      tcg_syn = tcg_const_i32(syndrome);
      gen_helper_exception_bkpt_insn(cpu_env, tcg_syn);
      tcg_temp_free_i32(tcg_syn);
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
              break;
          }
          /* BRK */
 -        gen_exception_bkpt_insn(s, 4, syn_aa64_bkpt(imm16));
 +        gen_exception_bkpt_insn(s, syn_aa64_bkpt(imm16));
          break;
      case 2:
          if (op2_ll != 0) {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_insn(DisasContext *s, uint32_t pc, int excp,
      s->base.is_jmp = DISAS_NORETURN;
  }
--static void gen_exception_bkpt_insn(DisasContext *s, int offset, uint32_t syn)
+ /*
 +static void gen_exception_bkpt_insn(DisasContext *s, uint32_t syn)
  {
      TCGv_i32 tcg_syn;
      gen_set_condexec(s);
 -    gen_set_pc_im(s, s->base.pc_next - offset);
 +    gen_set_pc_im(s, s->pc_curr);
      tcg_syn = tcg_const_i32(syn);
      gen_helper_exception_bkpt_insn(cpu_env, tcg_syn);
      tcg_temp_free_i32(tcg_syn);
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
              case 1:
                  /* bkpt */
                  ARCH(5);
 -                gen_exception_bkpt_insn(s, 4, syn_aa32_bkpt(imm16, false));
 +                gen_exception_bkpt_insn(s, syn_aa32_bkpt(imm16, false));
                  break;
              case 2:
                  /* Hypervisor call (v7) */
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
          {
              int imm8 = extract32(insn, 0, 8);
              ARCH(5);
 -            gen_exception_bkpt_insn(s, 2, syn_aa32_bkpt(imm8, true));
 +            gen_exception_bkpt_insn(s, syn_aa32_bkpt(imm8, true));
              break;
          }
 --
 .20.1

-New patch
+[PULL 10/52] target/arm: Use bit 55 explicitly for pauth
+From: Richard Henderson <richard.henderson@linaro.org>
+The psuedocode in aarch64/functions/pac/auth/Auth and
+aarch64/functions/pac/strip/Strip always uses bit 55 for
+extfield and do not consider if the current regime has 2 ranges.
+Suggested-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200216194343.21331-2-richard.henderson@linaro.org
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ target/arm/pauth_helper.c | 3 ++-
+file changed, 2 insertions(+), 1 deletion(-)
+diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/pauth_helper.c
++++ b/target/arm/pauth_helper.c
+@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
+ static uint64_t pauth_original_ptr(uint64_t ptr, ARMVAParameters param)
+ {
+-    uint64_t extfield = -param.select;
++    /* Note that bit 55 is used whether or not the regime has 2 ranges. */
++    uint64_t extfield = sextract64(ptr, 55, 1);
+     int bot_pac_bit = 64 - param.tsz;
+     int top_pac_bit = 64 - 8 * param.tbi;
+--
+.20.1

-[Qemu-devel] [PULL 19/29] target/arm/helper: zcr: Add build bug next to value range assumption
+[PULL 11/52] target/arm: Fix select for aa64_va_parameters_both
-From: Andrew Jones <drjones@redhat.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-The current implementation of ZCR_ELx matches the architecture, only
+Select should always be 0 for a regime with one range.
 implementing the lower four bits, with the rest RAZ/WI. This puts
 a strict limit on ARM_MAX_VQ of 16. Make sure we don't let ARM_MAX_VQ
 grow without a corresponding update here.
-Suggested-by: Dave Martin <Dave.Martin@arm.com>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Andrew Jones <drjones@redhat.com>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200216194343.21331-3-richard.henderson@linaro.org
 Reviewed-by: Eric Auger <eric.auger@redhat.com>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.c | 1 +
+ target/arm/helper.c | 46 +++++++++++++++++++++++----------------------
-file changed, 1 insertion(+)
+file changed, 24 insertions(+), 22 deletions(-)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
-     int new_len;
+     bool tbi, tbid, epd, hpd, using16k, using64k;
+     int select, tsz;
-     /* Bits other than [3:0] are RAZ/WI.  */
-+    QEMU_BUILD_BUG_ON(ARM_MAX_VQ > 16);
+-    /*
-     raw_write(env, ri, value & 0xf);
+-     * Bit 55 is always between the two regions, and is canonical for
+-     * determining if address tagging is enabled.
-     /*
+-     */
 -    select = extract64(va, 55, 1);
 -
      if (!regime_has_2_ranges(mmu_idx)) {
 +        select = 0;
          tsz = extract32(tcr, 0, 6);
          using64k = extract32(tcr, 14, 1);
          using16k = extract32(tcr, 15, 1);
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
              tbid = extract32(tcr, 29, 1);
          }
          epd = false;
 -    } else if (!select) {
 -        tsz = extract32(tcr, 0, 6);
 -        epd = extract32(tcr, 7, 1);
 -        using64k = extract32(tcr, 14, 1);
 -        using16k = extract32(tcr, 15, 1);
 -        tbi = extract64(tcr, 37, 1);
 -        hpd = extract64(tcr, 41, 1);
 -        tbid = extract64(tcr, 51, 1);
      } else {
 -        int tg = extract32(tcr, 30, 2);
 -        using16k = tg == 1;
 -        using64k = tg == 3;
 -        tsz = extract32(tcr, 16, 6);
 -        epd = extract32(tcr, 23, 1);
 -        tbi = extract64(tcr, 38, 1);
 -        hpd = extract64(tcr, 42, 1);
 -        tbid = extract64(tcr, 52, 1);
 +        /*
 +         * Bit 55 is always between the two regions, and is canonical for
 +         * determining if address tagging is enabled.
 +         */
 +        select = extract64(va, 55, 1);
 +        if (!select) {
 +            tsz = extract32(tcr, 0, 6);
 +            epd = extract32(tcr, 7, 1);
 +            using64k = extract32(tcr, 14, 1);
 +            using16k = extract32(tcr, 15, 1);
 +            tbi = extract64(tcr, 37, 1);
 +            hpd = extract64(tcr, 41, 1);
 +            tbid = extract64(tcr, 51, 1);
 +        } else {
 +            int tg = extract32(tcr, 30, 2);
 +            using16k = tg == 1;
 +            using64k = tg == 3;
 +            tsz = extract32(tcr, 16, 6);
 +            epd = extract32(tcr, 23, 1);
 +            tbi = extract64(tcr, 38, 1);
 +            hpd = extract64(tcr, 42, 1);
 +            tbid = extract64(tcr, 52, 1);
 +        }
      }
      tsz = MIN(tsz, 39);  /* TODO: ARMv8.4-TTST */
      tsz = MAX(tsz, 16);  /* TODO: ARMv8.2-LVA  */
 --
 .20.1

-[Qemu-devel] [PULL 26/29] target/arm: Use ror32 instead of open-coding the operation
+[PULL 12/52] target/arm: Remove ttbr1_valid check from get_phys_addr_lpae
 From: Richard Henderson <richard.henderson@linaro.org>
-The helper function is more documentary, and also already
+Now that aa64_va_parameters_both sets select based on the number
-handles the case of rotate by zero.
+of ranges in the regime, the ttbr1_valid check is redundant.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190808202616.13782-5-richard.henderson@linaro.org
+Message-id: 20200216194343.21331-4-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 7 ++-----
+ target/arm/helper.c | 6 +-----
-file changed, 2 insertions(+), 5 deletions(-)
+file changed, 1 insertion(+), 5 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
-                 /* CPSR = immediate */
+     TCR *tcr = regime_tcr(env, mmu_idx);
-                 val = insn & 0xff;
+     int ap, ns, xn, pxn;
-                 shift = ((insn >> 8) & 0xf) * 2;
+     uint32_t el = regime_el(env, mmu_idx);
--                if (shift)
+-    bool ttbr1_valid;
--                    val = (val >> shift) | (val << (32 - shift));
+     uint64_t descaddrmask;
-+                val = ror32(val, shift);
+     bool aarch64 = arm_el_is_aa64(env, el);
-                 i = ((insn & (1 << 22)) != 0);
+     bool guarded = false;
-                 if (gen_set_psr_im(s, msr_mask(s, (insn >> 16) & 0xf, i),
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
-                                    i, val)) {
+         param = aa64_va_parameters(env, address, mmu_idx,
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+                                    access_type != MMU_INST_FETCH);
-             /* immediate operand */
+         level = 0;
-             val = insn & 0xff;
+-        ttbr1_valid = regime_has_2_ranges(mmu_idx);
-             shift = ((insn >> 8) & 0xf) * 2;
+         addrsize = 64 - 8 * param.tbi;
--            if (shift) {
+         inputsize = 64 - param.tsz;
--                val = (val >> shift) | (val << (32 - shift));
+     } else {
--            }
+         param = aa32_va_parameters(env, address, mmu_idx);
-+            val = ror32(val, shift);
+         level = 1;
-             tmp2 = tcg_temp_new_i32();
+-        /* There is no TTBR1 for EL2 */
-             tcg_gen_movi_i32(tmp2, val);
+-        ttbr1_valid = (el != 2);
-             if (logic_cc && shift) {
+         addrsize = (mmu_idx == ARMMMUIdx_Stage2 ? 40 : 32);
          inputsize = addrsize - param.tsz;
      }
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
      if (inputsize < addrsize) {
          target_ulong top_bits = sextract64(address, inputsize,
                                             addrsize - inputsize);
 -        if (-top_bits != param.select || (param.select && !ttbr1_valid)) {
 +        if (-top_bits != param.select) {
              /* The gap between the two regions is a Translation fault */
              fault_type = ARMFault_Translation;
              goto do_fault;
 --
 .20.1

-New patch
+[PULL 13/52] target/arm: Split out aa64_va_parameter_tbi, aa64_va_parameter_tbid
+From: Richard Henderson <richard.henderson@linaro.org>
+For the purpose of rebuild_hflags_a64, we do not need to compute
+all of the va parameters, only tbi.  Moreover, we can compute them
+in a form that is more useful to storing in hflags.
+This eliminates the need for aa64_va_parameter_both, so fold that
+in to aa64_va_parameter.  The remaining calls to aa64_va_parameter
+are in get_phys_addr_lpae and in pauth_helper.c.
+This reduces the total cpu consumption of aa64_va_parameter in a
+kernel boot plus a kvm guest kernel boot from 3% to 0.5%.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200216194343.21331-5-richard.henderson@linaro.org
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ target/arm/internals.h |  3 --
+ target/arm/helper.c    | 68 +++++++++++++++++++++++-------------------
+files changed, 37 insertions(+), 34 deletions(-)
+diff --git a/target/arm/internals.h b/target/arm/internals.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/internals.h
++++ b/target/arm/internals.h
+@@ -XXX,XX +XXX,XX @@ typedef struct ARMVAParameters {
+     unsigned tsz    : 8;
+     unsigned select : 1;
+     bool tbi        : 1;
+-    bool tbid       : 1;
+     bool epd        : 1;
+     bool hpd        : 1;
+     bool using16k   : 1;
+     bool using64k   : 1;
+ } ARMVAParameters;
+-ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
+-                                        ARMMMUIdx mmu_idx);
+ ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
+                                    ARMMMUIdx mmu_idx, bool data);
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static uint8_t convert_stage2_attrs(CPUARMState *env, uint8_t s2attrs)
+ }
+ #endif /* !CONFIG_USER_ONLY */
+-ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
+-                                        ARMMMUIdx mmu_idx)
++static int aa64_va_parameter_tbi(uint64_t tcr, ARMMMUIdx mmu_idx)
++{
++    if (regime_has_2_ranges(mmu_idx)) {
++        return extract64(tcr, 37, 2);
++    } else if (mmu_idx == ARMMMUIdx_Stage2) {
++        return 0; /* VTCR_EL2 */
++    } else {
++        return extract32(tcr, 20, 1);
++    }
++}
++
++static int aa64_va_parameter_tbid(uint64_t tcr, ARMMMUIdx mmu_idx)
++{
++    if (regime_has_2_ranges(mmu_idx)) {
++        return extract64(tcr, 51, 2);
++    } else if (mmu_idx == ARMMMUIdx_Stage2) {
++        return 0; /* VTCR_EL2 */
++    } else {
++        return extract32(tcr, 29, 1);
++    }
++}
++
++ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
++                                   ARMMMUIdx mmu_idx, bool data)
+ {
+     uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
+-    bool tbi, tbid, epd, hpd, using16k, using64k;
+-    int select, tsz;
++    bool epd, hpd, using16k, using64k;
++    int select, tsz, tbi;
+     if (!regime_has_2_ranges(mmu_idx)) {
+         select = 0;
+@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
+         using16k = extract32(tcr, 15, 1);
+         if (mmu_idx == ARMMMUIdx_Stage2) {
+             /* VTCR_EL2 */
+-            tbi = tbid = hpd = false;
++            hpd = false;
+         } else {
+-            tbi = extract32(tcr, 20, 1);
+             hpd = extract32(tcr, 24, 1);
+-            tbid = extract32(tcr, 29, 1);
+         }
+         epd = false;
+     } else {
+@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
+             epd = extract32(tcr, 7, 1);
+             using64k = extract32(tcr, 14, 1);
+             using16k = extract32(tcr, 15, 1);
+-            tbi = extract64(tcr, 37, 1);
+             hpd = extract64(tcr, 41, 1);
+-            tbid = extract64(tcr, 51, 1);
+         } else {
+             int tg = extract32(tcr, 30, 2);
+             using16k = tg == 1;
+             using64k = tg == 3;
+             tsz = extract32(tcr, 16, 6);
+             epd = extract32(tcr, 23, 1);
+-            tbi = extract64(tcr, 38, 1);
+             hpd = extract64(tcr, 42, 1);
+-            tbid = extract64(tcr, 52, 1);
+         }
+     }
+     tsz = MIN(tsz, 39);  /* TODO: ARMv8.4-TTST */
+     tsz = MAX(tsz, 16);  /* TODO: ARMv8.2-LVA  */
++    /* Present TBI as a composite with TBID.  */
++    tbi = aa64_va_parameter_tbi(tcr, mmu_idx);
++    if (!data) {
++        tbi &= ~aa64_va_parameter_tbid(tcr, mmu_idx);
++    }
++    tbi = (tbi >> select) & 1;
++
+     return (ARMVAParameters) {
+         .tsz = tsz,
+         .select = select,
+         .tbi = tbi,
+-        .tbid = tbid,
+         .epd = epd,
+         .hpd = hpd,
+         .using16k = using16k,
+@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
+     };
+ }
+-ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
+-                                   ARMMMUIdx mmu_idx, bool data)
+-{
+-    ARMVAParameters ret = aa64_va_parameters_both(env, va, mmu_idx);
+-
+-    /* Present TBI as a composite with TBID.  */
+-    ret.tbi &= (data || !ret.tbid);
+-    return ret;
+-}
+-
+ #ifndef CONFIG_USER_ONLY
+ static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
+                                           ARMMMUIdx mmu_idx)
+@@ -XXX,XX +XXX,XX @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
+ {
+     uint32_t flags = rebuild_hflags_aprofile(env);
+     ARMMMUIdx stage1 = stage_1_mmu_idx(mmu_idx);
+-    ARMVAParameters p0 = aa64_va_parameters_both(env, 0, stage1);
++    uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
+     uint64_t sctlr;
+     int tbii, tbid;
+     flags = FIELD_DP32(flags, TBFLAG_ANY, AARCH64_STATE, 1);
+     /* Get control bits for tagged addresses.  */
+-    if (regime_has_2_ranges(mmu_idx)) {
+-        ARMVAParameters p1 = aa64_va_parameters_both(env, -1, stage1);
+-        tbid = (p1.tbi << 1) | p0.tbi;
+-        tbii = tbid & ~((p1.tbid << 1) | p0.tbid);
+-    } else {
+-        tbid = p0.tbi;
+-        tbii = tbid & !p0.tbid;
+-    }
++    tbid = aa64_va_parameter_tbi(tcr, mmu_idx);
++    tbii = tbid & ~aa64_va_parameter_tbid(tcr, mmu_idx);
+     flags = FIELD_DP32(flags, TBFLAG_A64, TBII, tbii);
+     flags = FIELD_DP32(flags, TBFLAG_A64, TBID, tbid);
+--
+.20.1

-[Qemu-devel] [PULL 05/29] target/arm: Fix routing of singlestep exceptions
+[PULL 14/52] target/arm: Add _aa32_ to isar_feature functions testing 32-bit ID registers
-When generating an architectural single-step exception we were
+Enforce a convention that an isar_feature function that tests a
-routing it to the "default exception level", which is to say
+-bit ID register always has _aa32_ in its name, and one that
-the same exception level we execute at except that EL0 exceptions
+tests a 64-bit ID register always has _aa64_ in its name.
-go to EL1. This is incorrect because the debug exception level
+We already follow this except for three cases: thumb_div,
-can be configured by the guest for situations such as single
+arm_div and jazelle, which all need _aa32_ adding.
 stepping of EL0 and EL1 code by EL2.
-We have to track the target debug exception level in the TB
+(As noted in the comment, isar_feature_aa32_fp16_arith()
-flags, because it is dependent on CPU state like HCR_EL2.TGE
+is an exception in that it currently tests ID_AA64PFR0_EL1,
-and MDCR_EL2.TDE. (That we were previously calling the
+but will switch to MVFR1 once we've properly implemented
-arm_debug_target_el() function to determine dc->ss_same_el
+FP16 for AArch32.)
 is itself a bug, though one that would only have manifested
 as incorrect syndrome information.) Since we are out of TB
 flag bits unless we want to expand into the cs_base field,
 we share some bits with the M-profile only HANDLER and
 STACKCHECK bits, since only A-profile has this singlestep.
-Fixes: https://bugs.launchpad.net/qemu/+bug/1838913
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+Message-id: 20200214175116.9164-2-peter.maydell@linaro.org
 Tested-by: Alex Bennée <alex.bennee@linaro.org>
 Message-id: 20190805130952.4415-3-peter.maydell@linaro.org
 ---
- target/arm/cpu.h           |  5 +++++
+ target/arm/cpu.h       | 13 ++++++++++---
- target/arm/translate.h     | 15 +++++++++++----
+ target/arm/internals.h |  2 +-
- target/arm/helper.c        |  6 ++++++
+ linux-user/elfload.c   |  4 ++--
- target/arm/translate-a64.c |  2 +-
+ target/arm/cpu.c       |  6 ++++--
- target/arm/translate.c     |  4 +++-
+ target/arm/helper.c    |  2 +-
-files changed, 26 insertions(+), 6 deletions(-)
+ target/arm/translate.c |  6 +++---
 files changed, 21 insertions(+), 12 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, PSTATE_SS, 26, 1)
+@@ -XXX,XX +XXX,XX @@ static inline uint64_t *aa64_vfp_qreg(CPUARMState *env, unsigned regno)
- /* Target EL if we take a floating-point-disabled exception */
+ /* Shared between translate-sve.c and sve_helper.c.  */
- FIELD(TBFLAG_ANY, FPEXC_EL, 24, 2)
+ extern const uint64_t pred_esz_masks[4];
- FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
 +/*
-+ * For A-profile only, target EL for debug exceptions.
++ * Naming convention for isar_feature functions:
-+ * Note that this overlaps with the M-profile-only HANDLER and STACKCHECK bits.
++ * Functions which test 32-bit ID registers should have _aa32_ in
 + * their name. Functions which test 64-bit ID registers should have
 + * _aa64_ in their name.
 + */
-+FIELD(TBFLAG_ANY, DEBUG_TARGET_EL, 21, 2)
++
+ /*
- /* Bit usage when in AArch32 state: */
+  * 32-bit feature tests via id registers.
- FIELD(TBFLAG_A32, THUMB, 0, 1)
+  */
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+-static inline bool isar_feature_thumb_div(const ARMISARegisters *id)
 +static inline bool isar_feature_aa32_thumb_div(const ARMISARegisters *id)
  {
      return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) != 0;
  }
 -static inline bool isar_feature_arm_div(const ARMISARegisters *id)
 +static inline bool isar_feature_aa32_arm_div(const ARMISARegisters *id)
  {
      return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) > 1;
  }
 -static inline bool isar_feature_jazelle(const ARMISARegisters *id)
 +static inline bool isar_feature_aa32_jazelle(const ARMISARegisters *id)
  {
      return FIELD_EX32(id->id_isar1, ID_ISAR1, JAZELLE) != 0;
  }
 diff --git a/target/arm/internals.h b/target/arm/internals.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/internals.h
-+++ b/target/arm/translate.h
++++ b/target/arm/internals.h
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
+@@ -XXX,XX +XXX,XX @@ static inline uint32_t aarch32_cpsr_valid_mask(uint64_t features,
-     uint32_t svc_imm;
+     if ((features >> ARM_FEATURE_THUMB2) & 1) {
-     int aarch64;
+         valid |= CPSR_IT;
-     int current_el;
+     }
-+    /* Debug target exception level for single-step exceptions */
+-    if (isar_feature_jazelle(id)) {
-+    int debug_target_el;
++    if (isar_feature_aa32_jazelle(id)) {
-     GHashTable *cp_regs;
+         valid |= CPSR_J;
-     uint64_t features; /* CPU features bits */
+     }
-     /* Because unallocated encodings generate different exception syndrome
+     if (isar_feature_aa32_pan(id)) {
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
+diff --git a/linux-user/elfload.c b/linux-user/elfload.c
-      * ie A64 LDX*, LDAX*, A32/T32 LDREX*, LDAEX*.
+index XXXXXXX..XXXXXXX 100644
-      */
+--- a/linux-user/elfload.c
-     bool is_ldex;
++++ b/linux-user/elfload.c
--    /* True if a single-step exception will be taken to the current EL */
+@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
--    bool ss_same_el;
+     GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
-     /* True if v8.3-PAuth is active.  */
+     GET_FEATURE(ARM_FEATURE_V6K, ARM_HWCAP_ARM_TLS);
-     bool pauth_active;
+     GET_FEATURE(ARM_FEATURE_VFP4, ARM_HWCAP_ARM_VFPv4);
-     /* True with v8.5-BTI and SCTLR_ELx.BT* set.  */
+-    GET_FEATURE_ID(arm_div, ARM_HWCAP_ARM_IDIVA);
-@@ -XXX,XX +XXX,XX @@ static inline void gen_exception(int excp, uint32_t syndrome,
+-    GET_FEATURE_ID(thumb_div, ARM_HWCAP_ARM_IDIVT);
- /* Generate an architectural singlestep exception */
++    GET_FEATURE_ID(aa32_arm_div, ARM_HWCAP_ARM_IDIVA);
- static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
++    GET_FEATURE_ID(aa32_thumb_div, ARM_HWCAP_ARM_IDIVT);
- {
+     /* All QEMU's VFPv3 CPUs have 32 registers, see VFP_DREG in translate.c.
--    gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, isv, ex),
+      * Note that the ARM_HWCAP_ARM_VFPv3D16 bit is always the inverse of
--                  default_exception_el(s));
+      * ARM_HWCAP_ARM_VFPD32 (and so always clear for QEMU); it is unrelated
-+    bool same_el = (s->debug_target_el == s->current_el);
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
-+
+index XXXXXXX..XXXXXXX 100644
-+    /*
+--- a/target/arm/cpu.c
-+     * If singlestep is targeting a lower EL than the current one,
++++ b/target/arm/cpu.c
-+     * then s->ss_active must be false and we can never get here.
+@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
-+     */
+          * Presence of EL2 itself is ARM_FEATURE_EL2, and of the
-+    assert(s->debug_target_el >= s->current_el);
+          * Security Extensions is ARM_FEATURE_EL3.
-+
+          */
-+    gen_exception(EXCP_UDEF, syn_swstep(same_el, isv, ex), s->debug_target_el);
+-        assert(!tcg_enabled() || no_aa32 || cpu_isar_feature(arm_div, cpu));
- }
++        assert(!tcg_enabled() || no_aa32 ||
++               cpu_isar_feature(aa32_arm_div, cpu));
- /*
+         set_feature(env, ARM_FEATURE_LPAE);
          set_feature(env, ARM_FEATURE_V7);
      }
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
      if (arm_feature(env, ARM_FEATURE_V6)) {
          set_feature(env, ARM_FEATURE_V5);
          if (!arm_feature(env, ARM_FEATURE_M)) {
 -            assert(!tcg_enabled() || no_aa32 || cpu_isar_feature(jazelle, cpu));
 +            assert(!tcg_enabled() || no_aa32 ||
 +                   cpu_isar_feature(aa32_jazelle, cpu));
              set_feature(env, ARM_FEATURE_AUXCR);
          }
      }
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
-         }
+     if (arm_feature(env, ARM_FEATURE_LPAE)) {
          define_arm_cp_regs(cpu, lpae_cp_reginfo);
      }
+-    if (cpu_isar_feature(jazelle, cpu)) {
-+    if (!arm_feature(env, ARM_FEATURE_M)) {
++    if (cpu_isar_feature(aa32_jazelle, cpu)) {
-+        int target_el = arm_debug_target_el(env);
+         define_arm_cp_regs(cpu, jazelle_regs);
-+
+     }
-+        flags = FIELD_DP32(flags, TBFLAG_ANY, DEBUG_TARGET_EL, target_el);
+     /* Slightly awkwardly, the OMAP and StrongARM cores need all of
 +    }
 +
      *pflags = flags;
      *cs_base = 0;
  }
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
      dc->ss_active = FIELD_EX32(tb_flags, TBFLAG_ANY, SS_ACTIVE);
      dc->pstate_ss = FIELD_EX32(tb_flags, TBFLAG_ANY, PSTATE_SS);
      dc->is_ldex = false;
 -    dc->ss_same_el = (arm_debug_target_el(env) == dc->current_el);
 +    dc->debug_target_el = FIELD_EX32(tb_flags, TBFLAG_ANY, DEBUG_TARGET_EL);
      /* Bound the number of insns to execute to those left on the page.  */
      bound = -(dc->base.pc_first | TARGET_PAGE_MASK) / 4;
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
+@@ -XXX,XX +XXX,XX @@
-     dc->ss_active = FIELD_EX32(tb_flags, TBFLAG_ANY, SS_ACTIVE);
+ #define ENABLE_ARCH_5     arm_dc_feature(s, ARM_FEATURE_V5)
-     dc->pstate_ss = FIELD_EX32(tb_flags, TBFLAG_ANY, PSTATE_SS);
+ /* currently all emulated v5 cores are also v5TE, so don't bother */
-     dc->is_ldex = false;
+ #define ENABLE_ARCH_5TE   arm_dc_feature(s, ARM_FEATURE_V5)
--    dc->ss_same_el = false; /* Can't be true since EL_d must be AArch64 */
+-#define ENABLE_ARCH_5J    dc_isar_feature(jazelle, s)
-+    if (!arm_feature(env, ARM_FEATURE_M)) {
++#define ENABLE_ARCH_5J    dc_isar_feature(aa32_jazelle, s)
-+        dc->debug_target_el = FIELD_EX32(tb_flags, TBFLAG_ANY, DEBUG_TARGET_EL);
+ #define ENABLE_ARCH_6     arm_dc_feature(s, ARM_FEATURE_V6)
-+    }
+ #define ENABLE_ARCH_6K    arm_dc_feature(s, ARM_FEATURE_V6K)
+ #define ENABLE_ARCH_6T2   arm_dc_feature(s, ARM_FEATURE_THUMB2)
-     dc->page_start = dc->base.pc_first & TARGET_PAGE_MASK;
+@@ -XXX,XX +XXX,XX @@ static bool op_div(DisasContext *s, arg_rrr *a, bool u)
      TCGv_i32 t1, t2;
      if (s->thumb
 -        ? !dc_isar_feature(thumb_div, s)
 -        : !dc_isar_feature(arm_div, s)) {
 +        ? !dc_isar_feature(aa32_thumb_div, s)
 +        : !dc_isar_feature(aa32_arm_div, s)) {
          return false;
      }
 --
 .20.1

-New patch
+[PULL 15/52] target/arm: Check aa32_pan in take_aarch32_exception(), not aa64_pan
+In take_aarch32_exception(), we know we are dealing with a CPU that
+has AArch32, so the right isar_feature test is aa32_pan, not aa64_pan.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200214175116.9164-3-peter.maydell@linaro.org
+---
+ target/arm/helper.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static void take_aarch32_exception(CPUARMState *env, int new_mode,
+         env->elr_el[2] = env->regs[15];
+     } else {
+         /* CPSR.PAN is normally preserved preserved unless...  */
+-        if (cpu_isar_feature(aa64_pan, env_archcpu(env))) {
++        if (cpu_isar_feature(aa32_pan, env_archcpu(env))) {
+             switch (new_el) {
+             case 3:
+                 if (!arm_is_secure_below_el3(env)) {
+--
+.20.1

-New patch
+[PULL 16/52] target/arm: Add isar_feature_any_fp16 and document naming/usage conventions
+Our current usage of the isar_feature feature tests almost always
+uses an _aa32_ test when the code path is known to be AArch32
+specific and an _aa64_ test when the code path is known to be
+AArch64 specific. There is just one exception: in the vfp_set_fpscr
+helper we check aa64_fp16 to determine whether the FZ16 bit in
+the FP(S)CR exists, but this code is also used for AArch32.
+There are other places in future where we're likely to want
+a general "does this feature exist for either AArch32 or
+AArch64" check (typically where architecturally the feature exists
+for both CPU states if it exists at all, but the CPU might be
+AArch32-only or AArch64-only, and so only have one set of ID
+registers).
+Introduce a new category of isar_feature_* functions:
+isar_feature_any_foo() should be tested when what we want to
+know is "does this feature exist for either AArch32 or AArch64",
+and always returns the logical OR of isar_feature_aa32_foo()
+and isar_feature_aa64_foo().
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-4-peter.maydell@linaro.org
+---
+ target/arm/cpu.h        | 19 ++++++++++++++++++-
+ target/arm/vfp_helper.c |  2 +-
+files changed, 19 insertions(+), 2 deletions(-)
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ extern const uint64_t pred_esz_masks[4];
+  * Naming convention for isar_feature functions:
+  * Functions which test 32-bit ID registers should have _aa32_ in
+  * their name. Functions which test 64-bit ID registers should have
+- * _aa64_ in their name.
++ * _aa64_ in their name. These must only be used in code where we
++ * know for certain that the CPU has AArch32 or AArch64 respectively
++ * or where the correct answer for a CPU which doesn't implement that
++ * CPU state is "false" (eg when generating A32 or A64 code, if adding
++ * system registers that are specific to that CPU state, for "should
++ * we let this system register bit be set" tests where the 32-bit
++ * flavour of the register doesn't have the bit, and so on).
++ * Functions which simply ask "does this feature exist at all" have
++ * _any_ in their name, and always return the logical OR of the _aa64_
++ * and the _aa32_ function.
+  */
+ /*
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_bti(const ARMISARegisters *id)
+     return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, BT) != 0;
+ }
++/*
++ * Feature tests for "does this exist in either 32-bit or 64-bit?"
++ */
++static inline bool isar_feature_any_fp16(const ARMISARegisters *id)
++{
++    return isar_feature_aa64_fp16(id) || isar_feature_aa32_fp16_arith(id);
++}
++
+ /*
+  * Forward to the above feature tests given an ARMCPU pointer.
+  */
+diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/vfp_helper.c
++++ b/target/arm/vfp_helper.c
+@@ -XXX,XX +XXX,XX @@ uint32_t vfp_get_fpscr(CPUARMState *env)
+ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
+ {
+     /* When ARMv8.2-FP16 is not supported, FZ16 is RES0.  */
+-    if (!cpu_isar_feature(aa64_fp16, env_archcpu(env))) {
++    if (!cpu_isar_feature(any_fp16, env_archcpu(env))) {
+         val &= ~FPCR_FZ16;
+     }
+--
+.20.1

-New patch
+[PULL 17/52] target/arm: Define and use any_predinv isar_feature test
+Instead of open-coding "ARM_FEATURE_AARCH64 ? aa64_predinv: aa32_predinv",
+define and use an any_predinv isar_feature test function.
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-5-peter.maydell@linaro.org
+---
+ target/arm/cpu.h    | 5 +++++
+ target/arm/helper.c | 9 +--------
+files changed, 6 insertions(+), 8 deletions(-)
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_fp16(const ARMISARegisters *id)
+     return isar_feature_aa64_fp16(id) || isar_feature_aa32_fp16_arith(id);
+ }
++static inline bool isar_feature_any_predinv(const ARMISARegisters *id)
++{
++    return isar_feature_aa64_predinv(id) || isar_feature_aa32_predinv(id);
++}
++
+ /*
+  * Forward to the above feature tests given an ARMCPU pointer.
+  */
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
+ #endif /*CONFIG_USER_ONLY*/
+ #endif
+-    /*
+-     * While all v8.0 cpus support aarch64, QEMU does have configurations
+-     * that do not set ID_AA64ISAR1, e.g. user-only qemu-arm -cpu max,
+-     * which will set ID_ISAR6.
+-     */
+-    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)
+-        ? cpu_isar_feature(aa64_predinv, cpu)
+-        : cpu_isar_feature(aa32_predinv, cpu)) {
++    if (cpu_isar_feature(any_predinv, cpu)) {
+         define_arm_cp_regs(cpu, predinv_reginfo);
+     }
+--
+.20.1

-New patch
+[PULL 18/52] target/arm: Factor out PMU register definitions
+Pull the code that defines the various PMU registers out
 into its own function, matching the pattern we have
 already for the debug registers.
 Apart from one style fix to a multi-line comment, this
 is purely movement of code with no changes to it.
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Message-id: 20200214175116.9164-6-peter.maydell@linaro.org
 ---
  target/arm/helper.c | 158 +++++++++++++++++++++++---------------------
 file changed, 82 insertions(+), 76 deletions(-)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
      }
  }
 +static void define_pmu_regs(ARMCPU *cpu)
 +{
 +    /*
 +     * v7 performance monitor control register: same implementor
 +     * field as main ID register, and we implement four counters in
 +     * addition to the cycle count register.
 +     */
 +    unsigned int i, pmcrn = 4;
 +    ARMCPRegInfo pmcr = {
 +        .name = "PMCR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 0,
 +        .access = PL0_RW,
 +        .type = ARM_CP_IO | ARM_CP_ALIAS,
 +        .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmcr),
 +        .accessfn = pmreg_access, .writefn = pmcr_write,
 +        .raw_writefn = raw_write,
 +    };
 +    ARMCPRegInfo pmcr64 = {
 +        .name = "PMCR_EL0", .state = ARM_CP_STATE_AA64,
 +        .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 0,
 +        .access = PL0_RW, .accessfn = pmreg_access,
 +        .type = ARM_CP_IO,
 +        .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcr),
 +        .resetvalue = (cpu->midr & 0xff000000) | (pmcrn << PMCRN_SHIFT),
 +        .writefn = pmcr_write, .raw_writefn = raw_write,
 +    };
 +    define_one_arm_cp_reg(cpu, &pmcr);
 +    define_one_arm_cp_reg(cpu, &pmcr64);
 +    for (i = 0; i < pmcrn; i++) {
 +        char *pmevcntr_name = g_strdup_printf("PMEVCNTR%d", i);
 +        char *pmevcntr_el0_name = g_strdup_printf("PMEVCNTR%d_EL0", i);
 +        char *pmevtyper_name = g_strdup_printf("PMEVTYPER%d", i);
 +        char *pmevtyper_el0_name = g_strdup_printf("PMEVTYPER%d_EL0", i);
 +        ARMCPRegInfo pmev_regs[] = {
 +            { .name = pmevcntr_name, .cp = 15, .crn = 14,
 +              .crm = 8 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
 +              .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
 +              .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
 +              .accessfn = pmreg_access },
 +            { .name = pmevcntr_el0_name, .state = ARM_CP_STATE_AA64,
 +              .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 8 | (3 & (i >> 3)),
 +              .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
 +              .type = ARM_CP_IO,
 +              .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
 +              .raw_readfn = pmevcntr_rawread,
 +              .raw_writefn = pmevcntr_rawwrite },
 +            { .name = pmevtyper_name, .cp = 15, .crn = 14,
 +              .crm = 12 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
 +              .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
 +              .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
 +              .accessfn = pmreg_access },
 +            { .name = pmevtyper_el0_name, .state = ARM_CP_STATE_AA64,
 +              .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 12 | (3 & (i >> 3)),
 +              .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
 +              .type = ARM_CP_IO,
 +              .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
 +              .raw_writefn = pmevtyper_rawwrite },
 +            REGINFO_SENTINEL
 +        };
 +        define_arm_cp_regs(cpu, pmev_regs);
 +        g_free(pmevcntr_name);
 +        g_free(pmevcntr_el0_name);
 +        g_free(pmevtyper_name);
 +        g_free(pmevtyper_el0_name);
 +    }
 +    if (FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) >= 4 &&
 +            FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) != 0xf) {
 +        ARMCPRegInfo v81_pmu_regs[] = {
 +            { .name = "PMCEID2", .state = ARM_CP_STATE_AA32,
 +              .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 4,
 +              .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 +              .resetvalue = extract64(cpu->pmceid0, 32, 32) },
 +            { .name = "PMCEID3", .state = ARM_CP_STATE_AA32,
 +              .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 5,
 +              .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 +              .resetvalue = extract64(cpu->pmceid1, 32, 32) },
 +            REGINFO_SENTINEL
 +        };
 +        define_arm_cp_regs(cpu, v81_pmu_regs);
 +    }
 +}
 +
  /* We don't know until after realize whether there's a GICv3
   * attached, and that is what registers the gicv3 sysregs.
   * So we have to fill in the GIC fields in ID_PFR/ID_PFR1_EL1/ID_AA64PFR0_EL1
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
          define_arm_cp_regs(cpu, pmovsset_cp_reginfo);
      }
      if (arm_feature(env, ARM_FEATURE_V7)) {
 -        /* v7 performance monitor control register: same implementor
 -         * field as main ID register, and we implement four counters in
 -         * addition to the cycle count register.
 -         */
 -        unsigned int i, pmcrn = 4;
 -        ARMCPRegInfo pmcr = {
 -            .name = "PMCR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 0,
 -            .access = PL0_RW,
 -            .type = ARM_CP_IO | ARM_CP_ALIAS,
 -            .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmcr),
 -            .accessfn = pmreg_access, .writefn = pmcr_write,
 -            .raw_writefn = raw_write,
 -        };
 -        ARMCPRegInfo pmcr64 = {
 -            .name = "PMCR_EL0", .state = ARM_CP_STATE_AA64,
 -            .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 0,
 -            .access = PL0_RW, .accessfn = pmreg_access,
 -            .type = ARM_CP_IO,
 -            .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcr),
 -            .resetvalue = (cpu->midr & 0xff000000) | (pmcrn << PMCRN_SHIFT),
 -            .writefn = pmcr_write, .raw_writefn = raw_write,
 -        };
 -        define_one_arm_cp_reg(cpu, &pmcr);
 -        define_one_arm_cp_reg(cpu, &pmcr64);
 -        for (i = 0; i < pmcrn; i++) {
 -            char *pmevcntr_name = g_strdup_printf("PMEVCNTR%d", i);
 -            char *pmevcntr_el0_name = g_strdup_printf("PMEVCNTR%d_EL0", i);
 -            char *pmevtyper_name = g_strdup_printf("PMEVTYPER%d", i);
 -            char *pmevtyper_el0_name = g_strdup_printf("PMEVTYPER%d_EL0", i);
 -            ARMCPRegInfo pmev_regs[] = {
 -                { .name = pmevcntr_name, .cp = 15, .crn = 14,
 -                  .crm = 8 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
 -                  .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
 -                  .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
 -                  .accessfn = pmreg_access },
 -                { .name = pmevcntr_el0_name, .state = ARM_CP_STATE_AA64,
 -                  .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 8 | (3 & (i >> 3)),
 -                  .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
 -                  .type = ARM_CP_IO,
 -                  .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
 -                  .raw_readfn = pmevcntr_rawread,
 -                  .raw_writefn = pmevcntr_rawwrite },
 -                { .name = pmevtyper_name, .cp = 15, .crn = 14,
 -                  .crm = 12 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
 -                  .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
 -                  .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
 -                  .accessfn = pmreg_access },
 -                { .name = pmevtyper_el0_name, .state = ARM_CP_STATE_AA64,
 -                  .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 12 | (3 & (i >> 3)),
 -                  .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
 -                  .type = ARM_CP_IO,
 -                  .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
 -                  .raw_writefn = pmevtyper_rawwrite },
 -                REGINFO_SENTINEL
 -            };
 -            define_arm_cp_regs(cpu, pmev_regs);
 -            g_free(pmevcntr_name);
 -            g_free(pmevcntr_el0_name);
 -            g_free(pmevtyper_name);
 -            g_free(pmevtyper_el0_name);
 -        }
          ARMCPRegInfo clidr = {
              .name = "CLIDR", .state = ARM_CP_STATE_BOTH,
              .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 1, .opc2 = 1,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
          define_one_arm_cp_reg(cpu, &clidr);
          define_arm_cp_regs(cpu, v7_cp_reginfo);
          define_debug_regs(cpu);
 +        define_pmu_regs(cpu);
      } else {
          define_arm_cp_regs(cpu, not_v7_cp_reginfo);
      }
 -    if (FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) >= 4 &&
 -            FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) != 0xf) {
 -        ARMCPRegInfo v81_pmu_regs[] = {
 -            { .name = "PMCEID2", .state = ARM_CP_STATE_AA32,
 -              .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 4,
 -              .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 -              .resetvalue = extract64(cpu->pmceid0, 32, 32) },
 -            { .name = "PMCEID3", .state = ARM_CP_STATE_AA32,
 -              .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 5,
 -              .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 -              .resetvalue = extract64(cpu->pmceid1, 32, 32) },
 -            REGINFO_SENTINEL
 -        };
 -        define_arm_cp_regs(cpu, v81_pmu_regs);
 -    }
      if (arm_feature(env, ARM_FEATURE_V8)) {
          /* AArch64 ID registers, which all have impdef reset values.
           * Note that within the ID register ranges the unused slots
 --
 .20.1

-New patch
+[PULL 19/52] target/arm: Add and use FIELD definitions for ID_AA64DFR0_EL1
+Add FIELD() definitions for the ID_AA64DFR0_EL1 and use them
+where we currently have hard-coded bit values.
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-7-peter.maydell@linaro.org
+---
+ target/arm/cpu.h    | 10 ++++++++++
+ target/arm/cpu.c    |  2 +-
+ target/arm/helper.c |  6 +++---
+files changed, 14 insertions(+), 4 deletions(-)
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ FIELD(ID_AA64MMFR2, BBM, 52, 4)
+ FIELD(ID_AA64MMFR2, EVT, 56, 4)
+ FIELD(ID_AA64MMFR2, E0PD, 60, 4)
++FIELD(ID_AA64DFR0, DEBUGVER, 0, 4)
++FIELD(ID_AA64DFR0, TRACEVER, 4, 4)
++FIELD(ID_AA64DFR0, PMUVER, 8, 4)
++FIELD(ID_AA64DFR0, BRPS, 12, 4)
++FIELD(ID_AA64DFR0, WRPS, 20, 4)
++FIELD(ID_AA64DFR0, CTX_CMPS, 28, 4)
++FIELD(ID_AA64DFR0, PMSVER, 32, 4)
++FIELD(ID_AA64DFR0, DOUBLELOCK, 36, 4)
++FIELD(ID_AA64DFR0, TRACEFILT, 40, 4)
++
+ FIELD(ID_DFR0, COPDBG, 0, 4)
+ FIELD(ID_DFR0, COPSDBG, 4, 4)
+ FIELD(ID_DFR0, MMAPDBG, 8, 4)
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.c
++++ b/target/arm/cpu.c
+@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
+                 cpu);
+ #endif
+     } else {
+-        cpu->id_aa64dfr0 &= ~0xf00;
++        cpu->id_aa64dfr0 = FIELD_DP64(cpu->id_aa64dfr0, ID_AA64DFR0, PMUVER, 0);
+         cpu->id_dfr0 &= ~(0xf << 24);
+         cpu->pmceid0 = 0;
+         cpu->pmceid1 = 0;
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
+      * check that if they both exist then they agree.
+      */
+     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+-        assert(extract32(cpu->id_aa64dfr0, 12, 4) == brps);
+-        assert(extract32(cpu->id_aa64dfr0, 20, 4) == wrps);
+-        assert(extract32(cpu->id_aa64dfr0, 28, 4) == ctx_cmps);
++        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, BRPS) == brps);
++        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, WRPS) == wrps);
++        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, CTX_CMPS) == ctx_cmps);
+     }
+     define_one_arm_cp_reg(cpu, &dbgdidr);
+--
+.20.1

-New patch
+[PULL 20/52] target/arm: Use FIELD macros for clearing ID_DFR0 PERFMON field
+We already define FIELD macros for ID_DFR0, so use them in the
+one place where we're doing direct bit value manipulation.
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-8-peter.maydell@linaro.org
+---
+ target/arm/cpu.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.c
++++ b/target/arm/cpu.c
+@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
+ #endif
+     } else {
+         cpu->id_aa64dfr0 = FIELD_DP64(cpu->id_aa64dfr0, ID_AA64DFR0, PMUVER, 0);
+-        cpu->id_dfr0 &= ~(0xf << 24);
++        cpu->id_dfr0 = FIELD_DP32(cpu->id_dfr0, ID_DFR0, PERFMON, 0);
+         cpu->pmceid0 = 0;
+         cpu->pmceid1 = 0;
+     }
+--
+.20.1

-New patch
+[PULL 21/52] target/arm: Define an aa32_pmu_8_1 isar feature test function
+Instead of open-coding a check on the ID_DFR0 PerfMon ID register
 field, create a standardly-named isar_feature for "does AArch32 have
 a v8.1 PMUv3" and use it.
 This entails moving the id_dfr0 field into the ARMISARegisters struct.
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Message-id: 20200214175116.9164-9-peter.maydell@linaro.org
 ---
  target/arm/cpu.h      |  9 ++++++++-
  hw/intc/armv7m_nvic.c |  2 +-
  target/arm/cpu.c      | 28 ++++++++++++++--------------
  target/arm/cpu64.c    |  6 +++---
  target/arm/helper.c   |  5 ++---
 files changed, 28 insertions(+), 22 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
          uint32_t mvfr0;
          uint32_t mvfr1;
          uint32_t mvfr2;
 +        uint32_t id_dfr0;
          uint64_t id_aa64isar0;
          uint64_t id_aa64isar1;
          uint64_t id_aa64pfr0;
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
      uint32_t reset_sctlr;
      uint32_t id_pfr0;
      uint32_t id_pfr1;
 -    uint32_t id_dfr0;
      uint64_t pmceid0;
      uint64_t pmceid1;
      uint32_t id_afr0;
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_ats1e1(const ARMISARegisters *id)
      return FIELD_EX64(id->mvfr0, ID_MMFR3, PAN) >= 2;
  }
 +static inline bool isar_feature_aa32_pmu_8_1(const ARMISARegisters *id)
 +{
 +    /* 0xf means "non-standard IMPDEF PMU" */
 +    return FIELD_EX32(id->id_dfr0, ID_DFR0, PERFMON) >= 4 &&
 +        FIELD_EX32(id->id_dfr0, ID_DFR0, PERFMON) != 0xf;
 +}
 +
  /*
   * 64-bit feature tests via id registers.
   */
 diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/intc/armv7m_nvic.c
 +++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
      case 0xd44: /* PFR1.  */
          return cpu->id_pfr1;
      case 0xd48: /* DFR0.  */
 -        return cpu->id_dfr0;
 +        return cpu->isar.id_dfr0;
      case 0xd4c: /* AFR0.  */
          return cpu->id_afr0;
      case 0xd50: /* MMFR0.  */
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
  #endif
      } else {
          cpu->id_aa64dfr0 = FIELD_DP64(cpu->id_aa64dfr0, ID_AA64DFR0, PMUVER, 0);
 -        cpu->id_dfr0 = FIELD_DP32(cpu->id_dfr0, ID_DFR0, PERFMON, 0);
 +        cpu->isar.id_dfr0 = FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, PERFMON, 0);
          cpu->pmceid0 = 0;
          cpu->pmceid1 = 0;
      }
@@ -XXX,XX +XXX,XX @@ static void arm1136_r2_initfn(Object *obj)
      cpu->reset_sctlr = 0x00050078;
      cpu->id_pfr0 = 0x111;
      cpu->id_pfr1 = 0x1;
 -    cpu->id_dfr0 = 0x2;
 +    cpu->isar.id_dfr0 = 0x2;
      cpu->id_afr0 = 0x3;
      cpu->id_mmfr0 = 0x01130003;
      cpu->id_mmfr1 = 0x10030302;
@@ -XXX,XX +XXX,XX @@ static void arm1136_initfn(Object *obj)
      cpu->reset_sctlr = 0x00050078;
      cpu->id_pfr0 = 0x111;
      cpu->id_pfr1 = 0x1;
 -    cpu->id_dfr0 = 0x2;
 +    cpu->isar.id_dfr0 = 0x2;
      cpu->id_afr0 = 0x3;
      cpu->id_mmfr0 = 0x01130003;
      cpu->id_mmfr1 = 0x10030302;
@@ -XXX,XX +XXX,XX @@ static void arm1176_initfn(Object *obj)
      cpu->reset_sctlr = 0x00050078;
      cpu->id_pfr0 = 0x111;
      cpu->id_pfr1 = 0x11;
 -    cpu->id_dfr0 = 0x33;
 +    cpu->isar.id_dfr0 = 0x33;
      cpu->id_afr0 = 0;
      cpu->id_mmfr0 = 0x01130003;
      cpu->id_mmfr1 = 0x10030302;
@@ -XXX,XX +XXX,XX @@ static void arm11mpcore_initfn(Object *obj)
      cpu->ctr = 0x1d192992; /* 32K icache 32K dcache */
      cpu->id_pfr0 = 0x111;
      cpu->id_pfr1 = 0x1;
 -    cpu->id_dfr0 = 0;
 +    cpu->isar.id_dfr0 = 0;
      cpu->id_afr0 = 0x2;
      cpu->id_mmfr0 = 0x01100103;
      cpu->id_mmfr1 = 0x10020302;
@@ -XXX,XX +XXX,XX @@ static void cortex_m3_initfn(Object *obj)
      cpu->pmsav7_dregion = 8;
      cpu->id_pfr0 = 0x00000030;
      cpu->id_pfr1 = 0x00000200;
 -    cpu->id_dfr0 = 0x00100000;
 +    cpu->isar.id_dfr0 = 0x00100000;
      cpu->id_afr0 = 0x00000000;
      cpu->id_mmfr0 = 0x00000030;
      cpu->id_mmfr1 = 0x00000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
      cpu->isar.mvfr2 = 0x00000000;
      cpu->id_pfr0 = 0x00000030;
      cpu->id_pfr1 = 0x00000200;
 -    cpu->id_dfr0 = 0x00100000;
 +    cpu->isar.id_dfr0 = 0x00100000;
      cpu->id_afr0 = 0x00000000;
      cpu->id_mmfr0 = 0x00000030;
      cpu->id_mmfr1 = 0x00000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_m7_initfn(Object *obj)
      cpu->isar.mvfr2 = 0x00000040;
      cpu->id_pfr0 = 0x00000030;
      cpu->id_pfr1 = 0x00000200;
 -    cpu->id_dfr0 = 0x00100000;
 +    cpu->isar.id_dfr0 = 0x00100000;
      cpu->id_afr0 = 0x00000000;
      cpu->id_mmfr0 = 0x00100030;
      cpu->id_mmfr1 = 0x00000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
      cpu->isar.mvfr2 = 0x00000040;
      cpu->id_pfr0 = 0x00000030;
      cpu->id_pfr1 = 0x00000210;
 -    cpu->id_dfr0 = 0x00200000;
 +    cpu->isar.id_dfr0 = 0x00200000;
      cpu->id_afr0 = 0x00000000;
      cpu->id_mmfr0 = 0x00101F40;
      cpu->id_mmfr1 = 0x00000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
      cpu->midr = 0x411fc153; /* r1p3 */
      cpu->id_pfr0 = 0x0131;
      cpu->id_pfr1 = 0x001;
 -    cpu->id_dfr0 = 0x010400;
 +    cpu->isar.id_dfr0 = 0x010400;
      cpu->id_afr0 = 0x0;
      cpu->id_mmfr0 = 0x0210030;
      cpu->id_mmfr1 = 0x00000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
      cpu->reset_sctlr = 0x00c50078;
      cpu->id_pfr0 = 0x1031;
      cpu->id_pfr1 = 0x11;
 -    cpu->id_dfr0 = 0x400;
 +    cpu->isar.id_dfr0 = 0x400;
      cpu->id_afr0 = 0;
      cpu->id_mmfr0 = 0x31100003;
      cpu->id_mmfr1 = 0x20000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
      cpu->reset_sctlr = 0x00c50078;
      cpu->id_pfr0 = 0x1031;
      cpu->id_pfr1 = 0x11;
 -    cpu->id_dfr0 = 0x000;
 +    cpu->isar.id_dfr0 = 0x000;
      cpu->id_afr0 = 0;
      cpu->id_mmfr0 = 0x00100103;
      cpu->id_mmfr1 = 0x20000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
      cpu->reset_sctlr = 0x00c50078;
      cpu->id_pfr0 = 0x00001131;
      cpu->id_pfr1 = 0x00011011;
 -    cpu->id_dfr0 = 0x02010555;
 +    cpu->isar.id_dfr0 = 0x02010555;
      cpu->id_afr0 = 0x00000000;
      cpu->id_mmfr0 = 0x10101105;
      cpu->id_mmfr1 = 0x40000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
      cpu->reset_sctlr = 0x00c50078;
      cpu->id_pfr0 = 0x00001131;
      cpu->id_pfr1 = 0x00011011;
 -    cpu->id_dfr0 = 0x02010555;
 +    cpu->isar.id_dfr0 = 0x02010555;
      cpu->id_afr0 = 0x00000000;
      cpu->id_mmfr0 = 0x10201105;
      cpu->id_mmfr1 = 0x20000000;
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
      cpu->reset_sctlr = 0x00c50838;
      cpu->id_pfr0 = 0x00000131;
      cpu->id_pfr1 = 0x00011011;
 -    cpu->id_dfr0 = 0x03010066;
 +    cpu->isar.id_dfr0 = 0x03010066;
      cpu->id_afr0 = 0x00000000;
      cpu->id_mmfr0 = 0x10101105;
      cpu->id_mmfr1 = 0x40000000;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
      cpu->reset_sctlr = 0x00c50838;
      cpu->id_pfr0 = 0x00000131;
      cpu->id_pfr1 = 0x00011011;
 -    cpu->id_dfr0 = 0x03010066;
 +    cpu->isar.id_dfr0 = 0x03010066;
      cpu->id_afr0 = 0x00000000;
      cpu->id_mmfr0 = 0x10101105;
      cpu->id_mmfr1 = 0x40000000;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
      cpu->reset_sctlr = 0x00c50838;
      cpu->id_pfr0 = 0x00000131;
      cpu->id_pfr1 = 0x00011011;
 -    cpu->id_dfr0 = 0x03010066;
 +    cpu->isar.id_dfr0 = 0x03010066;
      cpu->id_afr0 = 0x00000000;
      cpu->id_mmfr0 = 0x10201105;
      cpu->id_mmfr1 = 0x40000000;
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
          g_free(pmevtyper_name);
          g_free(pmevtyper_el0_name);
      }
 -    if (FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) >= 4 &&
 -            FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) != 0xf) {
 +    if (cpu_isar_feature(aa32_pmu_8_1, cpu)) {
          ARMCPRegInfo v81_pmu_regs[] = {
              { .name = "PMCEID2", .state = ARM_CP_STATE_AA32,
                .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 4,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 2,
                .access = PL1_R, .type = ARM_CP_CONST,
                .accessfn = access_aa32_tid3,
 -              .resetvalue = cpu->id_dfr0 },
 +              .resetvalue = cpu->isar.id_dfr0 },
              { .name = "ID_AFR0", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 3,
                .access = PL1_R, .type = ARM_CP_CONST,
 --
 .20.1

-[Qemu-devel] [PULL 18/29] target/arm/cpu: Ensure we can use the pmu with kvm
+[PULL 22/52] target/arm: Add _aa64_ and _any_ versions of pmu_8_1 isar checks
-From: Andrew Jones <drjones@redhat.com>
+Add the 64-bit version of the "is this a v8.1 PMUv3?"
 ID register check function, and the _any_ version that
 checks for either AArch32 or AArch64 support. We'll use
 this in a later commit.
-We first convert the pmu property from a static property to one with
+We don't (yet) do any isar_feature checks on ID_AA64DFR1_EL1,
-its own accessors. Then we use the set accessor to check if the PMU is
+but we move id_aa64dfr1 into the ARMISARegisters struct with
-supported when using KVM. Indeed a 32-bit KVM host does not support
+id_aa64dfr0, for consistency.
 the PMU, so this check will catch an attempt to use it at property-set
 time.
-Signed-off-by: Andrew Jones <drjones@redhat.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Eric Auger <eric.auger@redhat.com>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-10-peter.maydell@linaro.org
 ---
- target/arm/kvm_arm.h | 14 ++++++++++++++
+ target/arm/cpu.h    | 15 +++++++++++++--
- target/arm/cpu.c     | 30 +++++++++++++++++++++++++-----
+ target/arm/cpu.c    |  3 ++-
- target/arm/kvm.c     |  7 +++++++
+ target/arm/cpu64.c  |  6 +++---
-files changed, 46 insertions(+), 5 deletions(-)
+ target/arm/helper.c | 12 +++++++-----
 files changed, 25 insertions(+), 11 deletions(-)
-diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/kvm_arm.h
+--- a/target/arm/cpu.h
-+++ b/target/arm/kvm_arm.h
++++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu);
+@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
-  */
+         uint64_t id_aa64mmfr0;
- bool kvm_arm_aarch32_supported(CPUState *cs);
+         uint64_t id_aa64mmfr1;
+         uint64_t id_aa64mmfr2;
-+/**
++        uint64_t id_aa64dfr0;
-+ * bool kvm_arm_pmu_supported:
++        uint64_t id_aa64dfr1;
-+ * @cs: CPUState
+     } isar;
-+ *
+     uint32_t midr;
-+ * Returns: true if the KVM VCPU can enable its PMU
+     uint32_t revidr;
-+ * and false otherwise.
+@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
-+ */
+     uint32_t id_mmfr2;
-+bool kvm_arm_pmu_supported(CPUState *cs);
+     uint32_t id_mmfr3;
-+
+     uint32_t id_mmfr4;
- /**
+-    uint64_t id_aa64dfr0;
-  * kvm_arm_get_max_vm_ipa_size - Returns the number of bits in the
+-    uint64_t id_aa64dfr1;
-  * IPA address space supported by KVM
+     uint64_t id_aa64afr0;
-@@ -XXX,XX +XXX,XX @@ static inline bool kvm_arm_aarch32_supported(CPUState *cs)
+     uint64_t id_aa64afr1;
-     return false;
+     uint32_t dbgdidr;
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_bti(const ARMISARegisters *id)
      return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, BT) != 0;
  }
-+static inline bool kvm_arm_pmu_supported(CPUState *cs)
++static inline bool isar_feature_aa64_pmu_8_1(const ARMISARegisters *id)
 +{
-+    return false;
++    return FIELD_EX64(id->id_aa64dfr0, ID_AA64DFR0, PMUVER) >= 4 &&
 +        FIELD_EX64(id->id_aa64dfr0, ID_AA64DFR0, PMUVER) != 0xf;
 +}
 +
- static inline int kvm_arm_get_max_vm_ipa_size(MachineState *ms)
+ /*
- {
+  * Feature tests for "does this exist in either 32-bit or 64-bit?"
-     return -ENOENT;
+  */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_predinv(const ARMISARegisters *id)
      return isar_feature_aa64_predinv(id) || isar_feature_aa32_predinv(id);
  }
 +static inline bool isar_feature_any_pmu_8_1(const ARMISARegisters *id)
 +{
 +    return isar_feature_aa64_pmu_8_1(id) || isar_feature_aa32_pmu_8_1(id);
 +}
 +
  /*
   * Forward to the above feature tests given an ARMCPU pointer.
   */
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_has_el3_property =
+@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
- static Property arm_cpu_cfgend_property =
+                 cpu);
-             DEFINE_PROP_BOOL("cfgend", ARMCPU, cfgend, false);
+ #endif
+     } else {
--/* use property name "pmu" to match other archs and virt tools */
+-        cpu->id_aa64dfr0 = FIELD_DP64(cpu->id_aa64dfr0, ID_AA64DFR0, PMUVER, 0);
--static Property arm_cpu_has_pmu_property =
++        cpu->isar.id_aa64dfr0 =
--            DEFINE_PROP_BOOL("pmu", ARMCPU, has_pmu, true);
++            FIELD_DP64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, PMUVER, 0);
--
+         cpu->isar.id_dfr0 = FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, PERFMON, 0);
- static Property arm_cpu_has_vfp_property =
+         cpu->pmceid0 = 0;
-             DEFINE_PROP_BOOL("vfp", ARMCPU, has_vfp, true);
+         cpu->pmceid1 = 0;
+diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
-@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_pmsav7_dregion_property =
+index XXXXXXX..XXXXXXX 100644
-                                            pmsav7_dregion,
+--- a/target/arm/cpu64.c
-                                            qdev_prop_uint32, uint32_t);
++++ b/target/arm/cpu64.c
+@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
-+static bool arm_get_pmu(Object *obj, Error **errp)
+     cpu->isar.id_isar5 = 0x00011121;
-+{
+     cpu->isar.id_isar6 = 0;
-+    ARMCPU *cpu = ARM_CPU(obj);
+     cpu->isar.id_aa64pfr0 = 0x00002222;
-+
+-    cpu->id_aa64dfr0 = 0x10305106;
-+    return cpu->has_pmu;
++    cpu->isar.id_aa64dfr0 = 0x10305106;
-+}
+     cpu->isar.id_aa64isar0 = 0x00011120;
-+
+     cpu->isar.id_aa64mmfr0 = 0x00001124;
-+static void arm_set_pmu(Object *obj, bool value, Error **errp)
+     cpu->dbgdidr = 0x3516d000;
-+{
+@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
-+    ARMCPU *cpu = ARM_CPU(obj);
+     cpu->isar.id_isar5 = 0x00011121;
-+
+     cpu->isar.id_isar6 = 0;
-+    if (value) {
+     cpu->isar.id_aa64pfr0 = 0x00002222;
-+        if (kvm_enabled() && !kvm_arm_pmu_supported(CPU(cpu))) {
+-    cpu->id_aa64dfr0 = 0x10305106;
-+            error_setg(errp, "'pmu' feature not supported by KVM on this host");
++    cpu->isar.id_aa64dfr0 = 0x10305106;
-+            return;
+     cpu->isar.id_aa64isar0 = 0x00011120;
-+        }
+     cpu->isar.id_aa64mmfr0 = 0x00001122; /* 40 bit physical addr */
-+        set_feature(&cpu->env, ARM_FEATURE_PMU);
+     cpu->dbgdidr = 0x3516d000;
-+    } else {
+@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
-+        unset_feature(&cpu->env, ARM_FEATURE_PMU);
+     cpu->isar.id_isar4 = 0x00011142;
-+    }
+     cpu->isar.id_isar5 = 0x00011121;
-+    cpu->has_pmu = value;
+     cpu->isar.id_aa64pfr0 = 0x00002222;
-+}
+-    cpu->id_aa64dfr0 = 0x10305106;
-+
++    cpu->isar.id_aa64dfr0 = 0x10305106;
- static void arm_get_init_svtor(Object *obj, Visitor *v, const char *name,
+     cpu->isar.id_aa64isar0 = 0x00011120;
-                                void *opaque, Error **errp)
+     cpu->isar.id_aa64mmfr0 = 0x00001124;
- {
+     cpu->dbgdidr = 0x3516d000;
-@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@
  #include "hw/semihosting/semihost.h"
  #include "sysemu/cpus.h"
  #include "sysemu/kvm.h"
 +#include "sysemu/tcg.h"
  #include "qemu/range.h"
  #include "qapi/qapi-commands-machine-target.h"
  #include "qapi/error.h"
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
       * check that if they both exist then they agree.
       */
      if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
 -        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, BRPS) == brps);
 -        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, WRPS) == wrps);
 -        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, CTX_CMPS) == ctx_cmps);
 +        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, BRPS) == brps);
 +        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, WRPS) == wrps);
 +        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, CTX_CMPS)
 +               == ctx_cmps);
      }
-     if (arm_feature(&cpu->env, ARM_FEATURE_PMU)) {
+     define_one_arm_cp_reg(cpu, &dbgdidr);
--        qdev_property_add_static(DEVICE(obj), &arm_cpu_has_pmu_property,
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
-+        cpu->has_pmu = true;
+               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 5, .opc2 = 0,
-+        object_property_add_bool(obj, "pmu", arm_get_pmu, arm_set_pmu,
+               .access = PL1_R, .type = ARM_CP_CONST,
-                                  &error_abort);
+               .accessfn = access_aa64_tid3,
-     }
+-              .resetvalue = cpu->id_aa64dfr0 },
++              .resetvalue = cpu->isar.id_aa64dfr0 },
-diff --git a/target/arm/kvm.c b/target/arm/kvm.c
+             { .name = "ID_AA64DFR1_EL1", .state = ARM_CP_STATE_AA64,
-index XXXXXXX..XXXXXXX 100644
+               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 5, .opc2 = 1,
---- a/target/arm/kvm.c
+               .access = PL1_R, .type = ARM_CP_CONST,
-+++ b/target/arm/kvm.c
+               .accessfn = access_aa64_tid3,
-@@ -XXX,XX +XXX,XX @@ void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
+-              .resetvalue = cpu->id_aa64dfr1 },
-     env->features = arm_host_cpu_features.features;
++              .resetvalue = cpu->isar.id_aa64dfr1 },
- }
+             { .name = "ID_AA64DFR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
+               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 5, .opc2 = 2,
-+bool kvm_arm_pmu_supported(CPUState *cpu)
+               .access = PL1_R, .type = ARM_CP_CONST,
 +{
 +    KVMState *s = KVM_STATE(current_machine->accelerator);
 +
 +    return kvm_check_extension(s, KVM_CAP_ARM_PMU_V3);
 +}
 +
  int kvm_arm_get_max_vm_ipa_size(MachineState *ms)
  {
      KVMState *s = KVM_STATE(ms->accelerator);
 --
 .20.1

-New patch
+[PULL 23/52] target/arm: Stop assuming DBGDIDR always exists
+The AArch32 DBGDIDR defines properties like the number of
+breakpoints, watchpoints and context-matching comparators.  On an
+AArch64 CPU, the register may not even exist if AArch32 is not
+supported at EL1.
+Currently we hard-code use of DBGDIDR to identify the number of
+breakpoints etc; this works for all our TCG CPUs, but will break if
+we ever add an AArch64-only CPU.  We also have an assert() that the
+AArch32 and AArch64 registers match, which currently works only by
+luck for KVM because we don't populate either of these ID registers
+from the KVM vCPU and so they are both zero.
+Clean this up so we have functions for finding the number
+of breakpoints, watchpoints and context comparators which look
+in the appropriate ID register.
+This allows us to drop the "check that AArch64 and AArch32 agree
+on the number of breakpoints etc" asserts:
+ * we no longer look at the AArch32 versions unless that's the
+   right place to be looking
+ * it's valid to have a CPU (eg AArch64-only) where they don't match
+ * we shouldn't have been asserting the validity of ID registers
+   in a codepath used with KVM anyway
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200214175116.9164-11-peter.maydell@linaro.org
+---
+ target/arm/cpu.h          |  7 +++++++
+ target/arm/internals.h    | 42 +++++++++++++++++++++++++++++++++++++++
+ target/arm/debug_helper.c |  6 +++---
+ target/arm/helper.c       | 21 +++++---------------
+files changed, 57 insertions(+), 19 deletions(-)
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ FIELD(ID_DFR0, MPROFDBG, 20, 4)
+ FIELD(ID_DFR0, PERFMON, 24, 4)
+ FIELD(ID_DFR0, TRACEFILT, 28, 4)
++FIELD(DBGDIDR, SE_IMP, 12, 1)
++FIELD(DBGDIDR, NSUHD_IMP, 14, 1)
++FIELD(DBGDIDR, VERSION, 16, 4)
++FIELD(DBGDIDR, CTX_CMPS, 20, 4)
++FIELD(DBGDIDR, BRPS, 24, 4)
++FIELD(DBGDIDR, WRPS, 28, 4)
++
+ FIELD(MVFR0, SIMDREG, 0, 4)
+ FIELD(MVFR0, FPSP, 4, 4)
+ FIELD(MVFR0, FPDP, 8, 4)
+diff --git a/target/arm/internals.h b/target/arm/internals.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/internals.h
++++ b/target/arm/internals.h
+@@ -XXX,XX +XXX,XX @@ static inline uint32_t arm_debug_exception_fsr(CPUARMState *env)
+     }
+ }
++/**
++ * arm_num_brps: Return number of implemented breakpoints.
++ * Note that the ID register BRPS field is "number of bps - 1",
++ * and we return the actual number of breakpoints.
++ */
++static inline int arm_num_brps(ARMCPU *cpu)
++{
++    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
++        return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, BRPS) + 1;
++    } else {
++        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, BRPS) + 1;
++    }
++}
++
++/**
++ * arm_num_wrps: Return number of implemented watchpoints.
++ * Note that the ID register WRPS field is "number of wps - 1",
++ * and we return the actual number of watchpoints.
++ */
++static inline int arm_num_wrps(ARMCPU *cpu)
++{
++    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
++        return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, WRPS) + 1;
++    } else {
++        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, WRPS) + 1;
++    }
++}
++
++/**
++ * arm_num_ctx_cmps: Return number of implemented context comparators.
++ * Note that the ID register CTX_CMPS field is "number of cmps - 1",
++ * and we return the actual number of comparators.
++ */
++static inline int arm_num_ctx_cmps(ARMCPU *cpu)
++{
++    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
++        return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, CTX_CMPS) + 1;
++    } else {
++        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, CTX_CMPS) + 1;
++    }
++}
++
+ /* Note make_memop_idx reserves 4 bits for mmu_idx, and MO_BSWAP is bit 3.
+  * Thus a TCGMemOpIdx, without any MO_ALIGN bits, fits in 8 bits.
+  */
+diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/debug_helper.c
++++ b/target/arm/debug_helper.c
+@@ -XXX,XX +XXX,XX @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
+ {
+     CPUARMState *env = &cpu->env;
+     uint64_t bcr = env->cp15.dbgbcr[lbn];
+-    int brps = extract32(cpu->dbgdidr, 24, 4);
+-    int ctx_cmps = extract32(cpu->dbgdidr, 20, 4);
++    int brps = arm_num_brps(cpu);
++    int ctx_cmps = arm_num_ctx_cmps(cpu);
+     int bt;
+     uint32_t contextidr;
+     uint64_t hcr_el2;
+@@ -XXX,XX +XXX,XX @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
+      * case DBGWCR<n>_EL1.LBN must indicate that breakpoint).
+      * We choose the former.
+      */
+-    if (lbn > brps || lbn < (brps - ctx_cmps)) {
++    if (lbn >= brps || lbn < (brps - ctx_cmps)) {
+         return false;
+     }
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
+     };
+     /* Note that all these register fields hold "number of Xs minus 1". */
+-    brps = extract32(cpu->dbgdidr, 24, 4);
+-    wrps = extract32(cpu->dbgdidr, 28, 4);
+-    ctx_cmps = extract32(cpu->dbgdidr, 20, 4);
++    brps = arm_num_brps(cpu);
++    wrps = arm_num_wrps(cpu);
++    ctx_cmps = arm_num_ctx_cmps(cpu);
+     assert(ctx_cmps <= brps);
+-    /* The DBGDIDR and ID_AA64DFR0_EL1 define various properties
+-     * of the debug registers such as number of breakpoints;
+-     * check that if they both exist then they agree.
+-     */
+-    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+-        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, BRPS) == brps);
+-        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, WRPS) == wrps);
+-        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, CTX_CMPS)
+-               == ctx_cmps);
+-    }
+-
+     define_one_arm_cp_reg(cpu, &dbgdidr);
+     define_arm_cp_regs(cpu, debug_cp_reginfo);
+@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
+         define_arm_cp_regs(cpu, debug_lpae_cp_reginfo);
+     }
+-    for (i = 0; i < brps + 1; i++) {
++    for (i = 0; i < brps; i++) {
+         ARMCPRegInfo dbgregs[] = {
+             { .name = "DBGBVR", .state = ARM_CP_STATE_BOTH,
+               .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = i, .opc2 = 4,
+@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
+         define_arm_cp_regs(cpu, dbgregs);
+     }
+-    for (i = 0; i < wrps + 1; i++) {
++    for (i = 0; i < wrps; i++) {
+         ARMCPRegInfo dbgregs[] = {
+             { .name = "DBGWVR", .state = ARM_CP_STATE_BOTH,
+               .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = i, .opc2 = 6,
+--
+.20.1

-New patch
+[PULL 24/52] target/arm: Move DBGDIDR into ARMISARegisters
+We're going to want to read the DBGDIDR register from KVM in
+a subsequent commit, which means it needs to be in the
+ARMISARegisters sub-struct. Move it.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200214175116.9164-12-peter.maydell@linaro.org
+---
+ target/arm/cpu.h       | 2 +-
+ target/arm/internals.h | 6 +++---
+ target/arm/cpu.c       | 8 ++++----
+ target/arm/cpu64.c     | 6 +++---
+ target/arm/helper.c    | 2 +-
+files changed, 12 insertions(+), 12 deletions(-)
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
+         uint32_t mvfr1;
+         uint32_t mvfr2;
+         uint32_t id_dfr0;
++        uint32_t dbgdidr;
+         uint64_t id_aa64isar0;
+         uint64_t id_aa64isar1;
+         uint64_t id_aa64pfr0;
+@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
+     uint32_t id_mmfr4;
+     uint64_t id_aa64afr0;
+     uint64_t id_aa64afr1;
+-    uint32_t dbgdidr;
+     uint32_t clidr;
+     uint64_t mp_affinity; /* MP ID without feature bits */
+     /* The elements of this array are the CCSIDR values for each cache,
+diff --git a/target/arm/internals.h b/target/arm/internals.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/internals.h
++++ b/target/arm/internals.h
+@@ -XXX,XX +XXX,XX @@ static inline int arm_num_brps(ARMCPU *cpu)
+     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+         return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, BRPS) + 1;
+     } else {
+-        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, BRPS) + 1;
++        return FIELD_EX32(cpu->isar.dbgdidr, DBGDIDR, BRPS) + 1;
+     }
+ }
+@@ -XXX,XX +XXX,XX @@ static inline int arm_num_wrps(ARMCPU *cpu)
+     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+         return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, WRPS) + 1;
+     } else {
+-        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, WRPS) + 1;
++        return FIELD_EX32(cpu->isar.dbgdidr, DBGDIDR, WRPS) + 1;
+     }
+ }
+@@ -XXX,XX +XXX,XX @@ static inline int arm_num_ctx_cmps(ARMCPU *cpu)
+     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+         return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, CTX_CMPS) + 1;
+     } else {
+-        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, CTX_CMPS) + 1;
++        return FIELD_EX32(cpu->isar.dbgdidr, DBGDIDR, CTX_CMPS) + 1;
+     }
+ }
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.c
++++ b/target/arm/cpu.c
+@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
+     cpu->isar.id_isar2 = 0x21232031;
+     cpu->isar.id_isar3 = 0x11112131;
+     cpu->isar.id_isar4 = 0x00111142;
+-    cpu->dbgdidr = 0x15141000;
++    cpu->isar.dbgdidr = 0x15141000;
+     cpu->clidr = (1 << 27) | (2 << 24) | 3;
+     cpu->ccsidr[0] = 0xe007e01a; /* 16k L1 dcache. */
+     cpu->ccsidr[1] = 0x2007e01a; /* 16k L1 icache. */
+@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
+     cpu->isar.id_isar2 = 0x21232041;
+     cpu->isar.id_isar3 = 0x11112131;
+     cpu->isar.id_isar4 = 0x00111142;
+-    cpu->dbgdidr = 0x35141000;
++    cpu->isar.dbgdidr = 0x35141000;
+     cpu->clidr = (1 << 27) | (1 << 24) | 3;
+     cpu->ccsidr[0] = 0xe00fe019; /* 16k L1 dcache. */
+     cpu->ccsidr[1] = 0x200fe019; /* 16k L1 icache. */
+@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
+     cpu->isar.id_isar2 = 0x21232041;
+     cpu->isar.id_isar3 = 0x11112131;
+     cpu->isar.id_isar4 = 0x10011142;
+-    cpu->dbgdidr = 0x3515f005;
++    cpu->isar.dbgdidr = 0x3515f005;
+     cpu->clidr = 0x0a200023;
+     cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
+     cpu->ccsidr[1] = 0x201fe00a; /* 32K L1 icache */
+@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
+     cpu->isar.id_isar2 = 0x21232041;
+     cpu->isar.id_isar3 = 0x11112131;
+     cpu->isar.id_isar4 = 0x10011142;
+-    cpu->dbgdidr = 0x3515f021;
++    cpu->isar.dbgdidr = 0x3515f021;
+     cpu->clidr = 0x0a200023;
+     cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
+     cpu->ccsidr[1] = 0x201fe00a; /* 32K L1 icache */
+diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu64.c
++++ b/target/arm/cpu64.c
+@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
+     cpu->isar.id_aa64dfr0 = 0x10305106;
+     cpu->isar.id_aa64isar0 = 0x00011120;
+     cpu->isar.id_aa64mmfr0 = 0x00001124;
+-    cpu->dbgdidr = 0x3516d000;
++    cpu->isar.dbgdidr = 0x3516d000;
+     cpu->clidr = 0x0a200023;
+     cpu->ccsidr[0] = 0x701fe00a; /* 32KB L1 dcache */
+     cpu->ccsidr[1] = 0x201fe012; /* 48KB L1 icache */
+@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
+     cpu->isar.id_aa64dfr0 = 0x10305106;
+     cpu->isar.id_aa64isar0 = 0x00011120;
+     cpu->isar.id_aa64mmfr0 = 0x00001122; /* 40 bit physical addr */
+-    cpu->dbgdidr = 0x3516d000;
++    cpu->isar.dbgdidr = 0x3516d000;
+     cpu->clidr = 0x0a200023;
+     cpu->ccsidr[0] = 0x700fe01a; /* 32KB L1 dcache */
+     cpu->ccsidr[1] = 0x201fe00a; /* 32KB L1 icache */
+@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
+     cpu->isar.id_aa64dfr0 = 0x10305106;
+     cpu->isar.id_aa64isar0 = 0x00011120;
+     cpu->isar.id_aa64mmfr0 = 0x00001124;
+-    cpu->dbgdidr = 0x3516d000;
++    cpu->isar.dbgdidr = 0x3516d000;
+     cpu->clidr = 0x0a200023;
+     cpu->ccsidr[0] = 0x701fe00a; /* 32KB L1 dcache */
+     cpu->ccsidr[1] = 0x201fe012; /* 48KB L1 icache */
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
+     ARMCPRegInfo dbgdidr = {
+         .name = "DBGDIDR", .cp = 14, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 0,
+         .access = PL0_R, .accessfn = access_tda,
+-        .type = ARM_CP_CONST, .resetvalue = cpu->dbgdidr,
++        .type = ARM_CP_CONST, .resetvalue = cpu->isar.dbgdidr,
+     };
+     /* Note that all these register fields hold "number of Xs minus 1". */
+--
+.20.1

-[Qemu-devel] [PULL 21/29] target/arm/kvm64: Fix error returns
+[PULL 25/52] target/arm: Read debug-related ID registers from KVM
-From: Andrew Jones <drjones@redhat.com>
+Now we have isar_feature test functions that look at fields in the
 ID_AA64DFR0_EL1 and ID_DFR0 ID registers, add the code that reads
 these register values from KVM so that the checks behave correctly
 when we're using KVM.
-A couple return -EINVAL's forgot their '-'s.
+No isar_feature function tests ID_AA64DFR1_EL1 or DBGDIDR yet, but we
 add it to maintain the invariant that every field in the
 ARMISARegisters struct is populated for a KVM CPU and can be relied
 on.  This requirement isn't actually written down yet, so add a note
 to the relevant comment.
-Signed-off-by: Andrew Jones <drjones@redhat.com>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Eric Auger <eric.auger@redhat.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-13-peter.maydell@linaro.org
 ---
- target/arm/kvm64.c | 4 ++--
+ target/arm/cpu.h   |  5 +++++
-file changed, 2 insertions(+), 2 deletions(-)
+ target/arm/kvm32.c |  8 ++++++++
  target/arm/kvm64.c | 36 ++++++++++++++++++++++++++++++++++++
 files changed, 49 insertions(+)
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
+      * prefix means a constant register.
+      * Some of these registers are split out into a substructure that
+      * is shared with the translators to control the ISA.
++     *
++     * Note that if you add an ID register to the ARMISARegisters struct
++     * you need to also update the 32-bit and 64-bit versions of the
++     * kvm_arm_get_host_cpu_features() function to correctly populate the
++     * field by reading the value from the KVM vCPU.
+      */
+     struct ARMISARegisters {
+         uint32_t id_isar0;
+diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/kvm32.c
++++ b/target/arm/kvm32.c
+@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
+         ahcf->isar.id_isar6 = 0;
+     }
++    err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_dfr0,
++                          ARM_CP15_REG32(0, 0, 1, 2));
++
+     err |= read_sys_reg32(fdarray[2], &ahcf->isar.mvfr0,
+                           KVM_REG_ARM | KVM_REG_SIZE_U32 |
+                           KVM_REG_ARM_VFP | KVM_REG_ARM_VFP_MVFR0);
+@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
+      * Fortunately there is not yet anything in there that affects migration.
+      */
++    /*
++     * There is no way to read DBGDIDR, because currently 32-bit KVM
++     * doesn't implement debug at all. Leave it at zero.
++     */
++
+     kvm_arm_destroy_scratch_host_vcpu(fdarray);
+     if (err < 0) {
 diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm64.c
 +++ b/target/arm/kvm64.c
-@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
+@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
-     write_cpustate_to_list(cpu, true);
+     } else {
+         err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64pfr1,
-     if (!write_list_to_kvmstate(cpu, level)) {
+                               ARM64_SYS_REG(3, 0, 0, 4, 1));
--        return EINVAL;
++        err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64dfr0,
-+        return -EINVAL;
++                              ARM64_SYS_REG(3, 0, 0, 5, 0));
 +        err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64dfr1,
 +                              ARM64_SYS_REG(3, 0, 0, 5, 1));
          err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64isar0,
                                ARM64_SYS_REG(3, 0, 0, 6, 0));
          err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64isar1,
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
           * than skipping the reads and leaving 0, as we must avoid
           * considering the values in every case.
           */
 +        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_dfr0,
 +                              ARM64_SYS_REG(3, 0, 0, 1, 2));
          err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar0,
                                ARM64_SYS_REG(3, 0, 0, 2, 0));
          err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar1,
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
                                ARM64_SYS_REG(3, 0, 0, 3, 1));
          err |= read_sys_reg32(fdarray[2], &ahcf->isar.mvfr2,
                                ARM64_SYS_REG(3, 0, 0, 3, 2));
 +
 +        /*
 +         * DBGDIDR is a bit complicated because the kernel doesn't
 +         * provide an accessor for it in 64-bit mode, which is what this
 +         * scratch VM is in, and there's no architected "64-bit sysreg
 +         * which reads the same as the 32-bit register" the way there is
 +         * for other ID registers. Instead we synthesize a value from the
 +         * AArch64 ID_AA64DFR0, the same way the kernel code in
 +         * arch/arm64/kvm/sys_regs.c:trap_dbgidr() does.
 +         * We only do this if the CPU supports AArch32 at EL1.
 +         */
 +        if (FIELD_EX32(ahcf->isar.id_aa64pfr0, ID_AA64PFR0, EL1) >= 2) {
 +            int wrps = FIELD_EX64(ahcf->isar.id_aa64dfr0, ID_AA64DFR0, WRPS);
 +            int brps = FIELD_EX64(ahcf->isar.id_aa64dfr0, ID_AA64DFR0, BRPS);
 +            int ctx_cmps =
 +                FIELD_EX64(ahcf->isar.id_aa64dfr0, ID_AA64DFR0, CTX_CMPS);
 +            int version = 6; /* ARMv8 debug architecture */
 +            bool has_el3 =
 +                !!FIELD_EX32(ahcf->isar.id_aa64pfr0, ID_AA64PFR0, EL3);
 +            uint32_t dbgdidr = 0;
 +
 +            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, WRPS, wrps);
 +            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, BRPS, brps);
 +            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, CTX_CMPS, ctx_cmps);
 +            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, VERSION, version);
 +            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, NSUHD_IMP, has_el3);
 +            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, SE_IMP, has_el3);
 +            dbgdidr |= (1 << 15); /* RES1 bit */
 +            ahcf->isar.dbgdidr = dbgdidr;
 +        }
      }
-     kvm_arm_sync_mpstate_to_kvm(cpu);
+     sve_supported = ioctl(fdarray[0], KVM_CHECK_EXTENSION, KVM_CAP_ARM_SVE) > 0;
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
      }
      if (!write_kvmstate_to_list(cpu)) {
 -        return EINVAL;
 +        return -EINVAL;
      }
      /* Note that it's OK to have registers which aren't in CPUState,
       * so we can ignore a failure return here.
 --
 .20.1

-New patch
+[PULL 26/52] target/arm: Implement ARMv8.1-PMU extension
+The ARMv8.1-PMU extension requires:
+ * the evtCount field in PMETYPER<n>_EL0 is 16 bits, not 10
+ * MDCR_EL2.HPMD allows event counting to be disabled at EL2
+ * two new required events, STALL_FRONTEND and STALL_BACKEND
+ * ID register bits in ID_AA64DFR0_EL1 and ID_DFR0
+We already implement the 16-bit evtCount field and the
+HPMD bit, so all that is missing is the two new events:
+  STALL_FRONTEND
+   "counts every cycle counted by the CPU_CYCLES event on which no
+    operation was issued because there are no operations available
+    to issue to this PE from the frontend"
+  STALL_BACKEND
+   "counts every cycle counted by the CPU_CYCLES event on which no
+    operation was issued because the backend is unable to accept
+    any available operations from the frontend"
+QEMU never stalls in this sense, so our implementation is trivial:
+always return a zero count.
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-14-peter.maydell@linaro.org
+---
+ target/arm/helper.c | 32 ++++++++++++++++++++++++++++++--
+file changed, 30 insertions(+), 2 deletions(-)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static int64_t instructions_ns_per(uint64_t icount)
+ }
+ #endif
++static bool pmu_8_1_events_supported(CPUARMState *env)
++{
++    /* For events which are supported in any v8.1 PMU */
++    return cpu_isar_feature(any_pmu_8_1, env_archcpu(env));
++}
++
++static uint64_t zero_event_get_count(CPUARMState *env)
++{
++    /* For events which on QEMU never fire, so their count is always zero */
++    return 0;
++}
++
++static int64_t zero_event_ns_per(uint64_t cycles)
++{
++    /* An event which never fires can never overflow */
++    return -1;
++}
++
+ static const pm_event pm_events[] = {
+     { .number = 0x000, /* SW_INCR */
+       .supported = event_always_supported,
+@@ -XXX,XX +XXX,XX @@ static const pm_event pm_events[] = {
+       .supported = event_always_supported,
+       .get_count = cycles_get_count,
+       .ns_per_count = cycles_ns_per,
+-    }
++    },
+ #endif
++    { .number = 0x023, /* STALL_FRONTEND */
++      .supported = pmu_8_1_events_supported,
++      .get_count = zero_event_get_count,
++      .ns_per_count = zero_event_ns_per,
++    },
++    { .number = 0x024, /* STALL_BACKEND */
++      .supported = pmu_8_1_events_supported,
++      .get_count = zero_event_get_count,
++      .ns_per_count = zero_event_ns_per,
++    },
+ };
+ /*
+@@ -XXX,XX +XXX,XX @@ static const pm_event pm_events[] = {
+  * should first be updated to something sparse instead of the current
+  * supported_event_map[] array.
+  */
+-#define MAX_EVENT_ID 0x11
++#define MAX_EVENT_ID 0x24
+ #define UNSUPPORTED_EVENT UINT16_MAX
+ static uint16_t supported_event_map[MAX_EVENT_ID + 1];
+--
+.20.1

-[Qemu-devel] [PULL 20/29] target/arm/cpu: Use div-round-up to determine predicate register array size
+[PULL 27/52] target/arm: Implement ARMv8.4-PMU extension
-From: Andrew Jones <drjones@redhat.com>
+The ARMv8.4-PMU extension adds:
  * one new required event, STALL
  * one new system register PMMIR_EL1
-Unless we're guaranteed to always increase ARM_MAX_VQ by a multiple of
+(There are also some more L1-cache related events, but since
-four, then we should use DIV_ROUND_UP to ensure we get an appropriate
+we don't implement any cache we don't provide these, in the
-array size.
+same way we don't provide the base-PMUv3 cache events.)
-Signed-off-by: Andrew Jones <drjones@redhat.com>
+The STALL event "counts every attributable cycle on which no
 attributable instruction or operation was sent for execution on this
 PE".  QEMU doesn't stall in this sense, so this is another
 always-reads-zero event.
 The PMMIR_EL1 register is a read-only register providing
 implementation-specific information about the PMU; currently it has
 only one field, SLOTS, which defines behaviour of the STALL_SLOT PMU
 event.  Since QEMU doesn't implement the STALL_SLOT event, we can
 validly make the register read zero.
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-15-peter.maydell@linaro.org
 ---
- target/arm/cpu.h | 2 +-
+ target/arm/cpu.h    | 18 ++++++++++++++++++
-file changed, 1 insertion(+), 1 deletion(-)
+ target/arm/helper.c | 22 +++++++++++++++++++++-
 files changed, 39 insertions(+), 1 deletion(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ typedef struct ARMVectorReg {
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_pmu_8_1(const ARMISARegisters *id)
- #ifdef TARGET_AARCH64
+         FIELD_EX32(id->id_dfr0, ID_DFR0, PERFMON) != 0xf;
- /* In AArch32 mode, predicate registers do not exist at all.  */
+ }
- typedef struct ARMPredicateReg {
--    uint64_t p[2 * ARM_MAX_VQ / 8] QEMU_ALIGNED(16);
++static inline bool isar_feature_aa32_pmu_8_4(const ARMISARegisters *id)
-+    uint64_t p[DIV_ROUND_UP(2 * ARM_MAX_VQ, 8)] QEMU_ALIGNED(16);
++{
- } ARMPredicateReg;
++    /* 0xf means "non-standard IMPDEF PMU" */
++    return FIELD_EX32(id->id_dfr0, ID_DFR0, PERFMON) >= 5 &&
- /* In AArch32 mode, PAC keys do not exist at all.  */
++        FIELD_EX32(id->id_dfr0, ID_DFR0, PERFMON) != 0xf;
 +}
 +
  /*
   * 64-bit feature tests via id registers.
   */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_pmu_8_1(const ARMISARegisters *id)
          FIELD_EX64(id->id_aa64dfr0, ID_AA64DFR0, PMUVER) != 0xf;
  }
 +static inline bool isar_feature_aa64_pmu_8_4(const ARMISARegisters *id)
 +{
 +    return FIELD_EX32(id->id_aa64dfr0, ID_AA64DFR0, PMUVER) >= 5 &&
 +        FIELD_EX32(id->id_aa64dfr0, ID_AA64DFR0, PMUVER) != 0xf;
 +}
 +
  /*
   * Feature tests for "does this exist in either 32-bit or 64-bit?"
   */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_pmu_8_1(const ARMISARegisters *id)
      return isar_feature_aa64_pmu_8_1(id) || isar_feature_aa32_pmu_8_1(id);
  }
 +static inline bool isar_feature_any_pmu_8_4(const ARMISARegisters *id)
 +{
 +    return isar_feature_aa64_pmu_8_4(id) || isar_feature_aa32_pmu_8_4(id);
 +}
 +
  /*
   * Forward to the above feature tests given an ARMCPU pointer.
   */
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static bool pmu_8_1_events_supported(CPUARMState *env)
      return cpu_isar_feature(any_pmu_8_1, env_archcpu(env));
  }
 +static bool pmu_8_4_events_supported(CPUARMState *env)
 +{
 +    /* For events which are supported in any v8.1 PMU */
 +    return cpu_isar_feature(any_pmu_8_4, env_archcpu(env));
 +}
 +
  static uint64_t zero_event_get_count(CPUARMState *env)
  {
      /* For events which on QEMU never fire, so their count is always zero */
@@ -XXX,XX +XXX,XX @@ static const pm_event pm_events[] = {
        .get_count = zero_event_get_count,
        .ns_per_count = zero_event_ns_per,
      },
 +    { .number = 0x03c, /* STALL */
 +      .supported = pmu_8_4_events_supported,
 +      .get_count = zero_event_get_count,
 +      .ns_per_count = zero_event_ns_per,
 +    },
  };
  /*
@@ -XXX,XX +XXX,XX @@ static const pm_event pm_events[] = {
   * should first be updated to something sparse instead of the current
   * supported_event_map[] array.
   */
 -#define MAX_EVENT_ID 0x24
 +#define MAX_EVENT_ID 0x3c
  #define UNSUPPORTED_EVENT UINT16_MAX
  static uint16_t supported_event_map[MAX_EVENT_ID + 1];
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
          };
          define_arm_cp_regs(cpu, v81_pmu_regs);
      }
 +    if (cpu_isar_feature(any_pmu_8_4, cpu)) {
 +        static const ARMCPRegInfo v84_pmmir = {
 +            .name = "PMMIR_EL1", .state = ARM_CP_STATE_BOTH,
 +            .opc0 = 3, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 6,
 +            .access = PL1_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
 +            .resetvalue = 0
 +        };
 +        define_one_arm_cp_reg(cpu, &v84_pmmir);
 +    }
  }
  /* We don't know until after realize whether there's a GICv3
 --
 .20.1

-New patch
+[PULL 28/52] target/arm: Provide ARMv8.4-PMU in '-cpu max'
+Set the ID register bits to provide ARMv8.4-PMU (and implicitly
+also ARMv8.1-PMU) in the 'max' CPU.
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-16-peter.maydell@linaro.org
+---
+ target/arm/cpu64.c | 8 ++++++++
+file changed, 8 insertions(+)
+diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu64.c
++++ b/target/arm/cpu64.c
+@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
+         u = FIELD_DP32(u, ID_MMFR3, PAN, 2); /* ATS1E1 */
+         cpu->id_mmfr3 = u;
++        u = cpu->isar.id_aa64dfr0;
++        u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
++        cpu->isar.id_aa64dfr0 = u;
++
++        u = cpu->isar.id_dfr0;
++        u = FIELD_DP32(u, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
++        cpu->isar.id_dfr0 = u;
++
+         /*
+          * FIXME: We do not yet support ARMv8.2-fp16 for AArch32 yet,
+          * so do not set MVFR1.FPHP.  Strictly speaking this is not legal,
+--
+.20.1

-New patch
+[PULL 29/52] target/arm: Correct definition of PMCRDP
+The PMCR_EL0.DP bit is bit 5, which is 0x20, not 0x10.  0x10 is 'X'.
+Correct our #define of PMCRDP and add the missing PMCRX.
+We do have the correct behaviour for handling the DP bit being
+set, so this fixes a guest-visible bug.
+Fixes: 033614c47de
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-17-peter.maydell@linaro.org
+---
+ target/arm/helper.c | 3 ++-
+file changed, 2 insertions(+), 1 deletion(-)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
+ #define PMCRN_MASK  0xf800
+ #define PMCRN_SHIFT 11
+ #define PMCRLC  0x40
+-#define PMCRDP  0x10
++#define PMCRDP  0x20
++#define PMCRX   0x10
+ #define PMCRD   0x8
+ #define PMCRC   0x4
+ #define PMCRP   0x2
+--
+.20.1

-New patch
+[PULL 30/52] target/arm: Correct handling of PMCR_EL0.LC bit
+The LC bit in the PMCR_EL0 register is supposed to be:
+ * read/write
+ * RES1 on an AArch64-only implementation
+ * an architecturally UNKNOWN value on reset
+(and use of LC==0 by software is deprecated).
+We were implementing it incorrectly as read-only always zero,
+though we do have all the code needed to test it and behave
+accordingly.
+Instead make it a read-write bit which resets to 1 always, which
+satisfies all the architectural requirements above.
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200214175116.9164-18-peter.maydell@linaro.org
+---
+ target/arm/helper.c | 13 +++++++++----
+file changed, 9 insertions(+), 4 deletions(-)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
+ #define PMCRC   0x4
+ #define PMCRP   0x2
+ #define PMCRE   0x1
++/*
++ * Mask of PMCR bits writeable by guest (not including WO bits like C, P,
++ * which can be written as 1 to trigger behaviour but which stay RAZ).
++ */
++#define PMCR_WRITEABLE_MASK (PMCRLC | PMCRDP | PMCRX | PMCRD | PMCRE)
+ #define PMXEVTYPER_P          0x80000000
+ #define PMXEVTYPER_U          0x40000000
+@@ -XXX,XX +XXX,XX @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+         }
+     }
+-    /* only the DP, X, D and E bits are writable */
+-    env->cp15.c9_pmcr &= ~0x39;
+-    env->cp15.c9_pmcr |= (value & 0x39);
++    env->cp15.c9_pmcr &= ~PMCR_WRITEABLE_MASK;
++    env->cp15.c9_pmcr |= (value & PMCR_WRITEABLE_MASK);
+     pmu_op_finish(env);
+ }
+@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
+         .access = PL0_RW, .accessfn = pmreg_access,
+         .type = ARM_CP_IO,
+         .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcr),
+-        .resetvalue = (cpu->midr & 0xff000000) | (pmcrn << PMCRN_SHIFT),
++        .resetvalue = (cpu->midr & 0xff000000) | (pmcrn << PMCRN_SHIFT) |
++                      PMCRLC,
+         .writefn = pmcr_write, .raw_writefn = raw_write,
+     };
+     define_one_arm_cp_reg(cpu, &pmcr);
+--
+.20.1

-[Qemu-devel] [PULL 17/29] target/arm/cpu64: Ensure kvm really supports aarch64=off
+[PULL 31/52] target/arm: Test correct register in aa32_pan and aa32_ats1e1 checks
-From: Andrew Jones <drjones@redhat.com>
+The isar_feature_aa32_pan and isar_feature_aa32_ats1e1 functions
 are supposed to be testing fields in ID_MMFR3; but a cut-and-paste
 error meant we were looking at MVFR0 instead.
-If -cpu <cpu>,aarch64=off is used then KVM must also be used, and it
+Fix the functions to look at the right register; this requires
-and the host must support running the vcpu in 32-bit mode. Also, if
+us to move at least id_mmfr3 to the ARMISARegisters struct; we
--cpu <cpu>,aarch64=on is used, then it doesn't matter if kvm is
+choose to move all the ID_MMFRn registers for consistency.
 enabled or not.
-Signed-off-by: Andrew Jones <drjones@redhat.com>
+Fixes: 3d6ad6bb466f
 Reviewed-by: Eric Auger <eric.auger@redhat.com>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200214175116.9164-19-peter.maydell@linaro.org
 ---
- target/arm/kvm_arm.h | 14 ++++++++++++++
+ target/arm/cpu.h      |  14 +++---
- target/arm/cpu64.c   | 12 ++++++------
+ hw/intc/armv7m_nvic.c |   8 ++--
- target/arm/kvm64.c   |  9 +++++++++
+ target/arm/cpu.c      | 104 +++++++++++++++++++++---------------------
-files changed, 29 insertions(+), 6 deletions(-)
+ target/arm/cpu64.c    |  28 ++++++------
  target/arm/helper.c   |  12 ++---
  target/arm/kvm32.c    |  17 +++++++
  target/arm/kvm64.c    |  10 ++++
 files changed, 110 insertions(+), 83 deletions(-)
-diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/kvm_arm.h
+--- a/target/arm/cpu.h
-+++ b/target/arm/kvm_arm.h
++++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf);
+@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
-  */
+         uint32_t id_isar4;
- void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu);
+         uint32_t id_isar5;
+         uint32_t id_isar6;
-+/**
++        uint32_t id_mmfr0;
-+ * kvm_arm_aarch32_supported:
++        uint32_t id_mmfr1;
-+ * @cs: CPUState
++        uint32_t id_mmfr2;
-+ *
++        uint32_t id_mmfr3;
-+ * Returns: true if the KVM VCPU can enable AArch32 mode
++        uint32_t id_mmfr4;
-+ * and false otherwise.
+         uint32_t mvfr0;
-+ */
+         uint32_t mvfr1;
-+bool kvm_arm_aarch32_supported(CPUState *cs);
+         uint32_t mvfr2;
-+
+@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
- /**
+     uint64_t pmceid0;
-  * kvm_arm_get_max_vm_ipa_size - Returns the number of bits in the
+     uint64_t pmceid1;
-  * IPA address space supported by KVM
+     uint32_t id_afr0;
-@@ -XXX,XX +XXX,XX @@ static inline void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
+-    uint32_t id_mmfr0;
-     cpu->host_cpu_probe_failed = true;
+-    uint32_t id_mmfr1;
 -    uint32_t id_mmfr2;
 -    uint32_t id_mmfr3;
 -    uint32_t id_mmfr4;
      uint64_t id_aa64afr0;
      uint64_t id_aa64afr1;
      uint32_t clidr;
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_vminmaxnm(const ARMISARegisters *id)
  static inline bool isar_feature_aa32_pan(const ARMISARegisters *id)
  {
 -    return FIELD_EX64(id->mvfr0, ID_MMFR3, PAN) != 0;
 +    return FIELD_EX32(id->id_mmfr3, ID_MMFR3, PAN) != 0;
  }
-+static inline bool kvm_arm_aarch32_supported(CPUState *cs)
+ static inline bool isar_feature_aa32_ats1e1(const ARMISARegisters *id)
 +{
 +    return false;
 +}
 +
  static inline int kvm_arm_get_max_vm_ipa_size(MachineState *ms)
  {
-     return -ENOENT;
+-    return FIELD_EX64(id->mvfr0, ID_MMFR3, PAN) >= 2;
 +    return FIELD_EX32(id->id_mmfr3, ID_MMFR3, PAN) >= 2;
  }
  static inline bool isar_feature_aa32_pmu_8_1(const ARMISARegisters *id)
 diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/intc/armv7m_nvic.c
 +++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
      case 0xd4c: /* AFR0.  */
          return cpu->id_afr0;
      case 0xd50: /* MMFR0.  */
 -        return cpu->id_mmfr0;
 +        return cpu->isar.id_mmfr0;
      case 0xd54: /* MMFR1.  */
 -        return cpu->id_mmfr1;
 +        return cpu->isar.id_mmfr1;
      case 0xd58: /* MMFR2.  */
 -        return cpu->id_mmfr2;
 +        return cpu->isar.id_mmfr2;
      case 0xd5c: /* MMFR3.  */
 -        return cpu->id_mmfr3;
 +        return cpu->isar.id_mmfr3;
      case 0xd60: /* ISAR0.  */
          return cpu->isar.id_isar0;
      case 0xd64: /* ISAR1.  */
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm1136_r2_initfn(Object *obj)
      cpu->id_pfr1 = 0x1;
      cpu->isar.id_dfr0 = 0x2;
      cpu->id_afr0 = 0x3;
 -    cpu->id_mmfr0 = 0x01130003;
 -    cpu->id_mmfr1 = 0x10030302;
 -    cpu->id_mmfr2 = 0x01222110;
 +    cpu->isar.id_mmfr0 = 0x01130003;
 +    cpu->isar.id_mmfr1 = 0x10030302;
 +    cpu->isar.id_mmfr2 = 0x01222110;
      cpu->isar.id_isar0 = 0x00140011;
      cpu->isar.id_isar1 = 0x12002111;
      cpu->isar.id_isar2 = 0x11231111;
@@ -XXX,XX +XXX,XX @@ static void arm1136_initfn(Object *obj)
      cpu->id_pfr1 = 0x1;
      cpu->isar.id_dfr0 = 0x2;
      cpu->id_afr0 = 0x3;
 -    cpu->id_mmfr0 = 0x01130003;
 -    cpu->id_mmfr1 = 0x10030302;
 -    cpu->id_mmfr2 = 0x01222110;
 +    cpu->isar.id_mmfr0 = 0x01130003;
 +    cpu->isar.id_mmfr1 = 0x10030302;
 +    cpu->isar.id_mmfr2 = 0x01222110;
      cpu->isar.id_isar0 = 0x00140011;
      cpu->isar.id_isar1 = 0x12002111;
      cpu->isar.id_isar2 = 0x11231111;
@@ -XXX,XX +XXX,XX @@ static void arm1176_initfn(Object *obj)
      cpu->id_pfr1 = 0x11;
      cpu->isar.id_dfr0 = 0x33;
      cpu->id_afr0 = 0;
 -    cpu->id_mmfr0 = 0x01130003;
 -    cpu->id_mmfr1 = 0x10030302;
 -    cpu->id_mmfr2 = 0x01222100;
 +    cpu->isar.id_mmfr0 = 0x01130003;
 +    cpu->isar.id_mmfr1 = 0x10030302;
 +    cpu->isar.id_mmfr2 = 0x01222100;
      cpu->isar.id_isar0 = 0x0140011;
      cpu->isar.id_isar1 = 0x12002111;
      cpu->isar.id_isar2 = 0x11231121;
@@ -XXX,XX +XXX,XX @@ static void arm11mpcore_initfn(Object *obj)
      cpu->id_pfr1 = 0x1;
      cpu->isar.id_dfr0 = 0;
      cpu->id_afr0 = 0x2;
 -    cpu->id_mmfr0 = 0x01100103;
 -    cpu->id_mmfr1 = 0x10020302;
 -    cpu->id_mmfr2 = 0x01222000;
 +    cpu->isar.id_mmfr0 = 0x01100103;
 +    cpu->isar.id_mmfr1 = 0x10020302;
 +    cpu->isar.id_mmfr2 = 0x01222000;
      cpu->isar.id_isar0 = 0x00100011;
      cpu->isar.id_isar1 = 0x12002111;
      cpu->isar.id_isar2 = 0x11221011;
@@ -XXX,XX +XXX,XX @@ static void cortex_m3_initfn(Object *obj)
      cpu->id_pfr1 = 0x00000200;
      cpu->isar.id_dfr0 = 0x00100000;
      cpu->id_afr0 = 0x00000000;
 -    cpu->id_mmfr0 = 0x00000030;
 -    cpu->id_mmfr1 = 0x00000000;
 -    cpu->id_mmfr2 = 0x00000000;
 -    cpu->id_mmfr3 = 0x00000000;
 +    cpu->isar.id_mmfr0 = 0x00000030;
 +    cpu->isar.id_mmfr1 = 0x00000000;
 +    cpu->isar.id_mmfr2 = 0x00000000;
 +    cpu->isar.id_mmfr3 = 0x00000000;
      cpu->isar.id_isar0 = 0x01141110;
      cpu->isar.id_isar1 = 0x02111000;
      cpu->isar.id_isar2 = 0x21112231;
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
      cpu->id_pfr1 = 0x00000200;
      cpu->isar.id_dfr0 = 0x00100000;
      cpu->id_afr0 = 0x00000000;
 -    cpu->id_mmfr0 = 0x00000030;
 -    cpu->id_mmfr1 = 0x00000000;
 -    cpu->id_mmfr2 = 0x00000000;
 -    cpu->id_mmfr3 = 0x00000000;
 +    cpu->isar.id_mmfr0 = 0x00000030;
 +    cpu->isar.id_mmfr1 = 0x00000000;
 +    cpu->isar.id_mmfr2 = 0x00000000;
 +    cpu->isar.id_mmfr3 = 0x00000000;
      cpu->isar.id_isar0 = 0x01141110;
      cpu->isar.id_isar1 = 0x02111000;
      cpu->isar.id_isar2 = 0x21112231;
@@ -XXX,XX +XXX,XX @@ static void cortex_m7_initfn(Object *obj)
      cpu->id_pfr1 = 0x00000200;
      cpu->isar.id_dfr0 = 0x00100000;
      cpu->id_afr0 = 0x00000000;
 -    cpu->id_mmfr0 = 0x00100030;
 -    cpu->id_mmfr1 = 0x00000000;
 -    cpu->id_mmfr2 = 0x01000000;
 -    cpu->id_mmfr3 = 0x00000000;
 +    cpu->isar.id_mmfr0 = 0x00100030;
 +    cpu->isar.id_mmfr1 = 0x00000000;
 +    cpu->isar.id_mmfr2 = 0x01000000;
 +    cpu->isar.id_mmfr3 = 0x00000000;
      cpu->isar.id_isar0 = 0x01101110;
      cpu->isar.id_isar1 = 0x02112000;
      cpu->isar.id_isar2 = 0x20232231;
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
      cpu->id_pfr1 = 0x00000210;
      cpu->isar.id_dfr0 = 0x00200000;
      cpu->id_afr0 = 0x00000000;
 -    cpu->id_mmfr0 = 0x00101F40;
 -    cpu->id_mmfr1 = 0x00000000;
 -    cpu->id_mmfr2 = 0x01000000;
 -    cpu->id_mmfr3 = 0x00000000;
 +    cpu->isar.id_mmfr0 = 0x00101F40;
 +    cpu->isar.id_mmfr1 = 0x00000000;
 +    cpu->isar.id_mmfr2 = 0x01000000;
 +    cpu->isar.id_mmfr3 = 0x00000000;
      cpu->isar.id_isar0 = 0x01101110;
      cpu->isar.id_isar1 = 0x02212000;
      cpu->isar.id_isar2 = 0x20232232;
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
      cpu->id_pfr1 = 0x001;
      cpu->isar.id_dfr0 = 0x010400;
      cpu->id_afr0 = 0x0;
 -    cpu->id_mmfr0 = 0x0210030;
 -    cpu->id_mmfr1 = 0x00000000;
 -    cpu->id_mmfr2 = 0x01200000;
 -    cpu->id_mmfr3 = 0x0211;
 +    cpu->isar.id_mmfr0 = 0x0210030;
 +    cpu->isar.id_mmfr1 = 0x00000000;
 +    cpu->isar.id_mmfr2 = 0x01200000;
 +    cpu->isar.id_mmfr3 = 0x0211;
      cpu->isar.id_isar0 = 0x02101111;
      cpu->isar.id_isar1 = 0x13112111;
      cpu->isar.id_isar2 = 0x21232141;
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
      cpu->id_pfr1 = 0x11;
      cpu->isar.id_dfr0 = 0x400;
      cpu->id_afr0 = 0;
 -    cpu->id_mmfr0 = 0x31100003;
 -    cpu->id_mmfr1 = 0x20000000;
 -    cpu->id_mmfr2 = 0x01202000;
 -    cpu->id_mmfr3 = 0x11;
 +    cpu->isar.id_mmfr0 = 0x31100003;
 +    cpu->isar.id_mmfr1 = 0x20000000;
 +    cpu->isar.id_mmfr2 = 0x01202000;
 +    cpu->isar.id_mmfr3 = 0x11;
      cpu->isar.id_isar0 = 0x00101111;
      cpu->isar.id_isar1 = 0x12112111;
      cpu->isar.id_isar2 = 0x21232031;
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
      cpu->id_pfr1 = 0x11;
      cpu->isar.id_dfr0 = 0x000;
      cpu->id_afr0 = 0;
 -    cpu->id_mmfr0 = 0x00100103;
 -    cpu->id_mmfr1 = 0x20000000;
 -    cpu->id_mmfr2 = 0x01230000;
 -    cpu->id_mmfr3 = 0x00002111;
 +    cpu->isar.id_mmfr0 = 0x00100103;
 +    cpu->isar.id_mmfr1 = 0x20000000;
 +    cpu->isar.id_mmfr2 = 0x01230000;
 +    cpu->isar.id_mmfr3 = 0x00002111;
      cpu->isar.id_isar0 = 0x00101111;
      cpu->isar.id_isar1 = 0x13112111;
      cpu->isar.id_isar2 = 0x21232041;
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
      cpu->id_pfr1 = 0x00011011;
      cpu->isar.id_dfr0 = 0x02010555;
      cpu->id_afr0 = 0x00000000;
 -    cpu->id_mmfr0 = 0x10101105;
 -    cpu->id_mmfr1 = 0x40000000;
 -    cpu->id_mmfr2 = 0x01240000;
 -    cpu->id_mmfr3 = 0x02102211;
 +    cpu->isar.id_mmfr0 = 0x10101105;
 +    cpu->isar.id_mmfr1 = 0x40000000;
 +    cpu->isar.id_mmfr2 = 0x01240000;
 +    cpu->isar.id_mmfr3 = 0x02102211;
      /* a7_mpcore_r0p5_trm, page 4-4 gives 0x01101110; but
       * table 4-41 gives 0x02101110, which includes the arm div insns.
       */
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
      cpu->id_pfr1 = 0x00011011;
      cpu->isar.id_dfr0 = 0x02010555;
      cpu->id_afr0 = 0x00000000;
 -    cpu->id_mmfr0 = 0x10201105;
 -    cpu->id_mmfr1 = 0x20000000;
 -    cpu->id_mmfr2 = 0x01240000;
 -    cpu->id_mmfr3 = 0x02102211;
 +    cpu->isar.id_mmfr0 = 0x10201105;
 +    cpu->isar.id_mmfr1 = 0x20000000;
 +    cpu->isar.id_mmfr2 = 0x01240000;
 +    cpu->isar.id_mmfr3 = 0x02102211;
      cpu->isar.id_isar0 = 0x02101110;
      cpu->isar.id_isar1 = 0x13112111;
      cpu->isar.id_isar2 = 0x21232041;
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
              t = FIELD_DP32(t, MVFR2, FPMISC, 4);   /* FP MaxNum */
              cpu->isar.mvfr2 = t;
 -            t = cpu->id_mmfr3;
 +            t = cpu->isar.id_mmfr3;
              t = FIELD_DP32(t, ID_MMFR3, PAN, 2); /* ATS1E1 */
 -            cpu->id_mmfr3 = t;
 +            cpu->isar.id_mmfr3 = t;
 -            t = cpu->id_mmfr4;
 +            t = cpu->isar.id_mmfr4;
              t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
 -            cpu->id_mmfr4 = t;
 +            cpu->isar.id_mmfr4 = t;
          }
  #endif
      }
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
-@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_set_aarch64(Object *obj, bool value, Error **errp)
+@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
-      * restriction allows us to avoid fixing up functionality that assumes a
+     cpu->id_pfr1 = 0x00011011;
-      * uniform execution state like do_interrupt.
+     cpu->isar.id_dfr0 = 0x03010066;
      cpu->id_afr0 = 0x00000000;
 -    cpu->id_mmfr0 = 0x10101105;
 -    cpu->id_mmfr1 = 0x40000000;
 -    cpu->id_mmfr2 = 0x01260000;
 -    cpu->id_mmfr3 = 0x02102211;
 +    cpu->isar.id_mmfr0 = 0x10101105;
 +    cpu->isar.id_mmfr1 = 0x40000000;
 +    cpu->isar.id_mmfr2 = 0x01260000;
 +    cpu->isar.id_mmfr3 = 0x02102211;
      cpu->isar.id_isar0 = 0x02101110;
      cpu->isar.id_isar1 = 0x13112111;
      cpu->isar.id_isar2 = 0x21232042;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
      cpu->id_pfr1 = 0x00011011;
      cpu->isar.id_dfr0 = 0x03010066;
      cpu->id_afr0 = 0x00000000;
 -    cpu->id_mmfr0 = 0x10101105;
 -    cpu->id_mmfr1 = 0x40000000;
 -    cpu->id_mmfr2 = 0x01260000;
 -    cpu->id_mmfr3 = 0x02102211;
 +    cpu->isar.id_mmfr0 = 0x10101105;
 +    cpu->isar.id_mmfr1 = 0x40000000;
 +    cpu->isar.id_mmfr2 = 0x01260000;
 +    cpu->isar.id_mmfr3 = 0x02102211;
      cpu->isar.id_isar0 = 0x02101110;
      cpu->isar.id_isar1 = 0x13112111;
      cpu->isar.id_isar2 = 0x21232042;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
      cpu->id_pfr1 = 0x00011011;
      cpu->isar.id_dfr0 = 0x03010066;
      cpu->id_afr0 = 0x00000000;
 -    cpu->id_mmfr0 = 0x10201105;
 -    cpu->id_mmfr1 = 0x40000000;
 -    cpu->id_mmfr2 = 0x01260000;
 -    cpu->id_mmfr3 = 0x02102211;
 +    cpu->isar.id_mmfr0 = 0x10201105;
 +    cpu->isar.id_mmfr1 = 0x40000000;
 +    cpu->isar.id_mmfr2 = 0x01260000;
 +    cpu->isar.id_mmfr3 = 0x02102211;
      cpu->isar.id_isar0 = 0x02101110;
      cpu->isar.id_isar1 = 0x13112111;
      cpu->isar.id_isar2 = 0x21232042;
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
          u = FIELD_DP32(u, ID_ISAR6, SPECRES, 1);
          cpu->isar.id_isar6 = u;
 -        u = cpu->id_mmfr3;
 +        u = cpu->isar.id_mmfr3;
          u = FIELD_DP32(u, ID_MMFR3, PAN, 2); /* ATS1E1 */
 -        cpu->id_mmfr3 = u;
 +        cpu->isar.id_mmfr3 = u;
          u = cpu->isar.id_aa64dfr0;
          u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 4,
                .access = PL1_R, .type = ARM_CP_CONST,
                .accessfn = access_aa32_tid3,
 -              .resetvalue = cpu->id_mmfr0 },
 +              .resetvalue = cpu->isar.id_mmfr0 },
              { .name = "ID_MMFR1", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 5,
                .access = PL1_R, .type = ARM_CP_CONST,
                .accessfn = access_aa32_tid3,
 -              .resetvalue = cpu->id_mmfr1 },
 +              .resetvalue = cpu->isar.id_mmfr1 },
              { .name = "ID_MMFR2", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 6,
                .access = PL1_R, .type = ARM_CP_CONST,
                .accessfn = access_aa32_tid3,
 -              .resetvalue = cpu->id_mmfr2 },
 +              .resetvalue = cpu->isar.id_mmfr2 },
              { .name = "ID_MMFR3", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 7,
                .access = PL1_R, .type = ARM_CP_CONST,
                .accessfn = access_aa32_tid3,
 -              .resetvalue = cpu->id_mmfr3 },
 +              .resetvalue = cpu->isar.id_mmfr3 },
              { .name = "ID_ISAR0", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 0,
                .access = PL1_R, .type = ARM_CP_CONST,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 6,
                .access = PL1_R, .type = ARM_CP_CONST,
                .accessfn = access_aa32_tid3,
 -              .resetvalue = cpu->id_mmfr4 },
 +              .resetvalue = cpu->isar.id_mmfr4 },
              { .name = "ID_ISAR6", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 7,
                .access = PL1_R, .type = ARM_CP_CONST,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
          define_arm_cp_regs(cpu, vmsa_pmsa_cp_reginfo);
          define_arm_cp_regs(cpu, vmsa_cp_reginfo);
          /* TTCBR2 is introduced with ARMv8.2-A32HPD.  */
 -        if (FIELD_EX32(cpu->id_mmfr4, ID_MMFR4, HPDS) != 0) {
 +        if (FIELD_EX32(cpu->isar.id_mmfr4, ID_MMFR4, HPDS) != 0) {
              define_one_arm_cp_reg(cpu, &ttbcr2_reginfo);
          }
      }
 diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm32.c
 +++ b/target/arm/kvm32.c
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
       * Fortunately there is not yet anything in there that affects migration.
       */
--    if (!kvm_enabled()) {
--        error_setg(errp, "'aarch64' feature cannot be disabled "
++    err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr0,
--                         "unless KVM is enabled");
++                          ARM_CP15_REG32(0, 0, 1, 4));
--        return;
++    err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr1,
--    }
++                          ARM_CP15_REG32(0, 0, 1, 5));
--
++    err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr2,
-     if (value == false) {
++                          ARM_CP15_REG32(0, 0, 1, 6));
-+        if (!kvm_enabled() || !kvm_arm_aarch32_supported(CPU(cpu))) {
++    err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr3,
-+            error_setg(errp, "'aarch64' feature cannot be disabled "
++                          ARM_CP15_REG32(0, 0, 1, 7));
-+                             "unless KVM is enabled and 32-bit EL1 "
++    if (read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr4,
-+                             "is supported");
++                       ARM_CP15_REG32(0, 0, 2, 6))) {
-+            return;
++        /*
-+        }
++         * Older kernels don't support reading ID_MMFR4 (a new in v8
-         unset_feature(&cpu->env, ARM_FEATURE_AARCH64);
++         * register); assume it's zero.
-     } else {
++         */
-         set_feature(&cpu->env, ARM_FEATURE_AARCH64);
++        ahcf->isar.id_mmfr4 = 0;
 +    }
 +
      /*
       * There is no way to read DBGDIDR, because currently 32-bit KVM
       * doesn't implement debug at all. Leave it at zero.
 diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm64.c
 +++ b/target/arm/kvm64.c
-@@ -XXX,XX +XXX,XX @@
- #include "exec/gdbstub.h"
- #include "sysemu/sysemu.h"
- #include "sysemu/kvm.h"
-+#include "sysemu/kvm_int.h"
- #include "kvm_arm.h"
-+#include "hw/boards.h"
- #include "internals.h"
- static bool have_guest_debug;
 @@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
-     return true;
+          */
- }
+         err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_dfr0,
+                               ARM64_SYS_REG(3, 0, 0, 1, 2));
-+bool kvm_arm_aarch32_supported(CPUState *cpu)
++        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr0,
-+{
++                              ARM64_SYS_REG(3, 0, 0, 1, 4));
-+    KVMState *s = KVM_STATE(current_machine->accelerator);
++        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr1,
-+
++                              ARM64_SYS_REG(3, 0, 0, 1, 5));
-+    return kvm_check_extension(s, KVM_CAP_ARM_EL1_32BIT);
++        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr2,
-+}
++                              ARM64_SYS_REG(3, 0, 0, 1, 6));
-+
++        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr3,
- #define ARM_CPU_ID_MPIDR       3, 0, 0, 0, 5
++                              ARM64_SYS_REG(3, 0, 0, 1, 7));
+         err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar0,
- int kvm_arch_init_vcpu(CPUState *cs)
+                               ARM64_SYS_REG(3, 0, 0, 2, 0));
          err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar1,
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
                                ARM64_SYS_REG(3, 0, 0, 2, 4));
          err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar5,
                                ARM64_SYS_REG(3, 0, 0, 2, 5));
 +        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr4,
 +                              ARM64_SYS_REG(3, 0, 0, 2, 6));
          err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar6,
                                ARM64_SYS_REG(3, 0, 0, 2, 7));
 --
 .20.1

-New patch
+[PULL 32/52] target/arm: Use isar_feature function for testing AA32HPD feature
+Now we have moved ID_MMFR4 into the ARMISARegisters struct, we
+can define and use an isar_feature for the presence of the
+ARMv8.2-AA32HPD feature, rather than open-coding the test.
+While we're here, correct a comment typo which missed an 'A'
+from the feature name.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200214175116.9164-20-peter.maydell@linaro.org
+---
+ target/arm/cpu.h    | 5 +++++
+ target/arm/helper.c | 4 ++--
+files changed, 7 insertions(+), 2 deletions(-)
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_pmu_8_4(const ARMISARegisters *id)
+         FIELD_EX32(id->id_dfr0, ID_DFR0, PERFMON) != 0xf;
+ }
++static inline bool isar_feature_aa32_hpd(const ARMISARegisters *id)
++{
++    return FIELD_EX32(id->id_mmfr4, ID_MMFR4, HPDS) != 0;
++}
++
+ /*
+  * 64-bit feature tests via id registers.
+  */
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
+     } else {
+         define_arm_cp_regs(cpu, vmsa_pmsa_cp_reginfo);
+         define_arm_cp_regs(cpu, vmsa_cp_reginfo);
+-        /* TTCBR2 is introduced with ARMv8.2-A32HPD.  */
+-        if (FIELD_EX32(cpu->isar.id_mmfr4, ID_MMFR4, HPDS) != 0) {
++        /* TTCBR2 is introduced with ARMv8.2-AA32HPD.  */
++        if (cpu_isar_feature(aa32_hpd, cpu)) {
+             define_one_arm_cp_reg(cpu, &ttbcr2_reginfo);
+         }
+     }
+--
+.20.1

-[Qemu-devel] [PULL 04/29] target/arm: Factor out 'generate singlestep exception' function
+[PULL 33/52] target/arm: Use FIELD_EX32 for testing 32-bit fields
-Factor out code to 'generate a singlestep exception', which is
+Cut-and-paste errors mean we're using FIELD_EX64() to extract fields from
-currently repeated in four places.
+some 32-bit ID register fields. Use FIELD_EX32() instead. (This makes
+no difference in behaviour, it's just more consistent.)
 To do this we need to also pull the identical copies of the
 gen-exception() function out of translate-a64.c and translate.c
 into translate.h.
 (There is a bug in the code: we're taking the exception to the wrong
 target EL.  This will be simpler to fix if there's only one place to
 do it.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+Message-id: 20200214175116.9164-21-peter.maydell@linaro.org
 Message-id: 20190805130952.4415-2-peter.maydell@linaro.org
 ---
- target/arm/translate.h     | 23 +++++++++++++++++++++++
+ target/arm/cpu.h | 18 +++++++++---------
- target/arm/translate-a64.c | 19 ++-----------------
+file changed, 9 insertions(+), 9 deletions(-)
  target/arm/translate.c     | 20 ++------------------
 files changed, 27 insertions(+), 35 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/cpu.h
-+++ b/target/arm/translate.h
++++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
- #define TARGET_ARM_TRANSLATE_H
+ static inline bool isar_feature_aa32_fp_d32(const ARMISARegisters *id)
+ {
- #include "exec/translator.h"
+     /* Return true if D16-D31 are implemented */
-+#include "internals.h"
+-    return FIELD_EX64(id->mvfr0, MVFR0, SIMDREG) >= 2;
++    return FIELD_EX32(id->mvfr0, MVFR0, SIMDREG) >= 2;
  /* internal defines */
@@ -XXX,XX +XXX,XX @@ static inline void gen_ss_advance(DisasContext *s)
      }
  }
-+static inline void gen_exception(int excp, uint32_t syndrome,
+ static inline bool isar_feature_aa32_fpshvec(const ARMISARegisters *id)
-+                                 uint32_t target_el)
+ {
-+{
+-    return FIELD_EX64(id->mvfr0, MVFR0, FPSHVEC) > 0;
-+    TCGv_i32 tcg_excp = tcg_const_i32(excp);
++    return FIELD_EX32(id->mvfr0, MVFR0, FPSHVEC) > 0;
-+    TCGv_i32 tcg_syn = tcg_const_i32(syndrome);
+ }
-+    TCGv_i32 tcg_el = tcg_const_i32(target_el);
-+
+ static inline bool isar_feature_aa32_fpdp(const ARMISARegisters *id)
-+    gen_helper_exception_with_syndrome(cpu_env, tcg_excp,
+ {
-+                                       tcg_syn, tcg_el);
+     /* Return true if CPU supports double precision floating point */
-+
+-    return FIELD_EX64(id->mvfr0, MVFR0, FPDP) > 0;
-+    tcg_temp_free_i32(tcg_el);
++    return FIELD_EX32(id->mvfr0, MVFR0, FPDP) > 0;
-+    tcg_temp_free_i32(tcg_syn);
+ }
-+    tcg_temp_free_i32(tcg_excp);
 +}
 +
 +/* Generate an architectural singlestep exception */
 +static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
 +{
 +    gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, isv, ex),
 +                  default_exception_el(s));
 +}
 +
  /*
-  * Given a VFP floating point constant encoded into an 8 bit immediate in an
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fpdp(const ARMISARegisters *id)
-  * instruction, expand it to the actual constant value of the specified
+  */
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+ static inline bool isar_feature_aa32_fp16_spconv(const ARMISARegisters *id)
-index XXXXXXX..XXXXXXX 100644
+ {
---- a/target/arm/translate-a64.c
+-    return FIELD_EX64(id->mvfr1, MVFR1, FPHP) > 0;
-+++ b/target/arm/translate-a64.c
++    return FIELD_EX32(id->mvfr1, MVFR1, FPHP) > 0;
@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal(int excp)
      tcg_temp_free_i32(tcg_excp);
  }
--static void gen_exception(int excp, uint32_t syndrome, uint32_t target_el)
+ static inline bool isar_feature_aa32_fp16_dpconv(const ARMISARegisters *id)
 -{
 -    TCGv_i32 tcg_excp = tcg_const_i32(excp);
 -    TCGv_i32 tcg_syn = tcg_const_i32(syndrome);
 -    TCGv_i32 tcg_el = tcg_const_i32(target_el);
 -
 -    gen_helper_exception_with_syndrome(cpu_env, tcg_excp,
 -                                       tcg_syn, tcg_el);
 -    tcg_temp_free_i32(tcg_el);
 -    tcg_temp_free_i32(tcg_syn);
 -    tcg_temp_free_i32(tcg_excp);
 -}
 -
  static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
  {
-     gen_a64_set_pc_im(s->pc - offset);
+-    return FIELD_EX64(id->mvfr1, MVFR1, FPHP) > 1;
-@@ -XXX,XX +XXX,XX @@ static void gen_step_complete_exception(DisasContext *s)
++    return FIELD_EX32(id->mvfr1, MVFR1, FPHP) > 1;
       * of the exception, and our syndrome information is always correct.
       */
      gen_ss_advance(s);
 -    gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, 1, s->is_ldex),
 -                  default_exception_el(s));
 +    gen_swstep_exception(s, 1, s->is_ldex);
      s->base.is_jmp = DISAS_NORETURN;
  }
-@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
+ static inline bool isar_feature_aa32_vsel(const ARMISARegisters *id)
-          * bits should be zero.
+ {
-          */
+-    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 1;
-         assert(dc->base.num_insns == 1);
++    return FIELD_EX32(id->mvfr2, MVFR2, FPMISC) >= 1;
 -        gen_exception(EXCP_UDEF, syn_swstep(dc->ss_same_el, 0, 0),
 -                      default_exception_el(dc));
 +        gen_swstep_exception(dc, 0, 0);
          dc->base.is_jmp = DISAS_NORETURN;
      } else {
          disas_a64_insn(env, dc);
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal(int excp)
      tcg_temp_free_i32(tcg_excp);
  }
--static void gen_exception(int excp, uint32_t syndrome, uint32_t target_el)
+ static inline bool isar_feature_aa32_vcvt_dr(const ARMISARegisters *id)
 -{
 -    TCGv_i32 tcg_excp = tcg_const_i32(excp);
 -    TCGv_i32 tcg_syn = tcg_const_i32(syndrome);
 -    TCGv_i32 tcg_el = tcg_const_i32(target_el);
 -
 -    gen_helper_exception_with_syndrome(cpu_env, tcg_excp,
 -                                       tcg_syn, tcg_el);
 -
 -    tcg_temp_free_i32(tcg_el);
 -    tcg_temp_free_i32(tcg_syn);
 -    tcg_temp_free_i32(tcg_excp);
 -}
 -
  static void gen_step_complete_exception(DisasContext *s)
  {
-     /* We just completed step of an insn. Move from Active-not-pending
+-    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 2;
-@@ -XXX,XX +XXX,XX @@ static void gen_step_complete_exception(DisasContext *s)
++    return FIELD_EX32(id->mvfr2, MVFR2, FPMISC) >= 2;
       * of the exception, and our syndrome information is always correct.
       */
      gen_ss_advance(s);
 -    gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, 1, s->is_ldex),
 -                  default_exception_el(s));
 +    gen_swstep_exception(s, 1, s->is_ldex);
      s->base.is_jmp = DISAS_NORETURN;
  }
-@@ -XXX,XX +XXX,XX @@ static bool arm_pre_translate_insn(DisasContext *dc)
+ static inline bool isar_feature_aa32_vrint(const ARMISARegisters *id)
-          * bits should be zero.
+ {
-          */
+-    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 3;
-         assert(dc->base.num_insns == 1);
++    return FIELD_EX32(id->mvfr2, MVFR2, FPMISC) >= 3;
--        gen_exception(EXCP_UDEF, syn_swstep(dc->ss_same_el, 0, 0),
+ }
--                      default_exception_el(dc));
-+        gen_swstep_exception(dc, 0, 0);
+ static inline bool isar_feature_aa32_vminmaxnm(const ARMISARegisters *id)
-         dc->base.is_jmp = DISAS_NORETURN;
+ {
-         return true;
+-    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 4;
-     }
++    return FIELD_EX32(id->mvfr2, MVFR2, FPMISC) >= 4;
  }
  static inline bool isar_feature_aa32_pan(const ARMISARegisters *id)
 --
 .20.1

-[Qemu-devel] [PULL 01/29] target/arm: generate a custom MIDR for -cpu max
+[PULL 34/52] target/arm: Correctly implement ACTLR2, HACTLR2
-From: Alex Bennée <alex.bennee@linaro.org>
+The ACTLR2 and HACTLR2 AArch32 system registers didn't exist in ARMv7
 or the original ARMv8.  They were later added as optional registers,
 whose presence is signaled by the ID_MMFR4.AC2 field.  From ARMv8.2
 they are mandatory (ie ID_MMFR4.AC2 must be non-zero).
-While most features are now detected by probing the ID_* registers
+We implemented HACTLR2 in commit 0e0456ab8895a5e85, but we
-kernels can (and do) use MIDR_EL1 for working out of they have to
+incorrectly made it exist for all v8 CPUs, and we didn't implement
-apply errata. This can trip up warnings in the kernel as it tries to
+ACTLR2 at all.
 work out if it should apply workarounds to features that don't
 actually exist in the reported CPU type.
-Avoid this problem by synthesising our own MIDR value.
+Sort this out by implementing both registers only when they are
 supposed to exist, and setting the ID_MMFR4 bit for -cpu max.
-Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
+Note that this removes HACTLR2 from our Cortex-A53, -A47 and -A72
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+CPU models; this is correct, because those CPUs do not implement
 this register.
 Fixes: 0e0456ab8895a5e85
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190726113950.7499-1-alex.bennee@linaro.org
+Message-id: 20200214175116.9164-22-peter.maydell@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h   |  6 ++++++
+ target/arm/cpu.h    |  5 +++++
- target/arm/cpu64.c | 19 +++++++++++++++++++
+ target/arm/cpu.c    |  1 +
-files changed, 25 insertions(+)
+ target/arm/cpu64.c  |  4 ++++
  target/arm/helper.c | 32 +++++++++++++++++++++++---------
 files changed, 33 insertions(+), 9 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ FIELD(V7M_FPCCR, ASPEN, 31, 1)
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_hpd(const ARMISARegisters *id)
      return FIELD_EX32(id->id_mmfr4, ID_MMFR4, HPDS) != 0;
  }
 +static inline bool isar_feature_aa32_ac2(const ARMISARegisters *id)
 +{
 +    return FIELD_EX32(id->id_mmfr4, ID_MMFR4, AC2) != 0;
 +}
 +
  /*
-  * System register ID fields.
+  * 64-bit feature tests via id registers.
   */
-+FIELD(MIDR_EL1, REVISION, 0, 4)
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
-+FIELD(MIDR_EL1, PARTNUM, 4, 12)
+index XXXXXXX..XXXXXXX 100644
-+FIELD(MIDR_EL1, ARCHITECTURE, 16, 4)
+--- a/target/arm/cpu.c
-+FIELD(MIDR_EL1, VARIANT, 20, 4)
++++ b/target/arm/cpu.c
-+FIELD(MIDR_EL1, IMPLEMENTER, 24, 8)
+@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
-+
- FIELD(ID_ISAR0, SWAP, 0, 4)
+             t = cpu->isar.id_mmfr4;
- FIELD(ID_ISAR0, BITCOUNT, 4, 4)
+             t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
- FIELD(ID_ISAR0, BITFIELD, 8, 4)
++            t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
              cpu->isar.id_mmfr4 = t;
          }
  #endif
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
 @@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
-         uint32_t u;
+         u = FIELD_DP32(u, ID_MMFR3, PAN, 2); /* ATS1E1 */
-         aarch64_a57_initfn(obj);
+         cpu->isar.id_mmfr3 = u;
-+        /*
++        u = cpu->isar.id_mmfr4;
-+         * Reset MIDR so the guest doesn't mistake our 'max' CPU type for a real
++        u = FIELD_DP32(u, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
-+         * one and try to apply errata workarounds or use impdef features we
++        cpu->isar.id_mmfr4 = u;
 +         * don't provide.
 +         * An IMPLEMENTER field of 0 means "reserved for software use";
 +         * ARCHITECTURE must be 0xf indicating "v7 or later, check ID registers
 +         * to see which features are present";
 +         * the VARIANT, PARTNUM and REVISION fields are all implementation
 +         * defined and we choose to define PARTNUM just in case guest
 +         * code needs to distinguish this QEMU CPU from other software
 +         * implementations, though this shouldn't be needed.
 +         */
 +        t = FIELD_DP64(0, MIDR_EL1, IMPLEMENTER, 0);
 +        t = FIELD_DP64(t, MIDR_EL1, ARCHITECTURE, 0xf);
 +        t = FIELD_DP64(t, MIDR_EL1, PARTNUM, 'Q');
 +        t = FIELD_DP64(t, MIDR_EL1, VARIANT, 0);
 +        t = FIELD_DP64(t, MIDR_EL1, REVISION, 0);
 +        cpu->midr = t;
 +
-         t = cpu->isar.id_aa64isar0;
+         u = cpu->isar.id_aa64dfr0;
-         t = FIELD_DP64(t, ID_AA64ISAR0, AES, 2); /* AES + PMULL */
+         u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
-         t = FIELD_DP64(t, ID_AA64ISAR0, SHA1, 1);
+         cpu->isar.id_aa64dfr0 = u;
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo ats1cp_reginfo[] = {
  };
  #endif
 +/*
 + * ACTLR2 and HACTLR2 map to ACTLR_EL1[63:32] and
 + * ACTLR_EL2[63:32]. They exist only if the ID_MMFR4.AC2 field
 + * is non-zero, which is never for ARMv7, optionally in ARMv8
 + * and mandatorily for ARMv8.2 and up.
 + * ACTLR2 is banked for S and NS if EL3 is AArch32. Since QEMU's
 + * implementation is RAZ/WI we can ignore this detail, as we
 + * do for ACTLR.
 + */
 +static const ARMCPRegInfo actlr2_hactlr2_reginfo[] = {
 +    { .name = "ACTLR2", .state = ARM_CP_STATE_AA32,
 +      .cp = 15, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 3,
 +      .access = PL1_RW, .type = ARM_CP_CONST,
 +      .resetvalue = 0 },
 +    { .name = "HACTLR2", .state = ARM_CP_STATE_AA32,
 +      .cp = 15, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 3,
 +      .access = PL2_RW, .type = ARM_CP_CONST,
 +      .resetvalue = 0 },
 +    REGINFO_SENTINEL
 +};
 +
  void register_cp_regs_for_features(ARMCPU *cpu)
  {
      /* Register all the coprocessor registers based on feature bits */
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
              REGINFO_SENTINEL
          };
          define_arm_cp_regs(cpu, auxcr_reginfo);
 -        if (arm_feature(env, ARM_FEATURE_V8)) {
 -            /* HACTLR2 maps to ACTLR_EL2[63:32] and is not in ARMv7 */
 -            ARMCPRegInfo hactlr2_reginfo = {
 -                .name = "HACTLR2", .state = ARM_CP_STATE_AA32,
 -                .cp = 15, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 3,
 -                .access = PL2_RW, .type = ARM_CP_CONST,
 -                .resetvalue = 0
 -            };
 -            define_one_arm_cp_reg(cpu, &hactlr2_reginfo);
 +        if (cpu_isar_feature(aa32_ac2, cpu)) {
 +            define_arm_cp_regs(cpu, actlr2_hactlr2_reginfo);
          }
      }
 --
 .20.1

-New patch
+[PULL 35/52] hw: usb: hcd-ohci: Move OHCISysBusState and TYPE_SYSBUS_OHCI to include file
+From: Guenter Roeck <linux@roeck-us.net>
+We need to be able to use OHCISysBusState outside hcd-ohci.c, so move it
+to its include file.
+Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
+Signed-off-by: Guenter Roeck <linux@roeck-us.net>
+Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
+Message-id: 20200217204812.9857-2-linux@roeck-us.net
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/usb/hcd-ohci.h | 16 ++++++++++++++++
+ hw/usb/hcd-ohci.c | 15 ---------------
+files changed, 16 insertions(+), 15 deletions(-)
+diff --git a/hw/usb/hcd-ohci.h b/hw/usb/hcd-ohci.h
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/usb/hcd-ohci.h
++++ b/hw/usb/hcd-ohci.h
+@@ -XXX,XX +XXX,XX @@
+ #define HCD_OHCI_H
+ #include "sysemu/dma.h"
++#include "hw/usb.h"
+ /* Number of Downstream Ports on the root hub: */
+ #define OHCI_MAX_PORTS 15
+@@ -XXX,XX +XXX,XX @@ typedef struct OHCIState {
+     void (*ohci_die)(struct OHCIState *ohci);
+ } OHCIState;
++#define TYPE_SYSBUS_OHCI "sysbus-ohci"
++#define SYSBUS_OHCI(obj) OBJECT_CHECK(OHCISysBusState, (obj), TYPE_SYSBUS_OHCI)
++
++typedef struct {
++    /*< private >*/
++    SysBusDevice parent_obj;
++    /*< public >*/
++
++    OHCIState ohci;
++    char *masterbus;
++    uint32_t num_ports;
++    uint32_t firstport;
++    dma_addr_t dma_offset;
++} OHCISysBusState;
++
+ extern const VMStateDescription vmstate_ohci_state;
+ void usb_ohci_init(OHCIState *ohci, DeviceState *dev, uint32_t num_ports,
+diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/usb/hcd-ohci.c
++++ b/hw/usb/hcd-ohci.c
+@@ -XXX,XX +XXX,XX @@ void ohci_sysbus_die(struct OHCIState *ohci)
+     ohci_bus_stop(ohci);
+ }
+-#define TYPE_SYSBUS_OHCI "sysbus-ohci"
+-#define SYSBUS_OHCI(obj) OBJECT_CHECK(OHCISysBusState, (obj), TYPE_SYSBUS_OHCI)
+-
+-typedef struct {
+-    /*< private >*/
+-    SysBusDevice parent_obj;
+-    /*< public >*/
+-
+-    OHCIState ohci;
+-    char *masterbus;
+-    uint32_t num_ports;
+-    uint32_t firstport;
+-    dma_addr_t dma_offset;
+-} OHCISysBusState;
+-
+ static void ohci_realize_pxa(DeviceState *dev, Error **errp)
+ {
+     OHCISysBusState *s = SYSBUS_OHCI(dev);
+--
+.20.1

-New patch
+[PULL 36/52] hcd-ehci: Introduce "companion-enable" sysbus property
+From: Guenter Roeck <linux@roeck-us.net>
+We'll use this property in a follow-up patch to insantiate an EHCI
+bus with companion support.
+Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
+Signed-off-by: Guenter Roeck <linux@roeck-us.net>
+Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
+Message-id: 20200217204812.9857-3-linux@roeck-us.net
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/usb/hcd-ehci-sysbus.c | 2 ++
+file changed, 2 insertions(+)
+diff --git a/hw/usb/hcd-ehci-sysbus.c b/hw/usb/hcd-ehci-sysbus.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/usb/hcd-ehci-sysbus.c
++++ b/hw/usb/hcd-ehci-sysbus.c
+@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_ehci_sysbus = {
+ static Property ehci_sysbus_properties[] = {
+     DEFINE_PROP_UINT32("maxframes", EHCISysBusState, ehci.maxframes, 128),
++    DEFINE_PROP_BOOL("companion-enable", EHCISysBusState, ehci.companion_enable,
++                     false),
+     DEFINE_PROP_END_OF_LIST(),
+ };
+--
+.20.1

-[Qemu-devel] [PULL 02/29] hw/misc/zynq_slcr: use standard register definition
+[PULL 37/52] arm: allwinner: Wire up USB ports
-From: Damien Hedde <damien.hedde@greensocs.com>
+From: Guenter Roeck <linux@roeck-us.net>
-Replace the zynq_slcr registers enum and macros using the
+Instantiate EHCI and OHCI controllers on Allwinner A10. OHCI ports are
-hw/registerfields.h macros.
+modeled as companions of the respective EHCI ports.
-Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
+With this patch applied, USB controllers are discovered and instantiated
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+when booting the cubieboard machine with a recent Linux kernel.
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Message-id: 20190729145654.14644-30-damien.hedde@greensocs.com
+ehci-platform 1c14000.usb: EHCI Host Controller
 ehci-platform 1c14000.usb: new USB bus registered, assigned bus number 1
 ehci-platform 1c14000.usb: irq 26, io mem 0x01c14000
 ehci-platform 1c14000.usb: USB 2.0 started, EHCI 1.00
 ehci-platform 1c1c000.usb: EHCI Host Controller
 ehci-platform 1c1c000.usb: new USB bus registered, assigned bus number 2
 ehci-platform 1c1c000.usb: irq 31, io mem 0x01c1c000
 ehci-platform 1c1c000.usb: USB 2.0 started, EHCI 1.00
 ohci-platform 1c14400.usb: Generic Platform OHCI controller
 ohci-platform 1c14400.usb: new USB bus registered, assigned bus number 3
 ohci-platform 1c14400.usb: irq 27, io mem 0x01c14400
 ohci-platform 1c1c400.usb: Generic Platform OHCI controller
 ohci-platform 1c1c400.usb: new USB bus registered, assigned bus number 4
 ohci-platform 1c1c400.usb: irq 32, io mem 0x01c1c400
 usb 2-1: new high-speed USB device number 2 using ehci-platform
 usb-storage 2-1:1.0: USB Mass Storage device detected
 scsi host1: usb-storage 2-1:1.0
 usb 3-1: new full-speed USB device number 2 using ohci-platform
 input: QEMU QEMU USB Mouse as /devices/platform/soc/1c14400.usb/usb3/3-1/3-1:1.0/0003:0627:0001.0001/input/input0
 Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
 Signed-off-by: Guenter Roeck <linux@roeck-us.net>
 Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
 Message-id: 20200217204812.9857-4-linux@roeck-us.net
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/misc/zynq_slcr.c | 450 ++++++++++++++++++++++----------------------
+ include/hw/arm/allwinner-a10.h |  6 +++++
-file changed, 225 insertions(+), 225 deletions(-)
+ hw/arm/allwinner-a10.c         | 43 ++++++++++++++++++++++++++++++++++
 files changed, 49 insertions(+)
-diff --git a/hw/misc/zynq_slcr.c b/hw/misc/zynq_slcr.c
+diff --git a/include/hw/arm/allwinner-a10.h b/include/hw/arm/allwinner-a10.h
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/zynq_slcr.c
+--- a/include/hw/arm/allwinner-a10.h
-+++ b/hw/misc/zynq_slcr.c
++++ b/include/hw/arm/allwinner-a10.h
 @@ -XXX,XX +XXX,XX @@
+ #include "hw/intc/allwinner-a10-pic.h"
+ #include "hw/net/allwinner_emac.h"
+ #include "hw/ide/ahci.h"
++#include "hw/usb/hcd-ohci.h"
++#include "hw/usb/hcd-ehci.h"
+ #include "target/arm/cpu.h"
+ #define AW_A10_SDRAM_BASE       0x40000000
++#define AW_A10_NUM_USB          2
++
+ #define TYPE_AW_A10 "allwinner-a10"
+ #define AW_A10(obj) OBJECT_CHECK(AwA10State, (obj), TYPE_AW_A10)
+@@ -XXX,XX +XXX,XX @@ typedef struct AwA10State {
+     AwEmacState emac;
+     AllwinnerAHCIState sata;
+     MemoryRegion sram_a;
++    EHCISysBusState ehci[AW_A10_NUM_USB];
++    OHCISysBusState ohci[AW_A10_NUM_USB];
+ } AwA10State;
+ #endif
+diff --git a/hw/arm/allwinner-a10.c b/hw/arm/allwinner-a10.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/allwinner-a10.c
++++ b/hw/arm/allwinner-a10.c
+@@ -XXX,XX +XXX,XX @@
+ #include "hw/arm/allwinner-a10.h"
+ #include "hw/misc/unimp.h"
  #include "sysemu/sysemu.h"
- #include "qemu/log.h"
++#include "hw/boards.h"
- #include "qemu/module.h"
++#include "hw/usb/hcd-ohci.h"
-+#include "hw/registerfields.h"
+ #define AW_A10_PIC_REG_BASE     0x01c20400
- #ifndef ZYNQ_SLCR_ERR_DEBUG
+ #define AW_A10_PIT_REG_BASE     0x01c20c00
- #define ZYNQ_SLCR_ERR_DEBUG 0
+ #define AW_A10_UART0_REG_BASE   0x01c28000
-@@ -XXX,XX +XXX,XX @@
+ #define AW_A10_EMAC_BASE        0x01c0b000
- #define XILINX_LOCK_KEY 0x767b
++#define AW_A10_EHCI_BASE        0x01c14000
- #define XILINX_UNLOCK_KEY 0xdf0d
++#define AW_A10_OHCI_BASE        0x01c14400
+ #define AW_A10_SATA_BASE        0x01c18000
--#define R_PSS_RST_CTRL_SOFT_RST 0x1
-+REG32(SCL, 0x000)
+ static void aw_a10_init(Object *obj)
-+REG32(LOCK, 0x004)
+@@ -XXX,XX +XXX,XX @@ static void aw_a10_init(Object *obj)
-+REG32(UNLOCK, 0x008)
-+REG32(LOCKSTA, 0x00c)
+     sysbus_init_child_obj(obj, "sata", &s->sata, sizeof(s->sata),
+                           TYPE_ALLWINNER_AHCI);
--enum {
++
--    SCL             = 0x000 / 4,
++    if (machine_usb(current_machine)) {
--    LOCK,
++        int i;
--    UNLOCK,
++
--    LOCKSTA,
++        for (i = 0; i < AW_A10_NUM_USB; i++) {
-+REG32(ARM_PLL_CTRL, 0x100)
++            sysbus_init_child_obj(obj, "ehci[*]", OBJECT(&s->ehci[i]),
-+REG32(DDR_PLL_CTRL, 0x104)
++                                  sizeof(s->ehci[i]), TYPE_PLATFORM_EHCI);
-+REG32(IO_PLL_CTRL, 0x108)
++            sysbus_init_child_obj(obj, "ohci[*]", OBJECT(&s->ohci[i]),
-+REG32(PLL_STATUS, 0x10c)
++                                  sizeof(s->ohci[i]), TYPE_SYSBUS_OHCI);
-+REG32(ARM_PLL_CFG, 0x110)
++        }
-+REG32(DDR_PLL_CFG, 0x114)
++    }
 +REG32(IO_PLL_CFG, 0x118)
 -    ARM_PLL_CTRL    = 0x100 / 4,
 -    DDR_PLL_CTRL,
 -    IO_PLL_CTRL,
 -    PLL_STATUS,
 -    ARM_PLL_CFG,
 -    DDR_PLL_CFG,
 -    IO_PLL_CFG,
 -
 -    ARM_CLK_CTRL    = 0x120 / 4,
 -    DDR_CLK_CTRL,
 -    DCI_CLK_CTRL,
 -    APER_CLK_CTRL,
 -    USB0_CLK_CTRL,
 -    USB1_CLK_CTRL,
 -    GEM0_RCLK_CTRL,
 -    GEM1_RCLK_CTRL,
 -    GEM0_CLK_CTRL,
 -    GEM1_CLK_CTRL,
 -    SMC_CLK_CTRL,
 -    LQSPI_CLK_CTRL,
 -    SDIO_CLK_CTRL,
 -    UART_CLK_CTRL,
 -    SPI_CLK_CTRL,
 -    CAN_CLK_CTRL,
 -    CAN_MIOCLK_CTRL,
 -    DBG_CLK_CTRL,
 -    PCAP_CLK_CTRL,
 -    TOPSW_CLK_CTRL,
 +REG32(ARM_CLK_CTRL, 0x120)
 +REG32(DDR_CLK_CTRL, 0x124)
 +REG32(DCI_CLK_CTRL, 0x128)
 +REG32(APER_CLK_CTRL, 0x12c)
 +REG32(USB0_CLK_CTRL, 0x130)
 +REG32(USB1_CLK_CTRL, 0x134)
 +REG32(GEM0_RCLK_CTRL, 0x138)
 +REG32(GEM1_RCLK_CTRL, 0x13c)
 +REG32(GEM0_CLK_CTRL, 0x140)
 +REG32(GEM1_CLK_CTRL, 0x144)
 +REG32(SMC_CLK_CTRL, 0x148)
 +REG32(LQSPI_CLK_CTRL, 0x14c)
 +REG32(SDIO_CLK_CTRL, 0x150)
 +REG32(UART_CLK_CTRL, 0x154)
 +REG32(SPI_CLK_CTRL, 0x158)
 +REG32(CAN_CLK_CTRL, 0x15c)
 +REG32(CAN_MIOCLK_CTRL, 0x160)
 +REG32(DBG_CLK_CTRL, 0x164)
 +REG32(PCAP_CLK_CTRL, 0x168)
 +REG32(TOPSW_CLK_CTRL, 0x16c)
  #define FPGA_CTRL_REGS(n, start) \
 -    FPGA ## n ## _CLK_CTRL = (start) / 4, \
 -    FPGA ## n ## _THR_CTRL, \
 -    FPGA ## n ## _THR_CNT, \
 -    FPGA ## n ## _THR_STA,
 -    FPGA_CTRL_REGS(0, 0x170)
 -    FPGA_CTRL_REGS(1, 0x180)
 -    FPGA_CTRL_REGS(2, 0x190)
 -    FPGA_CTRL_REGS(3, 0x1a0)
 +    REG32(FPGA ## n ## _CLK_CTRL, (start)) \
 +    REG32(FPGA ## n ## _THR_CTRL, (start) + 0x4)\
 +    REG32(FPGA ## n ## _THR_CNT,  (start) + 0x8)\
 +    REG32(FPGA ## n ## _THR_STA,  (start) + 0xc)
 +FPGA_CTRL_REGS(0, 0x170)
 +FPGA_CTRL_REGS(1, 0x180)
 +FPGA_CTRL_REGS(2, 0x190)
 +FPGA_CTRL_REGS(3, 0x1a0)
 -    BANDGAP_TRIP    = 0x1b8 / 4,
 -    PLL_PREDIVISOR  = 0x1c0 / 4,
 -    CLK_621_TRUE,
 +REG32(BANDGAP_TRIP, 0x1b8)
 +REG32(PLL_PREDIVISOR, 0x1c0)
 +REG32(CLK_621_TRUE, 0x1c4)
 -    PSS_RST_CTRL    = 0x200 / 4,
 -    DDR_RST_CTRL,
 -    TOPSW_RESET_CTRL,
 -    DMAC_RST_CTRL,
 -    USB_RST_CTRL,
 -    GEM_RST_CTRL,
 -    SDIO_RST_CTRL,
 -    SPI_RST_CTRL,
 -    CAN_RST_CTRL,
 -    I2C_RST_CTRL,
 -    UART_RST_CTRL,
 -    GPIO_RST_CTRL,
 -    LQSPI_RST_CTRL,
 -    SMC_RST_CTRL,
 -    OCM_RST_CTRL,
 -    FPGA_RST_CTRL   = 0x240 / 4,
 -    A9_CPU_RST_CTRL,
 +REG32(PSS_RST_CTRL, 0x200)
 +    FIELD(PSS_RST_CTRL, SOFT_RST, 0, 1)
 +REG32(DDR_RST_CTRL, 0x204)
 +REG32(TOPSW_RESET_CTRL, 0x208)
 +REG32(DMAC_RST_CTRL, 0x20c)
 +REG32(USB_RST_CTRL, 0x210)
 +REG32(GEM_RST_CTRL, 0x214)
 +REG32(SDIO_RST_CTRL, 0x218)
 +REG32(SPI_RST_CTRL, 0x21c)
 +REG32(CAN_RST_CTRL, 0x220)
 +REG32(I2C_RST_CTRL, 0x224)
 +REG32(UART_RST_CTRL, 0x228)
 +REG32(GPIO_RST_CTRL, 0x22c)
 +REG32(LQSPI_RST_CTRL, 0x230)
 +REG32(SMC_RST_CTRL, 0x234)
 +REG32(OCM_RST_CTRL, 0x238)
 +REG32(FPGA_RST_CTRL, 0x240)
 +REG32(A9_CPU_RST_CTRL, 0x244)
 -    RS_AWDT_CTRL    = 0x24c / 4,
 -    RST_REASON,
 +REG32(RS_AWDT_CTRL, 0x24c)
 +REG32(RST_REASON, 0x250)
 -    REBOOT_STATUS   = 0x258 / 4,
 -    BOOT_MODE,
 +REG32(REBOOT_STATUS, 0x258)
 +REG32(BOOT_MODE, 0x25c)
 -    APU_CTRL        = 0x300 / 4,
 -    WDT_CLK_SEL,
 +REG32(APU_CTRL, 0x300)
 +REG32(WDT_CLK_SEL, 0x304)
 -    TZ_DMA_NS       = 0x440 / 4,
 -    TZ_DMA_IRQ_NS,
 -    TZ_DMA_PERIPH_NS,
 +REG32(TZ_DMA_NS, 0x440)
 +REG32(TZ_DMA_IRQ_NS, 0x444)
 +REG32(TZ_DMA_PERIPH_NS, 0x448)
 -    PSS_IDCODE      = 0x530 / 4,
 +REG32(PSS_IDCODE, 0x530)
 -    DDR_URGENT      = 0x600 / 4,
 -    DDR_CAL_START   = 0x60c / 4,
 -    DDR_REF_START   = 0x614 / 4,
 -    DDR_CMD_STA,
 -    DDR_URGENT_SEL,
 -    DDR_DFI_STATUS,
 +REG32(DDR_URGENT, 0x600)
 +REG32(DDR_CAL_START, 0x60c)
 +REG32(DDR_REF_START, 0x614)
 +REG32(DDR_CMD_STA, 0x618)
 +REG32(DDR_URGENT_SEL, 0x61c)
 +REG32(DDR_DFI_STATUS, 0x620)
 -    MIO             = 0x700 / 4,
 +REG32(MIO, 0x700)
  #define MIO_LENGTH 54
 -    MIO_LOOPBACK    = 0x804 / 4,
 -    MIO_MST_TRI0,
 -    MIO_MST_TRI1,
 +REG32(MIO_LOOPBACK, 0x804)
 +REG32(MIO_MST_TRI0, 0x808)
 +REG32(MIO_MST_TRI1, 0x80c)
 -    SD0_WP_CD_SEL   = 0x830 / 4,
 -    SD1_WP_CD_SEL,
 +REG32(SD0_WP_CD_SEL, 0x830)
 +REG32(SD1_WP_CD_SEL, 0x834)
 -    LVL_SHFTR_EN    = 0x900 / 4,
 -    OCM_CFG         = 0x910 / 4,
 +REG32(LVL_SHFTR_EN, 0x900)
 +REG32(OCM_CFG, 0x910)
 -    CPU_RAM         = 0xa00 / 4,
 +REG32(CPU_RAM, 0xa00)
 -    IOU             = 0xa30 / 4,
 +REG32(IOU, 0xa30)
 -    DMAC_RAM        = 0xa50 / 4,
 +REG32(DMAC_RAM, 0xa50)
 -    AFI0            = 0xa60 / 4,
 -    AFI1 = AFI0 + 3,
 -    AFI2 = AFI1 + 3,
 -    AFI3 = AFI2 + 3,
 +REG32(AFI0, 0xa60)
 +REG32(AFI1, 0xa6c)
 +REG32(AFI2, 0xa78)
 +REG32(AFI3, 0xa84)
  #define AFI_LENGTH 3
 -    OCM             = 0xa90 / 4,
 +REG32(OCM, 0xa90)
 -    DEVCI_RAM       = 0xaa0 / 4,
 +REG32(DEVCI_RAM, 0xaa0)
 -    CSG_RAM         = 0xab0 / 4,
 +REG32(CSG_RAM, 0xab0)
 -    GPIOB_CTRL      = 0xb00 / 4,
 -    GPIOB_CFG_CMOS18,
 -    GPIOB_CFG_CMOS25,
 -    GPIOB_CFG_CMOS33,
 -    GPIOB_CFG_HSTL  = 0xb14 / 4,
 -    GPIOB_DRVR_BIAS_CTRL,
 +REG32(GPIOB_CTRL, 0xb00)
 +REG32(GPIOB_CFG_CMOS18, 0xb04)
 +REG32(GPIOB_CFG_CMOS25, 0xb08)
 +REG32(GPIOB_CFG_CMOS33, 0xb0c)
 +REG32(GPIOB_CFG_HSTL, 0xb14)
 +REG32(GPIOB_DRVR_BIAS_CTRL, 0xb18)
 -    DDRIOB          = 0xb40 / 4,
 +REG32(DDRIOB, 0xb40)
  #define DDRIOB_LENGTH 14
 -};
  #define ZYNQ_SLCR_MMIO_SIZE     0x1000
  #define ZYNQ_SLCR_NUM_REGS      (ZYNQ_SLCR_MMIO_SIZE / 4)
@@ -XXX,XX +XXX,XX @@ static void zynq_slcr_reset(DeviceState *d)
      DB_PRINT("RESET\n");
 -    s->regs[LOCKSTA] = 1;
 +    s->regs[R_LOCKSTA] = 1;
      /* 0x100 - 0x11C */
 -    s->regs[ARM_PLL_CTRL]   = 0x0001A008;
 -    s->regs[DDR_PLL_CTRL]   = 0x0001A008;
 -    s->regs[IO_PLL_CTRL]    = 0x0001A008;
 -    s->regs[PLL_STATUS]     = 0x0000003F;
 -    s->regs[ARM_PLL_CFG]    = 0x00014000;
 -    s->regs[DDR_PLL_CFG]    = 0x00014000;
 -    s->regs[IO_PLL_CFG]     = 0x00014000;
 +    s->regs[R_ARM_PLL_CTRL]   = 0x0001A008;
 +    s->regs[R_DDR_PLL_CTRL]   = 0x0001A008;
 +    s->regs[R_IO_PLL_CTRL]    = 0x0001A008;
 +    s->regs[R_PLL_STATUS]     = 0x0000003F;
 +    s->regs[R_ARM_PLL_CFG]    = 0x00014000;
 +    s->regs[R_DDR_PLL_CFG]    = 0x00014000;
 +    s->regs[R_IO_PLL_CFG]     = 0x00014000;
      /* 0x120 - 0x16C */
 -    s->regs[ARM_CLK_CTRL]   = 0x1F000400;
 -    s->regs[DDR_CLK_CTRL]   = 0x18400003;
 -    s->regs[DCI_CLK_CTRL]   = 0x01E03201;
 -    s->regs[APER_CLK_CTRL]  = 0x01FFCCCD;
 -    s->regs[USB0_CLK_CTRL]  = s->regs[USB1_CLK_CTRL]    = 0x00101941;
 -    s->regs[GEM0_RCLK_CTRL] = s->regs[GEM1_RCLK_CTRL]   = 0x00000001;
 -    s->regs[GEM0_CLK_CTRL]  = s->regs[GEM1_CLK_CTRL]    = 0x00003C01;
 -    s->regs[SMC_CLK_CTRL]   = 0x00003C01;
 -    s->regs[LQSPI_CLK_CTRL] = 0x00002821;
 -    s->regs[SDIO_CLK_CTRL]  = 0x00001E03;
 -    s->regs[UART_CLK_CTRL]  = 0x00003F03;
 -    s->regs[SPI_CLK_CTRL]   = 0x00003F03;
 -    s->regs[CAN_CLK_CTRL]   = 0x00501903;
 -    s->regs[DBG_CLK_CTRL]   = 0x00000F03;
 -    s->regs[PCAP_CLK_CTRL]  = 0x00000F01;
 +    s->regs[R_ARM_CLK_CTRL]   = 0x1F000400;
 +    s->regs[R_DDR_CLK_CTRL]   = 0x18400003;
 +    s->regs[R_DCI_CLK_CTRL]   = 0x01E03201;
 +    s->regs[R_APER_CLK_CTRL]  = 0x01FFCCCD;
 +    s->regs[R_USB0_CLK_CTRL]  = s->regs[R_USB1_CLK_CTRL]  = 0x00101941;
 +    s->regs[R_GEM0_RCLK_CTRL] = s->regs[R_GEM1_RCLK_CTRL] = 0x00000001;
 +    s->regs[R_GEM0_CLK_CTRL]  = s->regs[R_GEM1_CLK_CTRL]  = 0x00003C01;
 +    s->regs[R_SMC_CLK_CTRL]   = 0x00003C01;
 +    s->regs[R_LQSPI_CLK_CTRL] = 0x00002821;
 +    s->regs[R_SDIO_CLK_CTRL]  = 0x00001E03;
 +    s->regs[R_UART_CLK_CTRL]  = 0x00003F03;
 +    s->regs[R_SPI_CLK_CTRL]   = 0x00003F03;
 +    s->regs[R_CAN_CLK_CTRL]   = 0x00501903;
 +    s->regs[R_DBG_CLK_CTRL]   = 0x00000F03;
 +    s->regs[R_PCAP_CLK_CTRL]  = 0x00000F01;
      /* 0x170 - 0x1AC */
 -    s->regs[FPGA0_CLK_CTRL] = s->regs[FPGA1_CLK_CTRL] = s->regs[FPGA2_CLK_CTRL]
 -                            = s->regs[FPGA3_CLK_CTRL] = 0x00101800;
 -    s->regs[FPGA0_THR_STA] = s->regs[FPGA1_THR_STA] = s->regs[FPGA2_THR_STA]
 -                           = s->regs[FPGA3_THR_STA] = 0x00010000;
 +    s->regs[R_FPGA0_CLK_CTRL] = s->regs[R_FPGA1_CLK_CTRL]
 +                              = s->regs[R_FPGA2_CLK_CTRL]
 +                              = s->regs[R_FPGA3_CLK_CTRL] = 0x00101800;
 +    s->regs[R_FPGA0_THR_STA] = s->regs[R_FPGA1_THR_STA]
 +                             = s->regs[R_FPGA2_THR_STA]
 +                             = s->regs[R_FPGA3_THR_STA] = 0x00010000;
      /* 0x1B0 - 0x1D8 */
 -    s->regs[BANDGAP_TRIP]   = 0x0000001F;
 -    s->regs[PLL_PREDIVISOR] = 0x00000001;
 -    s->regs[CLK_621_TRUE]   = 0x00000001;
 +    s->regs[R_BANDGAP_TRIP]   = 0x0000001F;
 +    s->regs[R_PLL_PREDIVISOR] = 0x00000001;
 +    s->regs[R_CLK_621_TRUE]   = 0x00000001;
      /* 0x200 - 0x25C */
 -    s->regs[FPGA_RST_CTRL]  = 0x01F33F0F;
 -    s->regs[RST_REASON]     = 0x00000040;
 +    s->regs[R_FPGA_RST_CTRL]  = 0x01F33F0F;
 +    s->regs[R_RST_REASON]     = 0x00000040;
 -    s->regs[BOOT_MODE]      = 0x00000001;
 +    s->regs[R_BOOT_MODE]      = 0x00000001;
      /* 0x700 - 0x7D4 */
      for (i = 0; i < 54; i++) {
 -        s->regs[MIO + i] = 0x00001601;
 +        s->regs[R_MIO + i] = 0x00001601;
      }
      for (i = 2; i <= 8; i++) {
 -        s->regs[MIO + i] = 0x00000601;
 +        s->regs[R_MIO + i] = 0x00000601;
      }
 -    s->regs[MIO_MST_TRI0] = s->regs[MIO_MST_TRI1] = 0xFFFFFFFF;
 +    s->regs[R_MIO_MST_TRI0] = s->regs[R_MIO_MST_TRI1] = 0xFFFFFFFF;
 -    s->regs[CPU_RAM + 0] = s->regs[CPU_RAM + 1] = s->regs[CPU_RAM + 3]
 -                         = s->regs[CPU_RAM + 4] = s->regs[CPU_RAM + 7]
 -                         = 0x00010101;
 -    s->regs[CPU_RAM + 2] = s->regs[CPU_RAM + 5] = 0x01010101;
 -    s->regs[CPU_RAM + 6] = 0x00000001;
 +    s->regs[R_CPU_RAM + 0] = s->regs[R_CPU_RAM + 1] = s->regs[R_CPU_RAM + 3]
 +                           = s->regs[R_CPU_RAM + 4] = s->regs[R_CPU_RAM + 7]
 +                           = 0x00010101;
 +    s->regs[R_CPU_RAM + 2] = s->regs[R_CPU_RAM + 5] = 0x01010101;
 +    s->regs[R_CPU_RAM + 6] = 0x00000001;
 -    s->regs[IOU + 0] = s->regs[IOU + 1] = s->regs[IOU + 2] = s->regs[IOU + 3]
 -                     = 0x09090909;
 -    s->regs[IOU + 4] = s->regs[IOU + 5] = 0x00090909;
 -    s->regs[IOU + 6] = 0x00000909;
 +    s->regs[R_IOU + 0] = s->regs[R_IOU + 1] = s->regs[R_IOU + 2]
 +                       = s->regs[R_IOU + 3] = 0x09090909;
 +    s->regs[R_IOU + 4] = s->regs[R_IOU + 5] = 0x00090909;
 +    s->regs[R_IOU + 6] = 0x00000909;
 -    s->regs[DMAC_RAM] = 0x00000009;
 +    s->regs[R_DMAC_RAM] = 0x00000009;
 -    s->regs[AFI0 + 0] = s->regs[AFI0 + 1] = 0x09090909;
 -    s->regs[AFI1 + 0] = s->regs[AFI1 + 1] = 0x09090909;
 -    s->regs[AFI2 + 0] = s->regs[AFI2 + 1] = 0x09090909;
 -    s->regs[AFI3 + 0] = s->regs[AFI3 + 1] = 0x09090909;
 -    s->regs[AFI0 + 2] = s->regs[AFI1 + 2] = s->regs[AFI2 + 2]
 -                      = s->regs[AFI3 + 2] = 0x00000909;
 +    s->regs[R_AFI0 + 0] = s->regs[R_AFI0 + 1] = 0x09090909;
 +    s->regs[R_AFI1 + 0] = s->regs[R_AFI1 + 1] = 0x09090909;
 +    s->regs[R_AFI2 + 0] = s->regs[R_AFI2 + 1] = 0x09090909;
 +    s->regs[R_AFI3 + 0] = s->regs[R_AFI3 + 1] = 0x09090909;
 +    s->regs[R_AFI0 + 2] = s->regs[R_AFI1 + 2] = s->regs[R_AFI2 + 2]
 +                        = s->regs[R_AFI3 + 2] = 0x00000909;
 -    s->regs[OCM + 0]    = 0x01010101;
 -    s->regs[OCM + 1]    = s->regs[OCM + 2] = 0x09090909;
 +    s->regs[R_OCM + 0] = 0x01010101;
 +    s->regs[R_OCM + 1] = s->regs[R_OCM + 2] = 0x09090909;
 -    s->regs[DEVCI_RAM]  = 0x00000909;
 -    s->regs[CSG_RAM]    = 0x00000001;
 +    s->regs[R_DEVCI_RAM] = 0x00000909;
 +    s->regs[R_CSG_RAM]   = 0x00000001;
 -    s->regs[DDRIOB + 0] = s->regs[DDRIOB + 1] = s->regs[DDRIOB + 2]
 -                        = s->regs[DDRIOB + 3] = 0x00000e00;
 -    s->regs[DDRIOB + 4] = s->regs[DDRIOB + 5] = s->regs[DDRIOB + 6]
 -                        = 0x00000e00;
 -    s->regs[DDRIOB + 12] = 0x00000021;
 +    s->regs[R_DDRIOB + 0] = s->regs[R_DDRIOB + 1] = s->regs[R_DDRIOB + 2]
 +                          = s->regs[R_DDRIOB + 3] = 0x00000e00;
 +    s->regs[R_DDRIOB + 4] = s->regs[R_DDRIOB + 5] = s->regs[R_DDRIOB + 6]
 +                          = 0x00000e00;
 +    s->regs[R_DDRIOB + 12] = 0x00000021;
  }
+ static void aw_a10_realize(DeviceState *dev, Error **errp)
- static bool zynq_slcr_check_offset(hwaddr offset, bool rnw)
+@@ -XXX,XX +XXX,XX @@ static void aw_a10_realize(DeviceState *dev, Error **errp)
- {
+     serial_mm_init(get_system_memory(), AW_A10_UART0_REG_BASE, 2,
-     switch (offset) {
+                    qdev_get_gpio_in(dev, 1),
--    case LOCK:
+, serial_hd(0), DEVICE_NATIVE_ENDIAN);
--    case UNLOCK:
++
--    case DDR_CAL_START:
++    if (machine_usb(current_machine)) {
--    case DDR_REF_START:
++        int i;
-+    case R_LOCK:
++
-+    case R_UNLOCK:
++        for (i = 0; i < AW_A10_NUM_USB; i++) {
-+    case R_DDR_CAL_START:
++            char bus[16];
-+    case R_DDR_REF_START:
++
-         return !rnw; /* Write only */
++            sprintf(bus, "usb-bus.%d", i);
--    case LOCKSTA:
++
--    case FPGA0_THR_STA:
++            object_property_set_bool(OBJECT(&s->ehci[i]), true,
--    case FPGA1_THR_STA:
++                                     "companion-enable", &error_fatal);
--    case FPGA2_THR_STA:
++            object_property_set_bool(OBJECT(&s->ehci[i]), true, "realized",
--    case FPGA3_THR_STA:
++                                     &error_fatal);
--    case BOOT_MODE:
++            sysbus_mmio_map(SYS_BUS_DEVICE(&s->ehci[i]), 0,
--    case PSS_IDCODE:
++                            AW_A10_EHCI_BASE + i * 0x8000);
--    case DDR_CMD_STA:
++            sysbus_connect_irq(SYS_BUS_DEVICE(&s->ehci[i]), 0,
--    case DDR_DFI_STATUS:
++                               qdev_get_gpio_in(dev, 39 + i));
--    case PLL_STATUS:
++
-+    case R_LOCKSTA:
++            object_property_set_str(OBJECT(&s->ohci[i]), bus, "masterbus",
-+    case R_FPGA0_THR_STA:
++                                    &error_fatal);
-+    case R_FPGA1_THR_STA:
++            object_property_set_bool(OBJECT(&s->ohci[i]), true, "realized",
-+    case R_FPGA2_THR_STA:
++                                     &error_fatal);
-+    case R_FPGA3_THR_STA:
++            sysbus_mmio_map(SYS_BUS_DEVICE(&s->ohci[i]), 0,
-+    case R_BOOT_MODE:
++                            AW_A10_OHCI_BASE + i * 0x8000);
-+    case R_PSS_IDCODE:
++            sysbus_connect_irq(SYS_BUS_DEVICE(&s->ohci[i]), 0,
-+    case R_DDR_CMD_STA:
++                               qdev_get_gpio_in(dev, 64 + i));
-+    case R_DDR_DFI_STATUS:
++        }
-+    case R_PLL_STATUS:
++    }
-         return rnw;/* read only */
+ }
--    case SCL:
--    case ARM_PLL_CTRL ... IO_PLL_CTRL:
+ static void aw_a10_class_init(ObjectClass *oc, void *data)
 -    case ARM_PLL_CFG ... IO_PLL_CFG:
 -    case ARM_CLK_CTRL ... TOPSW_CLK_CTRL:
 -    case FPGA0_CLK_CTRL ... FPGA0_THR_CNT:
 -    case FPGA1_CLK_CTRL ... FPGA1_THR_CNT:
 -    case FPGA2_CLK_CTRL ... FPGA2_THR_CNT:
 -    case FPGA3_CLK_CTRL ... FPGA3_THR_CNT:
 -    case BANDGAP_TRIP:
 -    case PLL_PREDIVISOR:
 -    case CLK_621_TRUE:
 -    case PSS_RST_CTRL ... A9_CPU_RST_CTRL:
 -    case RS_AWDT_CTRL:
 -    case RST_REASON:
 -    case REBOOT_STATUS:
 -    case APU_CTRL:
 -    case WDT_CLK_SEL:
 -    case TZ_DMA_NS ... TZ_DMA_PERIPH_NS:
 -    case DDR_URGENT:
 -    case DDR_URGENT_SEL:
 -    case MIO ... MIO + MIO_LENGTH - 1:
 -    case MIO_LOOPBACK ... MIO_MST_TRI1:
 -    case SD0_WP_CD_SEL:
 -    case SD1_WP_CD_SEL:
 -    case LVL_SHFTR_EN:
 -    case OCM_CFG:
 -    case CPU_RAM:
 -    case IOU:
 -    case DMAC_RAM:
 -    case AFI0 ... AFI3 + AFI_LENGTH - 1:
 -    case OCM:
 -    case DEVCI_RAM:
 -    case CSG_RAM:
 -    case GPIOB_CTRL ... GPIOB_CFG_CMOS33:
 -    case GPIOB_CFG_HSTL:
 -    case GPIOB_DRVR_BIAS_CTRL:
 -    case DDRIOB ... DDRIOB + DDRIOB_LENGTH - 1:
 +    case R_SCL:
 +    case R_ARM_PLL_CTRL ... R_IO_PLL_CTRL:
 +    case R_ARM_PLL_CFG ... R_IO_PLL_CFG:
 +    case R_ARM_CLK_CTRL ... R_TOPSW_CLK_CTRL:
 +    case R_FPGA0_CLK_CTRL ... R_FPGA0_THR_CNT:
 +    case R_FPGA1_CLK_CTRL ... R_FPGA1_THR_CNT:
 +    case R_FPGA2_CLK_CTRL ... R_FPGA2_THR_CNT:
 +    case R_FPGA3_CLK_CTRL ... R_FPGA3_THR_CNT:
 +    case R_BANDGAP_TRIP:
 +    case R_PLL_PREDIVISOR:
 +    case R_CLK_621_TRUE:
 +    case R_PSS_RST_CTRL ... R_A9_CPU_RST_CTRL:
 +    case R_RS_AWDT_CTRL:
 +    case R_RST_REASON:
 +    case R_REBOOT_STATUS:
 +    case R_APU_CTRL:
 +    case R_WDT_CLK_SEL:
 +    case R_TZ_DMA_NS ... R_TZ_DMA_PERIPH_NS:
 +    case R_DDR_URGENT:
 +    case R_DDR_URGENT_SEL:
 +    case R_MIO ... R_MIO + MIO_LENGTH - 1:
 +    case R_MIO_LOOPBACK ... R_MIO_MST_TRI1:
 +    case R_SD0_WP_CD_SEL:
 +    case R_SD1_WP_CD_SEL:
 +    case R_LVL_SHFTR_EN:
 +    case R_OCM_CFG:
 +    case R_CPU_RAM:
 +    case R_IOU:
 +    case R_DMAC_RAM:
 +    case R_AFI0 ... R_AFI3 + AFI_LENGTH - 1:
 +    case R_OCM:
 +    case R_DEVCI_RAM:
 +    case R_CSG_RAM:
 +    case R_GPIOB_CTRL ... R_GPIOB_CFG_CMOS33:
 +    case R_GPIOB_CFG_HSTL:
 +    case R_GPIOB_DRVR_BIAS_CTRL:
 +    case R_DDRIOB ... R_DDRIOB + DDRIOB_LENGTH - 1:
          return true;
      default:
          return false;
@@ -XXX,XX +XXX,XX @@ static void zynq_slcr_write(void *opaque, hwaddr offset,
      }
      switch (offset) {
 -    case SCL:
 -        s->regs[SCL] = val & 0x1;
 +    case R_SCL:
 +        s->regs[R_SCL] = val & 0x1;
          return;
 -    case LOCK:
 +    case R_LOCK:
          if ((val & 0xFFFF) == XILINX_LOCK_KEY) {
              DB_PRINT("XILINX LOCK 0xF8000000 + 0x%x <= 0x%x\n", (int)offset,
                  (unsigned)val & 0xFFFF);
 -            s->regs[LOCKSTA] = 1;
 +            s->regs[R_LOCKSTA] = 1;
          } else {
              DB_PRINT("WRONG XILINX LOCK KEY 0xF8000000 + 0x%x <= 0x%x\n",
                  (int)offset, (unsigned)val & 0xFFFF);
          }
          return;
 -    case UNLOCK:
 +    case R_UNLOCK:
          if ((val & 0xFFFF) == XILINX_UNLOCK_KEY) {
              DB_PRINT("XILINX UNLOCK 0xF8000000 + 0x%x <= 0x%x\n", (int)offset,
                  (unsigned)val & 0xFFFF);
 -            s->regs[LOCKSTA] = 0;
 +            s->regs[R_LOCKSTA] = 0;
          } else {
              DB_PRINT("WRONG XILINX UNLOCK KEY 0xF8000000 + 0x%x <= 0x%x\n",
                  (int)offset, (unsigned)val & 0xFFFF);
@@ -XXX,XX +XXX,XX @@ static void zynq_slcr_write(void *opaque, hwaddr offset,
          return;
      }
 -    if (s->regs[LOCKSTA]) {
 +    if (s->regs[R_LOCKSTA]) {
          qemu_log_mask(LOG_GUEST_ERROR,
                        "SCLR registers are locked. Unlock them first\n");
          return;
@@ -XXX,XX +XXX,XX @@ static void zynq_slcr_write(void *opaque, hwaddr offset,
      s->regs[offset] = val;
      switch (offset) {
 -    case PSS_RST_CTRL:
 -        if (val & R_PSS_RST_CTRL_SOFT_RST) {
 +    case R_PSS_RST_CTRL:
 +        if (FIELD_EX32(val, PSS_RST_CTRL, SOFT_RST)) {
              qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
          }
          break;
 --
 .20.1

-[Qemu-devel] [PULL 11/29] target/arm: Replace s->pc with s->base.pc_next
+[PULL 38/52] target/arm: Vectorize USHL and SSHL
 From: Richard Henderson <richard.henderson@linaro.org>
-We must update s->base.pc_next when we return from the translate_insn
+These instructions shift left or right depending on the sign
-hook to the main translator loop.  By incrementing s->base.pc_next
+of the input, and 7 bits are significant to the shift.  This
-immediately after reading the insn word, "pc_next" contains the address
+requires several masks and selects in addition to the actual
-of the next instruction throughout translation.
+shifts to form the complete answer.
-All remaining uses of s->pc are referencing the address of the next insn,
+That said, the operation is still a small improvement even for
-so this is now a simple global replacement.  Remove the "s->pc" field.
+two 64-bit elements -- 13 vector operations instead of 2 * 7
 integer operations.
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200216214232.4230-2-richard.henderson@linaro.org
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Message-id: 20190807045335.1361-7-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h     |   1 -
+ target/arm/helper.h        |  11 +-
- target/arm/translate-a64.c |  51 +++++++++---------
+ target/arm/translate.h     |   6 +
- target/arm/translate.c     | 103 ++++++++++++++++++-------------------
+ target/arm/neon_helper.c   |  33 ----
-files changed, 72 insertions(+), 83 deletions(-)
+ target/arm/translate-a64.c |  18 +--
  target/arm/translate.c     | 299 +++++++++++++++++++++++++++++++++++--
  target/arm/vec_helper.c    |  88 +++++++++++
 files changed, 389 insertions(+), 66 deletions(-)
+diff --git a/target/arm/helper.h b/target/arm/helper.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.h
++++ b/target/arm/helper.h
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_abd_s16, i32, i32, i32)
+ DEF_HELPER_2(neon_abd_u32, i32, i32, i32)
+ DEF_HELPER_2(neon_abd_s32, i32, i32, i32)
+-DEF_HELPER_2(neon_shl_u8, i32, i32, i32)
+-DEF_HELPER_2(neon_shl_s8, i32, i32, i32)
+ DEF_HELPER_2(neon_shl_u16, i32, i32, i32)
+ DEF_HELPER_2(neon_shl_s16, i32, i32, i32)
+-DEF_HELPER_2(neon_shl_u32, i32, i32, i32)
+-DEF_HELPER_2(neon_shl_s32, i32, i32, i32)
+-DEF_HELPER_2(neon_shl_u64, i64, i64, i64)
+-DEF_HELPER_2(neon_shl_s64, i64, i64, i64)
+ DEF_HELPER_2(neon_rshl_u8, i32, i32, i32)
+ DEF_HELPER_2(neon_rshl_s8, i32, i32, i32)
+ DEF_HELPER_2(neon_rshl_u16, i32, i32, i32)
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(frint64_s, TCG_CALL_NO_RWG, f32, f32, ptr)
+ DEF_HELPER_FLAGS_2(frint32_d, TCG_CALL_NO_RWG, f64, f64, ptr)
+ DEF_HELPER_FLAGS_2(frint64_d, TCG_CALL_NO_RWG, f64, f64, ptr)
++DEF_HELPER_FLAGS_4(gvec_sshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_4(gvec_sshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_4(gvec_ushl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++
+ #ifdef TARGET_AARCH64
+ #include "helper-a64.h"
+ #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
+@@ -XXX,XX +XXX,XX @@ uint64_t vfp_expand_imm(int size, uint8_t imm8);
-     DisasContextBase base;
+ extern const GVecGen3 mla_op[4];
-     const ARMISARegisters *isar;
+ extern const GVecGen3 mls_op[4];
+ extern const GVecGen3 cmtst_op[4];
--    target_ulong pc;
++extern const GVecGen3 sshl_op[4];
-     /* The address of the current instruction being translated. */
++extern const GVecGen3 ushl_op[4];
-     target_ulong pc_curr;
+ extern const GVecGen2i ssra_op[4];
-     target_ulong page_start;
+ extern const GVecGen2i usra_op[4];
  extern const GVecGen2i sri_op[4];
@@ -XXX,XX +XXX,XX @@ extern const GVecGen4 sqadd_op[4];
  extern const GVecGen4 uqsub_op[4];
  extern const GVecGen4 sqsub_op[4];
  void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 +void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
 +void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
 +void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 +void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
 diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon_helper.c
 +++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ NEON_VOP(abd_u32, neon_u32, 1)
      } else { \
          dest = src1 << tmp; \
      }} while (0)
 -NEON_VOP(shl_u8, neon_u8, 4)
  NEON_VOP(shl_u16, neon_u16, 2)
 -NEON_VOP(shl_u32, neon_u32, 1)
  #undef NEON_FN
 -uint64_t HELPER(neon_shl_u64)(uint64_t val, uint64_t shiftop)
 -{
 -    int8_t shift = (int8_t)shiftop;
 -    if (shift >= 64 || shift <= -64) {
 -        val = 0;
 -    } else if (shift < 0) {
 -        val >>= -shift;
 -    } else {
 -        val <<= shift;
 -    }
 -    return val;
 -}
 -
  #define NEON_FN(dest, src1, src2) do { \
      int8_t tmp; \
      tmp = (int8_t)src2; \
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_shl_u64)(uint64_t val, uint64_t shiftop)
      } else { \
          dest = src1 << tmp; \
      }} while (0)
 -NEON_VOP(shl_s8, neon_s8, 4)
  NEON_VOP(shl_s16, neon_s16, 2)
 -NEON_VOP(shl_s32, neon_s32, 1)
  #undef NEON_FN
 -uint64_t HELPER(neon_shl_s64)(uint64_t valop, uint64_t shiftop)
 -{
 -    int8_t shift = (int8_t)shiftop;
 -    int64_t val = valop;
 -    if (shift >= 64) {
 -        val = 0;
 -    } else if (shift <= -64) {
 -        val >>= 63;
 -    } else if (shift < 0) {
 -        val >>= -shift;
 -    } else {
 -        val <<= shift;
 -    }
 -    return val;
 -}
 -
  #define NEON_FN(dest, src1, src2) do { \
      int8_t tmp; \
      tmp = (int8_t)src2; \
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal(int excp)
+@@ -XXX,XX +XXX,XX @@ static void handle_3same_64(DisasContext *s, int opcode, bool u,
+         break;
- static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
+     case 0x8: /* SSHL, USHL */
- {
+         if (u) {
--    gen_a64_set_pc_im(s->pc - offset);
+-            gen_helper_neon_shl_u64(tcg_rd, tcg_rn, tcg_rm);
-+    gen_a64_set_pc_im(s->base.pc_next - offset);
++            gen_ushl_i64(tcg_rd, tcg_rn, tcg_rm);
-     gen_exception_internal(excp);
+         } else {
-     s->base.is_jmp = DISAS_NORETURN;
+-            gen_helper_neon_shl_s64(tcg_rd, tcg_rn, tcg_rm);
- }
++            gen_sshl_i64(tcg_rd, tcg_rn, tcg_rm);
@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
  static void gen_exception_insn(DisasContext *s, int offset, int excp,
                                 uint32_t syndrome, uint32_t target_el)
  {
 -    gen_a64_set_pc_im(s->pc - offset);
 +    gen_a64_set_pc_im(s->base.pc_next - offset);
      gen_exception(excp, syndrome, target_el);
      s->base.is_jmp = DISAS_NORETURN;
  }
@@ -XXX,XX +XXX,XX @@ static void gen_exception_bkpt_insn(DisasContext *s, int offset,
  {
      TCGv_i32 tcg_syn;
 -    gen_a64_set_pc_im(s->pc - offset);
 +    gen_a64_set_pc_im(s->base.pc_next - offset);
      tcg_syn = tcg_const_i32(syndrome);
      gen_helper_exception_bkpt_insn(cpu_env, tcg_syn);
      tcg_temp_free_i32(tcg_syn);
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
      if (insn & (1U << 31)) {
          /* BL Branch with link */
 -        tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
 +        tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next);
      }
      /* B Branch / BL Branch with link */
@@ -XXX,XX +XXX,XX @@ static void disas_comp_b_imm(DisasContext *s, uint32_t insn)
      tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
                          tcg_cmp, 0, label_match);
 -    gen_goto_tb(s, 0, s->pc);
 +    gen_goto_tb(s, 0, s->base.pc_next);
      gen_set_label(label_match);
      gen_goto_tb(s, 1, addr);
  }
@@ -XXX,XX +XXX,XX @@ static void disas_test_b_imm(DisasContext *s, uint32_t insn)
      tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
                          tcg_cmp, 0, label_match);
      tcg_temp_free_i64(tcg_cmp);
 -    gen_goto_tb(s, 0, s->pc);
 +    gen_goto_tb(s, 0, s->base.pc_next);
      gen_set_label(label_match);
      gen_goto_tb(s, 1, addr);
  }
@@ -XXX,XX +XXX,XX @@ static void disas_cond_b_imm(DisasContext *s, uint32_t insn)
          /* genuinely conditional branches */
          TCGLabel *label_match = gen_new_label();
          arm_gen_test_cc(cond, label_match);
 -        gen_goto_tb(s, 0, s->pc);
 +        gen_goto_tb(s, 0, s->base.pc_next);
          gen_set_label(label_match);
          gen_goto_tb(s, 1, addr);
      } else {
@@ -XXX,XX +XXX,XX @@ static void handle_sync(DisasContext *s, uint32_t insn,
           * any pending interrupts immediately.
           */
          reset_btype(s);
 -        gen_goto_tb(s, 0, s->pc);
 +        gen_goto_tb(s, 0, s->base.pc_next);
          return;
      case 7: /* SB */
@@ -XXX,XX +XXX,XX @@ static void handle_sync(DisasContext *s, uint32_t insn,
           * MB and end the TB instead.
           */
          tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
 -        gen_goto_tb(s, 0, s->pc);
 +        gen_goto_tb(s, 0, s->base.pc_next);
          return;
      default:
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
          gen_a64_set_pc(s, dst);
          /* BLR also needs to load return address */
          if (opc == 1) {
 -            tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
 +            tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next);
          }
          break;
+     case 0x9: /* SQSHL, UQSHL */
-@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
-         gen_a64_set_pc(s, dst);
+                        is_q ? 16 : 8, vec_full_reg_size(s),
-         /* BLRAA also needs to load return address */
+                        (u ? uqsub_op : sqsub_op) + size);
-         if (opc == 9) {
+         return;
--            tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
++    case 0x08: /* SSHL, USHL */
-+            tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next);
++        gen_gvec_op3(s, is_q, rd, rn, rm,
-         }
++                     u ? &ushl_op[size] : &sshl_op[size]);
-         break;
++        return;
+     case 0x0c: /* SMAX, UMAX */
-@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
+         if (u) {
- {
+             gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size);
-     uint32_t insn;
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
+                 genfn = fns[size][u];
--    s->pc_curr = s->pc;
+                 break;
--    insn = arm_ldl_code(env, s->pc, s->sctlr_b);
+             }
-+    s->pc_curr = s->base.pc_next;
+-            case 0x8: /* SSHL, USHL */
-+    insn = arm_ldl_code(env, s->base.pc_next, s->sctlr_b);
+-            {
-     s->insn = insn;
+-                static NeonGenTwoOpFn * const fns[3][2] = {
--    s->pc += 4;
+-                    { gen_helper_neon_shl_s8, gen_helper_neon_shl_u8 },
-+    s->base.pc_next += 4;
+-                    { gen_helper_neon_shl_s16, gen_helper_neon_shl_u16 },
+-                    { gen_helper_neon_shl_s32, gen_helper_neon_shl_u32 },
-     s->fp_access_checked = false;
+-                };
+-                genfn = fns[size][u];
-@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
+-                break;
-     int bound, core_mmu_idx;
+-            }
+             case 0x9: /* SQSHL, UQSHL */
-     dc->isar = &arm_cpu->isar;
+             {
--    dc->pc = dc->base.pc_first;
+                 static NeonGenTwoOpEnvFn * const fns[3][2] = {
      dc->condjmp = 0;
      dc->aarch64 = 1;
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
  {
      DisasContext *dc = container_of(dcbase, DisasContext, base);
 -    tcg_gen_insn_start(dc->pc, 0, 0);
 +    tcg_gen_insn_start(dc->base.pc_next, 0, 0);
      dc->insn_start = tcg_last_op();
  }
@@ -XXX,XX +XXX,XX @@ static bool aarch64_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
      DisasContext *dc = container_of(dcbase, DisasContext, base);
      if (bp->flags & BP_CPU) {
 -        gen_a64_set_pc_im(dc->pc);
 +        gen_a64_set_pc_im(dc->base.pc_next);
          gen_helper_check_breakpoints(cpu_env);
          /* End the TB early; it likely won't be executed */
          dc->base.is_jmp = DISAS_TOO_MANY;
@@ -XXX,XX +XXX,XX @@ static bool aarch64_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
             to for it to be properly cleared -- thus we
             increment the PC here so that the logic setting
             tb->size below does the right thing.  */
 -        dc->pc += 4;
 +        dc->base.pc_next += 4;
          dc->base.is_jmp = DISAS_NORETURN;
      }
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
          disas_a64_insn(env, dc);
      }
 -    dc->base.pc_next = dc->pc;
      translator_loop_temp_check(&dc->base);
  }
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
           */
          switch (dc->base.is_jmp) {
          default:
 -            gen_a64_set_pc_im(dc->pc);
 +            gen_a64_set_pc_im(dc->base.pc_next);
              /* fall through */
          case DISAS_EXIT:
          case DISAS_JUMP:
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
          switch (dc->base.is_jmp) {
          case DISAS_NEXT:
          case DISAS_TOO_MANY:
 -            gen_goto_tb(dc, 1, dc->pc);
 +            gen_goto_tb(dc, 1, dc->base.pc_next);
              break;
          default:
          case DISAS_UPDATE:
 -            gen_a64_set_pc_im(dc->pc);
 +            gen_a64_set_pc_im(dc->base.pc_next);
              /* fall through */
          case DISAS_EXIT:
              tcg_gen_exit_tb(NULL, 0);
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
          case DISAS_SWI:
              break;
          case DISAS_WFE:
 -            gen_a64_set_pc_im(dc->pc);
 +            gen_a64_set_pc_im(dc->base.pc_next);
              gen_helper_wfe(cpu_env);
              break;
          case DISAS_YIELD:
 -            gen_a64_set_pc_im(dc->pc);
 +            gen_a64_set_pc_im(dc->base.pc_next);
              gen_helper_yield(cpu_env);
              break;
          case DISAS_WFI:
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
               */
              TCGv_i32 tmp = tcg_const_i32(4);
 -            gen_a64_set_pc_im(dc->pc);
 +            gen_a64_set_pc_im(dc->base.pc_next);
              gen_helper_wfi(cpu_env, tmp);
              tcg_temp_free_i32(tmp);
              /* The helper doesn't necessarily throw an exception, but we
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
          }
          }
      }
 -
 -    /* Functions above can change dc->pc, so re-align db->pc_next */
 -    dc->base.pc_next = dc->pc;
  }
  static void aarch64_tr_disas_log(const DisasContextBase *dcbase,
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static inline void gen_blxns(DisasContext *s, int rm)
+@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_shift_narrow(int size, TCGv_i32 var, TCGv_i32 shift,
-      * We do however need to set the PC, because the blxns helper reads it.
+         if (u) {
-      * The blxns helper may throw an exception.
+             switch (size) {
-      */
+             case 1: gen_helper_neon_shl_u16(var, var, shift); break;
--    gen_set_pc_im(s, s->pc);
+-            case 2: gen_helper_neon_shl_u32(var, var, shift); break;
-+    gen_set_pc_im(s, s->base.pc_next);
++            case 2: gen_ushl_i32(var, var, shift); break;
-     gen_helper_v7m_blxns(cpu_env, var);
+             default: abort();
-     tcg_temp_free_i32(var);
+             }
-     s->base.is_jmp = DISAS_EXIT;
+         } else {
-@@ -XXX,XX +XXX,XX @@ static inline void gen_hvc(DisasContext *s, int imm16)
+             switch (size) {
-      * for single stepping.)
+             case 1: gen_helper_neon_shl_s16(var, var, shift); break;
-      */
+-            case 2: gen_helper_neon_shl_s32(var, var, shift); break;
-     s->svc_imm = imm16;
++            case 2: gen_sshl_i32(var, var, shift); break;
--    gen_set_pc_im(s, s->pc);
+             default: abort();
-+    gen_set_pc_im(s, s->base.pc_next);
+             }
-     s->base.is_jmp = DISAS_HVC;
+         }
@@ -XXX,XX +XXX,XX @@ const GVecGen3 cmtst_op[4] = {
        .vece = MO_64 },
  };
 +void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
 +{
 +    TCGv_i32 lval = tcg_temp_new_i32();
 +    TCGv_i32 rval = tcg_temp_new_i32();
 +    TCGv_i32 lsh = tcg_temp_new_i32();
 +    TCGv_i32 rsh = tcg_temp_new_i32();
 +    TCGv_i32 zero = tcg_const_i32(0);
 +    TCGv_i32 max = tcg_const_i32(32);
 +
 +    /*
 +     * Rely on the TCG guarantee that out of range shifts produce
 +     * unspecified results, not undefined behaviour (i.e. no trap).
 +     * Discard out-of-range results after the fact.
 +     */
 +    tcg_gen_ext8s_i32(lsh, shift);
 +    tcg_gen_neg_i32(rsh, lsh);
 +    tcg_gen_shl_i32(lval, src, lsh);
 +    tcg_gen_shr_i32(rval, src, rsh);
 +    tcg_gen_movcond_i32(TCG_COND_LTU, dst, lsh, max, lval, zero);
 +    tcg_gen_movcond_i32(TCG_COND_LTU, dst, rsh, max, rval, dst);
 +
 +    tcg_temp_free_i32(lval);
 +    tcg_temp_free_i32(rval);
 +    tcg_temp_free_i32(lsh);
 +    tcg_temp_free_i32(rsh);
 +    tcg_temp_free_i32(zero);
 +    tcg_temp_free_i32(max);
 +}
 +
 +void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
 +{
 +    TCGv_i64 lval = tcg_temp_new_i64();
 +    TCGv_i64 rval = tcg_temp_new_i64();
 +    TCGv_i64 lsh = tcg_temp_new_i64();
 +    TCGv_i64 rsh = tcg_temp_new_i64();
 +    TCGv_i64 zero = tcg_const_i64(0);
 +    TCGv_i64 max = tcg_const_i64(64);
 +
 +    /*
 +     * Rely on the TCG guarantee that out of range shifts produce
 +     * unspecified results, not undefined behaviour (i.e. no trap).
 +     * Discard out-of-range results after the fact.
 +     */
 +    tcg_gen_ext8s_i64(lsh, shift);
 +    tcg_gen_neg_i64(rsh, lsh);
 +    tcg_gen_shl_i64(lval, src, lsh);
 +    tcg_gen_shr_i64(rval, src, rsh);
 +    tcg_gen_movcond_i64(TCG_COND_LTU, dst, lsh, max, lval, zero);
 +    tcg_gen_movcond_i64(TCG_COND_LTU, dst, rsh, max, rval, dst);
 +
 +    tcg_temp_free_i64(lval);
 +    tcg_temp_free_i64(rval);
 +    tcg_temp_free_i64(lsh);
 +    tcg_temp_free_i64(rsh);
 +    tcg_temp_free_i64(zero);
 +    tcg_temp_free_i64(max);
 +}
 +
 +static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
 +                         TCGv_vec src, TCGv_vec shift)
 +{
 +    TCGv_vec lval = tcg_temp_new_vec_matching(dst);
 +    TCGv_vec rval = tcg_temp_new_vec_matching(dst);
 +    TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
 +    TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
 +    TCGv_vec msk, max;
 +
 +    tcg_gen_neg_vec(vece, rsh, shift);
 +    if (vece == MO_8) {
 +        tcg_gen_mov_vec(lsh, shift);
 +    } else {
 +        msk = tcg_temp_new_vec_matching(dst);
 +        tcg_gen_dupi_vec(vece, msk, 0xff);
 +        tcg_gen_and_vec(vece, lsh, shift, msk);
 +        tcg_gen_and_vec(vece, rsh, rsh, msk);
 +        tcg_temp_free_vec(msk);
 +    }
 +
 +    /*
 +     * Rely on the TCG guarantee that out of range shifts produce
 +     * unspecified results, not undefined behaviour (i.e. no trap).
 +     * Discard out-of-range results after the fact.
 +     */
 +    tcg_gen_shlv_vec(vece, lval, src, lsh);
 +    tcg_gen_shrv_vec(vece, rval, src, rsh);
 +
 +    max = tcg_temp_new_vec_matching(dst);
 +    tcg_gen_dupi_vec(vece, max, 8 << vece);
 +
 +    /*
 +     * The choice of LT (signed) and GEU (unsigned) are biased toward
 +     * the instructions of the x86_64 host.  For MO_8, the whole byte
 +     * is significant so we must use an unsigned compare; otherwise we
 +     * have already masked to a byte and so a signed compare works.
 +     * Other tcg hosts have a full set of comparisons and do not care.
 +     */
 +    if (vece == MO_8) {
 +        tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
 +        tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
 +        tcg_gen_andc_vec(vece, lval, lval, lsh);
 +        tcg_gen_andc_vec(vece, rval, rval, rsh);
 +    } else {
 +        tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max);
 +        tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max);
 +        tcg_gen_and_vec(vece, lval, lval, lsh);
 +        tcg_gen_and_vec(vece, rval, rval, rsh);
 +    }
 +    tcg_gen_or_vec(vece, dst, lval, rval);
 +
 +    tcg_temp_free_vec(max);
 +    tcg_temp_free_vec(lval);
 +    tcg_temp_free_vec(rval);
 +    tcg_temp_free_vec(lsh);
 +    tcg_temp_free_vec(rsh);
 +}
 +
 +static const TCGOpcode ushl_list[] = {
 +    INDEX_op_neg_vec, INDEX_op_shlv_vec,
 +    INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
 +};
 +
 +const GVecGen3 ushl_op[4] = {
 +    { .fniv = gen_ushl_vec,
 +      .fno = gen_helper_gvec_ushl_b,
 +      .opt_opc = ushl_list,
 +      .vece = MO_8 },
 +    { .fniv = gen_ushl_vec,
 +      .fno = gen_helper_gvec_ushl_h,
 +      .opt_opc = ushl_list,
 +      .vece = MO_16 },
 +    { .fni4 = gen_ushl_i32,
 +      .fniv = gen_ushl_vec,
 +      .opt_opc = ushl_list,
 +      .vece = MO_32 },
 +    { .fni8 = gen_ushl_i64,
 +      .fniv = gen_ushl_vec,
 +      .opt_opc = ushl_list,
 +      .vece = MO_64 },
 +};
 +
 +void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
 +{
 +    TCGv_i32 lval = tcg_temp_new_i32();
 +    TCGv_i32 rval = tcg_temp_new_i32();
 +    TCGv_i32 lsh = tcg_temp_new_i32();
 +    TCGv_i32 rsh = tcg_temp_new_i32();
 +    TCGv_i32 zero = tcg_const_i32(0);
 +    TCGv_i32 max = tcg_const_i32(31);
 +
 +    /*
 +     * Rely on the TCG guarantee that out of range shifts produce
 +     * unspecified results, not undefined behaviour (i.e. no trap).
 +     * Discard out-of-range results after the fact.
 +     */
 +    tcg_gen_ext8s_i32(lsh, shift);
 +    tcg_gen_neg_i32(rsh, lsh);
 +    tcg_gen_shl_i32(lval, src, lsh);
 +    tcg_gen_umin_i32(rsh, rsh, max);
 +    tcg_gen_sar_i32(rval, src, rsh);
 +    tcg_gen_movcond_i32(TCG_COND_LEU, lval, lsh, max, lval, zero);
 +    tcg_gen_movcond_i32(TCG_COND_LT, dst, lsh, zero, rval, lval);
 +
 +    tcg_temp_free_i32(lval);
 +    tcg_temp_free_i32(rval);
 +    tcg_temp_free_i32(lsh);
 +    tcg_temp_free_i32(rsh);
 +    tcg_temp_free_i32(zero);
 +    tcg_temp_free_i32(max);
 +}
 +
 +void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
 +{
 +    TCGv_i64 lval = tcg_temp_new_i64();
 +    TCGv_i64 rval = tcg_temp_new_i64();
 +    TCGv_i64 lsh = tcg_temp_new_i64();
 +    TCGv_i64 rsh = tcg_temp_new_i64();
 +    TCGv_i64 zero = tcg_const_i64(0);
 +    TCGv_i64 max = tcg_const_i64(63);
 +
 +    /*
 +     * Rely on the TCG guarantee that out of range shifts produce
 +     * unspecified results, not undefined behaviour (i.e. no trap).
 +     * Discard out-of-range results after the fact.
 +     */
 +    tcg_gen_ext8s_i64(lsh, shift);
 +    tcg_gen_neg_i64(rsh, lsh);
 +    tcg_gen_shl_i64(lval, src, lsh);
 +    tcg_gen_umin_i64(rsh, rsh, max);
 +    tcg_gen_sar_i64(rval, src, rsh);
 +    tcg_gen_movcond_i64(TCG_COND_LEU, lval, lsh, max, lval, zero);
 +    tcg_gen_movcond_i64(TCG_COND_LT, dst, lsh, zero, rval, lval);
 +
 +    tcg_temp_free_i64(lval);
 +    tcg_temp_free_i64(rval);
 +    tcg_temp_free_i64(lsh);
 +    tcg_temp_free_i64(rsh);
 +    tcg_temp_free_i64(zero);
 +    tcg_temp_free_i64(max);
 +}
 +
 +static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
 +                         TCGv_vec src, TCGv_vec shift)
 +{
 +    TCGv_vec lval = tcg_temp_new_vec_matching(dst);
 +    TCGv_vec rval = tcg_temp_new_vec_matching(dst);
 +    TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
 +    TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
 +    TCGv_vec tmp = tcg_temp_new_vec_matching(dst);
 +
 +    /*
 +     * Rely on the TCG guarantee that out of range shifts produce
 +     * unspecified results, not undefined behaviour (i.e. no trap).
 +     * Discard out-of-range results after the fact.
 +     */
 +    tcg_gen_neg_vec(vece, rsh, shift);
 +    if (vece == MO_8) {
 +        tcg_gen_mov_vec(lsh, shift);
 +    } else {
 +        tcg_gen_dupi_vec(vece, tmp, 0xff);
 +        tcg_gen_and_vec(vece, lsh, shift, tmp);
 +        tcg_gen_and_vec(vece, rsh, rsh, tmp);
 +    }
 +
 +    /* Bound rsh so out of bound right shift gets -1.  */
 +    tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1);
 +    tcg_gen_umin_vec(vece, rsh, rsh, tmp);
 +    tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp);
 +
 +    tcg_gen_shlv_vec(vece, lval, src, lsh);
 +    tcg_gen_sarv_vec(vece, rval, src, rsh);
 +
 +    /* Select in-bound left shift.  */
 +    tcg_gen_andc_vec(vece, lval, lval, tmp);
 +
 +    /* Select between left and right shift.  */
 +    if (vece == MO_8) {
 +        tcg_gen_dupi_vec(vece, tmp, 0);
 +        tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, rval, lval);
 +    } else {
 +        tcg_gen_dupi_vec(vece, tmp, 0x80);
 +        tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, lval, rval);
 +    }
 +
 +    tcg_temp_free_vec(lval);
 +    tcg_temp_free_vec(rval);
 +    tcg_temp_free_vec(lsh);
 +    tcg_temp_free_vec(rsh);
 +    tcg_temp_free_vec(tmp);
 +}
 +
 +static const TCGOpcode sshl_list[] = {
 +    INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
 +    INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
 +};
 +
 +const GVecGen3 sshl_op[4] = {
 +    { .fniv = gen_sshl_vec,
 +      .fno = gen_helper_gvec_sshl_b,
 +      .opt_opc = sshl_list,
 +      .vece = MO_8 },
 +    { .fniv = gen_sshl_vec,
 +      .fno = gen_helper_gvec_sshl_h,
 +      .opt_opc = sshl_list,
 +      .vece = MO_16 },
 +    { .fni4 = gen_sshl_i32,
 +      .fniv = gen_sshl_vec,
 +      .opt_opc = sshl_list,
 +      .vece = MO_32 },
 +    { .fni8 = gen_sshl_i64,
 +      .fniv = gen_sshl_vec,
 +      .opt_opc = sshl_list,
 +      .vece = MO_64 },
 +};
 +
  static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
  {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                    vec_size, vec_size);
              }
              return 0;
 +
 +        case NEON_3R_VSHL:
 +            /* Note the operation is vshl vd,vm,vn */
 +            tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
 +                           u ? &ushl_op[size] : &sshl_op[size]);
 +            return 0;
          }
          if (size == 3) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  neon_load_reg64(cpu_V0, rn + pass);
                  neon_load_reg64(cpu_V1, rm + pass);
                  switch (op) {
 -                case NEON_3R_VSHL:
 -                    if (u) {
 -                        gen_helper_neon_shl_u64(cpu_V0, cpu_V1, cpu_V0);
 -                    } else {
 -                        gen_helper_neon_shl_s64(cpu_V0, cpu_V1, cpu_V0);
 -                    }
 -                    break;
                  case NEON_3R_VQSHL:
                      if (u) {
                          gen_helper_neon_qshl_u64(cpu_V0, cpu_env,
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          }
          pairwise = 0;
          switch (op) {
 -        case NEON_3R_VSHL:
          case NEON_3R_VQSHL:
          case NEON_3R_VRSHL:
          case NEON_3R_VQRSHL:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VHSUB:
              GEN_NEON_INTEGER_OP(hsub);
              break;
 -        case NEON_3R_VSHL:
 -            GEN_NEON_INTEGER_OP(shl);
 -            break;
          case NEON_3R_VQSHL:
              GEN_NEON_INTEGER_OP_ENV(qshl);
              break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              }
                          } else {
                              if (input_unsigned) {
 -                                gen_helper_neon_shl_u64(cpu_V0, in, tmp64);
 +                                gen_ushl_i64(cpu_V0, in, tmp64);
                              } else {
 -                                gen_helper_neon_shl_s64(cpu_V0, in, tmp64);
 +                                gen_sshl_i64(cpu_V0, in, tmp64);
                              }
                          }
                          tmp = tcg_temp_new_i32();
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fmlal_idx_a64)(void *vd, void *vn, void *vm,
      do_fmlal_idx(vd, vn, vm, &env->vfp.fp_status, desc,
                   get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
  }
++
-@@ -XXX,XX +XXX,XX @@ static inline void gen_smc(DisasContext *s)
++void HELPER(gvec_sshl_b)(void *vd, void *vn, void *vm, uint32_t desc)
-     tmp = tcg_const_i32(syn_aa32_smc());
++{
-     gen_helper_pre_smc(cpu_env, tmp);
++    intptr_t i, opr_sz = simd_oprsz(desc);
-     tcg_temp_free_i32(tmp);
++    int8_t *d = vd, *n = vn, *m = vm;
--    gen_set_pc_im(s, s->pc);
++
-+    gen_set_pc_im(s, s->base.pc_next);
++    for (i = 0; i < opr_sz; ++i) {
-     s->base.is_jmp = DISAS_SMC;
++        int8_t mm = m[i];
- }
++        int8_t nn = n[i];
++        int8_t res = 0;
- static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
++        if (mm >= 0) {
- {
++            if (mm < 8) {
-     gen_set_condexec(s);
++                res = nn << mm;
--    gen_set_pc_im(s, s->pc - offset);
++            }
-+    gen_set_pc_im(s, s->base.pc_next - offset);
++        } else {
-     gen_exception_internal(excp);
++            res = nn >> (mm > -8 ? -mm : 7);
-     s->base.is_jmp = DISAS_NORETURN;
++        }
- }
++        d[i] = res;
-@@ -XXX,XX +XXX,XX @@ static void gen_exception_insn(DisasContext *s, int offset, int excp,
++    }
-                                int syn, uint32_t target_el)
++    clear_tail(d, opr_sz, simd_maxsz(desc));
- {
++}
-     gen_set_condexec(s);
++
--    gen_set_pc_im(s, s->pc - offset);
++void HELPER(gvec_sshl_h)(void *vd, void *vn, void *vm, uint32_t desc)
-+    gen_set_pc_im(s, s->base.pc_next - offset);
++{
-     gen_exception(excp, syn, target_el);
++    intptr_t i, opr_sz = simd_oprsz(desc);
-     s->base.is_jmp = DISAS_NORETURN;
++    int16_t *d = vd, *n = vn, *m = vm;
- }
++
-@@ -XXX,XX +XXX,XX @@ static void gen_exception_bkpt_insn(DisasContext *s, int offset, uint32_t syn)
++    for (i = 0; i < opr_sz / 2; ++i) {
-     TCGv_i32 tcg_syn;
++        int8_t mm = m[i];   /* only 8 bits of shift are significant */
++        int16_t nn = n[i];
-     gen_set_condexec(s);
++        int16_t res = 0;
--    gen_set_pc_im(s, s->pc - offset);
++        if (mm >= 0) {
-+    gen_set_pc_im(s, s->base.pc_next - offset);
++            if (mm < 16) {
-     tcg_syn = tcg_const_i32(syn);
++                res = nn << mm;
-     gen_helper_exception_bkpt_insn(cpu_env, tcg_syn);
++            }
-     tcg_temp_free_i32(tcg_syn);
++        } else {
-@@ -XXX,XX +XXX,XX @@ static void gen_exception_bkpt_insn(DisasContext *s, int offset, uint32_t syn)
++            res = nn >> (mm > -16 ? -mm : 15);
- /* Force a TB lookup after an instruction that changes the CPU state.  */
++        }
- static inline void gen_lookup_tb(DisasContext *s)
++        d[i] = res;
- {
++    }
--    tcg_gen_movi_i32(cpu_R[15], s->pc);
++    clear_tail(d, opr_sz, simd_maxsz(desc));
-+    tcg_gen_movi_i32(cpu_R[15], s->base.pc_next);
++}
-     s->base.is_jmp = DISAS_EXIT;
++
- }
++void HELPER(gvec_ushl_b)(void *vd, void *vn, void *vm, uint32_t desc)
++{
-@@ -XXX,XX +XXX,XX @@ static inline bool use_goto_tb(DisasContext *s, target_ulong dest)
++    intptr_t i, opr_sz = simd_oprsz(desc);
- {
++    uint8_t *d = vd, *n = vn, *m = vm;
- #ifndef CONFIG_USER_ONLY
++
-     return (s->base.tb->pc & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK) ||
++    for (i = 0; i < opr_sz; ++i) {
--           ((s->pc - 1) & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK);
++        int8_t mm = m[i];
-+           ((s->base.pc_next - 1) & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK);
++        uint8_t nn = n[i];
- #else
++        uint8_t res = 0;
-     return true;
++        if (mm >= 0) {
- #endif
++            if (mm < 8) {
-@@ -XXX,XX +XXX,XX @@ static void gen_nop_hint(DisasContext *s, int val)
++                res = nn << mm;
-          */
++            }
-     case 1: /* yield */
++        } else {
-         if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
++            if (mm > -8) {
--            gen_set_pc_im(s, s->pc);
++                res = nn >> -mm;
-+            gen_set_pc_im(s, s->base.pc_next);
++            }
-             s->base.is_jmp = DISAS_YIELD;
++        }
-         }
++        d[i] = res;
-         break;
++    }
-     case 3: /* wfi */
++    clear_tail(d, opr_sz, simd_maxsz(desc));
--        gen_set_pc_im(s, s->pc);
++}
-+        gen_set_pc_im(s, s->base.pc_next);
++
-         s->base.is_jmp = DISAS_WFI;
++void HELPER(gvec_ushl_h)(void *vd, void *vn, void *vm, uint32_t desc)
-         break;
++{
-     case 2: /* wfe */
++    intptr_t i, opr_sz = simd_oprsz(desc);
-         if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
++    uint16_t *d = vd, *n = vn, *m = vm;
--            gen_set_pc_im(s, s->pc);
++
-+            gen_set_pc_im(s, s->base.pc_next);
++    for (i = 0; i < opr_sz / 2; ++i) {
-             s->base.is_jmp = DISAS_WFE;
++        int8_t mm = m[i];   /* only 8 bits of shift are significant */
-         }
++        uint16_t nn = n[i];
-         break;
++        uint16_t res = 0;
-@@ -XXX,XX +XXX,XX @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
++        if (mm >= 0) {
-             if (isread) {
++            if (mm < 16) {
-                 return 1;
++                res = nn << mm;
-             }
++            }
--            gen_set_pc_im(s, s->pc);
++        } else {
-+            gen_set_pc_im(s, s->base.pc_next);
++            if (mm > -16) {
-             s->base.is_jmp = DISAS_WFI;
++                res = nn >> -mm;
-             return 0;
++            }
-         default:
++        }
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
++        d[i] = res;
-                  * self-modifying code correctly and also to take
++    }
-                  * any pending interrupts immediately.
++    clear_tail(d, opr_sz, simd_maxsz(desc));
-                  */
++}
 -                gen_goto_tb(s, 0, s->pc);
 +                gen_goto_tb(s, 0, s->base.pc_next);
                  return;
              case 7: /* sb */
                  if ((insn & 0xf) || !dc_isar_feature(aa32_sb, s)) {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                   * for TCG; MB and end the TB instead.
                   */
                  tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
 -                gen_goto_tb(s, 0, s->pc);
 +                gen_goto_tb(s, 0, s->base.pc_next);
                  return;
              default:
                  goto illegal_op;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
              int32_t offset;
              tmp = tcg_temp_new_i32();
 -            tcg_gen_movi_i32(tmp, s->pc);
 +            tcg_gen_movi_i32(tmp, s->base.pc_next);
              store_reg(s, 14, tmp);
              /* Sign-extend the 24-bit offset */
              offset = (((int32_t)insn) << 8) >> 8;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
              /* branch link/exchange thumb (blx) */
              tmp = load_reg(s, rm);
              tmp2 = tcg_temp_new_i32();
 -            tcg_gen_movi_i32(tmp2, s->pc);
 +            tcg_gen_movi_i32(tmp2, s->base.pc_next);
              store_reg(s, 14, tmp2);
              gen_bx(s, tmp);
              break;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                  /* branch (and link) */
                  if (insn & (1 << 24)) {
                      tmp = tcg_temp_new_i32();
 -                    tcg_gen_movi_i32(tmp, s->pc);
 +                    tcg_gen_movi_i32(tmp, s->base.pc_next);
                      store_reg(s, 14, tmp);
                  }
                  offset = sextract32(insn << 2, 0, 26);
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
              break;
          case 0xf:
              /* swi */
 -            gen_set_pc_im(s, s->pc);
 +            gen_set_pc_im(s, s->base.pc_next);
              s->svc_imm = extract32(insn, 0, 24);
              s->base.is_jmp = DISAS_SWI;
              break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                  if (insn & (1 << 14)) {
                      /* Branch and link.  */
 -                    tcg_gen_movi_i32(cpu_R[14], s->pc | 1);
 +                    tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | 1);
                  }
                  offset += read_pc(s);
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                               * and also to take any pending interrupts
                               * immediately.
                               */
 -                            gen_goto_tb(s, 0, s->pc);
 +                            gen_goto_tb(s, 0, s->base.pc_next);
                              break;
                          case 7: /* sb */
                              if ((insn & 0xf) || !dc_isar_feature(aa32_sb, s)) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                               * for TCG; MB and end the TB instead.
                               */
                              tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
 -                            gen_goto_tb(s, 0, s->pc);
 +                            gen_goto_tb(s, 0, s->base.pc_next);
                              break;
                          default:
                              goto illegal_op;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
                  /* BLX/BX */
                  tmp = load_reg(s, rm);
                  if (link) {
 -                    val = (uint32_t)s->pc | 1;
 +                    val = (uint32_t)s->base.pc_next | 1;
                      tmp2 = tcg_temp_new_i32();
                      tcg_gen_movi_i32(tmp2, val);
                      store_reg(s, 14, tmp2);
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
          if (cond == 0xf) {
              /* swi */
 -            gen_set_pc_im(s, s->pc);
 +            gen_set_pc_im(s, s->base.pc_next);
              s->svc_imm = extract32(insn, 0, 8);
              s->base.is_jmp = DISAS_SWI;
              break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
              tcg_gen_andi_i32(tmp, tmp, 0xfffffffc);
              tmp2 = tcg_temp_new_i32();
 -            tcg_gen_movi_i32(tmp2, s->pc | 1);
 +            tcg_gen_movi_i32(tmp2, s->base.pc_next | 1);
              store_reg(s, 14, tmp2);
              gen_bx(s, tmp);
              break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
              tcg_gen_addi_i32(tmp, tmp, offset);
              tmp2 = tcg_temp_new_i32();
 -            tcg_gen_movi_i32(tmp2, s->pc | 1);
 +            tcg_gen_movi_i32(tmp2, s->base.pc_next | 1);
              store_reg(s, 14, tmp2);
              gen_bx(s, tmp);
          } else {
@@ -XXX,XX +XXX,XX @@ undef:
  static bool insn_crosses_page(CPUARMState *env, DisasContext *s)
  {
 -    /* Return true if the insn at dc->pc might cross a page boundary.
 +    /* Return true if the insn at dc->base.pc_next might cross a page boundary.
       * (False positives are OK, false negatives are not.)
       * We know this is a Thumb insn, and our caller ensures we are
 -     * only called if dc->pc is less than 4 bytes from the page
 +     * only called if dc->base.pc_next is less than 4 bytes from the page
       * boundary, so we cross the page if the first 16 bits indicate
       * that this is a 32 bit insn.
       */
 -    uint16_t insn = arm_lduw_code(env, s->pc, s->sctlr_b);
 +    uint16_t insn = arm_lduw_code(env, s->base.pc_next, s->sctlr_b);
 -    return !thumb_insn_is_16bit(s, s->pc, insn);
 +    return !thumb_insn_is_16bit(s, s->base.pc_next, insn);
  }
  static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
      uint32_t condexec, core_mmu_idx;
      dc->isar = &cpu->isar;
 -    dc->pc = dc->base.pc_first;
      dc->condjmp = 0;
      dc->aarch64 = 0;
@@ -XXX,XX +XXX,XX @@ static void arm_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
  {
      DisasContext *dc = container_of(dcbase, DisasContext, base);
 -    tcg_gen_insn_start(dc->pc,
 +    tcg_gen_insn_start(dc->base.pc_next,
                         (dc->condexec_cond << 4) | (dc->condexec_mask >> 1),
 );
      dc->insn_start = tcg_last_op();
@@ -XXX,XX +XXX,XX @@ static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
      if (bp->flags & BP_CPU) {
          gen_set_condexec(dc);
 -        gen_set_pc_im(dc, dc->pc);
 +        gen_set_pc_im(dc, dc->base.pc_next);
          gen_helper_check_breakpoints(cpu_env);
          /* End the TB early; it's likely not going to be executed */
          dc->base.is_jmp = DISAS_TOO_MANY;
@@ -XXX,XX +XXX,XX @@ static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
             tb->size below does the right thing.  */
          /* TODO: Advance PC by correct instruction length to
           * avoid disassembler error messages */
 -        dc->pc += 2;
 +        dc->base.pc_next += 2;
          dc->base.is_jmp = DISAS_NORETURN;
      }
@@ -XXX,XX +XXX,XX @@ static bool arm_pre_translate_insn(DisasContext *dc)
  {
  #ifdef CONFIG_USER_ONLY
      /* Intercept jump to the magic kernel page.  */
 -    if (dc->pc >= 0xffff0000) {
 +    if (dc->base.pc_next >= 0xffff0000) {
          /* We always get here via a jump, so know we are not in a
             conditional execution block.  */
          gen_exception_internal(EXCP_KERNEL_TRAP);
@@ -XXX,XX +XXX,XX @@ static void arm_post_translate_insn(DisasContext *dc)
          gen_set_label(dc->condlabel);
          dc->condjmp = 0;
      }
 -    dc->base.pc_next = dc->pc;
      translator_loop_temp_check(&dc->base);
  }
@@ -XXX,XX +XXX,XX @@ static void arm_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
          return;
      }
 -    dc->pc_curr = dc->pc;
 -    insn = arm_ldl_code(env, dc->pc, dc->sctlr_b);
 +    dc->pc_curr = dc->base.pc_next;
 +    insn = arm_ldl_code(env, dc->base.pc_next, dc->sctlr_b);
      dc->insn = insn;
 -    dc->pc += 4;
 +    dc->base.pc_next += 4;
      disas_arm_insn(dc, insn);
      arm_post_translate_insn(dc);
@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
          return;
      }
 -    dc->pc_curr = dc->pc;
 -    insn = arm_lduw_code(env, dc->pc, dc->sctlr_b);
 -    is_16bit = thumb_insn_is_16bit(dc, dc->pc, insn);
 -    dc->pc += 2;
 +    dc->pc_curr = dc->base.pc_next;
 +    insn = arm_lduw_code(env, dc->base.pc_next, dc->sctlr_b);
 +    is_16bit = thumb_insn_is_16bit(dc, dc->base.pc_next, insn);
 +    dc->base.pc_next += 2;
      if (!is_16bit) {
 -        uint32_t insn2 = arm_lduw_code(env, dc->pc, dc->sctlr_b);
 +        uint32_t insn2 = arm_lduw_code(env, dc->base.pc_next, dc->sctlr_b);
          insn = insn << 16 | insn2;
 -        dc->pc += 2;
 +        dc->base.pc_next += 2;
      }
      dc->insn = insn;
@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
       * but isn't very efficient).
       */
      if (dc->base.is_jmp == DISAS_NEXT
 -        && (dc->pc - dc->page_start >= TARGET_PAGE_SIZE
 -            || (dc->pc - dc->page_start >= TARGET_PAGE_SIZE - 3
 +        && (dc->base.pc_next - dc->page_start >= TARGET_PAGE_SIZE
 +            || (dc->base.pc_next - dc->page_start >= TARGET_PAGE_SIZE - 3
                  && insn_crosses_page(env, dc)))) {
          dc->base.is_jmp = DISAS_TOO_MANY;
      }
@@ -XXX,XX +XXX,XX @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
          case DISAS_NEXT:
          case DISAS_TOO_MANY:
          case DISAS_UPDATE:
 -            gen_set_pc_im(dc, dc->pc);
 +            gen_set_pc_im(dc, dc->base.pc_next);
              /* fall through */
          default:
              /* FIXME: Single stepping a WFI insn will not halt the CPU. */
@@ -XXX,XX +XXX,XX @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
          switch(dc->base.is_jmp) {
          case DISAS_NEXT:
          case DISAS_TOO_MANY:
 -            gen_goto_tb(dc, 1, dc->pc);
 +            gen_goto_tb(dc, 1, dc->base.pc_next);
              break;
          case DISAS_JUMP:
              gen_goto_ptr();
              break;
          case DISAS_UPDATE:
 -            gen_set_pc_im(dc, dc->pc);
 +            gen_set_pc_im(dc, dc->base.pc_next);
              /* fall through */
          default:
              /* indicate that the hash table must be used to find the next TB */
@@ -XXX,XX +XXX,XX @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
          gen_set_label(dc->condlabel);
          gen_set_condexec(dc);
          if (unlikely(is_singlestepping(dc))) {
 -            gen_set_pc_im(dc, dc->pc);
 +            gen_set_pc_im(dc, dc->base.pc_next);
              gen_singlestep_exception(dc);
          } else {
 -            gen_goto_tb(dc, 1, dc->pc);
 +            gen_goto_tb(dc, 1, dc->base.pc_next);
          }
      }
 -
 -    /* Functions above can change dc->pc, so re-align db->pc_next */
 -    dc->base.pc_next = dc->pc;
  }
  static void arm_tr_disas_log(const DisasContextBase *dcbase, CPUState *cpu)
 --
 .20.1

-[Qemu-devel] [PULL 13/29] target/arm: Replace offset with pc in gen_exception_internal_insn
+[PULL 39/52] target/arm: Convert PMUL.8 to gvec
 From: Richard Henderson <richard.henderson@linaro.org>
-The offset is variable depending on the instruction set.
+The gvec form will be needed for implementing SVE2.
 Passing in the actual value is clearer in intent.
+Extend the implementation to operate on uint64_t instead of uint32_t.
+Use a counted inner loop instead of terminating when op1 goes to zero,
+looking toward the required implementation for ARMv8.4-DIT.
+Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200216214232.4230-3-richard.henderson@linaro.org
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Message-id: 20190807045335.1361-9-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 8 ++++----
+ target/arm/helper.h        |  3 ++-
- target/arm/translate.c     | 8 ++++----
+ target/arm/neon_helper.c   | 22 ----------------------
-files changed, 8 insertions(+), 8 deletions(-)
+ target/arm/translate-a64.c | 10 +++-------
  target/arm/translate.c     | 11 ++++-------
  target/arm/vec_helper.c    | 30 ++++++++++++++++++++++++++++++
 files changed, 39 insertions(+), 37 deletions(-)
+diff --git a/target/arm/helper.h b/target/arm/helper.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.h
++++ b/target/arm/helper.h
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_sub_u8, i32, i32, i32)
+ DEF_HELPER_2(neon_sub_u16, i32, i32, i32)
+ DEF_HELPER_2(neon_mul_u8, i32, i32, i32)
+ DEF_HELPER_2(neon_mul_u16, i32, i32, i32)
+-DEF_HELPER_2(neon_mul_p8, i32, i32, i32)
+ DEF_HELPER_2(neon_mull_p8, i64, i32, i32)
+ DEF_HELPER_2(neon_tst_u8, i32, i32, i32)
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_sshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+ DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+ DEF_HELPER_FLAGS_4(gvec_ushl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_4(gvec_pmul_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++
+ #ifdef TARGET_AARCH64
+ #include "helper-a64.h"
+ #include "helper-sve.h"
+diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon_helper.c
++++ b/target/arm/neon_helper.c
+@@ -XXX,XX +XXX,XX @@ NEON_VOP(mul_u16, neon_u16, 2)
+ /* Polynomial multiplication is like integer multiplication except the
+    partial products are XORed, not added.  */
+-uint32_t HELPER(neon_mul_p8)(uint32_t op1, uint32_t op2)
+-{
+-    uint32_t mask;
+-    uint32_t result;
+-    result = 0;
+-    while (op1) {
+-        mask = 0;
+-        if (op1 & 1)
+-            mask |= 0xff;
+-        if (op1 & (1 << 8))
+-            mask |= (0xff << 8);
+-        if (op1 & (1 << 16))
+-            mask |= (0xff << 16);
+-        if (op1 & (1 << 24))
+-            mask |= (0xff << 24);
+-        result ^= op2 & mask;
+-        op1 = (op1 >> 1) & 0x7f7f7f7f;
+-        op2 = (op2 << 1) & 0xfefefefe;
+-    }
+-    return result;
+-}
+-
+ uint64_t HELPER(neon_mull_p8)(uint32_t op1, uint32_t op2)
+ {
+     uint64_t result = 0;
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal(int excp)
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
-     tcg_temp_free_i32(tcg_excp);
+     case 0x13: /* MUL, PMUL */
- }
+         if (!u) { /* MUL */
+             gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_mul, size);
--static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
+-            return;
-+static void gen_exception_internal_insn(DisasContext *s, uint64_t pc, int excp)
++        } else {  /* PMUL */
- {
++            gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0, gen_helper_gvec_pmul_b);
--    gen_a64_set_pc_im(s->base.pc_next - offset);
+         }
-+    gen_a64_set_pc_im(pc);
+-        break;
-     gen_exception_internal(excp);
++        return;
-     s->base.is_jmp = DISAS_NORETURN;
+     case 0x12: /* MLA, MLS */
- }
+         if (u) {
-@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
+             gen_gvec_op3(s, is_q, rd, rn, rm, &mls_op[size]);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                  genfn = fns[size][u];
                  break;
              }
- #endif
+-            case 0x13: /* MUL, PMUL */
--            gen_exception_internal_insn(s, 0, EXCP_SEMIHOST);
+-                assert(u); /* PMUL */
-+            gen_exception_internal_insn(s, s->base.pc_next, EXCP_SEMIHOST);
+-                assert(size == 0);
-         } else {
+-                genfn = gen_helper_neon_mul_p8;
-             unsupported_encoding(s, insn);
+-                break;
-         }
+             case 0x16: /* SQDMULH, SQRDMULH */
-@@ -XXX,XX +XXX,XX @@ static bool aarch64_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+             {
-         /* End the TB early; it likely won't be executed */
+                 static NeonGenTwoOpEnvFn * const fns[2][2] = {
          dc->base.is_jmp = DISAS_TOO_MANY;
      } else {
 -        gen_exception_internal_insn(dc, 0, EXCP_DEBUG);
 +        gen_exception_internal_insn(dc, dc->base.pc_next, EXCP_DEBUG);
          /* The address covered by the breakpoint must be
             included in [tb->pc, tb->pc + tb->size) in order
             to for it to be properly cleared -- thus we
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static inline void gen_smc(DisasContext *s)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     s->base.is_jmp = DISAS_SMC;
          case NEON_3R_VMUL: /* VMUL */
              if (u) {
 -                /* Polynomial case allows only P8 and is handled below.  */
 +                /* Polynomial case allows only P8.  */
                  if (size != 0) {
                      return 1;
                  }
 +                tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
 +                                   0, gen_helper_gvec_pmul_b);
              } else {
                  tcg_gen_gvec_mul(size, rd_ofs, rn_ofs, rm_ofs,
                                   vec_size, vec_size);
 -                return 0;
              }
 -            break;
 +            return 0;
          case NEON_3R_VML: /* VMLA, VMLS */
              tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size,
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tmp2 = neon_load_reg(rd, pass);
              gen_neon_add(size, tmp, tmp2);
              break;
 -        case NEON_3R_VMUL:
 -            /* VMUL.P8; other cases already eliminated.  */
 -            gen_helper_neon_mul_p8(tmp, tmp, tmp2);
 -            break;
          case NEON_3R_VPMAX:
              GEN_NEON_INTEGER_OP(pmax);
              break;
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_ushl_h)(void *vd, void *vn, void *vm, uint32_t desc)
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
++
--static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
++/*
-+static void gen_exception_internal_insn(DisasContext *s, uint32_t pc, int excp)
++ * 8x8->8 polynomial multiply.
- {
++ *
-     gen_set_condexec(s);
++ * Polynomial multiplication is like integer multiplication except the
--    gen_set_pc_im(s, s->base.pc_next - offset);
++ * partial products are XORed, not added.
-+    gen_set_pc_im(s, pc);
++ *
-     gen_exception_internal(excp);
++ * TODO: expose this as a generic vector operation, as it is a common
-     s->base.is_jmp = DISAS_NORETURN;
++ * crypto building block.
- }
++ */
-@@ -XXX,XX +XXX,XX @@ static inline void gen_hlt(DisasContext *s, int imm)
++void HELPER(gvec_pmul_b)(void *vd, void *vn, void *vm, uint32_t desc)
-         s->current_el != 0 &&
++{
- #endif
++    intptr_t i, j, opr_sz = simd_oprsz(desc);
-         (imm == (s->thumb ? 0x3c : 0xf000))) {
++    uint64_t *d = vd, *n = vn, *m = vm;
--        gen_exception_internal_insn(s, 0, EXCP_SEMIHOST);
++
-+        gen_exception_internal_insn(s, s->base.pc_next, EXCP_SEMIHOST);
++    for (i = 0; i < opr_sz / 8; ++i) {
-         return;
++        uint64_t nn = n[i];
-     }
++        uint64_t mm = m[i];
++        uint64_t rr = 0;
-@@ -XXX,XX +XXX,XX @@ static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
++
-         /* End the TB early; it's likely not going to be executed */
++        for (j = 0; j < 8; ++j) {
-         dc->base.is_jmp = DISAS_TOO_MANY;
++            uint64_t mask = (nn & 0x0101010101010101ull) * 0xff;
-     } else {
++            rr ^= mm & mask;
--        gen_exception_internal_insn(dc, 0, EXCP_DEBUG);
++            mm = (mm << 1) & 0xfefefefefefefefeull;
-+        gen_exception_internal_insn(dc, dc->base.pc_next, EXCP_DEBUG);
++            nn >>= 1;
-         /* The address covered by the breakpoint must be
++        }
-            included in [tb->pc, tb->pc + tb->size) in order
++        d[i] = rr;
-            to for it to be properly cleared -- thus we
++    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 --
 .20.1

-[Qemu-devel] [PULL 16/29] target/arm: Remove helper_double_saturate
+[PULL 40/52] target/arm: Convert PMULL.64 to gvec
 From: Richard Henderson <richard.henderson@linaro.org>
-Replace x = double_saturate(y) with x = add_saturate(y, y).
+The gvec form will be needed for implementing SVE2.
 There is no need for a separate more specialized helper.
+Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200216214232.4230-4-richard.henderson@linaro.org
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Message-id: 20190807045335.1361-12-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h    |  1 -
+ target/arm/helper.h        |  4 +---
- target/arm/op_helper.c | 15 ---------------
+ target/arm/neon_helper.c   | 30 ------------------------------
- target/arm/translate.c |  4 ++--
+ target/arm/translate-a64.c | 28 +++-------------------------
-files changed, 2 insertions(+), 18 deletions(-)
+ target/arm/translate.c     | 16 ++--------------
  target/arm/vec_helper.c    | 33 +++++++++++++++++++++++++++++++++
 files changed, 39 insertions(+), 72 deletions(-)
 diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.h
 +++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(add_saturate, i32, env, i32, i32)
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
- DEF_HELPER_3(sub_saturate, i32, env, i32, i32)
+ DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
- DEF_HELPER_3(add_usaturate, i32, env, i32, i32)
+ DEF_HELPER_2(dc_zva, void, env, i64)
- DEF_HELPER_3(sub_usaturate, i32, env, i32, i32)
--DEF_HELPER_2(double_saturate, i32, env, s32)
+-DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
- DEF_HELPER_FLAGS_2(sdiv, TCG_CALL_NO_RWG_SE, s32, s32, s32)
+-DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
- DEF_HELPER_FLAGS_2(udiv, TCG_CALL_NO_RWG_SE, i32, i32, i32)
+-
- DEF_HELPER_FLAGS_1(rbit, TCG_CALL_NO_RWG_SE, i32, i32)
+ DEF_HELPER_FLAGS_5(gvec_qrdmlah_s16, TCG_CALL_NO_RWG,
-diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
+                    void, ptr, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s16, TCG_CALL_NO_RWG,
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_4(gvec_ushl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_4(gvec_pmul_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
 diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/op_helper.c
+--- a/target/arm/neon_helper.c
-+++ b/target/arm/op_helper.c
++++ b/target/arm/neon_helper.c
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(sub_saturate)(CPUARMState *env, uint32_t a, uint32_t b)
+@@ -XXX,XX +XXX,XX @@ void HELPER(neon_zip16)(void *vd, void *vm)
-     return res;
+     rm[0] = m0;
      rd[0] = d0;
  }
+-
--uint32_t HELPER(double_saturate)(CPUARMState *env, int32_t val)
+-/* Helper function for 64 bit polynomial multiply case:
 - * perform PolynomialMult(op1, op2) and return either the top or
 - * bottom half of the 128 bit result.
 - */
 -uint64_t HELPER(neon_pmull_64_lo)(uint64_t op1, uint64_t op2)
 -{
--    uint32_t res;
+-    int bitnum;
--    if (val >= 0x40000000) {
+-    uint64_t res = 0;
--        res = ~SIGNBIT;
+-
--        env->QF = 1;
+-    for (bitnum = 0; bitnum < 64; bitnum++) {
--    } else if (val <= (int32_t)0xc0000000) {
+-        if (op1 & (1ULL << bitnum)) {
--        res = SIGNBIT;
+-            res ^= op2 << bitnum;
--        env->QF = 1;
+-        }
 -    } else {
 -        res = val << 1;
 -    }
 -    return res;
 -}
+-uint64_t HELPER(neon_pmull_64_hi)(uint64_t op1, uint64_t op2)
+-{
+-    int bitnum;
+-    uint64_t res = 0;
 -
- uint32_t HELPER(add_usaturate)(CPUARMState *env, uint32_t a, uint32_t b)
+-    /* bit 0 of op1 can't influence the high 64 bits at all */
- {
+-    for (bitnum = 1; bitnum < 64; bitnum++) {
-     uint32_t res = a + b;
+-        if (op1 & (1ULL << bitnum)) {
 -            res ^= op2 >> (64 - bitnum);
 -        }
 -    }
 -    return res;
 -}
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_3rd_narrowing(DisasContext *s, int is_q, int is_u, int size,
      clear_vec_high(s, is_q, rd);
  }
 -static void handle_pmull_64(DisasContext *s, int is_q, int rd, int rn, int rm)
 -{
 -    /* PMULL of 64 x 64 -> 128 is an odd special case because it
 -     * is the only three-reg-diff instruction which produces a
 -     * 128-bit wide result from a single operation. However since
 -     * it's possible to calculate the two halves more or less
 -     * separately we just use two helper calls.
 -     */
 -    TCGv_i64 tcg_op1 = tcg_temp_new_i64();
 -    TCGv_i64 tcg_op2 = tcg_temp_new_i64();
 -    TCGv_i64 tcg_res = tcg_temp_new_i64();
 -
 -    read_vec_element(s, tcg_op1, rn, is_q, MO_64);
 -    read_vec_element(s, tcg_op2, rm, is_q, MO_64);
 -    gen_helper_neon_pmull_64_lo(tcg_res, tcg_op1, tcg_op2);
 -    write_vec_element(s, tcg_res, rd, 0, MO_64);
 -    gen_helper_neon_pmull_64_hi(tcg_res, tcg_op1, tcg_op2);
 -    write_vec_element(s, tcg_res, rd, 1, MO_64);
 -
 -    tcg_temp_free_i64(tcg_op1);
 -    tcg_temp_free_i64(tcg_op2);
 -    tcg_temp_free_i64(tcg_res);
 -}
 -
  /* AdvSIMD three different
   *   31  30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
   * +---+---+---+-----------+------+---+------+--------+-----+------+------+
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
              if (!fp_access_check(s)) {
                  return;
              }
 -            handle_pmull_64(s, is_q, rd, rn, rm);
 +            /* The Q field specifies lo/hi half input for this insn.  */
 +            gen_gvec_op3_ool(s, true, rd, rn, rm, is_q,
 +                             gen_helper_gvec_pmull_q);
              return;
          }
          goto is_widening;
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             tmp = load_reg(s, rm);
+                  * outside the loop below as it only performs a single pass.
-             tmp2 = load_reg(s, rn);
+                  */
-             if (op1 & 2)
+                 if (op == 14 && size == 2) {
--                gen_helper_double_saturate(tmp2, cpu_env, tmp2);
+-                    TCGv_i64 tcg_rn, tcg_rm, tcg_rd;
-+                gen_helper_add_saturate(tmp2, cpu_env, tmp2, tmp2);
+-
-             if (op1 & 1)
+                     if (!dc_isar_feature(aa32_pmull, s)) {
-                 gen_helper_sub_saturate(tmp, cpu_env, tmp, tmp2);
+                         return 1;
-             else
+                     }
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+-                    tcg_rn = tcg_temp_new_i64();
-                 tmp = load_reg(s, rn);
+-                    tcg_rm = tcg_temp_new_i64();
-                 tmp2 = load_reg(s, rm);
+-                    tcg_rd = tcg_temp_new_i64();
-                 if (op & 1)
+-                    neon_load_reg64(tcg_rn, rn);
--                    gen_helper_double_saturate(tmp, cpu_env, tmp);
+-                    neon_load_reg64(tcg_rm, rm);
-+                    gen_helper_add_saturate(tmp, cpu_env, tmp, tmp);
+-                    gen_helper_neon_pmull_64_lo(tcg_rd, tcg_rn, tcg_rm);
-                 if (op & 2)
+-                    neon_store_reg64(tcg_rd, rd);
-                     gen_helper_sub_saturate(tmp, cpu_env, tmp2, tmp);
+-                    gen_helper_neon_pmull_64_hi(tcg_rd, tcg_rn, tcg_rm);
-                 else
+-                    neon_store_reg64(tcg_rd, rd + 1);
 -                    tcg_temp_free_i64(tcg_rn);
 -                    tcg_temp_free_i64(tcg_rm);
 -                    tcg_temp_free_i64(tcg_rd);
 +                    tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
 +                                       0, gen_helper_gvec_pmull_q);
                      return 0;
                  }
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_pmul_b)(void *vd, void *vn, void *vm, uint32_t desc)
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
 +
 +/*
 + * 64x64->128 polynomial multiply.
 + * Because of the lanes are not accessed in strict columns,
 + * this probably cannot be turned into a generic helper.
 + */
 +void HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc)
 +{
 +    intptr_t i, j, opr_sz = simd_oprsz(desc);
 +    intptr_t hi = simd_data(desc);
 +    uint64_t *d = vd, *n = vn, *m = vm;
 +
 +    for (i = 0; i < opr_sz / 8; i += 2) {
 +        uint64_t nn = n[i + hi];
 +        uint64_t mm = m[i + hi];
 +        uint64_t rhi = 0;
 +        uint64_t rlo = 0;
 +
 +        /* Bit 0 can only influence the low 64-bit result.  */
 +        if (nn & 1) {
 +            rlo = mm;
 +        }
 +
 +        for (j = 1; j < 64; ++j) {
 +            uint64_t mask = -((nn >> j) & 1);
 +            rlo ^= (mm << j) & mask;
 +            rhi ^= (mm >> (64 - j)) & mask;
 +        }
 +        d[i] = rlo;
 +        d[i + 1] = rhi;
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 --
 .20.1

-[Qemu-devel] [PULL 07/29] target/arm: Introduce pc_curr
+[PULL 41/52] target/arm: Convert PMULL.8 to gvec
 From: Richard Henderson <richard.henderson@linaro.org>
-Add a new field to retain the address of the instruction currently
+We still need two different helpers, since NEON and SVE2 get the
-being translated.  The 32-bit uses are all within subroutines used
+inputs from different locations within the source vector.  However,
-by a32 and t32.  This will become less obvious when t16 support is
+we can convert both to the same internal form for computation.
-merged with a32+t32, and having a clear definition will help.
+The sve2 helper is not used yet, but adding it with this patch
-Convert aarch64 as well for consistency.  Note that there is one
+helps illustrate why the neon changes are helpful.
-instance of a pre-assert fprintf that used the wrong value for the
-address of the current instruction.
+Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200216214232.4230-5-richard.henderson@linaro.org
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Message-id: 20190807045335.1361-3-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.h |  2 +-
+ target/arm/helper-sve.h    |  2 ++
- target/arm/translate.h     |  2 ++
+ target/arm/helper.h        |  3 +-
- target/arm/translate-a64.c | 21 +++++++++++----------
+ target/arm/neon_helper.c   | 32 --------------------
- target/arm/translate.c     | 14 ++++++++------
+ target/arm/translate-a64.c | 27 +++++++++++------
-files changed, 22 insertions(+), 17 deletions(-)
+ target/arm/translate.c     | 26 ++++++++---------
+ target/arm/vec_helper.c    | 60 ++++++++++++++++++++++++++++++++++++++
-diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
+files changed, 95 insertions(+), 55 deletions(-)
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.h
+diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
-+++ b/target/arm/translate-a64.h
+index XXXXXXX..XXXXXXX 100644
-@@ -XXX,XX +XXX,XX @@ void unallocated_encoding(DisasContext *s);
+--- a/target/arm/helper-sve.h
-         qemu_log_mask(LOG_UNIMP,                                         \
++++ b/target/arm/helper-sve.h
-                       "%s:%d: unsupported instruction encoding 0x%08x "  \
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_stdd_le_zd, TCG_CALL_NO_WG,
-                       "at pc=%016" PRIx64 "\n",                          \
+                    void, env, ptr, ptr, ptr, tl, i32)
--                      __FILE__, __LINE__, insn, s->pc - 4);              \
+ DEF_HELPER_FLAGS_6(sve_stdd_be_zd, TCG_CALL_NO_WG,
-+                      __FILE__, __LINE__, insn, s->pc_curr);             \
+                    void, env, ptr, ptr, ptr, tl, i32)
-         unallocated_encoding(s);                                         \
++
-     } while (0)
++DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+diff --git a/target/arm/helper.h b/target/arm/helper.h
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+index XXXXXXX..XXXXXXX 100644
-index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.h
---- a/target/arm/translate.h
++++ b/target/arm/helper.h
-+++ b/target/arm/translate.h
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_sub_u8, i32, i32, i32)
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
+ DEF_HELPER_2(neon_sub_u16, i32, i32, i32)
-     const ARMISARegisters *isar;
+ DEF_HELPER_2(neon_mul_u8, i32, i32, i32)
+ DEF_HELPER_2(neon_mul_u16, i32, i32, i32)
-     target_ulong pc;
+-DEF_HELPER_2(neon_mull_p8, i64, i32, i32)
-+    /* The address of the current instruction being translated. */
-+    target_ulong pc_curr;
+ DEF_HELPER_2(neon_tst_u8, i32, i32, i32)
-     target_ulong page_start;
+ DEF_HELPER_2(neon_tst_u16, i32, i32, i32)
-     uint32_t insn;
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_ushl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
-     /* Nonzero if this instruction has been conditionally skipped.  */
+ DEF_HELPER_FLAGS_4(gvec_pmul_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
  #include "helper-sve.h"
 diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon_helper.c
 +++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ NEON_VOP(mul_u8, neon_u8, 4)
  NEON_VOP(mul_u16, neon_u16, 2)
  #undef NEON_FN
 -/* Polynomial multiplication is like integer multiplication except the
 -   partial products are XORed, not added.  */
 -uint64_t HELPER(neon_mull_p8)(uint32_t op1, uint32_t op2)
 -{
 -    uint64_t result = 0;
 -    uint64_t mask;
 -    uint64_t op2ex = op2;
 -    op2ex = (op2ex & 0xff) |
 -        ((op2ex & 0xff00) << 8) |
 -        ((op2ex & 0xff0000) << 16) |
 -        ((op2ex & 0xff000000) << 24);
 -    while (op1) {
 -        mask = 0;
 -        if (op1 & 1) {
 -            mask |= 0xffff;
 -        }
 -        if (op1 & (1 << 8)) {
 -            mask |= (0xffffU << 16);
 -        }
 -        if (op1 & (1 << 16)) {
 -            mask |= (0xffffULL << 32);
 -        }
 -        if (op1 & (1 << 24)) {
 -            mask |= (0xffffULL << 48);
 -        }
 -        result ^= op2ex & mask;
 -        op1 = (op1 >> 1) & 0x7f7f7f7f;
 -        op2ex <<= 1;
 -    }
 -    return result;
 -}
 -
  #define NEON_FN(dest, src1, src2) dest = (src1 & src2) ? -1 : 0
  NEON_VOP(tst_u8, neon_u8, 4)
  NEON_VOP(tst_u16, neon_u16, 2)
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static inline AArch64DecodeFn *lookup_disas_fn(const AArch64DecodeTable *table,
+@@ -XXX,XX +XXX,XX @@ static void handle_3rd_widening(DisasContext *s, int is_q, int is_u, int size,
-  */
+                 gen_helper_neon_addl_saturate_s32(tcg_passres, cpu_env,
- static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
+                                                   tcg_passres, tcg_passres);
  {
 -    uint64_t addr = s->pc + sextract32(insn, 0, 26) * 4 - 4;
 +    uint64_t addr = s->pc_curr + sextract32(insn, 0, 26) * 4;
      if (insn & (1U << 31)) {
          /* BL Branch with link */
@@ -XXX,XX +XXX,XX @@ static void disas_comp_b_imm(DisasContext *s, uint32_t insn)
      sf = extract32(insn, 31, 1);
      op = extract32(insn, 24, 1); /* 0: CBZ; 1: CBNZ */
      rt = extract32(insn, 0, 5);
 -    addr = s->pc + sextract32(insn, 5, 19) * 4 - 4;
 +    addr = s->pc_curr + sextract32(insn, 5, 19) * 4;
      tcg_cmp = read_cpu_reg(s, rt, sf);
      label_match = gen_new_label();
@@ -XXX,XX +XXX,XX @@ static void disas_test_b_imm(DisasContext *s, uint32_t insn)
      bit_pos = (extract32(insn, 31, 1) << 5) | extract32(insn, 19, 5);
      op = extract32(insn, 24, 1); /* 0: TBZ; 1: TBNZ */
 -    addr = s->pc + sextract32(insn, 5, 14) * 4 - 4;
 +    addr = s->pc_curr + sextract32(insn, 5, 14) * 4;
      rt = extract32(insn, 0, 5);
      tcg_cmp = tcg_temp_new_i64();
@@ -XXX,XX +XXX,XX @@ static void disas_cond_b_imm(DisasContext *s, uint32_t insn)
          unallocated_encoding(s);
          return;
      }
 -    addr = s->pc + sextract32(insn, 5, 19) * 4 - 4;
 +    addr = s->pc_curr + sextract32(insn, 5, 19) * 4;
      cond = extract32(insn, 0, 4);
      reset_btype(s);
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
          TCGv_i32 tcg_syn, tcg_isread;
          uint32_t syndrome;
 -        gen_a64_set_pc_im(s->pc - 4);
 +        gen_a64_set_pc_im(s->pc_curr);
          tmpptr = tcg_const_ptr(ri);
          syndrome = syn_aa64_sysregtrap(op0, op1, op2, crn, crm, rt, isread);
          tcg_syn = tcg_const_i32(syndrome);
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
              /* The pre HVC helper handles cases when HVC gets trapped
               * as an undefined insn by runtime configuration.
               */
 -            gen_a64_set_pc_im(s->pc - 4);
 +            gen_a64_set_pc_im(s->pc_curr);
              gen_helper_pre_hvc(cpu_env);
              gen_ss_advance(s);
              gen_exception_insn(s, 0, EXCP_HVC, syn_aa64_hvc(imm16), 2);
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
                  unallocated_encoding(s);
                  break;
-             }
+-            case 14: /* PMULL */
--            gen_a64_set_pc_im(s->pc - 4);
+-                assert(size == 0);
-+            gen_a64_set_pc_im(s->pc_curr);
+-                gen_helper_neon_mull_p8(tcg_passres, tcg_op1, tcg_op2);
-             tmp = tcg_const_i32(syn_aa64_smc(imm16));
+-                break;
              gen_helper_pre_smc(cpu_env, tmp);
              tcg_temp_free_i32(tmp);
@@ -XXX,XX +XXX,XX @@ static void disas_ld_lit(DisasContext *s, uint32_t insn)
      tcg_rt = cpu_reg(s, rt);
 -    clean_addr = tcg_const_i64((s->pc - 4) + imm);
 +    clean_addr = tcg_const_i64(s->pc_curr + imm);
      if (is_vector) {
          do_fp_ld(s, rt, clean_addr, size);
      } else {
@@ -XXX,XX +XXX,XX @@ static void disas_pc_rel_adr(DisasContext *s, uint32_t insn)
      offset = sextract64(insn, 5, 19);
      offset = offset << 2 | extract32(insn, 29, 2);
      rd = extract32(insn, 0, 5);
 -    base = s->pc - 4;
 +    base = s->pc_curr;
      if (page) {
          /* ADRP (page based) */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
                  break;
              default:
-                 fprintf(stderr, "%s: insn %#04x, fpop %#2x @ %#" PRIx64 "\n",
--                        __func__, insn, fpopcode, s->pc);
-+                        __func__, insn, fpopcode, s->pc_curr);
                  g_assert_not_reached();
              }
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
-@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
+         handle_3rd_narrowing(s, is_q, is_u, size, opcode, rd, rn, rm);
- {
+         break;
-     uint32_t insn;
+     case 14: /* PMULL, PMULL2 */
+-        if (is_u || size == 1 || size == 2) {
-+    s->pc_curr = s->pc;
++        if (is_u) {
-     insn = arm_ldl_code(env, s->pc, s->sctlr_b);
+             unallocated_encoding(s);
-     s->insn = insn;
+             return;
-     s->pc += 4;
+         }
 -        if (size == 3) {
 +        switch (size) {
 +        case 0: /* PMULL.P8 */
 +            if (!fp_access_check(s)) {
 +                return;
 +            }
 +            /* The Q field specifies lo/hi half input for this insn.  */
 +            gen_gvec_op3_ool(s, true, rd, rn, rm, is_q,
 +                             gen_helper_neon_pmull_h);
 +            break;
 +
 +        case 3: /* PMULL.P64 */
              if (!dc_isar_feature(aa64_pmull, s)) {
                  unallocated_encoding(s);
                  return;
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
              /* The Q field specifies lo/hi half input for this insn.  */
              gen_gvec_op3_ool(s, true, rd, rn, rm, is_q,
                               gen_helper_gvec_pmull_q);
 -            return;
 +            break;
 +
 +        default:
 +            unallocated_encoding(s);
 +            break;
          }
 -        goto is_widening;
 +        return;
      case 9: /* SQDMLAL, SQDMLAL2 */
      case 11: /* SQDMLSL, SQDMLSL2 */
      case 13: /* SQDMULL, SQDMULL2 */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
              unallocated_encoding(s);
              return;
          }
 -    is_widening:
          if (!fp_access_check(s)) {
              return;
          }
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static inline void gen_hvc(DisasContext *s, int imm16)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-      * as an undefined insn by runtime configuration (ie before
+                     return 1;
-      * the insn really executes).
+                 }
-      */
--    gen_set_pc_im(s, s->pc - 4);
+-                /* Handle VMULL.P64 (Polynomial 64x64 to 128 bit multiply)
-+    gen_set_pc_im(s, s->pc_curr);
+-                 * outside the loop below as it only performs a single pass.
-     gen_helper_pre_hvc(cpu_env);
+-                 */
-     /* Otherwise we will treat this as a real exception which
+-                if (op == 14 && size == 2) {
-      * happens after execution of the insn. (The distinction matters
+-                    if (!dc_isar_feature(aa32_pmull, s)) {
-@@ -XXX,XX +XXX,XX @@ static inline void gen_smc(DisasContext *s)
+-                        return 1;
-      */
++                /* Handle polynomial VMULL in a single pass.  */
-     TCGv_i32 tmp;
++                if (op == 14) {
++                    if (size == 0) {
--    gen_set_pc_im(s, s->pc - 4);
++                        /* VMULL.P8 */
-+    gen_set_pc_im(s, s->pc_curr);
++                        tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
-     tmp = tcg_const_i32(syn_aa32_smc());
++                                           0, gen_helper_neon_pmull_h);
-     gen_helper_pre_smc(cpu_env, tmp);
++                    } else {
-     tcg_temp_free_i32(tmp);
++                        /* VMULL.P64 */
-@@ -XXX,XX +XXX,XX @@ static void gen_msr_banked(DisasContext *s, int r, int sysm, int rn)
++                        if (!dc_isar_feature(aa32_pmull, s)) {
++                            return 1;
-     /* Sync state because msr_banked() can raise exceptions */
++                        }
-     gen_set_condexec(s);
++                        tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
--    gen_set_pc_im(s, s->pc - 4);
++                                           0, gen_helper_gvec_pmull_q);
-+    gen_set_pc_im(s, s->pc_curr);
+                     }
-     tcg_reg = load_reg(s, rn);
+-                    tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
-     tcg_tgtmode = tcg_const_i32(tgtmode);
+-                                       0, gen_helper_gvec_pmull_q);
-     tcg_regno = tcg_const_i32(regno);
+                     return 0;
-@@ -XXX,XX +XXX,XX @@ static void gen_mrs_banked(DisasContext *s, int r, int sysm, int rn)
+                 }
-     /* Sync state because mrs_banked() can raise exceptions */
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     gen_set_condexec(s);
+                         /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */
--    gen_set_pc_im(s, s->pc - 4);
+                         gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
-+    gen_set_pc_im(s, s->pc_curr);
+                         break;
-     tcg_reg = tcg_temp_new_i32();
+-                    case 14: /* Polynomial VMULL */
-     tcg_tgtmode = tcg_const_i32(tgtmode);
+-                        gen_helper_neon_mull_p8(cpu_V0, tmp, tmp2);
-     tcg_regno = tcg_const_i32(regno);
+-                        tcg_temp_free_i32(tmp2);
-@@ -XXX,XX +XXX,XX @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
+-                        tcg_temp_free_i32(tmp);
-             }
+-                        break;
+                     default: /* 15 is RESERVED: caught earlier  */
-             gen_set_condexec(s);
+                         abort();
--            gen_set_pc_im(s, s->pc - 4);
+                     }
-+            gen_set_pc_im(s, s->pc_curr);
+diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
-             tmpptr = tcg_const_ptr(ri);
+index XXXXXXX..XXXXXXX 100644
-             tcg_syn = tcg_const_i32(syndrome);
+--- a/target/arm/vec_helper.c
-             tcg_isread = tcg_const_i32(isread);
++++ b/target/arm/vec_helper.c
-@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
+@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc)
      tmp = tcg_const_i32(mode);
      /* get_r13_banked() will raise an exception if called from System mode */
      gen_set_condexec(s);
 -    gen_set_pc_im(s, s->pc - 4);
 +    gen_set_pc_im(s, s->pc_curr);
      gen_helper_get_r13_banked(addr, cpu_env, tmp);
      tcg_temp_free_i32(tmp);
      switch (amode) {
@@ -XXX,XX +XXX,XX @@ static void arm_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
          return;
      }
+     clear_tail(d, opr_sz, simd_maxsz(desc));
-+    dc->pc_curr = dc->pc;
+ }
-     insn = arm_ldl_code(env, dc->pc, dc->sctlr_b);
++
-     dc->insn = insn;
++/*
-     dc->pc += 4;
++ * 8x8->16 polynomial multiply.
-@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
++ *
-         return;
++ * The byte inputs are expanded to (or extracted from) half-words.
-     }
++ * Note that neon and sve2 get the inputs from different positions.
++ * This allows 4 bytes to be processed in parallel with uint64_t.
-+    dc->pc_curr = dc->pc;
++ */
-     insn = arm_lduw_code(env, dc->pc, dc->sctlr_b);
++
-     is_16bit = thumb_insn_is_16bit(dc, dc->pc, insn);
++static uint64_t expand_byte_to_half(uint64_t x)
-     dc->pc += 2;
++{
 +    return  (x & 0x000000ff)
 +         | ((x & 0x0000ff00) << 8)
 +         | ((x & 0x00ff0000) << 16)
 +         | ((x & 0xff000000) << 24);
 +}
 +
 +static uint64_t pmull_h(uint64_t op1, uint64_t op2)
 +{
 +    uint64_t result = 0;
 +    int i;
 +
 +    for (i = 0; i < 8; ++i) {
 +        uint64_t mask = (op1 & 0x0001000100010001ull) * 0xffff;
 +        result ^= op2 & mask;
 +        op1 >>= 1;
 +        op2 <<= 1;
 +    }
 +    return result;
 +}
 +
 +void HELPER(neon_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc)
 +{
 +    int hi = simd_data(desc);
 +    uint64_t *d = vd, *n = vn, *m = vm;
 +    uint64_t nn = n[hi], mm = m[hi];
 +
 +    d[0] = pmull_h(expand_byte_to_half(nn), expand_byte_to_half(mm));
 +    nn >>= 32;
 +    mm >>= 32;
 +    d[1] = pmull_h(expand_byte_to_half(nn), expand_byte_to_half(mm));
 +
 +    clear_tail(d, 16, simd_maxsz(desc));
 +}
 +
 +#ifdef TARGET_AARCH64
 +void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc)
 +{
 +    int shift = simd_data(desc) * 8;
 +    intptr_t i, opr_sz = simd_oprsz(desc);
 +    uint64_t *d = vd, *n = vn, *m = vm;
 +
 +    for (i = 0; i < opr_sz / 8; ++i) {
 +        uint64_t nn = (n[i] >> shift) & 0x00ff00ff00ff00ffull;
 +        uint64_t mm = (m[i] >> shift) & 0x00ff00ff00ff00ffull;
 +
 +        d[i] = pmull_h(nn, mm);
 +    }
 +}
 +#endif
 --
 .20.1

-New patch
+[PULL 42/52] xilinx_spips: Correct the number of dummy cycles for the FAST_READ_4 cmd
+From: Francisco Iglesias <francisco.iglesias@xilinx.com>
+Correct the number of dummy cycles required by the FAST_READ_4 command (to
+be eight, one dummy byte).
+Fixes: ef06ca3946 ("xilinx_spips: Add support for RX discard and RX drain")
+Suggested-by: Cédric Le Goater <clg@kaod.org>
+Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
+Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
+Message-id: 20200218113350.6090-1-frasse.iglesias@gmail.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/ssi/xilinx_spips.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/ssi/xilinx_spips.c
++++ b/hw/ssi/xilinx_spips.c
+@@ -XXX,XX +XXX,XX @@ static int xilinx_spips_num_dummies(XilinxQSPIPS *qs, uint8_t command)
+     case FAST_READ:
+     case DOR:
+     case QOR:
++    case FAST_READ_4:
+     case DOR_4:
+     case QOR_4:
+         return 1;
+     case DIOR:
+-    case FAST_READ_4:
+     case DIOR_4:
+         return 2;
+     case QIOR:
+--
+.20.1

-[Qemu-devel] [PULL 03/29] Set ENET_BD_BDU in I.MX FEC controller
+[PULL 43/52] sh4: Fix PCI ISA IO memory subregion
-From: Aaron Hill <aa1ronham@gmail.com>
+From: Guenter Roeck <linux@roeck-us.net>
-This commit properly sets the ENET_BD_BDU flag once the emulated FEC controller
+Booting the r2d machine from flash fails because flash is not discovered.
-has finished processing the last descriptor. This is done for both transmit
+Looking at the flattened memory tree, we see the following.
 and receive descriptors.
-This allows the QNX 7.0.0 BSP for the Sabrelite board (which can be
+FlatView #1
-found at http://blackberry.qnx.com/en/developers/bsp) to properly
+ AS "memory", root: system
-control the FEC. Without this patch, the BSP ethernet driver will never
+ AS "cpu-memory-0", root: system
-re-use FEC descriptors, as the unset ENET_BD_BDU flag will cause
+ AS "sh_pci_host", root: bus master container
-it to believe that the descriptors are still in use by the NIC.
+ Root memory region: system
   0000000000000000-000000000000ffff (prio 0, i/o): io
   0000000000010000-0000000000ffffff (prio 0, i/o): r2d.flash @0000000000010000
-Note that Linux does not appear to use this field at all, and is
+The overlapping memory region is sh_pci.isa, ie the ISA I/O region bridge.
-unaffected by this patch.
+This region is initially assigned to address 0xfe240000, but overwritten
 with a write into the PCIIOBR register. This write is expected to adjust
 the PCI memory window, but not to change the region's base adddress.
-Without this patch, QNX will think that the NIC is still processing its
+Peter Maydell provided the following detailed explanation.
 transaction descriptors, and won't send any more data over the network.
-For reference:
+"Section 22.3.7 and in particular figure 22.3 (of "SSH7751R user's manual:
 hardware") are clear about how this is supposed to work: there is a window
 at 0xfe240000 in the system register space for PCI I/O space. When the CPU
 makes an access into that area, the PCI controller calculates the PCI
 address to use by combining bits 0..17 of the system address with the
 bits 31..18 value that the guest has put into the PCIIOBR. That is, writing
 to the PCIIOBR changes which section of the IO address space is visible in
 the 0xfe240000 window. Instead what QEMU's implementation does is move the
 window to whatever value the guest writes to the PCIIOBR register -- so if
 the guest writes 0 we put the window at 0 in system address space."
-On page 1192 of the I.MX 6DQ reference manual revision (Rev. 5, 06/2018),
+Fix the problem by calling memory_region_set_alias_offset() instead of
-which can be found at https://www.nxp.com/products/processors-and-microcontrollers/arm-based-processors-and-mcus/i.mx-applications-processors/i.mx-6-processors/i.mx-6quad-processors-high-performance-3d-graphics-hd-video-arm-cortex-a9-core:i.MX6Q?&tab=Documentation_Tab&linkline=Application-Note
+removing and re-adding the PCI ISA subregion on writes into PCIIOBR.
 At the same time, in sh_pci_device_realize(), don't set iobr since
 it is overwritten later anyway. Instead, pass the base address to
 memory_region_add_subregion() directly.
-the 'BDU' field is described as follows for the 'Enhanced transmit
+Many thanks to Peter Maydell for the detailed problem analysis, and for
-buffer descriptor':
+providing suggestions on how to fix the problem.
-'Last buffer descriptor update done. Indicates that the last BD data has been updated by
+Signed-off-by: Guenter Roeck <linux@roeck-us.net>
-uDMA. This field is written by the user (=0) and uDMA (=1).'
+Message-id: 20200218201050.15273-1-linux@roeck-us.net
 The same description is used for the receive buffer descriptor.
 Signed-off-by: Aaron Hill <aa1ronham@gmail.com>
 Message-id: 20190805142417.10433-1-aaron.hill@alertinnovation.com
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/net/imx_fec.c | 4 ++++
+ hw/sh4/sh_pci.c | 11 +++--------
-file changed, 4 insertions(+)
+file changed, 3 insertions(+), 8 deletions(-)
-diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
+diff --git a/hw/sh4/sh_pci.c b/hw/sh4/sh_pci.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/net/imx_fec.c
+--- a/hw/sh4/sh_pci.c
-+++ b/hw/net/imx_fec.c
++++ b/hw/sh4/sh_pci.c
-@@ -XXX,XX +XXX,XX @@ static void imx_enet_do_tx(IMXFECState *s, uint32_t index)
+@@ -XXX,XX +XXX,XX @@ static void sh_pci_reg_write (void *p, hwaddr addr, uint64_t val,
-             if (bd.option & ENET_BD_TX_INT) {
+         pcic->mbr = val & 0xff000001;
-                 s->regs[ENET_EIR] |= int_txf;
+         break;
-             }
+     case 0x1c8:
-+            /* Indicate that we've updated the last buffer descriptor. */
+-        if ((val & 0xfffc0000) != (pcic->iobr & 0xfffc0000)) {
-+            bd.last_buffer = ENET_BD_BDU;
+-            memory_region_del_subregion(get_system_memory(), &pcic->isa);
-         }
+-            pcic->iobr = val & 0xfffc0001;
-         if (bd.option & ENET_BD_TX_INT) {
+-            memory_region_add_subregion(get_system_memory(),
-             s->regs[ENET_EIR] |= int_txb;
+-                                        pcic->iobr & 0xfffc0000, &pcic->isa);
-@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
+-        }
-             /* Last buffer in frame.  */
++        pcic->iobr = val & 0xfffc0001;
-             bd.flags |= flags | ENET_BD_L;
++        memory_region_set_alias_offset(&pcic->isa, val & 0xfffc0000);
-             FEC_PRINTF("rx frame flags %04x\n", bd.flags);
+         break;
-+            /* Indicate that we've updated the last buffer descriptor. */
+     case 0x220:
-+            bd.last_buffer = ENET_BD_BDU;
+         pci_data_write(phb->bus, pcic->par, val, 4);
-             if (bd.option & ENET_BD_RX_INT) {
+@@ -XXX,XX +XXX,XX @@ static void sh_pci_device_realize(DeviceState *dev, Error **errp)
-                 s->regs[ENET_EIR] |= ENET_INT_RXF;
+                              get_system_io(), 0, 0x40000);
-             }
+     sysbus_init_mmio(sbd, &s->memconfig_p4);
      sysbus_init_mmio(sbd, &s->memconfig_a7);
 -    s->iobr = 0xfe240000;
 -    memory_region_add_subregion(get_system_memory(), s->iobr, &s->isa);
 +    memory_region_add_subregion(get_system_memory(), 0xfe240000, &s->isa);
      s->dev = pci_create_simple(phb->bus, PCI_DEVFN(0, 0), "sh_pci_host");
  }
 --
 .20.1

-[Qemu-devel] [PULL 12/29] target/arm: Replace offset with pc in gen_exception_insn
+[PULL 44/52] target/arm: Rename isar_feature_aa32_simd_r32
 From: Richard Henderson <richard.henderson@linaro.org>
-The offset is variable depending on the instruction set, whereas
+The old name, isar_feature_aa32_fp_d32, does not reflect
-we have stored values for the current pc and the next pc.  Passing
+the MVFR0 field name, SIMDReg.
 in the actual value is clearer in intent.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Message-id: 20200214181547.21408-3-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+[PMM: wrapped one long line]
 Message-id: 20190807045335.1361-8-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c     | 25 ++++++++++++++-----------
+ target/arm/cpu.h               |  2 +-
- target/arm/translate-vfp.inc.c |  6 +++---
+ target/arm/translate-vfp.inc.c | 53 +++++++++++++++++-----------------
- target/arm/translate.c         | 31 ++++++++++++++++---------------
+files changed, 28 insertions(+), 27 deletions(-)
-files changed, 33 insertions(+), 29 deletions(-)
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/cpu.h
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
-     s->base.is_jmp = DISAS_NORETURN;
+     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
  }
--static void gen_exception_insn(DisasContext *s, int offset, int excp,
+-static inline bool isar_feature_aa32_fp_d32(const ARMISARegisters *id)
-+static void gen_exception_insn(DisasContext *s, uint64_t pc, int excp,
++static inline bool isar_feature_aa32_simd_r32(const ARMISARegisters *id)
                                 uint32_t syndrome, uint32_t target_el)
  {
--    gen_a64_set_pc_im(s->base.pc_next - offset);
+     /* Return true if D16-D31 are implemented */
-+    gen_a64_set_pc_im(pc);
+     return FIELD_EX32(id->mvfr0, MVFR0, SIMDREG) >= 2;
      gen_exception(excp, syndrome, target_el);
      s->base.is_jmp = DISAS_NORETURN;
  }
@@ -XXX,XX +XXX,XX @@ static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest)
  void unallocated_encoding(DisasContext *s)
  {
      /* Unallocated and reserved encodings are uncategorized */
 -    gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
 +    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
                         default_exception_el(s));
  }
@@ -XXX,XX +XXX,XX @@ static inline bool fp_access_check(DisasContext *s)
          return true;
      }
 -    gen_exception_insn(s, 4, EXCP_UDEF, syn_fp_access_trap(1, 0xe, false),
 -                       s->fp_excp_el);
 +    gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
 +                       syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
      return false;
  }
@@ -XXX,XX +XXX,XX @@ static inline bool fp_access_check(DisasContext *s)
  bool sve_access_check(DisasContext *s)
  {
      if (s->sve_excp_el) {
 -        gen_exception_insn(s, 4, EXCP_UDEF, syn_sve_access_trap(),
 +        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_sve_access_trap(),
                             s->sve_excp_el);
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
          switch (op2_ll) {
          case 1:                                                     /* SVC */
              gen_ss_advance(s);
 -            gen_exception_insn(s, 0, EXCP_SWI, syn_aa64_svc(imm16),
 -                               default_exception_el(s));
 +            gen_exception_insn(s, s->base.pc_next, EXCP_SWI,
 +                               syn_aa64_svc(imm16), default_exception_el(s));
              break;
          case 2:                                                     /* HVC */
              if (s->current_el == 0) {
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
              gen_a64_set_pc_im(s->pc_curr);
              gen_helper_pre_hvc(cpu_env);
              gen_ss_advance(s);
 -            gen_exception_insn(s, 0, EXCP_HVC, syn_aa64_hvc(imm16), 2);
 +            gen_exception_insn(s, s->base.pc_next, EXCP_HVC,
 +                               syn_aa64_hvc(imm16), 2);
              break;
          case 3:                                                     /* SMC */
              if (s->current_el == 0) {
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
              gen_helper_pre_smc(cpu_env, tmp);
              tcg_temp_free_i32(tmp);
              gen_ss_advance(s);
 -            gen_exception_insn(s, 0, EXCP_SMC, syn_aa64_smc(imm16), 3);
 +            gen_exception_insn(s, s->base.pc_next, EXCP_SMC,
 +                               syn_aa64_smc(imm16), 3);
              break;
          default:
              unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
              if (s->btype != 0
                  && s->guarded_page
                  && !btype_destination_ok(insn, s->bt, s->btype)) {
 -                gen_exception_insn(s, 4, EXCP_UDEF, syn_btitrap(s->btype),
 +                gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
 +                                   syn_btitrap(s->btype),
                                     default_exception_el(s));
                  return;
              }
 diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-vfp.inc.c
 +++ b/target/arm/translate-vfp.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
- {
+     }
-     if (s->fp_excp_el) {
-         if (arm_dc_feature(s, ARM_FEATURE_M)) {
+     /* UNDEF accesses to D16-D31 if they don't exist */
--            gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
+-    if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
-+            gen_exception_insn(s, s->pc_curr, EXCP_NOCP, syn_uncategorized(),
++    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-                                s->fp_excp_el);
+         ((a->vm | a->vn | a->vd) & 0x10)) {
-         } else {
+         return false;
--            gen_exception_insn(s, 4, EXCP_UDEF,
+     }
-+            gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
-                                syn_fp_access_trap(1, 0xe, false),
+     }
-                                s->fp_excp_el);
-         }
+     /* UNDEF accesses to D16-D31 if they don't exist */
-@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
+-    if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
++    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-     if (!s->vfp_enabled && !ignore_vfp_enabled) {
+         ((a->vm | a->vn | a->vd) & 0x10)) {
-         assert(!arm_dc_feature(s, ARM_FEATURE_M));
+         return false;
--        gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
+     }
-+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
-                            default_exception_el(s));
+     }
-         return false;
-     }
+     /* UNDEF accesses to D16-D31 if they don't exist */
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+-    if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
-index XXXXXXX..XXXXXXX 100644
++    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
---- a/target/arm/translate.c
+         ((a->vm | a->vd) & 0x10)) {
-+++ b/target/arm/translate.c
+         return false;
-@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
+     }
-     s->base.is_jmp = DISAS_NORETURN;
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
- }
+     }
--static void gen_exception_insn(DisasContext *s, int offset, int excp,
+     /* UNDEF accesses to D16-D31 if they don't exist */
-+static void gen_exception_insn(DisasContext *s, uint32_t pc, int excp,
+-    if (dp && !dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
-                                int syn, uint32_t target_el)
++    if (dp && !dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
- {
+         return false;
-     gen_set_condexec(s);
+     }
--    gen_set_pc_im(s, s->base.pc_next - offset);
-+    gen_set_pc_im(s, pc);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a)
-     gen_exception(excp, syn, target_el);
+     uint32_t offset;
-     s->base.is_jmp = DISAS_NORETURN;
- }
+     /* UNDEF accesses to D16-D31 if they don't exist */
-@@ -XXX,XX +XXX,XX @@ static inline void gen_hlt(DisasContext *s, int imm)
+-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vn & 0x10)) {
-         return;
++    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vn & 0x10)) {
-     }
+         return false;
+     }
--    gen_exception_insn(s, s->thumb ? 2 : 4, EXCP_UDEF, syn_uncategorized(),
-+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_from_gp(DisasContext *s, arg_VMOV_from_gp *a)
-                        default_exception_el(s));
+     uint32_t offset;
- }
+     /* UNDEF accesses to D16-D31 if they don't exist */
-@@ -XXX,XX +XXX,XX @@ static bool msr_banked_access_decode(DisasContext *s, int r, int sysm, int rn,
+-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vn & 0x10)) {
++    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vn & 0x10)) {
- undef:
+         return false;
-     /* If we get here then some access check did not pass */
+     }
--    gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(), exc_target);
-+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+@@ -XXX,XX +XXX,XX @@ static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
-+                       syn_uncategorized(), exc_target);
+     }
-     return false;
- }
+     /* UNDEF accesses to D16-D31 if they don't exist */
+-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vn & 0x10)) {
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
++    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vn & 0x10)) {
-      * for attempts to execute invalid vfp/neon encodings with FP disabled.
+         return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_64_dp(DisasContext *s, arg_VMOV_64_dp *a)
       */
-     if (s->fp_excp_el) {
--        gen_exception_insn(s, 4, EXCP_UDEF,
+     /* UNDEF accesses to D16-D31 if they don't exist */
-+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
-                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
++    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
-         return 0;
+         return false;
      }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-      * for attempts to execute invalid vfp/neon encodings with FP disabled.
+@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_dp *a)
-      */
+     TCGv_i64 tmp;
-     if (s->fp_excp_el) {
--        gen_exception_insn(s, 4, EXCP_UDEF,
+     /* UNDEF accesses to D16-D31 if they don't exist */
-+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
-                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
++    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
-         return 0;
+         return false;
      }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
-     }
+@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a)
+     }
-     if (s->fp_excp_el) {
--        gen_exception_insn(s, 4, EXCP_UDEF,
+     /* UNDEF accesses to D16-D31 if they don't exist */
-+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd + n) > 16) {
-                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
++    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd + n) > 16) {
-         return 0;
+         return false;
      }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
-         off_rm = vfp_reg_offset(0, rm);
+@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
-     }
+     TCGv_ptr fpst;
-     if (s->fp_excp_el) {
--        gen_exception_insn(s, 4, EXCP_UDEF,
+     /* UNDEF accesses to D16-D31 if they don't exist */
-+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+-    if (!dc_isar_feature(aa32_fp_d32, s) && ((vd | vn | vm) & 0x10)) {
-                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
++    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vn | vm) & 0x10)) {
-         return 0;
+         return false;
      }
-@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
-      * For the UNPREDICTABLE cases we choose to UNDEF.
+@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
-      */
+     TCGv_i64 f0, fd;
-     if (s->current_el == 1 && !s->ns && mode == ARM_CPU_MODE_MON) {
--        gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(), 3);
+     /* UNDEF accesses to D16-D31 if they don't exist */
-+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(), 3);
+-    if (!dc_isar_feature(aa32_fp_d32, s) && ((vd | vm) & 0x10)) {
-         return;
++    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vm) & 0x10)) {
-     }
+         return false;
+     }
-@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
-     }
+@@ -XXX,XX +XXX,XX @@ static bool trans_VFM_dp(DisasContext *s, arg_VFM_dp *a)
+     }
-     if (undef) {
--        gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
+     /* UNDEF accesses to D16-D31 if they don't exist. */
-+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+-    if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vn | a->vm) & 0x10)) {
-                            default_exception_el(s));
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-         return;
++        ((a->vd | a->vn | a->vm) & 0x10)) {
-     }
+         return false;
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+     }
-      * UsageFault exception.
-      */
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
-     if (arm_dc_feature(s, ARM_FEATURE_M)) {
+     vd = a->vd;
--        gen_exception_insn(s, 4, EXCP_INVSTATE, syn_uncategorized(),
-+        gen_exception_insn(s, s->pc_curr, EXCP_INVSTATE, syn_uncategorized(),
+     /* UNDEF accesses to D16-D31 if they don't exist. */
-                            default_exception_el(s));
+-    if (!dc_isar_feature(aa32_fp_d32, s) && (vd & 0x10)) {
-         return;
++    if (!dc_isar_feature(aa32_simd_r32, s) && (vd & 0x10)) {
-     }
+         return false;
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+     }
-             break;
-         default:
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
-         illegal_op:
+     }
--            gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
-+            gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+     /* UNDEF accesses to D16-D31 if they don't exist. */
-                                default_exception_el(s));
+-    if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
-             break;
++    if (!dc_isar_feature(aa32_simd_r32, s) && ((a->vd | a->vm) & 0x10)) {
-         }
+         return false;
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+     }
-             }
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
-             /* All other insns: NOCP */
+     }
--            gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
-+            gen_exception_insn(s, s->pc_curr, EXCP_NOCP, syn_uncategorized(),
+     /* UNDEF accesses to D16-D31 if they don't exist. */
-                                default_exception_el(s));
+-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd  & 0x10)) {
-             break;
++    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd  & 0x10)) {
-         }
+         return false;
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+     }
-     }
-     return;
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
- illegal_op:
+     }
--    gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
-+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+     /* UNDEF accesses to D16-D31 if they don't exist. */
-                        default_exception_el(s));
+-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm  & 0x10)) {
- }
++    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm  & 0x10)) {
+         return false;
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
+     }
-     return;
- illegal_op:
+@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a)
- undef:
+     }
--    gen_exception_insn(s, 2, EXCP_UDEF, syn_uncategorized(),
-+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+     /* UNDEF accesses to D16-D31 if they don't exist. */
-                        default_exception_el(s));
+-    if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
- }
++    if (!dc_isar_feature(aa32_simd_r32, s) && ((a->vd | a->vm) & 0x10)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a)
      }
      /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
 +    if (!dc_isar_feature(aa32_simd_r32, s) && ((a->vd | a->vm) & 0x10)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
      }
      /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
 +    if (!dc_isar_feature(aa32_simd_r32, s) && ((a->vd | a->vm) & 0x10)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a)
      TCGv_i32 vm;
      /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a)
      TCGv_i32 vd;
      /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a)
      TCGv_ptr fpst;
      /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
      }
      /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
      }
      /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a)
      TCGv_ptr fpst;
      /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
          return false;
      }
 --
 .20.1

-[Qemu-devel] [PULL 25/29] target/arm: Remove redundant shift tests
+[PULL 45/52] target/arm: Use isar_feature_aa32_simd_r32 more places
 From: Richard Henderson <richard.henderson@linaro.org>
-The immediate shift generator functions already test for,
+Many uses of ARM_FEATURE_VFP3 are testing for the number of simd
-and eliminate, the case of a shift by zero.
+registers implemented.  Use the proper test vs MVFR0.SIMDReg.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190808202616.13782-4-richard.henderson@linaro.org
+Message-id: 20200214181547.21408-4-richard.henderson@linaro.org
 [PMM: fix typo in commit message]
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 19 +++++++------------
+ target/arm/cpu.c       |  9 ++++-----
-file changed, 7 insertions(+), 12 deletions(-)
+ target/arm/helper.c    | 13 ++++++-------
  target/arm/translate.c |  2 +-
 files changed, 11 insertions(+), 13 deletions(-)
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.c
++++ b/target/arm/cpu.c
+@@ -XXX,XX +XXX,XX @@ static void arm_cpu_dump_state(CPUState *cs, FILE *f, int flags)
+     if (flags & CPU_DUMP_FPU) {
+         int numvfpregs = 0;
+-        if (arm_feature(env, ARM_FEATURE_VFP)) {
+-            numvfpregs += 16;
+-        }
+-        if (arm_feature(env, ARM_FEATURE_VFP3)) {
+-            numvfpregs += 16;
++        if (cpu_isar_feature(aa32_simd_r32, cpu)) {
++            numvfpregs = 32;
++        } else if (arm_feature(env, ARM_FEATURE_VFP)) {
++            numvfpregs = 16;
+         }
+         for (i = 0; i < numvfpregs; i++) {
+             uint64_t v = *aa32_vfp_dreg(env, i);
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static void switch_mode(CPUARMState *env, int mode);
+ static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
+ {
+-    int nregs;
++    ARMCPU *cpu = env_archcpu(env);
++    int nregs = cpu_isar_feature(aa32_simd_r32, cpu) ? 32 : 16;
+     /* VFP data registers are always little-endian.  */
+-    nregs = arm_feature(env, ARM_FEATURE_VFP3) ? 32 : 16;
+     if (reg < nregs) {
+         stq_le_p(buf, *aa32_vfp_dreg(env, reg));
+         return 8;
+@@ -XXX,XX +XXX,XX @@ static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
+ static int vfp_gdb_set_reg(CPUARMState *env, uint8_t *buf, int reg)
+ {
+-    int nregs;
++    ARMCPU *cpu = env_archcpu(env);
++    int nregs = cpu_isar_feature(aa32_simd_r32, cpu) ? 32 : 16;
+-    nregs = arm_feature(env, ARM_FEATURE_VFP3) ? 32 : 16;
+     if (reg < nregs) {
+         *aa32_vfp_dreg(env, reg) = ldq_le_p(buf);
+         return 8;
+@@ -XXX,XX +XXX,XX @@ static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+             /* VFPv3 and upwards with NEON implement 32 double precision
+              * registers (D0-D31).
+              */
+-            if (!arm_feature(env, ARM_FEATURE_NEON) ||
+-                    !arm_feature(env, ARM_FEATURE_VFP3)) {
++            if (!cpu_isar_feature(aa32_simd_r32, env_archcpu(env))) {
+                 /* D32DIS [30] is RAO/WI if D16-31 are not implemented. */
+                 value |= (1 << 30);
+             }
+@@ -XXX,XX +XXX,XX @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
+     } else if (arm_feature(env, ARM_FEATURE_NEON)) {
+         gdb_register_coprocessor(cs, vfp_gdb_get_reg, vfp_gdb_set_reg,
+, "arm-neon.xml", 0);
+-    } else if (arm_feature(env, ARM_FEATURE_VFP3)) {
++    } else if (cpu_isar_feature(aa32_simd_r32, cpu)) {
+         gdb_register_coprocessor(cs, vfp_gdb_get_reg, vfp_gdb_set_reg,
+, "arm-vfp3.xml", 0);
+     } else if (arm_feature(env, ARM_FEATURE_VFP)) {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
-                         shift = (insn >> 10) & 3;
+ #define VFP_SREG(insn, bigbit, smallbit) \
-                         /* ??? In many cases it's not necessary to do a
+   ((VFP_REG_SHR(insn, bigbit - 1) & 0x1e) | (((insn) >> (smallbit)) & 1))
-                            rotate, a shift is sufficient.  */
+ #define VFP_DREG(reg, insn, bigbit, smallbit) do { \
--                        if (shift != 0)
+-    if (arm_dc_feature(s, ARM_FEATURE_VFP3)) { \
--                            tcg_gen_rotri_i32(tmp, tmp, shift * 8);
++    if (dc_isar_feature(aa32_simd_r32, s)) { \
-+                        tcg_gen_rotri_i32(tmp, tmp, shift * 8);
+         reg = (((insn) >> (bigbit)) & 0x0f) \
-                         op1 = (insn >> 20) & 7;
+               | (((insn) >> ((smallbit) - 4)) & 0x10); \
-                         switch (op1) {
+     } else { \
                          case 0: gen_sxtb16(tmp);  break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
              shift = (insn >> 4) & 3;
              /* ??? In many cases it's not necessary to do a
                 rotate, a shift is sufficient.  */
 -            if (shift != 0)
 -                tcg_gen_rotri_i32(tmp, tmp, shift * 8);
 +            tcg_gen_rotri_i32(tmp, tmp, shift * 8);
              op = (insn >> 20) & 7;
              switch (op) {
              case 0: gen_sxth(tmp);   break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                      case 7:
                          goto illegal_op;
                      default: /* Saturate.  */
 -                        if (shift) {
 -                            if (op & 1)
 -                                tcg_gen_sari_i32(tmp, tmp, shift);
 -                            else
 -                                tcg_gen_shli_i32(tmp, tmp, shift);
 +                        if (op & 1) {
 +                            tcg_gen_sari_i32(tmp, tmp, shift);
 +                        } else {
 +                            tcg_gen_shli_i32(tmp, tmp, shift);
                          }
                          tmp2 = tcg_const_i32(imm);
                          if (op & 4) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                      goto illegal_op;
                  }
                  tmp = load_reg(s, rm);
 -                if (shift) {
 -                    tcg_gen_shli_i32(tmp, tmp, shift);
 -                }
 +                tcg_gen_shli_i32(tmp, tmp, shift);
                  tcg_gen_add_i32(addr, addr, tmp);
                  tcg_temp_free_i32(tmp);
                  break;
 --
 .20.1

-[Qemu-devel] [PULL 24/29] target/arm: Use tcg_gen_deposit_i32 for PKHBT, PKHTB
+[PULL 46/52] target/arm: Set MVFR0.FPSP for ARMv5 cpus
 From: Richard Henderson <richard.henderson@linaro.org>
-Use deposit as the composit operation to merge the
+We are going to convert FEATURE tests to ISAR tests,
-bits from the two inputs.
+so FPSP needs to be set for these cpus, like we have
 already for FPDP.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190808202616.13782-3-richard.henderson@linaro.org
+Message-id: 20200214181547.21408-5-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 26 ++++++++++----------------
+ target/arm/cpu.c | 10 ++++++----
-file changed, 10 insertions(+), 16 deletions(-)
+file changed, 6 insertions(+), 4 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/cpu.c
-+++ b/target/arm/translate.c
++++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+@@ -XXX,XX +XXX,XX @@ static void arm926_initfn(Object *obj)
-                         shift = (insn >> 7) & 0x1f;
+      */
-                         if (insn & (1 << 6)) {
+     cpu->isar.id_isar1 = FIELD_DP32(cpu->isar.id_isar1, ID_ISAR1, JAZELLE, 1);
-                             /* pkhtb */
+     /*
--                            if (shift == 0)
+-     * Similarly, we need to set MVFR0 fields to enable double precision
-+                            if (shift == 0) {
+-     * and short vector support even though ARMv5 doesn't have this register.
-                                 shift = 31;
++     * Similarly, we need to set MVFR0 fields to enable vfp and short vector
-+                            }
++     * support even though ARMv5 doesn't have this register.
-                             tcg_gen_sari_i32(tmp2, tmp2, shift);
+      */
--                            tcg_gen_andi_i32(tmp, tmp, 0xffff0000);
+     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSHVEC, 1);
--                            tcg_gen_ext16u_i32(tmp2, tmp2);
++    cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSP, 1);
-+                            tcg_gen_deposit_i32(tmp, tmp, tmp2, 0, 16);
+     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPDP, 1);
-                         } else {
+ }
-                             /* pkhbt */
--                            if (shift)
+@@ -XXX,XX +XXX,XX @@ static void arm1026_initfn(Object *obj)
--                                tcg_gen_shli_i32(tmp2, tmp2, shift);
+      */
--                            tcg_gen_ext16u_i32(tmp, tmp);
+     cpu->isar.id_isar1 = FIELD_DP32(cpu->isar.id_isar1, ID_ISAR1, JAZELLE, 1);
--                            tcg_gen_andi_i32(tmp2, tmp2, 0xffff0000);
+     /*
-+                            tcg_gen_shli_i32(tmp2, tmp2, shift);
+-     * Similarly, we need to set MVFR0 fields to enable double precision
-+                            tcg_gen_deposit_i32(tmp, tmp2, tmp, 0, 16);
+-     * and short vector support even though ARMv5 doesn't have this register.
-                         }
++     * Similarly, we need to set MVFR0 fields to enable vfp and short vector
--                        tcg_gen_or_i32(tmp, tmp, tmp2);
++     * support even though ARMv5 doesn't have this register.
-                         tcg_temp_free_i32(tmp2);
+      */
-                         store_reg(s, rd, tmp);
+     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSHVEC, 1);
-                     } else if ((insn & 0x00200020) == 0x00200000) {
++    cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSP, 1);
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPDP, 1);
-             shift = ((insn >> 10) & 0x1c) | ((insn >> 6) & 0x3);
-             if (insn & (1 << 5)) {
+     {
                  /* pkhtb */
 -                if (shift == 0)
 +                if (shift == 0) {
                      shift = 31;
 +                }
                  tcg_gen_sari_i32(tmp2, tmp2, shift);
 -                tcg_gen_andi_i32(tmp, tmp, 0xffff0000);
 -                tcg_gen_ext16u_i32(tmp2, tmp2);
 +                tcg_gen_deposit_i32(tmp, tmp, tmp2, 0, 16);
              } else {
                  /* pkhbt */
 -                if (shift)
 -                    tcg_gen_shli_i32(tmp2, tmp2, shift);
 -                tcg_gen_ext16u_i32(tmp, tmp);
 -                tcg_gen_andi_i32(tmp2, tmp2, 0xffff0000);
 +                tcg_gen_shli_i32(tmp2, tmp2, shift);
 +                tcg_gen_deposit_i32(tmp, tmp2, tmp, 0, 16);
              }
 -            tcg_gen_or_i32(tmp, tmp, tmp2);
              tcg_temp_free_i32(tmp2);
              store_reg(s, rd, tmp);
          } else {
 --
 .20.1

-[Qemu-devel] [PULL 23/29] target/arm: Use tcg_gen_extract_i32 for shifter_out_im
+[PULL 47/52] target/arm: Add isar_feature_aa32_simd_r16
 From: Richard Henderson <richard.henderson@linaro.org>
-Extract is a compact combination of shift + and.
+Use this in the places that were checking ARM_FEATURE_VFP, and
 are obviously testing for the existance of the register set
 as opposed to testing for some particular instruction extension.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190808202616.13782-2-richard.henderson@linaro.org
+Message-id: 20200214181547.21408-6-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 9 +--------
+ target/arm/cpu.h        |  6 ++++++
-file changed, 1 insertion(+), 8 deletions(-)
+ hw/intc/armv7m_nvic.c   | 20 ++++++++++----------
+ linux-user/arm/signal.c |  4 ++--
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+ target/arm/arch_dump.c  | 11 ++++++-----
-index XXXXXXX..XXXXXXX 100644
+ target/arm/cpu.c        |  8 ++++----
---- a/target/arm/translate.c
+ target/arm/helper.c     |  4 ++--
-+++ b/target/arm/translate.c
+ target/arm/m_helper.c   | 11 ++++++-----
-@@ -XXX,XX +XXX,XX @@ static void gen_sar(TCGv_i32 dest, TCGv_i32 t0, TCGv_i32 t1)
+ target/arm/machine.c    |  3 +--
+files changed, 37 insertions(+), 30 deletions(-)
- static void shifter_out_im(TCGv_i32 var, int shift)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
      return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
  }
 +static inline bool isar_feature_aa32_simd_r16(const ARMISARegisters *id)
 +{
 +    /* Return true if D0-D15 are implemented */
 +    return FIELD_EX32(id->mvfr0, MVFR0, SIMDREG) > 0;
 +}
 +
  static inline bool isar_feature_aa32_simd_r32(const ARMISARegisters *id)
  {
--    if (shift == 0) {
+     /* Return true if D16-D31 are implemented */
--        tcg_gen_andi_i32(cpu_CF, var, 1);
+diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
--    } else {
+index XXXXXXX..XXXXXXX 100644
--        tcg_gen_shri_i32(cpu_CF, var, shift);
+--- a/hw/intc/armv7m_nvic.c
--        if (shift != 31) {
++++ b/hw/intc/armv7m_nvic.c
--            tcg_gen_andi_i32(cpu_CF, cpu_CF, 1);
+@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
--        }
+     case 0xd84: /* CSSELR */
--    }
+         return cpu->env.v7m.csselr[attrs.secure];
-+    tcg_gen_extract_i32(cpu_CF, var, shift, 1);
+     case 0xd88: /* CPACR */
 -        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +        if (!cpu_isar_feature(aa32_simd_r16, cpu)) {
              return 0;
          }
          return cpu->env.v7m.cpacr[attrs.secure];
      case 0xd8c: /* NSACR */
 -        if (!attrs.secure || !arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +        if (!attrs.secure || !cpu_isar_feature(aa32_simd_r16, cpu)) {
              return 0;
          }
          return cpu->env.v7m.nsacr;
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
          }
          return cpu->env.v7m.sfar;
      case 0xf34: /* FPCCR */
 -        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +        if (!cpu_isar_feature(aa32_simd_r16, cpu)) {
              return 0;
          }
          if (attrs.secure) {
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
              return value;
          }
      case 0xf38: /* FPCAR */
 -        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +        if (!cpu_isar_feature(aa32_simd_r16, cpu)) {
              return 0;
          }
          return cpu->env.v7m.fpcar[attrs.secure];
      case 0xf3c: /* FPDSCR */
 -        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +        if (!cpu_isar_feature(aa32_simd_r16, cpu)) {
              return 0;
          }
          return cpu->env.v7m.fpdscr[attrs.secure];
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
          }
          break;
      case 0xd88: /* CPACR */
 -        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
              /* We implement only the Floating Point extension's CP10/CP11 */
              cpu->env.v7m.cpacr[attrs.secure] = value & (0xf << 20);
          }
          break;
      case 0xd8c: /* NSACR */
 -        if (attrs.secure && arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +        if (attrs.secure && cpu_isar_feature(aa32_simd_r16, cpu)) {
              /* We implement only the Floating Point extension's CP10/CP11 */
              cpu->env.v7m.nsacr = value & (3 << 10);
          }
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
          break;
      }
      case 0xf34: /* FPCCR */
 -        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
              /* Not all bits here are banked. */
              uint32_t fpccr_s;
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
          }
          break;
      case 0xf38: /* FPCAR */
 -        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
              value &= ~7;
              cpu->env.v7m.fpcar[attrs.secure] = value;
          }
          break;
      case 0xf3c: /* FPDSCR */
 -        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
              value &= 0x07c00000;
              cpu->env.v7m.fpdscr[attrs.secure] = value;
          }
 diff --git a/linux-user/arm/signal.c b/linux-user/arm/signal.c
 index XXXXXXX..XXXXXXX 100644
 --- a/linux-user/arm/signal.c
 +++ b/linux-user/arm/signal.c
@@ -XXX,XX +XXX,XX @@ static void setup_sigframe_v2(struct target_ucontext_v2 *uc,
      setup_sigcontext(&uc->tuc_mcontext, env, set->sig[0]);
      /* Save coprocessor signal frame.  */
      regspace = uc->tuc_regspace;
 -    if (arm_feature(env, ARM_FEATURE_VFP)) {
 +    if (cpu_isar_feature(aa32_simd_r16, env_archcpu(env))) {
          regspace = setup_sigframe_v2_vfp(regspace, env);
      }
      if (arm_feature(env, ARM_FEATURE_IWMMXT)) {
@@ -XXX,XX +XXX,XX @@ static int do_sigframe_return_v2(CPUARMState *env,
      /* Restore coprocessor signal frame */
      regspace = uc->tuc_regspace;
 -    if (arm_feature(env, ARM_FEATURE_VFP)) {
 +    if (cpu_isar_feature(aa32_simd_r16, env_archcpu(env))) {
          regspace = restore_sigframe_v2_vfp(env, regspace);
          if (!regspace) {
              return 1;
 diff --git a/target/arm/arch_dump.c b/target/arm/arch_dump.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/arch_dump.c
 +++ b/target/arm/arch_dump.c
@@ -XXX,XX +XXX,XX @@ int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
                               int cpuid, void *opaque)
  {
      struct arm_note note;
 -    CPUARMState *env = &ARM_CPU(cs)->env;
 +    ARMCPU *cpu = ARM_CPU(cs);
 +    CPUARMState *env = &cpu->env;
      DumpState *s = opaque;
 -    int ret, i, fpvalid = !!arm_feature(env, ARM_FEATURE_VFP);
 +    int ret, i;
 +    bool fpvalid = cpu_isar_feature(aa32_simd_r16, cpu);
      arm_note_init(&note, s, "CORE", 5, NT_PRSTATUS, sizeof(note.prstatus));
@@ -XXX,XX +XXX,XX @@ int cpu_get_dump_info(ArchDumpInfo *info,
  ssize_t cpu_get_note_size(int class, int machine, int nr_cpus)
  {
      ARMCPU *cpu = ARM_CPU(first_cpu);
 -    CPUARMState *env = &cpu->env;
      size_t note_size;
      if (class == ELFCLASS64) {
@@ -XXX,XX +XXX,XX @@ ssize_t cpu_get_note_size(int class, int machine, int nr_cpus)
          note_size += AARCH64_PRFPREG_NOTE_SIZE;
  #ifdef TARGET_AARCH64
          if (cpu_isar_feature(aa64_sve, cpu)) {
 -            note_size += AARCH64_SVE_NOTE_SIZE(env);
 +            note_size += AARCH64_SVE_NOTE_SIZE(&cpu->env);
          }
  #endif
      } else {
          note_size = ARM_PRSTATUS_NOTE_SIZE;
 -        if (arm_feature(env, ARM_FEATURE_VFP)) {
 +        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
              note_size += ARM_VFP_NOTE_SIZE;
          }
      }
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
              env->v7m.ccr[M_REG_S] |= R_V7M_CCR_UNALIGN_TRP_MASK;
          }
 -        if (arm_feature(env, ARM_FEATURE_VFP)) {
 +        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
              env->v7m.fpccr[M_REG_NS] = R_V7M_FPCCR_ASPEN_MASK;
              env->v7m.fpccr[M_REG_S] = R_V7M_FPCCR_ASPEN_MASK |
                  R_V7M_FPCCR_LSPEN_MASK | R_V7M_FPCCR_S_MASK;
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_dump_state(CPUState *cs, FILE *f, int flags)
          int numvfpregs = 0;
          if (cpu_isar_feature(aa32_simd_r32, cpu)) {
              numvfpregs = 32;
 -        } else if (arm_feature(env, ARM_FEATURE_VFP)) {
 +        } else if (cpu_isar_feature(aa32_simd_r16, cpu)) {
              numvfpregs = 16;
          }
          for (i = 0; i < numvfpregs; i++) {
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
       * KVM does not currently allow us to lie to the guest about its
       * ID/feature registers, so the guest always sees what the host has.
       */
 -    if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +    if (cpu_isar_feature(aa32_simd_r16, cpu)) {
          cpu->has_vfp = true;
          if (!kvm_enabled()) {
              qdev_property_add_static(DEVICE(obj), &arm_cpu_has_vfp_property);
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
       * We rely on no XScale CPU having VFP so we can use the same bits in the
       * TB flags field for VECSTRIDE and XSCALE_CPAR.
       */
 -    assert(!(arm_feature(env, ARM_FEATURE_VFP) &&
 +    assert(!(cpu_isar_feature(aa32_simd_r16, cpu) &&
               arm_feature(env, ARM_FEATURE_XSCALE)));
      if (arm_feature(env, ARM_FEATURE_V7) &&
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
           * ASEDIS [31] and D32DIS [30] are both UNK/SBZP without VFP.
           * TRCDIS [28] is RAZ/WI since we do not implement a trace macrocell.
           */
 -        if (arm_feature(env, ARM_FEATURE_VFP)) {
 +        if (cpu_isar_feature(aa32_simd_r16, env_archcpu(env))) {
              /* VFP coprocessor: cp10 & cp11 [23:20] */
              mask |= (1 << 31) | (1 << 30) | (0xf << 20);
@@ -XXX,XX +XXX,XX @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
      } else if (cpu_isar_feature(aa32_simd_r32, cpu)) {
          gdb_register_coprocessor(cs, vfp_gdb_get_reg, vfp_gdb_set_reg,
 , "arm-vfp3.xml", 0);
 -    } else if (arm_feature(env, ARM_FEATURE_VFP)) {
 +    } else if (cpu_isar_feature(aa32_simd_r16, cpu)) {
          gdb_register_coprocessor(cs, vfp_gdb_get_reg, vfp_gdb_set_reg,
 , "arm-vfp.xml", 0);
      }
 diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/m_helper.c
 +++ b/target/arm/m_helper.c
@@ -XXX,XX +XXX,XX @@ static uint32_t v7m_integrity_sig(CPUARMState *env, uint32_t lr)
       */
      uint32_t sig = 0xfefa125a;
 -    if (!arm_feature(env, ARM_FEATURE_VFP) || (lr & R_V7M_EXCRET_FTYPE_MASK)) {
 +    if (!cpu_isar_feature(aa32_simd_r16, env_archcpu(env))
 +        || (lr & R_V7M_EXCRET_FTYPE_MASK)) {
          sig |= 1;
      }
      return sig;
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
      if (dotailchain) {
          /* Sanitize LR FType and PREFIX bits */
 -        if (!arm_feature(env, ARM_FEATURE_VFP)) {
 +        if (!cpu_isar_feature(aa32_simd_r16, cpu)) {
              lr |= R_V7M_EXCRET_FTYPE_MASK;
          }
          lr = deposit32(lr, 24, 8, 0xff);
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
      ftype = excret & R_V7M_EXCRET_FTYPE_MASK;
 -    if (!arm_feature(env, ARM_FEATURE_VFP) && !ftype) {
 +    if (!ftype && !cpu_isar_feature(aa32_simd_r16, cpu)) {
          qemu_log_mask(LOG_GUEST_ERROR, "M profile: zero FTYPE in exception "
                        "exit PC value 0x%" PRIx32 " is UNPREDICTABLE "
                        "if FPU not present\n",
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
               * SFPA is RAZ/WI from NS. FPCA is RO if NSACR.CP10 == 0,
               * RES0 if the FPU is not present, and is stored in the S bank
               */
 -            if (arm_feature(env, ARM_FEATURE_VFP) &&
 +            if (cpu_isar_feature(aa32_simd_r16, env_archcpu(env)) &&
                  extract32(env->v7m.nsacr, 10, 1)) {
                  env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
                  env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_FPCA_MASK;
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
              env->v7m.control[env->v7m.secure] &= ~R_V7M_CONTROL_NPRIV_MASK;
              env->v7m.control[env->v7m.secure] |= val & R_V7M_CONTROL_NPRIV_MASK;
          }
 -        if (arm_feature(env, ARM_FEATURE_VFP)) {
 +        if (cpu_isar_feature(aa32_simd_r16, env_archcpu(env))) {
              /*
               * SFPA is RAZ/WI from NS or if no FPU.
               * FPCA is RO if NSACR.CP10 == 0, RES0 if the FPU is not present.
 diff --git a/target/arm/machine.c b/target/arm/machine.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/machine.c
 +++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@
  static bool vfp_needed(void *opaque)
  {
      ARMCPU *cpu = opaque;
 -    CPUARMState *env = &cpu->env;
 -    return arm_feature(env, ARM_FEATURE_VFP);
 +    return cpu_isar_feature(aa32_simd_r16, cpu);
  }
- /* Shift by immediate.  Includes special handling for shift == 0.  */
+ static int get_fpscr(QEMUFile *f, void *opaque, size_t size,
 --
 .20.1

-[Qemu-devel] [PULL 10/29] target/arm: Remove redundant s->pc & ~1
+[PULL 48/52] target/arm: Rename isar_feature_aa32_fpdp_v2
 From: Richard Henderson <richard.henderson@linaro.org>
-The thumb bit has already been removed from s->pc, and is always even.
+The old name, isar_feature_aa32_fpdp, does not reflect
 that the test includes VFPv2.  We will introduce further
 feature tests for VFPv3.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Message-id: 20200214181547.21408-7-richard.henderson@linaro.org
+[PMM: fixed grammar in commit message]
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190807045335.1361-6-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 10 +++++-----
+ target/arm/cpu.h               |  4 ++--
-file changed, 5 insertions(+), 5 deletions(-)
+ target/arm/translate-vfp.inc.c | 40 +++++++++++++++++-----------------
+files changed, 22 insertions(+), 22 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/cpu.h
-+++ b/target/arm/translate.c
++++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ static void gen_exception_bkpt_insn(DisasContext *s, int offset, uint32_t syn)
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fpshvec(const ARMISARegisters *id)
- /* Force a TB lookup after an instruction that changes the CPU state.  */
+     return FIELD_EX32(id->mvfr0, MVFR0, FPSHVEC) > 0;
- static inline void gen_lookup_tb(DisasContext *s)
+ }
 -static inline bool isar_feature_aa32_fpdp(const ARMISARegisters *id)
 +static inline bool isar_feature_aa32_fpdp_v2(const ARMISARegisters *id)
  {
--    tcg_gen_movi_i32(cpu_R[15], s->pc & ~1);
+-    /* Return true if CPU supports double precision floating point */
-+    tcg_gen_movi_i32(cpu_R[15], s->pc);
++    /* Return true if CPU supports double precision floating point, VFPv2 */
-     s->base.is_jmp = DISAS_EXIT;
+     return FIELD_EX32(id->mvfr0, MVFR0, FPDP) > 0;
  }
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
-                  * self-modifying code correctly and also to take
+index XXXXXXX..XXXXXXX 100644
-                  * any pending interrupts immediately.
+--- a/target/arm/translate-vfp.inc.c
-                  */
++++ b/target/arm/translate-vfp.inc.c
--                gen_goto_tb(s, 0, s->pc & ~1);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
-+                gen_goto_tb(s, 0, s->pc);
+         return false;
-                 return;
+     }
-             case 7: /* sb */
-                 if ((insn & 0xf) || !dc_isar_feature(aa32_sb, s)) {
+-    if (dp && !dc_isar_feature(aa32_fpdp, s)) {
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
++    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
-                  * for TCG; MB and end the TB instead.
+         return false;
-                  */
+     }
-                 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
--                gen_goto_tb(s, 0, s->pc & ~1);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
-+                gen_goto_tb(s, 0, s->pc);
+         return false;
-                 return;
+     }
-             default:
-                 goto illegal_op;
+-    if (dp && !dc_isar_feature(aa32_fpdp, s)) {
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
++    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
-                              * and also to take any pending interrupts
+         return false;
-                              * immediately.
+     }
-                              */
--                            gen_goto_tb(s, 0, s->pc & ~1);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
-+                            gen_goto_tb(s, 0, s->pc);
+         return false;
-                             break;
+     }
-                         case 7: /* sb */
-                             if ((insn & 0xf) || !dc_isar_feature(aa32_sb, s)) {
+-    if (dp && !dc_isar_feature(aa32_fpdp, s)) {
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
++    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
-                              * for TCG; MB and end the TB instead.
+         return false;
-                              */
+     }
-                             tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
--                            gen_goto_tb(s, 0, s->pc & ~1);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
-+                            gen_goto_tb(s, 0, s->pc);
+         return false;
-                             break;
+     }
-                         default:
-                             goto illegal_op;
+-    if (dp && !dc_isar_feature(aa32_fpdp, s)) {
 +    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VFM_dp(DisasContext *s, arg_VFM_dp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp, s)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
 --
 .20.1

-[Qemu-devel] [PULL 06/29] target/arm: Pass in pc to thumb_insn_is_16bit
+[PULL 49/52] target/arm: Add isar_feature_aa32_{fpsp_v2, fpsp_v3, fpdp_v3}
 From: Richard Henderson <richard.henderson@linaro.org>
-This function is used in two different contexts, and it will be
+We will shortly use these to test for VFPv2 and VFPv3
-clearer if the function is given the address to which it applies.
+in different situations.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200214181547.21408-8-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190807045335.1361-2-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 14 +++++++-------
+ target/arm/cpu.h | 18 ++++++++++++++++++
-file changed, 7 insertions(+), 7 deletions(-)
+file changed, 18 insertions(+)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/cpu.h
-+++ b/target/arm/translate.c
++++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fpshvec(const ARMISARegisters *id)
-     }
+     return FIELD_EX32(id->mvfr0, MVFR0, FPSHVEC) > 0;
  }
--static bool thumb_insn_is_16bit(DisasContext *s, uint32_t insn)
++static inline bool isar_feature_aa32_fpsp_v2(const ARMISARegisters *id)
-+static bool thumb_insn_is_16bit(DisasContext *s, uint32_t pc, uint32_t insn)
++{
 +    /* Return true if CPU supports single precision floating point, VFPv2 */
 +    return FIELD_EX32(id->mvfr0, MVFR0, FPSP) > 0;
 +}
 +
 +static inline bool isar_feature_aa32_fpsp_v3(const ARMISARegisters *id)
 +{
 +    /* Return true if CPU supports single precision floating point, VFPv3 */
 +    return FIELD_EX32(id->mvfr0, MVFR0, FPSP) >= 2;
 +}
 +
  static inline bool isar_feature_aa32_fpdp_v2(const ARMISARegisters *id)
  {
--    /* Return true if this is a 16 bit instruction. We must be precise
+     /* Return true if CPU supports double precision floating point, VFPv2 */
--     * about this (matching the decode).  We assume that s->pc still
+     return FIELD_EX32(id->mvfr0, MVFR0, FPDP) > 0;
 -     * points to the first 16 bits of the insn.
 +    /*
 +     * Return true if this is a 16 bit instruction. We must be precise
 +     * about this (matching the decode).
       */
      if ((insn >> 11) < 0x1d) {
          /* Definitely a 16-bit instruction */
@@ -XXX,XX +XXX,XX @@ static bool thumb_insn_is_16bit(DisasContext *s, uint32_t insn)
          return false;
      }
 -    if ((insn >> 11) == 0x1e && s->pc - s->page_start < TARGET_PAGE_SIZE - 3) {
 +    if ((insn >> 11) == 0x1e && pc - s->page_start < TARGET_PAGE_SIZE - 3) {
          /* 0b1111_0xxx_xxxx_xxxx : BL/BLX prefix, and the suffix
           * is not on the next page; we merge this into a 32-bit
           * insn.
@@ -XXX,XX +XXX,XX @@ static bool insn_crosses_page(CPUARMState *env, DisasContext *s)
       */
      uint16_t insn = arm_lduw_code(env, s->pc, s->sctlr_b);
 -    return !thumb_insn_is_16bit(s, insn);
 +    return !thumb_insn_is_16bit(s, s->pc, insn);
  }
- static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
++static inline bool isar_feature_aa32_fpdp_v3(const ARMISARegisters *id)
-@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
++{
-     }
++    /* Return true if CPU supports double precision floating point, VFPv3 */
++    return FIELD_EX32(id->mvfr0, MVFR0, FPDP) >= 2;
-     insn = arm_lduw_code(env, dc->pc, dc->sctlr_b);
++}
--    is_16bit = thumb_insn_is_16bit(dc, insn);
++
-+    is_16bit = thumb_insn_is_16bit(dc, dc->pc, insn);
+ /*
-     dc->pc += 2;
+  * We always set the FP and SIMD FP16 fields to indicate identical
-     if (!is_16bit) {
+  * levels of support (assuming SIMD is implemented at all), so
          uint32_t insn2 = arm_lduw_code(env, dc->pc, dc->sctlr_b);
 --
 .20.1

-[Qemu-devel] [PULL 09/29] target/arm: Introduce add_reg_for_lit
+[PULL 50/52] target/arm: Perform fpdp_v2 check first
 From: Richard Henderson <richard.henderson@linaro.org>
-Provide a common routine for the places that require ALIGN(PC, 4)
+Shuffle the order of the checks so that we test the ISA
-as the base address as opposed to plain PC.  The two are always
+before we test anything else, such as the register arguments.
 the same for A32, but the difference is meaningful for thumb mode.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200214181547.21408-9-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190807045335.1361-5-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-vfp.inc.c |  38 ++------
+ target/arm/translate-vfp.inc.c | 144 ++++++++++++++++-----------------
- target/arm/translate.c         | 166 +++++++++++++++------------------
+file changed, 72 insertions(+), 72 deletions(-)
 files changed, 82 insertions(+), 122 deletions(-)
 diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-vfp.inc.c
 +++ b/target/arm/translate-vfp.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_sp(DisasContext *s, arg_VLDR_VSTR_sp *a)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
-         offset = -offset;
+         return false;
      }
--    if (s->thumb && a->rn == 15) {
+-    /* UNDEF accesses to D16-D31 if they don't exist */
--        /* This is actually UNPREDICTABLE */
+-    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
--        addr = tcg_temp_new_i32();
+-        ((a->vm | a->vn | a->vd) & 0x10)) {
--        tcg_gen_movi_i32(addr, s->pc & ~2);
++    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
--    } else {
+         return false;
--        addr = load_reg(s, a->rn);
+     }
--    }
--    tcg_gen_addi_i32(addr, addr, offset);
+-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
-+    /* For thumb, use of PC is UNPREDICTABLE.  */
++    /* UNDEF accesses to D16-D31 if they don't exist */
-+    addr = add_reg_for_lit(s, a->rn, offset);
++    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-     tmp = tcg_temp_new_i32();
++        ((a->vm | a->vn | a->vd) & 0x10)) {
-     if (a->l) {
+         return false;
-         gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
+     }
-@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_dp *a)
-         offset = -offset;
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
-     }
+         return false;
+     }
--    if (s->thumb && a->rn == 15) {
--        /* This is actually UNPREDICTABLE */
+-    /* UNDEF accesses to D16-D31 if they don't exist */
--        addr = tcg_temp_new_i32();
+-    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
--        tcg_gen_movi_i32(addr, s->pc & ~2);
+-        ((a->vm | a->vn | a->vd) & 0x10)) {
--    } else {
++    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
--        addr = load_reg(s, a->rn);
+         return false;
--    }
+     }
--    tcg_gen_addi_i32(addr, addr, offset);
-+    /* For thumb, use of PC is UNPREDICTABLE.  */
+-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
-+    addr = add_reg_for_lit(s, a->rn, offset);
++    /* UNDEF accesses to D16-D31 if they don't exist */
-     tmp = tcg_temp_new_i64();
++    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-     if (a->l) {
++        ((a->vm | a->vn | a->vd) & 0x10)) {
-         gen_aa32_ld64(s, tmp, addr, get_mem_index(s));
+         return false;
-@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_sp(DisasContext *s, arg_VLDM_VSTM_sp *a)
+     }
-         return true;
-     }
+@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
+         return false;
--    if (s->thumb && a->rn == 15) {
+     }
--        /* This is actually UNPREDICTABLE */
--        addr = tcg_temp_new_i32();
+-    /* UNDEF accesses to D16-D31 if they don't exist */
--        tcg_gen_movi_i32(addr, s->pc & ~2);
+-    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
--    } else {
+-        ((a->vm | a->vd) & 0x10)) {
--        addr = load_reg(s, a->rn);
++    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
--    }
+         return false;
-+    /* For thumb, use of PC is UNPREDICTABLE.  */
+     }
-+    addr = add_reg_for_lit(s, a->rn, 0);
-     if (a->p) {
+-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
-         /* pre-decrement */
++    /* UNDEF accesses to D16-D31 if they don't exist */
-         tcg_gen_addi_i32(addr, addr, -(a->imm << 2));
++    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a)
++        ((a->vm | a->vd) & 0x10)) {
-         return true;
+         return false;
      }
--    if (s->thumb && a->rn == 15) {
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
--        /* This is actually UNPREDICTABLE */
+         return false;
--        addr = tcg_temp_new_i32();
+     }
--        tcg_gen_movi_i32(addr, s->pc & ~2);
--    } else {
+-    /* UNDEF accesses to D16-D31 if they don't exist */
--        addr = load_reg(s, a->rn);
+-    if (dp && !dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
--    }
++    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
-+    /* For thumb, use of PC is UNPREDICTABLE.  */
+         return false;
-+    addr = add_reg_for_lit(s, a->rn, 0);
+     }
-     if (a->p) {
-         /* pre-decrement */
+-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
-         tcg_gen_addi_i32(addr, addr, -(a->imm << 2));
++    /* UNDEF accesses to D16-D31 if they don't exist */
-diff --git a/target/arm/translate.c b/target/arm/translate.c
++    if (dp && !dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
-index XXXXXXX..XXXXXXX 100644
+         return false;
---- a/target/arm/translate.c
+     }
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 load_reg(DisasContext *s, int reg)
+@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
-     return tmp;
+     TCGv_i64 f0, f1, fd;
- }
+     TCGv_ptr fpst;
-+/*
+-    /* UNDEF accesses to D16-D31 if they don't exist */
-+ * Create a new temp, REG + OFS, except PC is ALIGN(PC, 4).
+-    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vn | vm) & 0x10)) {
-+ * This is used for load/store for which use of PC implies (literal),
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-+ * or ADD that implies ADR.
+         return false;
-+ */
+     }
-+static TCGv_i32 add_reg_for_lit(DisasContext *s, int reg, int ofs)
-+{
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-+    TCGv_i32 tmp = tcg_temp_new_i32();
++    /* UNDEF accesses to D16-D31 if they don't exist */
-+
++    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vn | vm) & 0x10)) {
-+    if (reg == 15) {
+         return false;
-+        tcg_gen_movi_i32(tmp, (read_pc(s) & ~3) + ofs);
+     }
-+    } else {
-+        tcg_gen_addi_i32(tmp, cpu_R[reg], ofs);
+@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
-+    }
+     int veclen = s->vec_len;
-+    return tmp;
+     TCGv_i64 f0, fd;
-+}
-+
+-    /* UNDEF accesses to D16-D31 if they don't exist */
- /* Set a CPU register.  The source must be a temporary and will be
+-    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vm) & 0x10)) {
-    marked as dead.  */
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
- static void store_reg(DisasContext *s, int reg, TCGv_i32 var)
+         return false;
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+     }
-                  */
-                 bool wback = extract32(insn, 21, 1);
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
++    /* UNDEF accesses to D16-D31 if they don't exist */
--                if (rn == 15) {
++    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vm) & 0x10)) {
--                    if (insn & (1 << 21)) {
+         return false;
--                        /* UNPREDICTABLE */
+     }
--                        goto illegal_op;
--                    }
+@@ -XXX,XX +XXX,XX @@ static bool trans_VFM_dp(DisasContext *s, arg_VFM_dp *a)
--                    addr = tcg_temp_new_i32();
+         return false;
--                    tcg_gen_movi_i32(addr, s->pc & ~3);
+     }
--                } else {
--                    addr = load_reg(s, rn);
+-    /* UNDEF accesses to D16-D31 if they don't exist. */
-+                if (rn == 15 && (insn & (1 << 21))) {
+-    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+                    /* UNPREDICTABLE */
+-        ((a->vd | a->vn | a->vm) & 0x10)) {
-+                    goto illegal_op;
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-                 }
+         return false;
-+
+     }
-+                addr = add_reg_for_lit(s, rn, 0);
-                 offset = (insn & 0xff) * 4;
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-                 if ((insn & (1 << 23)) == 0) {
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-                     offset = -offset;
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
++        ((a->vd | a->vn | a->vm) & 0x10)) {
-                         store_reg(s, rd, tmp);
+         return false;
-                     } else {
+     }
-                         /* Add/sub 12-bit immediate.  */
--                        if (rn == 15) {
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
--                            offset = s->pc & ~(uint32_t)3;
--                            if (insn & (1 << 23))
+     vd = a->vd;
--                                offset -= imm;
--                            else
+-    /* UNDEF accesses to D16-D31 if they don't exist. */
--                                offset += imm;
+-    if (!dc_isar_feature(aa32_simd_r32, s) && (vd & 0x10)) {
--                            tmp = tcg_temp_new_i32();
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
--                            tcg_gen_movi_i32(tmp, offset);
+         return false;
--                            store_reg(s, rd, tmp);
+     }
-+                        if (insn & (1 << 23)) {
-+                            imm = -imm;
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-+                        }
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-+                        tmp = add_reg_for_lit(s, rn, imm);
++    if (!dc_isar_feature(aa32_simd_r32, s) && (vd & 0x10)) {
-+                        if (rn == 13 && rd == 13) {
+         return false;
-+                            /* ADD SP, SP, imm or SUB SP, SP, imm */
+     }
-+                            store_sp_checked(s, tmp);
-                         } else {
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
--                            tmp = load_reg(s, rn);
+ {
--                            if (insn & (1 << 23))
+     TCGv_i64 vd, vm;
--                                tcg_gen_subi_i32(tmp, tmp, imm);
--                            else
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
--                                tcg_gen_addi_i32(tmp, tmp, imm);
++        return false;
--                            if (rn == 13 && rd == 13) {
++    }
--                                /* ADD SP, SP, imm or SUB SP, SP, imm */
++
--                                store_sp_checked(s, tmp);
+     /* Vm/M bits must be zero for the Z variant */
--                            } else {
+     if (a->z && a->vm != 0) {
--                                store_reg(s, rd, tmp);
+         return false;
--                            }
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
-+                            store_reg(s, rd, tmp);
+         return false;
-                         }
+     }
-                     }
-                 }
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+-        return false;
-             }
+-    }
-         }
+-
-         memidx = get_mem_index(s);
+     if (!vfp_access_check(s)) {
--        if (rn == 15) {
+         return true;
--            addr = tcg_temp_new_i32();
+     }
--            /* PC relative.  */
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
--            /* s->pc has already been incremented by 4.  */
+     TCGv_i32 tmp;
--            imm = s->pc & 0xfffffffc;
+     TCGv_i64 vd;
--            if (insn & (1 << 23))
--                imm += insn & 0xfff;
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
--            else
++        return false;
--                imm -= insn & 0xfff;
++    }
--            tcg_gen_movi_i32(addr, imm);
++
-+        imm = insn & 0xfff;
+     if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
-+        if (insn & (1 << 23)) {
+         return false;
-+            /* PC relative or Positive offset.  */
+     }
-+            addr = add_reg_for_lit(s, rn, imm);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
-+        } else if (rn == 15) {
+         return false;
-+            /* PC relative with negative offset.  */
+     }
-+            addr = add_reg_for_lit(s, rn, -imm);
-         } else {
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-             addr = load_reg(s, rn);
+-        return false;
--            if (insn & (1 << 23)) {
+-    }
--                /* Positive offset.  */
+-
--                imm = insn & 0xfff;
+     if (!vfp_access_check(s)) {
--                tcg_gen_addi_i32(addr, addr, imm);
+         return true;
--            } else {
+     }
--                imm = insn & 0xff;
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
--                switch ((insn >> 8) & 0xf) {
+     TCGv_i32 tmp;
--                case 0x0: /* Shifted Register.  */
+     TCGv_i64 vm;
--                    shift = (insn >> 4) & 0xf;
--                    if (shift > 3) {
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
--                        tcg_temp_free_i32(addr);
++        return false;
--                        goto illegal_op;
++    }
--                    }
++
--                    tmp = load_reg(s, rm);
+     if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
--                    if (shift)
+         return false;
--                        tcg_gen_shli_i32(tmp, tmp, shift);
+     }
--                    tcg_gen_add_i32(addr, addr, tmp);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
--                    tcg_temp_free_i32(tmp);
+         return false;
--                    break;
+     }
--                case 0xc: /* Negative offset.  */
--                    tcg_gen_addi_i32(addr, addr, -imm);
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
--                    break;
+-        return false;
--                case 0xe: /* User privilege.  */
+-    }
--                    tcg_gen_addi_i32(addr, addr, imm);
+-
--                    memidx = get_a32_user_mem_index(s);
+     if (!vfp_access_check(s)) {
--                    break;
+         return true;
--                case 0x9: /* Post-decrement.  */
+     }
--                    imm = -imm;
+@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a)
--                    /* Fall through.  */
+     TCGv_ptr fpst;
--                case 0xb: /* Post-increment.  */
+     TCGv_i64 tmp;
--                    postinc = 1;
--                    writeback = 1;
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
--                    break;
++        return false;
--                case 0xd: /* Pre-decrement.  */
++    }
--                    imm = -imm;
++
--                    /* Fall through.  */
+     if (!dc_isar_feature(aa32_vrint, s)) {
--                case 0xf: /* Pre-increment.  */
+         return false;
--                    writeback = 1;
+     }
--                    break;
+@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a)
--                default:
+         return false;
-+            imm = insn & 0xff;
+     }
-+            switch ((insn >> 8) & 0xf) {
-+            case 0x0: /* Shifted Register.  */
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-+                shift = (insn >> 4) & 0xf;
+-        return false;
-+                if (shift > 3) {
+-    }
-                     tcg_temp_free_i32(addr);
+-
-                     goto illegal_op;
+     if (!vfp_access_check(s)) {
-                 }
+         return true;
-+                tmp = load_reg(s, rm);
+     }
-+                if (shift) {
+@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a)
-+                    tcg_gen_shli_i32(tmp, tmp, shift);
+     TCGv_i64 tmp;
-+                }
+     TCGv_i32 tcg_rmode;
-+                tcg_gen_add_i32(addr, addr, tmp);
-+                tcg_temp_free_i32(tmp);
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-+                break;
++        return false;
-+            case 0xc: /* Negative offset.  */
++    }
-+                tcg_gen_addi_i32(addr, addr, -imm);
++
-+                break;
+     if (!dc_isar_feature(aa32_vrint, s)) {
-+            case 0xe: /* User privilege.  */
+         return false;
-+                tcg_gen_addi_i32(addr, addr, imm);
+     }
-+                memidx = get_a32_user_mem_index(s);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a)
-+                break;
+         return false;
-+            case 0x9: /* Post-decrement.  */
+     }
-+                imm = -imm;
-+                /* Fall through.  */
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-+            case 0xb: /* Post-increment.  */
+-        return false;
-+                postinc = 1;
+-    }
-+                writeback = 1;
+-
-+                break;
+     if (!vfp_access_check(s)) {
-+            case 0xd: /* Pre-decrement.  */
+         return true;
-+                imm = -imm;
+     }
-+                /* Fall through.  */
+@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
-+            case 0xf: /* Pre-increment.  */
+     TCGv_ptr fpst;
-+                writeback = 1;
+     TCGv_i64 tmp;
-+                break;
-+            default:
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-+                tcg_temp_free_i32(addr);
++        return false;
-+                goto illegal_op;
++    }
-             }
++
-         }
+     if (!dc_isar_feature(aa32_vrint, s)) {
+         return false;
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
+     }
-         if (insn & (1 << 11)) {
+@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
-             rd = (insn >> 8) & 7;
+         return false;
-             /* load pc-relative.  Bit 1 of PC is ignored.  */
+     }
--            val = read_pc(s) + ((insn & 0xff) * 4);
--            val &= ~(uint32_t)2;
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
--            addr = tcg_temp_new_i32();
+-        return false;
--            tcg_gen_movi_i32(addr, val);
+-    }
-+            addr = add_reg_for_lit(s, 15, (insn & 0xff) * 4);
+-
-             tmp = tcg_temp_new_i32();
+     if (!vfp_access_check(s)) {
-             gen_aa32_ld32u_iss(s, tmp, addr, get_mem_index(s),
+         return true;
-                                rd | ISSIs16Bit);
+     }
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a)
-          *  - Add PC/SP (immediate)
+     TCGv_i64 vd;
-          */
+     TCGv_i32 vm;
-         rd = (insn >> 8) & 7;
--        if (insn & (1 << 11)) {
+-    /* UNDEF accesses to D16-D31 if they don't exist. */
--            /* SP */
+-    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
--            tmp = load_reg(s, 13);
++    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
--        } else {
+         return false;
--            /* PC. bit 1 is ignored.  */
+     }
--            tmp = tcg_temp_new_i32();
--            tcg_gen_movi_i32(tmp, read_pc(s) & ~(uint32_t)2);
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
--        }
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-         val = (insn & 0xff) * 4;
++    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
--        tcg_gen_addi_i32(tmp, tmp, val);
+         return false;
-+        tmp = add_reg_for_lit(s, insn & (1 << 11) ? 13 : 15, val);
+     }
-         store_reg(s, rd, tmp);
-         break;
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a)
      TCGv_i64 vm;
      TCGv_i32 vd;
 -    /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a)
      TCGv_i64 vd;
      TCGv_ptr fpst;
 -    /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
      TCGv_i32 vd;
      TCGv_i64 vm;
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
 +        return false;
 +    }
 +
      if (!dc_isar_feature(aa32_jscvt, s)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
 -        return false;
 -    }
 -
      if (!vfp_access_check(s)) {
          return true;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
      TCGv_ptr fpst;
      int frac_bits;
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
 +        return false;
 +    }
 +
      if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
          return false;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
 -        return false;
 -    }
 -
      if (!vfp_access_check(s)) {
          return true;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a)
      TCGv_i64 vm;
      TCGv_ptr fpst;
 -    /* UNDEF accesses to D16-D31 if they don't exist. */
 -    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
 +    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
          return false;
      }
 -    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
          return false;
      }
 --
 .20.1

-[Qemu-devel] [PULL 15/29] target/arm: Use unallocated_encoding for aarch32
+[PULL 51/52] target/arm: Replace ARM_FEATURE_VFP3 checks with fp{sp, dp}_v3
 From: Richard Henderson <richard.henderson@linaro.org>
-Promote this function from aarch64 to fully general use.
+Sort this check to the start of a trans_* function.
-Use it to unify the code sequences for generating illegal
+Merge this with any existing test for fpdp_v2.
 opcode exceptions.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200214181547.21408-10-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190807045335.1361-11-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.h     |  2 --
+ target/arm/translate-vfp.inc.c | 24 ++++++++----------------
- target/arm/translate.h         |  2 ++
+file changed, 8 insertions(+), 16 deletions(-)
  target/arm/translate-a64.c     |  7 -------
  target/arm/translate-vfp.inc.c |  3 +--
  target/arm/translate.c         | 22 ++++++++++++----------
 files changed, 15 insertions(+), 21 deletions(-)
-diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.h
-+++ b/target/arm/translate-a64.h
-@@ -XXX,XX +XXX,XX @@
- #ifndef TARGET_ARM_TRANSLATE_A64_H
- #define TARGET_ARM_TRANSLATE_A64_H
--void unallocated_encoding(DisasContext *s);
--
- #define unsupported_encoding(s, insn)                                    \
-     do {                                                                 \
-         qemu_log_mask(LOG_UNIMP,                                         \
-diff --git a/target/arm/translate.h b/target/arm/translate.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
-+++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasCompare {
-     bool value_global;
- } DisasCompare;
-+void unallocated_encoding(DisasContext *s);
-+
- /* Share the TCG temporaries common between 32 and 64 bit modes.  */
- extern TCGv_i32 cpu_NF, cpu_ZF, cpu_CF, cpu_VF;
- extern TCGv_i64 cpu_exclusive_addr;
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
-+++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest)
-     }
- }
--void unallocated_encoding(DisasContext *s)
--{
--    /* Unallocated and reserved encodings are uncategorized */
--    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
--                       default_exception_el(s));
--}
--
- static void init_tmp_a64_array(DisasContext *s)
- {
- #ifdef CONFIG_DEBUG_TCG
 diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-vfp.inc.c
 +++ b/target/arm/translate-vfp.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
+          * VFPv2 allows access to FPSID from userspace; VFPv3 restricts
-     if (!s->vfp_enabled && !ignore_vfp_enabled) {
+          * all ID registers to privileged access only.
-         assert(!arm_dc_feature(s, ARM_FEATURE_M));
+          */
--        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+-        if (IS_USER(s) && arm_dc_feature(s, ARM_FEATURE_VFP3)) {
--                           default_exception_el(s));
++        if (IS_USER(s) && dc_isar_feature(aa32_fpsp_v3, s)) {
-+        unallocated_encoding(s);
+             return false;
          }
          ignore_vfp_enabled = true;
@@ -XXX,XX +XXX,XX @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
      case ARM_VFP_FPINST:
      case ARM_VFP_FPINST2:
          /* Not present in VFPv3 */
 -        if (IS_USER(s) || arm_dc_feature(s, ARM_FEATURE_VFP3)) {
 +        if (IS_USER(s) || dc_isar_feature(aa32_fpsp_v3, s)) {
              return false;
          }
          break;
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a)
      vd = a->vd;
 -    if (!dc_isar_feature(aa32_fpshvec, s) &&
 -        (veclen != 0 || s->vec_stride != 0)) {
 +    if (!dc_isar_feature(aa32_fpsp_v3, s)) {
          return false;
      }
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+-    if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
-index XXXXXXX..XXXXXXX 100644
++    if (!dc_isar_feature(aa32_fpshvec, s) &&
---- a/target/arm/translate.c
++        (veclen != 0 || s->vec_stride != 0)) {
-+++ b/target/arm/translate.c
+         return false;
@@ -XXX,XX +XXX,XX @@ static void gen_exception_bkpt_insn(DisasContext *s, uint32_t syn)
      s->base.is_jmp = DISAS_NORETURN;
  }
 +void unallocated_encoding(DisasContext *s)
 +{
 +    /* Unallocated and reserved encodings are uncategorized */
 +    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
 +                       default_exception_el(s));
 +}
 +
  /* Force a TB lookup after an instruction that changes the CPU state.  */
  static inline void gen_lookup_tb(DisasContext *s)
  {
@@ -XXX,XX +XXX,XX @@ static inline void gen_hlt(DisasContext *s, int imm)
          return;
      }
--    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
--                       default_exception_el(s));
-+    unallocated_encoding(s);
+     vd = a->vd;
- }
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
- static inline void gen_add_data_offset(DisasContext *s, unsigned int insn,
++    if (!dc_isar_feature(aa32_fpdp_v3, s)) {
-@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
+         return false;
      }
-     if (undef) {
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
--        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+         return false;
 -                           default_exception_el(s));
 +        unallocated_encoding(s);
          return;
      }
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+-    if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
-             break;
+-        return false;
-         default:
+-    }
-         illegal_op:
+-
--            gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+     if (!vfp_access_check(s)) {
--                               default_exception_el(s));
+         return true;
 +            unallocated_encoding(s);
              break;
          }
      }
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_sp(DisasContext *s, arg_VCVT_fix_sp *a)
      TCGv_ptr fpst;
      int frac_bits;
 -    if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
 +    if (!dc_isar_feature(aa32_fpsp_v3, s)) {
          return false;
      }
-     return;
- illegal_op:
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
--    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+     TCGv_ptr fpst;
--                       default_exception_el(s));
+     int frac_bits;
-+    unallocated_encoding(s);
- }
+-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+-        return false;
- static void disas_thumb_insn(DisasContext *s, uint32_t insn)
+-    }
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
+-
-     return;
+-    if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
- illegal_op:
++    if (!dc_isar_feature(aa32_fpdp_v3, s)) {
- undef:
+         return false;
--    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+     }
--                       default_exception_el(s));
 +    unallocated_encoding(s);
  }
  static bool insn_crosses_page(CPUARMState *env, DisasContext *s)
 --
 .20.1

-[Qemu-devel] [PULL 08/29] target/arm: Introduce read_pc
+[PULL 52/52] target/arm: Add missing checks for fpsp_v2
 From: Richard Henderson <richard.henderson@linaro.org>
-We currently have 3 different ways of computing the architectural
+We will eventually remove the early ARM_FEATURE_VFP test,
-value of "PC" as seen in the ARM ARM.
+so add a proper test for each trans_* that does not already
+have another ISA test.
 The value of s->pc has been incremented past the current insn,
 but that is all.  Thus for a32, PC = s->pc + 4; for t32, PC = s->pc;
 for t16, PC = s->pc + 2.  These differing computations make it
 impossible at present to unify the various code paths.
 With the newly introduced s->pc_curr, we can compute the correct
 value for all cases, using the formula given in the ARM ARM.
 This changes the behaviour for load_reg() and load_reg_var()
 when called with reg==15 from a 32-bit Thumb instruction:
 previously they would have returned the incorrect value
 of pc_curr + 6, and now they will return the architecturally
 correct value of PC, which is pc_curr + 4. This will not
 affect well-behaved guest software, because all of the places
 we call these functions from T32 code are instructions where
 using r15 is UNPREDICTABLE. Using the architectural PC value
 here is more consistent with the T16 and A32 behaviour.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200214181547.21408-11-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190807045335.1361-4-richard.henderson@linaro.org
-[PMM: added commit message note about UNPREDICTABLE T32 cases]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 59 ++++++++++++++++--------------------------
+ target/arm/translate-vfp.inc.c | 78 ++++++++++++++++++++++++++++++----
-file changed, 23 insertions(+), 36 deletions(-)
+file changed, 69 insertions(+), 9 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/translate-vfp.inc.c
-+++ b/target/arm/translate.c
++++ b/target/arm/translate-vfp.inc.c
-@@ -XXX,XX +XXX,XX @@ static inline void store_cpu_offset(TCGv_i32 var, int offset)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a)
- #define store_cpu_field(var, name) \
+     int pass;
-     store_cpu_offset(var, offsetof(CPUARMState, name))
+     uint32_t offset;
-+/* The architectural value of PC.  */
++    /* SIZE == 2 is a VFP instruction; otherwise NEON.  */
-+static uint32_t read_pc(DisasContext *s)
++    if (a->size == 2
-+{
++        ? !dc_isar_feature(aa32_fpsp_v2, s)
-+    return s->pc_curr + (s->thumb ? 4 : 8);
++        : !arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+}
++        return false;
-+
++    }
- /* Set a variable to the value of a CPU register.  */
++
- static void load_reg_var(DisasContext *s, TCGv_i32 var, int reg)
+     /* UNDEF accesses to D16-D31 if they don't exist */
      if (!dc_isar_feature(aa32_simd_r32, s) && (a->vn & 0x10)) {
          return false;
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a)
      pass = extract32(offset, 2, 1);
      offset = extract32(offset, 0, 2) * 8;
 -    if (a->size != 2 && !arm_dc_feature(s, ARM_FEATURE_NEON)) {
 -        return false;
 -    }
 -
      if (!vfp_access_check(s)) {
          return true;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_from_gp(DisasContext *s, arg_VMOV_from_gp *a)
      int pass;
      uint32_t offset;
 +    /* SIZE == 2 is a VFP instruction; otherwise NEON.  */
 +    if (a->size == 2
 +        ? !dc_isar_feature(aa32_fpsp_v2, s)
 +        : !arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
      /* UNDEF accesses to D16-D31 if they don't exist */
      if (!dc_isar_feature(aa32_simd_r32, s) && (a->vn & 0x10)) {
          return false;
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_from_gp(DisasContext *s, arg_VMOV_from_gp *a)
      pass = extract32(offset, 2, 1);
      offset = extract32(offset, 0, 2) * 8;
 -    if (a->size != 2 && !arm_dc_feature(s, ARM_FEATURE_NEON)) {
 -        return false;
 -    }
 -
      if (!vfp_access_check(s)) {
          return true;
      }
@@ -XXX,XX +XXX,XX @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
      TCGv_i32 tmp;
      bool ignore_vfp_enabled = false;
 +    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
 +        return false;
 +    }
 +
      if (arm_dc_feature(s, ARM_FEATURE_M)) {
          /*
           * The only M-profile VFP vmrs/vmsr sysreg is FPSCR.
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_single(DisasContext *s, arg_VMOV_single *a)
  {
-     if (reg == 15) {
+     TCGv_i32 tmp;
--        uint32_t addr;
--        /* normally, since we updated PC, we need only to add one insn */
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
--        if (s->thumb)
++        return false;
--            addr = (long)s->pc + 2;
++    }
--        else
++
--            addr = (long)s->pc + 4;
+     if (!vfp_access_check(s)) {
--        tcg_gen_movi_i32(var, addr);
+         return true;
-+        tcg_gen_movi_i32(var, read_pc(s));
+     }
-     } else {
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_64_sp(DisasContext *s, arg_VMOV_64_sp *a)
-         tcg_gen_mov_i32(var, cpu_R[reg]);
+ {
-     }
+     TCGv_i32 tmp;
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
-             /* branch link and change to thumb (blx <offset>) */
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
-             int32_t offset;
++        return false;
++    }
--            val = (uint32_t)s->pc;
++
-             tmp = tcg_temp_new_i32();
+     /*
--            tcg_gen_movi_i32(tmp, val);
+      * VMOV between two general-purpose registers and two single precision
-+            tcg_gen_movi_i32(tmp, s->pc);
+      * floating point registers
-             store_reg(s, 14, tmp);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_64_dp(DisasContext *s, arg_VMOV_64_dp *a)
-             /* Sign-extend the 24-bit offset */
-             offset = (((int32_t)insn) << 8) >> 8;
+     /*
-+            val = read_pc(s);
+      * VMOV between two general-purpose registers and one double precision
-             /* offset * 4 + bit24 * 2 + (thumb bit) */
+-     * floating point register
-             val += (offset << 2) | ((insn >> 23) & 2) | 1;
++     * floating point register.  Note that this does not require support
--            /* pipeline offset */
++     * for double precision arithmetic.
--            val += 4;
+      */
-             /* protected by ARCH(5); above, near the start of uncond block */
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
-             gen_bx_im(s, val);
++        return false;
-             return;
++    }
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
-                         } else {
+     /* UNDEF accesses to D16-D31 if they don't exist */
-                             /* store */
+     if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
-                             if (i == 15) {
+@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_sp(DisasContext *s, arg_VLDR_VSTR_sp *a)
--                                /* special case: r15 = PC + 8 */
+     uint32_t offset;
--                                val = (long)s->pc + 4;
+     TCGv_i32 addr, tmp;
-                                 tmp = tcg_temp_new_i32();
--                                tcg_gen_movi_i32(tmp, val);
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
-+                                tcg_gen_movi_i32(tmp, read_pc(s));
++        return false;
-                             } else if (user) {
++    }
-                                 tmp = tcg_temp_new_i32();
++
-                                 tmp2 = tcg_const_i32(i);
+     if (!vfp_access_check(s)) {
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+         return true;
-                 int32_t offset;
+     }
+@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_dp *a)
-                 /* branch (and link) */
+     TCGv_i32 addr;
--                val = (int32_t)s->pc;
+     TCGv_i64 tmp;
-                 if (insn & (1 << 24)) {
-                     tmp = tcg_temp_new_i32();
++    /* Note that this does not require support for double arithmetic.  */
--                    tcg_gen_movi_i32(tmp, val);
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
-+                    tcg_gen_movi_i32(tmp, s->pc);
++        return false;
-                     store_reg(s, 14, tmp);
++    }
-                 }
++
-                 offset = sextract32(insn << 2, 0, 26);
+     /* UNDEF accesses to D16-D31 if they don't exist */
--                val += offset + 4;
+     if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
--                gen_jmp(s, val);
+         return false;
-+                gen_jmp(s, read_pc(s) + offset);
+@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_sp(DisasContext *s, arg_VLDM_VSTM_sp *a)
-             }
+     TCGv_i32 addr, tmp;
-             break;
+     int i, n;
-         case 0xc:
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
-                 tcg_temp_free_i32(addr);
++        return false;
-             } else if ((insn & (7 << 5)) == 0) {
++    }
-                 /* Table Branch.  */
++
--                if (rn == 15) {
+     n = a->imm;
--                    addr = tcg_temp_new_i32();
--                    tcg_gen_movi_i32(addr, s->pc);
+     if (n == 0 || (a->vd + n) > 32) {
--                } else {
+@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a)
--                    addr = load_reg(s, rn);
+     TCGv_i64 tmp;
--                }
+     int i, n;
-+                addr = load_reg(s, rn);
-                 tmp = load_reg(s, rm);
++    /* Note that this does not require support for double arithmetic.  */
-                 tcg_gen_add_i32(addr, addr, tmp);
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
-                 if (insn & (1 << 4)) {
++        return false;
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
++    }
-                 }
++
-                 tcg_temp_free_i32(addr);
+     n = a->imm >> 1;
-                 tcg_gen_shli_i32(tmp, tmp, 1);
--                tcg_gen_addi_i32(tmp, tmp, s->pc);
+     if (n == 0 || (a->vd + n) > 32 || n > 16) {
-+                tcg_gen_addi_i32(tmp, tmp, read_pc(s));
+@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_sp(DisasContext *s, VFPGen3OpSPFn *fn,
-                 store_reg(s, 15, tmp);
+     TCGv_i32 f0, f1, fd;
-             } else {
+     TCGv_ptr fpst;
-                 bool is_lasr = false;
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
-                     tcg_gen_movi_i32(cpu_R[14], s->pc | 1);
++        return false;
-                 }
++    }
++
--                offset += s->pc;
+     if (!dc_isar_feature(aa32_fpshvec, s) &&
-+                offset += read_pc(s);
+         (veclen != 0 || s->vec_stride != 0)) {
-                 if (insn & (1 << 12)) {
+         return false;
-                     /* b/bl */
+@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm)
-                     gen_jmp(s, offset);
+     int veclen = s->vec_len;
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+     TCGv_i32 f0, fd;
-                 offset |= (insn & (1 << 11)) << 8;
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
-                 /* jump to the offset */
++        return false;
--                gen_jmp(s, s->pc + offset);
++    }
-+                gen_jmp(s, read_pc(s) + offset);
++
-             }
+     if (!dc_isar_feature(aa32_fpshvec, s) &&
-         } else {
+         (veclen != 0 || s->vec_stride != 0)) {
-             /*
+         return false;
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_sp(DisasContext *s, arg_VCMP_sp *a)
-         if (insn & (1 << 11)) {
+ {
-             rd = (insn >> 8) & 7;
+     TCGv_i32 vd, vm;
-             /* load pc-relative.  Bit 1 of PC is ignored.  */
--            val = s->pc + 2 + ((insn & 0xff) * 4);
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
-+            val = read_pc(s) + ((insn & 0xff) * 4);
++        return false;
-             val &= ~(uint32_t)2;
++    }
-             addr = tcg_temp_new_i32();
++
-             tcg_gen_movi_i32(addr, val);
+     /* Vm/M bits must be zero for the Z variant */
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
+     if (a->z && a->vm != 0) {
-         } else {
+         return false;
-             /* PC. bit 1 is ignored.  */
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_int_sp(DisasContext *s, arg_VCVT_int_sp *a)
-             tmp = tcg_temp_new_i32();
+     TCGv_i32 vm;
--            tcg_gen_movi_i32(tmp, (s->pc + 2) & ~(uint32_t)2);
+     TCGv_ptr fpst;
-+            tcg_gen_movi_i32(tmp, read_pc(s) & ~(uint32_t)2);
-         }
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
-         val = (insn & 0xff) * 4;
++        return false;
-         tcg_gen_addi_i32(tmp, tmp, val);
++    }
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
++
-                 tcg_gen_brcondi_i32(TCG_COND_NE, tmp, 0, s->condlabel);
+     if (!vfp_access_check(s)) {
-             tcg_temp_free_i32(tmp);
+         return true;
-             offset = ((insn & 0xf8) >> 2) | (insn & 0x200) >> 3;
+     }
--            val = (uint32_t)s->pc + 2;
+@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_sp_int(DisasContext *s, arg_VCVT_sp_int *a)
--            val += offset;
+     TCGv_i32 vm;
--            gen_jmp(s, val);
+     TCGv_ptr fpst;
-+            gen_jmp(s, read_pc(s) + offset);
-             break;
++    if (!dc_isar_feature(aa32_fpsp_v2, s)) {
++        return false;
-         case 15: /* IT, nop-hint.  */
++    }
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
++
-         arm_skip_unless(s, cond);
+     if (!vfp_access_check(s)) {
+         return true;
          /* jump to the offset */
 -        val = (uint32_t)s->pc + 2;
 +        val = read_pc(s);
          offset = ((int32_t)insn << 24) >> 24;
          val += offset << 1;
          gen_jmp(s, val);
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
              break;
          }
          /* unconditional branch */
 -        val = (uint32_t)s->pc;
 +        val = read_pc(s);
          offset = ((int32_t)insn << 21) >> 21;
 -        val += (offset << 1) + 2;
 +        val += offset << 1;
          gen_jmp(s, val);
          break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
              /* 0b1111_0xxx_xxxx_xxxx : BL/BLX prefix */
              uint32_t uoffset = ((int32_t)insn << 21) >> 9;
 -            tcg_gen_movi_i32(cpu_R[14], s->pc + 2 + uoffset);
 +            tcg_gen_movi_i32(cpu_R[14], read_pc(s) + uoffset);
          }
          break;
      }
 --
 .20.1

First arm pullreq of 4.2...

thanks
-- PMM

The following changes since commit 27608c7c66bd923eb5e5faab80e795408cbe2b51:

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20190814a' into staging (2019-08-16 12:00:18 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20190816

for you to fetch changes up to 664b7e3b97d6376f3329986c465b3782458b0f8b:

target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word (2019-08-16 14:02:53 +0100)

----------------------------------------------------------------
target-arm queue:
 * target/arm: generate a custom MIDR for -cpu max
 * hw/misc/zynq_slcr: refactor to use standard register definition
 * Set ENET_BD_BDU in I.MX FEC controller
 * target/arm: Fix routing of singlestep exceptions
 * refactor a32/t32 decoder handling of PC
 * minor optimisations/cleanups of some a32/t32 codegen
 * target/arm/cpu64: Ensure kvm really supports aarch64=off
 * target/arm/cpu: Ensure we can use the pmu with kvm
 * target/arm: Minor cleanups preparatory to KVM SVE support

----------------------------------------------------------------
Aaron Hill (1):
      Set ENET_BD_BDU in I.MX FEC controller

Alex Bennée (1):
      target/arm: generate a custom MIDR for -cpu max

Andrew Jones (6):
      target/arm/cpu64: Ensure kvm really supports aarch64=off
      target/arm/cpu: Ensure we can use the pmu with kvm
      target/arm/helper: zcr: Add build bug next to value range assumption
      target/arm/cpu: Use div-round-up to determine predicate register array size
      target/arm/kvm64: Fix error returns
      target/arm/kvm64: Move the get/put of fpsimd registers out

Damien Hedde (1):
      hw/misc/zynq_slcr: use standard register definition

Peter Maydell (2):
      target/arm: Factor out 'generate singlestep exception' function
      target/arm: Fix routing of singlestep exceptions

Richard Henderson (18):
      target/arm: Pass in pc to thumb_insn_is_16bit
      target/arm: Introduce pc_curr
      target/arm: Introduce read_pc
      target/arm: Introduce add_reg_for_lit
      target/arm: Remove redundant s->pc & ~1
      target/arm: Replace s->pc with s->base.pc_next
      target/arm: Replace offset with pc in gen_exception_insn
      target/arm: Replace offset with pc in gen_exception_internal_insn
      target/arm: Remove offset argument to gen_exception_bkpt_insn
      target/arm: Use unallocated_encoding for aarch32
      target/arm: Remove helper_double_saturate
      target/arm: Use tcg_gen_extract_i32 for shifter_out_im
      target/arm: Use tcg_gen_deposit_i32 for PKHBT, PKHTB
      target/arm: Remove redundant shift tests
      target/arm: Use ror32 instead of open-coding the operation
      target/arm: Use tcg_gen_rotri_i32 for gen_swap_half
      target/arm: Simplify SMMLA, SMMLAR, SMMLS, SMMLSR
      target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word

From: Alex Bennée <alex.bennee@linaro.org>

While most features are now detected by probing the ID_* registers
kernels can (and do) use MIDR_EL1 for working out of they have to
apply errata. This can trip up warnings in the kernel as it tries to
work out if it should apply workarounds to features that don't
actually exist in the reported CPU type.

Avoid this problem by synthesising our own MIDR value.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190726113950.7499-1-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h   |  6 ++++++
 target/arm/cpu64.c | 19 +++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(V7M_FPCCR, ASPEN, 31, 1)
 /*
  * System register ID fields.
  */
+FIELD(MIDR_EL1, REVISION, 0, 4)
+FIELD(MIDR_EL1, PARTNUM, 4, 12)
+FIELD(MIDR_EL1, ARCHITECTURE, 16, 4)
+FIELD(MIDR_EL1, VARIANT, 20, 4)
+FIELD(MIDR_EL1, IMPLEMENTER, 24, 8)
+
 FIELD(ID_ISAR0, SWAP, 0, 4)
 FIELD(ID_ISAR0, BITCOUNT, 4, 4)
 FIELD(ID_ISAR0, BITFIELD, 8, 4)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         uint32_t u;
         aarch64_a57_initfn(obj);
 
+        /*
+         * Reset MIDR so the guest doesn't mistake our 'max' CPU type for a real
+         * one and try to apply errata workarounds or use impdef features we
+         * don't provide.
+         * An IMPLEMENTER field of 0 means "reserved for software use";
+         * ARCHITECTURE must be 0xf indicating "v7 or later, check ID registers
+         * to see which features are present";
+         * the VARIANT, PARTNUM and REVISION fields are all implementation
+         * defined and we choose to define PARTNUM just in case guest
+         * code needs to distinguish this QEMU CPU from other software
+         * implementations, though this shouldn't be needed.
+         */
+        t = FIELD_DP64(0, MIDR_EL1, IMPLEMENTER, 0);
+        t = FIELD_DP64(t, MIDR_EL1, ARCHITECTURE, 0xf);
+        t = FIELD_DP64(t, MIDR_EL1, PARTNUM, 'Q');
+        t = FIELD_DP64(t, MIDR_EL1, VARIANT, 0);
+        t = FIELD_DP64(t, MIDR_EL1, REVISION, 0);
+        cpu->midr = t;
+
         t = cpu->isar.id_aa64isar0;
         t = FIELD_DP64(t, ID_AA64ISAR0, AES, 2); /* AES + PMULL */
         t = FIELD_DP64(t, ID_AA64ISAR0, SHA1, 1);
-- 
2.20.1

From: Damien Hedde <damien.hedde@greensocs.com>

Replace the zynq_slcr registers enum and macros using the
hw/registerfields.h macros.

Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20190729145654.14644-30-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/misc/zynq_slcr.c | 450 ++++++++++++++++++++++----------------------
 1 file changed, 225 insertions(+), 225 deletions(-)

diff --git a/hw/misc/zynq_slcr.c b/hw/misc/zynq_slcr.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/zynq_slcr.c
+++ b/hw/misc/zynq_slcr.c
@@ -XXX,XX +XXX,XX @@
 #include "sysemu/sysemu.h"
 #include "qemu/log.h"
 #include "qemu/module.h"
+#include "hw/registerfields.h"
 
 #ifndef ZYNQ_SLCR_ERR_DEBUG
 #define ZYNQ_SLCR_ERR_DEBUG 0
@@ -XXX,XX +XXX,XX @@
 #define XILINX_LOCK_KEY 0x767b
 #define XILINX_UNLOCK_KEY 0xdf0d
 
-#define R_PSS_RST_CTRL_SOFT_RST 0x1
+REG32(SCL, 0x000)
+REG32(LOCK, 0x004)
+REG32(UNLOCK, 0x008)
+REG32(LOCKSTA, 0x00c)
 
-enum {
-    SCL             = 0x000 / 4,
-    LOCK,
-    UNLOCK,
-    LOCKSTA,
+REG32(ARM_PLL_CTRL, 0x100)
+REG32(DDR_PLL_CTRL, 0x104)
+REG32(IO_PLL_CTRL, 0x108)
+REG32(PLL_STATUS, 0x10c)
+REG32(ARM_PLL_CFG, 0x110)
+REG32(DDR_PLL_CFG, 0x114)
+REG32(IO_PLL_CFG, 0x118)
 
-    ARM_PLL_CTRL    = 0x100 / 4,
-    DDR_PLL_CTRL,
-    IO_PLL_CTRL,
-    PLL_STATUS,
-    ARM_PLL_CFG,
-    DDR_PLL_CFG,
-    IO_PLL_CFG,
-
-    ARM_CLK_CTRL    = 0x120 / 4,
-    DDR_CLK_CTRL,
-    DCI_CLK_CTRL,
-    APER_CLK_CTRL,
-    USB0_CLK_CTRL,
-    USB1_CLK_CTRL,
-    GEM0_RCLK_CTRL,
-    GEM1_RCLK_CTRL,
-    GEM0_CLK_CTRL,
-    GEM1_CLK_CTRL,
-    SMC_CLK_CTRL,
-    LQSPI_CLK_CTRL,
-    SDIO_CLK_CTRL,
-    UART_CLK_CTRL,
-    SPI_CLK_CTRL,
-    CAN_CLK_CTRL,
-    CAN_MIOCLK_CTRL,
-    DBG_CLK_CTRL,
-    PCAP_CLK_CTRL,
-    TOPSW_CLK_CTRL,
+REG32(ARM_CLK_CTRL, 0x120)
+REG32(DDR_CLK_CTRL, 0x124)
+REG32(DCI_CLK_CTRL, 0x128)
+REG32(APER_CLK_CTRL, 0x12c)
+REG32(USB0_CLK_CTRL, 0x130)
+REG32(USB1_CLK_CTRL, 0x134)
+REG32(GEM0_RCLK_CTRL, 0x138)
+REG32(GEM1_RCLK_CTRL, 0x13c)
+REG32(GEM0_CLK_CTRL, 0x140)
+REG32(GEM1_CLK_CTRL, 0x144)
+REG32(SMC_CLK_CTRL, 0x148)
+REG32(LQSPI_CLK_CTRL, 0x14c)
+REG32(SDIO_CLK_CTRL, 0x150)
+REG32(UART_CLK_CTRL, 0x154)
+REG32(SPI_CLK_CTRL, 0x158)
+REG32(CAN_CLK_CTRL, 0x15c)
+REG32(CAN_MIOCLK_CTRL, 0x160)
+REG32(DBG_CLK_CTRL, 0x164)
+REG32(PCAP_CLK_CTRL, 0x168)
+REG32(TOPSW_CLK_CTRL, 0x16c)
 
 #define FPGA_CTRL_REGS(n, start) \
-    FPGA ## n ## _CLK_CTRL = (start) / 4, \
-    FPGA ## n ## _THR_CTRL, \
-    FPGA ## n ## _THR_CNT, \
-    FPGA ## n ## _THR_STA,
-    FPGA_CTRL_REGS(0, 0x170)
-    FPGA_CTRL_REGS(1, 0x180)
-    FPGA_CTRL_REGS(2, 0x190)
-    FPGA_CTRL_REGS(3, 0x1a0)
+    REG32(FPGA ## n ## _CLK_CTRL, (start)) \
+    REG32(FPGA ## n ## _THR_CTRL, (start) + 0x4)\
+    REG32(FPGA ## n ## _THR_CNT,  (start) + 0x8)\
+    REG32(FPGA ## n ## _THR_STA,  (start) + 0xc)
+FPGA_CTRL_REGS(0, 0x170)
+FPGA_CTRL_REGS(1, 0x180)
+FPGA_CTRL_REGS(2, 0x190)
+FPGA_CTRL_REGS(3, 0x1a0)
 
-    BANDGAP_TRIP    = 0x1b8 / 4,
-    PLL_PREDIVISOR  = 0x1c0 / 4,
-    CLK_621_TRUE,
+REG32(BANDGAP_TRIP, 0x1b8)
+REG32(PLL_PREDIVISOR, 0x1c0)
+REG32(CLK_621_TRUE, 0x1c4)
 
-    PSS_RST_CTRL    = 0x200 / 4,
-    DDR_RST_CTRL,
-    TOPSW_RESET_CTRL,
-    DMAC_RST_CTRL,
-    USB_RST_CTRL,
-    GEM_RST_CTRL,
-    SDIO_RST_CTRL,
-    SPI_RST_CTRL,
-    CAN_RST_CTRL,
-    I2C_RST_CTRL,
-    UART_RST_CTRL,
-    GPIO_RST_CTRL,
-    LQSPI_RST_CTRL,
-    SMC_RST_CTRL,
-    OCM_RST_CTRL,
-    FPGA_RST_CTRL   = 0x240 / 4,
-    A9_CPU_RST_CTRL,
+REG32(PSS_RST_CTRL, 0x200)
+    FIELD(PSS_RST_CTRL, SOFT_RST, 0, 1)
+REG32(DDR_RST_CTRL, 0x204)
+REG32(TOPSW_RESET_CTRL, 0x208)
+REG32(DMAC_RST_CTRL, 0x20c)
+REG32(USB_RST_CTRL, 0x210)
+REG32(GEM_RST_CTRL, 0x214)
+REG32(SDIO_RST_CTRL, 0x218)
+REG32(SPI_RST_CTRL, 0x21c)
+REG32(CAN_RST_CTRL, 0x220)
+REG32(I2C_RST_CTRL, 0x224)
+REG32(UART_RST_CTRL, 0x228)
+REG32(GPIO_RST_CTRL, 0x22c)
+REG32(LQSPI_RST_CTRL, 0x230)
+REG32(SMC_RST_CTRL, 0x234)
+REG32(OCM_RST_CTRL, 0x238)
+REG32(FPGA_RST_CTRL, 0x240)
+REG32(A9_CPU_RST_CTRL, 0x244)
 
-    RS_AWDT_CTRL    = 0x24c / 4,
-    RST_REASON,
+REG32(RS_AWDT_CTRL, 0x24c)
+REG32(RST_REASON, 0x250)
 
-    REBOOT_STATUS   = 0x258 / 4,
-    BOOT_MODE,
+REG32(REBOOT_STATUS, 0x258)
+REG32(BOOT_MODE, 0x25c)
 
-    APU_CTRL        = 0x300 / 4,
-    WDT_CLK_SEL,
+REG32(APU_CTRL, 0x300)
+REG32(WDT_CLK_SEL, 0x304)
 
-    TZ_DMA_NS       = 0x440 / 4,
-    TZ_DMA_IRQ_NS,
-    TZ_DMA_PERIPH_NS,
+REG32(TZ_DMA_NS, 0x440)
+REG32(TZ_DMA_IRQ_NS, 0x444)
+REG32(TZ_DMA_PERIPH_NS, 0x448)
 
-    PSS_IDCODE      = 0x530 / 4,
+REG32(PSS_IDCODE, 0x530)
 
-    DDR_URGENT      = 0x600 / 4,
-    DDR_CAL_START   = 0x60c / 4,
-    DDR_REF_START   = 0x614 / 4,
-    DDR_CMD_STA,
-    DDR_URGENT_SEL,
-    DDR_DFI_STATUS,
+REG32(DDR_URGENT, 0x600)
+REG32(DDR_CAL_START, 0x60c)
+REG32(DDR_REF_START, 0x614)
+REG32(DDR_CMD_STA, 0x618)
+REG32(DDR_URGENT_SEL, 0x61c)
+REG32(DDR_DFI_STATUS, 0x620)
 
-    MIO             = 0x700 / 4,
+REG32(MIO, 0x700)
 #define MIO_LENGTH 54
 
-    MIO_LOOPBACK    = 0x804 / 4,
-    MIO_MST_TRI0,
-    MIO_MST_TRI1,
+REG32(MIO_LOOPBACK, 0x804)
+REG32(MIO_MST_TRI0, 0x808)
+REG32(MIO_MST_TRI1, 0x80c)
 
-    SD0_WP_CD_SEL   = 0x830 / 4,
-    SD1_WP_CD_SEL,
+REG32(SD0_WP_CD_SEL, 0x830)
+REG32(SD1_WP_CD_SEL, 0x834)
 
-    LVL_SHFTR_EN    = 0x900 / 4,
-    OCM_CFG         = 0x910 / 4,
+REG32(LVL_SHFTR_EN, 0x900)
+REG32(OCM_CFG, 0x910)
 
-    CPU_RAM         = 0xa00 / 4,
+REG32(CPU_RAM, 0xa00)
 
-    IOU             = 0xa30 / 4,
+REG32(IOU, 0xa30)
 
-    DMAC_RAM        = 0xa50 / 4,
+REG32(DMAC_RAM, 0xa50)
 
-    AFI0            = 0xa60 / 4,
-    AFI1 = AFI0 + 3,
-    AFI2 = AFI1 + 3,
-    AFI3 = AFI2 + 3,
+REG32(AFI0, 0xa60)
+REG32(AFI1, 0xa6c)
+REG32(AFI2, 0xa78)
+REG32(AFI3, 0xa84)
 #define AFI_LENGTH 3
 
-    OCM             = 0xa90 / 4,
+REG32(OCM, 0xa90)
 
-    DEVCI_RAM       = 0xaa0 / 4,
+REG32(DEVCI_RAM, 0xaa0)
 
-    CSG_RAM         = 0xab0 / 4,
+REG32(CSG_RAM, 0xab0)
 
-    GPIOB_CTRL      = 0xb00 / 4,
-    GPIOB_CFG_CMOS18,
-    GPIOB_CFG_CMOS25,
-    GPIOB_CFG_CMOS33,
-    GPIOB_CFG_HSTL  = 0xb14 / 4,
-    GPIOB_DRVR_BIAS_CTRL,
+REG32(GPIOB_CTRL, 0xb00)
+REG32(GPIOB_CFG_CMOS18, 0xb04)
+REG32(GPIOB_CFG_CMOS25, 0xb08)
+REG32(GPIOB_CFG_CMOS33, 0xb0c)
+REG32(GPIOB_CFG_HSTL, 0xb14)
+REG32(GPIOB_DRVR_BIAS_CTRL, 0xb18)
 
-    DDRIOB          = 0xb40 / 4,
+REG32(DDRIOB, 0xb40)
 #define DDRIOB_LENGTH 14
-};
 
 #define ZYNQ_SLCR_MMIO_SIZE     0x1000
 #define ZYNQ_SLCR_NUM_REGS      (ZYNQ_SLCR_MMIO_SIZE / 4)
@@ -XXX,XX +XXX,XX @@ static void zynq_slcr_reset(DeviceState *d)
 
     DB_PRINT("RESET\n");
 
-    s->regs[LOCKSTA] = 1;
+    s->regs[R_LOCKSTA] = 1;
     /* 0x100 - 0x11C */
-    s->regs[ARM_PLL_CTRL]   = 0x0001A008;
-    s->regs[DDR_PLL_CTRL]   = 0x0001A008;
-    s->regs[IO_PLL_CTRL]    = 0x0001A008;
-    s->regs[PLL_STATUS]     = 0x0000003F;
-    s->regs[ARM_PLL_CFG]    = 0x00014000;
-    s->regs[DDR_PLL_CFG]    = 0x00014000;
-    s->regs[IO_PLL_CFG]     = 0x00014000;
+    s->regs[R_ARM_PLL_CTRL]   = 0x0001A008;
+    s->regs[R_DDR_PLL_CTRL]   = 0x0001A008;
+    s->regs[R_IO_PLL_CTRL]    = 0x0001A008;
+    s->regs[R_PLL_STATUS]     = 0x0000003F;
+    s->regs[R_ARM_PLL_CFG]    = 0x00014000;
+    s->regs[R_DDR_PLL_CFG]    = 0x00014000;
+    s->regs[R_IO_PLL_CFG]     = 0x00014000;
 
     /* 0x120 - 0x16C */
-    s->regs[ARM_CLK_CTRL]   = 0x1F000400;
-    s->regs[DDR_CLK_CTRL]   = 0x18400003;
-    s->regs[DCI_CLK_CTRL]   = 0x01E03201;
-    s->regs[APER_CLK_CTRL]  = 0x01FFCCCD;
-    s->regs[USB0_CLK_CTRL]  = s->regs[USB1_CLK_CTRL]    = 0x00101941;
-    s->regs[GEM0_RCLK_CTRL] = s->regs[GEM1_RCLK_CTRL]   = 0x00000001;
-    s->regs[GEM0_CLK_CTRL]  = s->regs[GEM1_CLK_CTRL]    = 0x00003C01;
-    s->regs[SMC_CLK_CTRL]   = 0x00003C01;
-    s->regs[LQSPI_CLK_CTRL] = 0x00002821;
-    s->regs[SDIO_CLK_CTRL]  = 0x00001E03;
-    s->regs[UART_CLK_CTRL]  = 0x00003F03;
-    s->regs[SPI_CLK_CTRL]   = 0x00003F03;
-    s->regs[CAN_CLK_CTRL]   = 0x00501903;
-    s->regs[DBG_CLK_CTRL]   = 0x00000F03;
-    s->regs[PCAP_CLK_CTRL]  = 0x00000F01;
+    s->regs[R_ARM_CLK_CTRL]   = 0x1F000400;
+    s->regs[R_DDR_CLK_CTRL]   = 0x18400003;
+    s->regs[R_DCI_CLK_CTRL]   = 0x01E03201;
+    s->regs[R_APER_CLK_CTRL]  = 0x01FFCCCD;
+    s->regs[R_USB0_CLK_CTRL]  = s->regs[R_USB1_CLK_CTRL]  = 0x00101941;
+    s->regs[R_GEM0_RCLK_CTRL] = s->regs[R_GEM1_RCLK_CTRL] = 0x00000001;
+    s->regs[R_GEM0_CLK_CTRL]  = s->regs[R_GEM1_CLK_CTRL]  = 0x00003C01;
+    s->regs[R_SMC_CLK_CTRL]   = 0x00003C01;
+    s->regs[R_LQSPI_CLK_CTRL] = 0x00002821;
+    s->regs[R_SDIO_CLK_CTRL]  = 0x00001E03;
+    s->regs[R_UART_CLK_CTRL]  = 0x00003F03;
+    s->regs[R_SPI_CLK_CTRL]   = 0x00003F03;
+    s->regs[R_CAN_CLK_CTRL]   = 0x00501903;
+    s->regs[R_DBG_CLK_CTRL]   = 0x00000F03;
+    s->regs[R_PCAP_CLK_CTRL]  = 0x00000F01;
 
     /* 0x170 - 0x1AC */
-    s->regs[FPGA0_CLK_CTRL] = s->regs[FPGA1_CLK_CTRL] = s->regs[FPGA2_CLK_CTRL]
-                            = s->regs[FPGA3_CLK_CTRL] = 0x00101800;
-    s->regs[FPGA0_THR_STA] = s->regs[FPGA1_THR_STA] = s->regs[FPGA2_THR_STA]
-                           = s->regs[FPGA3_THR_STA] = 0x00010000;
+    s->regs[R_FPGA0_CLK_CTRL] = s->regs[R_FPGA1_CLK_CTRL]
+                              = s->regs[R_FPGA2_CLK_CTRL]
+                              = s->regs[R_FPGA3_CLK_CTRL] = 0x00101800;
+    s->regs[R_FPGA0_THR_STA] = s->regs[R_FPGA1_THR_STA]
+                             = s->regs[R_FPGA2_THR_STA]
+                             = s->regs[R_FPGA3_THR_STA] = 0x00010000;
 
     /* 0x1B0 - 0x1D8 */
-    s->regs[BANDGAP_TRIP]   = 0x0000001F;
-    s->regs[PLL_PREDIVISOR] = 0x00000001;
-    s->regs[CLK_621_TRUE]   = 0x00000001;
+    s->regs[R_BANDGAP_TRIP]   = 0x0000001F;
+    s->regs[R_PLL_PREDIVISOR] = 0x00000001;
+    s->regs[R_CLK_621_TRUE]   = 0x00000001;
 
     /* 0x200 - 0x25C */
-    s->regs[FPGA_RST_CTRL]  = 0x01F33F0F;
-    s->regs[RST_REASON]     = 0x00000040;
+    s->regs[R_FPGA_RST_CTRL]  = 0x01F33F0F;
+    s->regs[R_RST_REASON]     = 0x00000040;
 
-    s->regs[BOOT_MODE]      = 0x00000001;
+    s->regs[R_BOOT_MODE]      = 0x00000001;
 
     /* 0x700 - 0x7D4 */
     for (i = 0; i < 54; i++) {
-        s->regs[MIO + i] = 0x00001601;
+        s->regs[R_MIO + i] = 0x00001601;
     }
     for (i = 2; i <= 8; i++) {
-        s->regs[MIO + i] = 0x00000601;
+        s->regs[R_MIO + i] = 0x00000601;
     }
 
-    s->regs[MIO_MST_TRI0] = s->regs[MIO_MST_TRI1] = 0xFFFFFFFF;
+    s->regs[R_MIO_MST_TRI0] = s->regs[R_MIO_MST_TRI1] = 0xFFFFFFFF;
 
-    s->regs[CPU_RAM + 0] = s->regs[CPU_RAM + 1] = s->regs[CPU_RAM + 3]
-                         = s->regs[CPU_RAM + 4] = s->regs[CPU_RAM + 7]
-                         = 0x00010101;
-    s->regs[CPU_RAM + 2] = s->regs[CPU_RAM + 5] = 0x01010101;
-    s->regs[CPU_RAM + 6] = 0x00000001;
+    s->regs[R_CPU_RAM + 0] = s->regs[R_CPU_RAM + 1] = s->regs[R_CPU_RAM + 3]
+                           = s->regs[R_CPU_RAM + 4] = s->regs[R_CPU_RAM + 7]
+                           = 0x00010101;
+    s->regs[R_CPU_RAM + 2] = s->regs[R_CPU_RAM + 5] = 0x01010101;
+    s->regs[R_CPU_RAM + 6] = 0x00000001;
 
-    s->regs[IOU + 0] = s->regs[IOU + 1] = s->regs[IOU + 2] = s->regs[IOU + 3]
-                     = 0x09090909;
-    s->regs[IOU + 4] = s->regs[IOU + 5] = 0x00090909;
-    s->regs[IOU + 6] = 0x00000909;
+    s->regs[R_IOU + 0] = s->regs[R_IOU + 1] = s->regs[R_IOU + 2]
+                       = s->regs[R_IOU + 3] = 0x09090909;
+    s->regs[R_IOU + 4] = s->regs[R_IOU + 5] = 0x00090909;
+    s->regs[R_IOU + 6] = 0x00000909;
 
-    s->regs[DMAC_RAM] = 0x00000009;
+    s->regs[R_DMAC_RAM] = 0x00000009;
 
-    s->regs[AFI0 + 0] = s->regs[AFI0 + 1] = 0x09090909;
-    s->regs[AFI1 + 0] = s->regs[AFI1 + 1] = 0x09090909;
-    s->regs[AFI2 + 0] = s->regs[AFI2 + 1] = 0x09090909;
-    s->regs[AFI3 + 0] = s->regs[AFI3 + 1] = 0x09090909;
-    s->regs[AFI0 + 2] = s->regs[AFI1 + 2] = s->regs[AFI2 + 2]
-                      = s->regs[AFI3 + 2] = 0x00000909;
+    s->regs[R_AFI0 + 0] = s->regs[R_AFI0 + 1] = 0x09090909;
+    s->regs[R_AFI1 + 0] = s->regs[R_AFI1 + 1] = 0x09090909;
+    s->regs[R_AFI2 + 0] = s->regs[R_AFI2 + 1] = 0x09090909;
+    s->regs[R_AFI3 + 0] = s->regs[R_AFI3 + 1] = 0x09090909;
+    s->regs[R_AFI0 + 2] = s->regs[R_AFI1 + 2] = s->regs[R_AFI2 + 2]
+                        = s->regs[R_AFI3 + 2] = 0x00000909;
 
-    s->regs[OCM + 0]    = 0x01010101;
-    s->regs[OCM + 1]    = s->regs[OCM + 2] = 0x09090909;
+    s->regs[R_OCM + 0] = 0x01010101;
+    s->regs[R_OCM + 1] = s->regs[R_OCM + 2] = 0x09090909;
 
-    s->regs[DEVCI_RAM]  = 0x00000909;
-    s->regs[CSG_RAM]    = 0x00000001;
+    s->regs[R_DEVCI_RAM] = 0x00000909;
+    s->regs[R_CSG_RAM]   = 0x00000001;
 
-    s->regs[DDRIOB + 0] = s->regs[DDRIOB + 1] = s->regs[DDRIOB + 2]
-                        = s->regs[DDRIOB + 3] = 0x00000e00;
-    s->regs[DDRIOB + 4] = s->regs[DDRIOB + 5] = s->regs[DDRIOB + 6]
-                        = 0x00000e00;
-    s->regs[DDRIOB + 12] = 0x00000021;
+    s->regs[R_DDRIOB + 0] = s->regs[R_DDRIOB + 1] = s->regs[R_DDRIOB + 2]
+                          = s->regs[R_DDRIOB + 3] = 0x00000e00;
+    s->regs[R_DDRIOB + 4] = s->regs[R_DDRIOB + 5] = s->regs[R_DDRIOB + 6]
+                          = 0x00000e00;
+    s->regs[R_DDRIOB + 12] = 0x00000021;
 }
 
 
 static bool zynq_slcr_check_offset(hwaddr offset, bool rnw)
 {
     switch (offset) {
-    case LOCK:
-    case UNLOCK:
-    case DDR_CAL_START:
-    case DDR_REF_START:
+    case R_LOCK:
+    case R_UNLOCK:
+    case R_DDR_CAL_START:
+    case R_DDR_REF_START:
         return !rnw; /* Write only */
-    case LOCKSTA:
-    case FPGA0_THR_STA:
-    case FPGA1_THR_STA:
-    case FPGA2_THR_STA:
-    case FPGA3_THR_STA:
-    case BOOT_MODE:
-    case PSS_IDCODE:
-    case DDR_CMD_STA:
-    case DDR_DFI_STATUS:
-    case PLL_STATUS:
+    case R_LOCKSTA:
+    case R_FPGA0_THR_STA:
+    case R_FPGA1_THR_STA:
+    case R_FPGA2_THR_STA:
+    case R_FPGA3_THR_STA:
+    case R_BOOT_MODE:
+    case R_PSS_IDCODE:
+    case R_DDR_CMD_STA:
+    case R_DDR_DFI_STATUS:
+    case R_PLL_STATUS:
         return rnw;/* read only */
-    case SCL:
-    case ARM_PLL_CTRL ... IO_PLL_CTRL:
-    case ARM_PLL_CFG ... IO_PLL_CFG:
-    case ARM_CLK_CTRL ... TOPSW_CLK_CTRL:
-    case FPGA0_CLK_CTRL ... FPGA0_THR_CNT:
-    case FPGA1_CLK_CTRL ... FPGA1_THR_CNT:
-    case FPGA2_CLK_CTRL ... FPGA2_THR_CNT:
-    case FPGA3_CLK_CTRL ... FPGA3_THR_CNT:
-    case BANDGAP_TRIP:
-    case PLL_PREDIVISOR:
-    case CLK_621_TRUE:
-    case PSS_RST_CTRL ... A9_CPU_RST_CTRL:
-    case RS_AWDT_CTRL:
-    case RST_REASON:
-    case REBOOT_STATUS:
-    case APU_CTRL:
-    case WDT_CLK_SEL:
-    case TZ_DMA_NS ... TZ_DMA_PERIPH_NS:
-    case DDR_URGENT:
-    case DDR_URGENT_SEL:
-    case MIO ... MIO + MIO_LENGTH - 1:
-    case MIO_LOOPBACK ... MIO_MST_TRI1:
-    case SD0_WP_CD_SEL:
-    case SD1_WP_CD_SEL:
-    case LVL_SHFTR_EN:
-    case OCM_CFG:
-    case CPU_RAM:
-    case IOU:
-    case DMAC_RAM:
-    case AFI0 ... AFI3 + AFI_LENGTH - 1:
-    case OCM:
-    case DEVCI_RAM:
-    case CSG_RAM:
-    case GPIOB_CTRL ... GPIOB_CFG_CMOS33:
-    case GPIOB_CFG_HSTL:
-    case GPIOB_DRVR_BIAS_CTRL:
-    case DDRIOB ... DDRIOB + DDRIOB_LENGTH - 1:
+    case R_SCL:
+    case R_ARM_PLL_CTRL ... R_IO_PLL_CTRL:
+    case R_ARM_PLL_CFG ... R_IO_PLL_CFG:
+    case R_ARM_CLK_CTRL ... R_TOPSW_CLK_CTRL:
+    case R_FPGA0_CLK_CTRL ... R_FPGA0_THR_CNT:
+    case R_FPGA1_CLK_CTRL ... R_FPGA1_THR_CNT:
+    case R_FPGA2_CLK_CTRL ... R_FPGA2_THR_CNT:
+    case R_FPGA3_CLK_CTRL ... R_FPGA3_THR_CNT:
+    case R_BANDGAP_TRIP:
+    case R_PLL_PREDIVISOR:
+    case R_CLK_621_TRUE:
+    case R_PSS_RST_CTRL ... R_A9_CPU_RST_CTRL:
+    case R_RS_AWDT_CTRL:
+    case R_RST_REASON:
+    case R_REBOOT_STATUS:
+    case R_APU_CTRL:
+    case R_WDT_CLK_SEL:
+    case R_TZ_DMA_NS ... R_TZ_DMA_PERIPH_NS:
+    case R_DDR_URGENT:
+    case R_DDR_URGENT_SEL:
+    case R_MIO ... R_MIO + MIO_LENGTH - 1:
+    case R_MIO_LOOPBACK ... R_MIO_MST_TRI1:
+    case R_SD0_WP_CD_SEL:
+    case R_SD1_WP_CD_SEL:
+    case R_LVL_SHFTR_EN:
+    case R_OCM_CFG:
+    case R_CPU_RAM:
+    case R_IOU:
+    case R_DMAC_RAM:
+    case R_AFI0 ... R_AFI3 + AFI_LENGTH - 1:
+    case R_OCM:
+    case R_DEVCI_RAM:
+    case R_CSG_RAM:
+    case R_GPIOB_CTRL ... R_GPIOB_CFG_CMOS33:
+    case R_GPIOB_CFG_HSTL:
+    case R_GPIOB_DRVR_BIAS_CTRL:
+    case R_DDRIOB ... R_DDRIOB + DDRIOB_LENGTH - 1:
         return true;
     default:
         return false;
@@ -XXX,XX +XXX,XX @@ static void zynq_slcr_write(void *opaque, hwaddr offset,
     }
 
     switch (offset) {
-    case SCL:
-        s->regs[SCL] = val & 0x1;
+    case R_SCL:
+        s->regs[R_SCL] = val & 0x1;
         return;
-    case LOCK:
+    case R_LOCK:
         if ((val & 0xFFFF) == XILINX_LOCK_KEY) {
             DB_PRINT("XILINX LOCK 0xF8000000 + 0x%x <= 0x%x\n", (int)offset,
                 (unsigned)val & 0xFFFF);
-            s->regs[LOCKSTA] = 1;
+            s->regs[R_LOCKSTA] = 1;
         } else {
             DB_PRINT("WRONG XILINX LOCK KEY 0xF8000000 + 0x%x <= 0x%x\n",
                 (int)offset, (unsigned)val & 0xFFFF);
         }
         return;
-    case UNLOCK:
+    case R_UNLOCK:
         if ((val & 0xFFFF) == XILINX_UNLOCK_KEY) {
             DB_PRINT("XILINX UNLOCK 0xF8000000 + 0x%x <= 0x%x\n", (int)offset,
                 (unsigned)val & 0xFFFF);
-            s->regs[LOCKSTA] = 0;
+            s->regs[R_LOCKSTA] = 0;
         } else {
             DB_PRINT("WRONG XILINX UNLOCK KEY 0xF8000000 + 0x%x <= 0x%x\n",
                 (int)offset, (unsigned)val & 0xFFFF);
@@ -XXX,XX +XXX,XX @@ static void zynq_slcr_write(void *opaque, hwaddr offset,
         return;
     }
 
-    if (s->regs[LOCKSTA]) {
+    if (s->regs[R_LOCKSTA]) {
         qemu_log_mask(LOG_GUEST_ERROR,
                       "SCLR registers are locked. Unlock them first\n");
         return;
@@ -XXX,XX +XXX,XX @@ static void zynq_slcr_write(void *opaque, hwaddr offset,
     s->regs[offset] = val;
 
     switch (offset) {
-    case PSS_RST_CTRL:
-        if (val & R_PSS_RST_CTRL_SOFT_RST) {
+    case R_PSS_RST_CTRL:
+        if (FIELD_EX32(val, PSS_RST_CTRL, SOFT_RST)) {
             qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
         }
         break;
-- 
2.20.1

From: Aaron Hill <aa1ronham@gmail.com>

This commit properly sets the ENET_BD_BDU flag once the emulated FEC controller
has finished processing the last descriptor. This is done for both transmit
and receive descriptors.

This allows the QNX 7.0.0 BSP for the Sabrelite board (which can be
found at http://blackberry.qnx.com/en/developers/bsp) to properly
control the FEC. Without this patch, the BSP ethernet driver will never
re-use FEC descriptors, as the unset ENET_BD_BDU flag will cause
it to believe that the descriptors are still in use by the NIC.

Note that Linux does not appear to use this field at all, and is
unaffected by this patch.

Without this patch, QNX will think that the NIC is still processing its
transaction descriptors, and won't send any more data over the network.

For reference:

On page 1192 of the I.MX 6DQ reference manual revision (Rev. 5, 06/2018),
which can be found at https://www.nxp.com/products/processors-and-microcontrollers/arm-based-processors-and-mcus/i.mx-applications-processors/i.mx-6-processors/i.mx-6quad-processors-high-performance-3d-graphics-hd-video-arm-cortex-a9-core:i.MX6Q?&tab=Documentation_Tab&linkline=Application-Note

the 'BDU' field is described as follows for the 'Enhanced transmit
buffer descriptor':

'Last buffer descriptor update done. Indicates that the last BD data has been updated by
uDMA. This field is written by the user (=0) and uDMA (=1).'

The same description is used for the receive buffer descriptor.

Signed-off-by: Aaron Hill <aa1ronham@gmail.com>
Message-id: 20190805142417.10433-1-aaron.hill@alertinnovation.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/net/imx_fec.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -XXX,XX +XXX,XX @@ static void imx_enet_do_tx(IMXFECState *s, uint32_t index)
             if (bd.option & ENET_BD_TX_INT) {
                 s->regs[ENET_EIR] |= int_txf;
             }
+            /* Indicate that we've updated the last buffer descriptor. */
+            bd.last_buffer = ENET_BD_BDU;
         }
         if (bd.option & ENET_BD_TX_INT) {
             s->regs[ENET_EIR] |= int_txb;
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
             /* Last buffer in frame.  */
             bd.flags |= flags | ENET_BD_L;
             FEC_PRINTF("rx frame flags %04x\n", bd.flags);
+            /* Indicate that we've updated the last buffer descriptor. */
+            bd.last_buffer = ENET_BD_BDU;
             if (bd.option & ENET_BD_RX_INT) {
                 s->regs[ENET_EIR] |= ENET_INT_RXF;
             }
-- 
2.20.1

Factor out code to 'generate a singlestep exception', which is
currently repeated in four places.

To do this we need to also pull the identical copies of the
gen-exception() function out of translate-a64.c and translate.c
into translate.h.

(There is a bug in the code: we're taking the exception to the wrong
target EL.  This will be simpler to fix if there's only one place to
do it.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190805130952.4415-2-peter.maydell@linaro.org
---
 target/arm/translate.h     | 23 +++++++++++++++++++++++
 target/arm/translate-a64.c | 19 ++-----------------
 target/arm/translate.c     | 20 ++------------------
 3 files changed, 27 insertions(+), 35 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@
 #define TARGET_ARM_TRANSLATE_H
 
 #include "exec/translator.h"
+#include "internals.h"
 
 
 /* internal defines */
@@ -XXX,XX +XXX,XX @@ static inline void gen_ss_advance(DisasContext *s)
     }
 }
 
+static inline void gen_exception(int excp, uint32_t syndrome,
+                                 uint32_t target_el)
+{
+    TCGv_i32 tcg_excp = tcg_const_i32(excp);
+    TCGv_i32 tcg_syn = tcg_const_i32(syndrome);
+    TCGv_i32 tcg_el = tcg_const_i32(target_el);
+
+    gen_helper_exception_with_syndrome(cpu_env, tcg_excp,
+                                       tcg_syn, tcg_el);
+
+    tcg_temp_free_i32(tcg_el);
+    tcg_temp_free_i32(tcg_syn);
+    tcg_temp_free_i32(tcg_excp);
+}
+
+/* Generate an architectural singlestep exception */
+static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
+{
+    gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, isv, ex),
+                  default_exception_el(s));
+}
+
 /*
  * Given a VFP floating point constant encoded into an 8 bit immediate in an
  * instruction, expand it to the actual constant value of the specified
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal(int excp)
     tcg_temp_free_i32(tcg_excp);
 }
 
-static void gen_exception(int excp, uint32_t syndrome, uint32_t target_el)
-{
-    TCGv_i32 tcg_excp = tcg_const_i32(excp);
-    TCGv_i32 tcg_syn = tcg_const_i32(syndrome);
-    TCGv_i32 tcg_el = tcg_const_i32(target_el);
-
-    gen_helper_exception_with_syndrome(cpu_env, tcg_excp,
-                                       tcg_syn, tcg_el);
-    tcg_temp_free_i32(tcg_el);
-    tcg_temp_free_i32(tcg_syn);
-    tcg_temp_free_i32(tcg_excp);
-}
-
 static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
 {
     gen_a64_set_pc_im(s->pc - offset);
@@ -XXX,XX +XXX,XX @@ static void gen_step_complete_exception(DisasContext *s)
      * of the exception, and our syndrome information is always correct.
      */
     gen_ss_advance(s);
-    gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, 1, s->is_ldex),
-                  default_exception_el(s));
+    gen_swstep_exception(s, 1, s->is_ldex);
     s->base.is_jmp = DISAS_NORETURN;
 }
 
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
          * bits should be zero.
          */
         assert(dc->base.num_insns == 1);
-        gen_exception(EXCP_UDEF, syn_swstep(dc->ss_same_el, 0, 0),
-                      default_exception_el(dc));
+        gen_swstep_exception(dc, 0, 0);
         dc->base.is_jmp = DISAS_NORETURN;
     } else {
         disas_a64_insn(env, dc);
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal(int excp)
     tcg_temp_free_i32(tcg_excp);
 }
 
-static void gen_exception(int excp, uint32_t syndrome, uint32_t target_el)
-{
-    TCGv_i32 tcg_excp = tcg_const_i32(excp);
-    TCGv_i32 tcg_syn = tcg_const_i32(syndrome);
-    TCGv_i32 tcg_el = tcg_const_i32(target_el);
-
-    gen_helper_exception_with_syndrome(cpu_env, tcg_excp,
-                                       tcg_syn, tcg_el);
-
-    tcg_temp_free_i32(tcg_el);
-    tcg_temp_free_i32(tcg_syn);
-    tcg_temp_free_i32(tcg_excp);
-}
-
 static void gen_step_complete_exception(DisasContext *s)
 {
     /* We just completed step of an insn. Move from Active-not-pending
@@ -XXX,XX +XXX,XX @@ static void gen_step_complete_exception(DisasContext *s)
      * of the exception, and our syndrome information is always correct.
      */
     gen_ss_advance(s);
-    gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, 1, s->is_ldex),
-                  default_exception_el(s));
+    gen_swstep_exception(s, 1, s->is_ldex);
     s->base.is_jmp = DISAS_NORETURN;
 }
 
@@ -XXX,XX +XXX,XX @@ static bool arm_pre_translate_insn(DisasContext *dc)
          * bits should be zero.
          */
         assert(dc->base.num_insns == 1);
-        gen_exception(EXCP_UDEF, syn_swstep(dc->ss_same_el, 0, 0),
-                      default_exception_el(dc));
+        gen_swstep_exception(dc, 0, 0);
         dc->base.is_jmp = DISAS_NORETURN;
         return true;
     }
-- 
2.20.1

When generating an architectural single-step exception we were
routing it to the "default exception level", which is to say
the same exception level we execute at except that EL0 exceptions
go to EL1. This is incorrect because the debug exception level
can be configured by the guest for situations such as single
stepping of EL0 and EL1 code by EL2.

We have to track the target debug exception level in the TB
flags, because it is dependent on CPU state like HCR_EL2.TGE
and MDCR_EL2.TDE. (That we were previously calling the
arm_debug_target_el() function to determine dc->ss_same_el
is itself a bug, though one that would only have manifested
as incorrect syndrome information.) Since we are out of TB
flag bits unless we want to expand into the cs_base field,
we share some bits with the M-profile only HANDLER and
STACKCHECK bits, since only A-profile has this singlestep.

Fixes: https://bugs.launchpad.net/qemu/+bug/1838913
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190805130952.4415-3-peter.maydell@linaro.org
---
 target/arm/cpu.h           |  5 +++++
 target/arm/translate.h     | 15 +++++++++++----
 target/arm/helper.c        |  6 ++++++
 target/arm/translate-a64.c |  2 +-
 target/arm/translate.c     |  4 +++-
 5 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, PSTATE_SS, 26, 1)
 /* Target EL if we take a floating-point-disabled exception */
 FIELD(TBFLAG_ANY, FPEXC_EL, 24, 2)
 FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
+/*
+ * For A-profile only, target EL for debug exceptions.
+ * Note that this overlaps with the M-profile-only HANDLER and STACKCHECK bits.
+ */
+FIELD(TBFLAG_ANY, DEBUG_TARGET_EL, 21, 2)
 
 /* Bit usage when in AArch32 state: */
 FIELD(TBFLAG_A32, THUMB, 0, 1)
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     uint32_t svc_imm;
     int aarch64;
     int current_el;
+    /* Debug target exception level for single-step exceptions */
+    int debug_target_el;
     GHashTable *cp_regs;
     uint64_t features; /* CPU features bits */
     /* Because unallocated encodings generate different exception syndrome
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
      * ie A64 LDX*, LDAX*, A32/T32 LDREX*, LDAEX*.
      */
     bool is_ldex;
-    /* True if a single-step exception will be taken to the current EL */
-    bool ss_same_el;
     /* True if v8.3-PAuth is active.  */
     bool pauth_active;
     /* True with v8.5-BTI and SCTLR_ELx.BT* set.  */
@@ -XXX,XX +XXX,XX @@ static inline void gen_exception(int excp, uint32_t syndrome,
 /* Generate an architectural singlestep exception */
 static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
 {
-    gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, isv, ex),
-                  default_exception_el(s));
+    bool same_el = (s->debug_target_el == s->current_el);
+
+    /*
+     * If singlestep is targeting a lower EL than the current one,
+     * then s->ss_active must be false and we can never get here.
+     */
+    assert(s->debug_target_el >= s->current_el);
+
+    gen_exception(EXCP_UDEF, syn_swstep(same_el, isv, ex), s->debug_target_el);
 }
 
 /*
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         }
     }
 
+    if (!arm_feature(env, ARM_FEATURE_M)) {
+        int target_el = arm_debug_target_el(env);
+
+        flags = FIELD_DP32(flags, TBFLAG_ANY, DEBUG_TARGET_EL, target_el);
+    }
+
     *pflags = flags;
     *cs_base = 0;
 }
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
     dc->ss_active = FIELD_EX32(tb_flags, TBFLAG_ANY, SS_ACTIVE);
     dc->pstate_ss = FIELD_EX32(tb_flags, TBFLAG_ANY, PSTATE_SS);
     dc->is_ldex = false;
-    dc->ss_same_el = (arm_debug_target_el(env) == dc->current_el);
+    dc->debug_target_el = FIELD_EX32(tb_flags, TBFLAG_ANY, DEBUG_TARGET_EL);
 
     /* Bound the number of insns to execute to those left on the page.  */
     bound = -(dc->base.pc_first | TARGET_PAGE_MASK) / 4;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     dc->ss_active = FIELD_EX32(tb_flags, TBFLAG_ANY, SS_ACTIVE);
     dc->pstate_ss = FIELD_EX32(tb_flags, TBFLAG_ANY, PSTATE_SS);
     dc->is_ldex = false;
-    dc->ss_same_el = false; /* Can't be true since EL_d must be AArch64 */
+    if (!arm_feature(env, ARM_FEATURE_M)) {
+        dc->debug_target_el = FIELD_EX32(tb_flags, TBFLAG_ANY, DEBUG_TARGET_EL);
+    }
 
     dc->page_start = dc->base.pc_first & TARGET_PAGE_MASK;
 
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

This function is used in two different contexts, and it will be
clearer if the function is given the address to which it applies.

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
     }
 }
 
-static bool thumb_insn_is_16bit(DisasContext *s, uint32_t insn)
+static bool thumb_insn_is_16bit(DisasContext *s, uint32_t pc, uint32_t insn)
 {
-    /* Return true if this is a 16 bit instruction. We must be precise
-     * about this (matching the decode).  We assume that s->pc still
-     * points to the first 16 bits of the insn.
+    /*
+     * Return true if this is a 16 bit instruction. We must be precise
+     * about this (matching the decode).
      */
     if ((insn >> 11) < 0x1d) {
         /* Definitely a 16-bit instruction */
@@ -XXX,XX +XXX,XX @@ static bool thumb_insn_is_16bit(DisasContext *s, uint32_t insn)
         return false;
     }
 
-    if ((insn >> 11) == 0x1e && s->pc - s->page_start < TARGET_PAGE_SIZE - 3) {
+    if ((insn >> 11) == 0x1e && pc - s->page_start < TARGET_PAGE_SIZE - 3) {
         /* 0b1111_0xxx_xxxx_xxxx : BL/BLX prefix, and the suffix
          * is not on the next page; we merge this into a 32-bit
          * insn.
@@ -XXX,XX +XXX,XX @@ static bool insn_crosses_page(CPUARMState *env, DisasContext *s)
      */
     uint16_t insn = arm_lduw_code(env, s->pc, s->sctlr_b);
 
-    return !thumb_insn_is_16bit(s, insn);
+    return !thumb_insn_is_16bit(s, s->pc, insn);
 }
 
 static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
     }
 
     insn = arm_lduw_code(env, dc->pc, dc->sctlr_b);
-    is_16bit = thumb_insn_is_16bit(dc, insn);
+    is_16bit = thumb_insn_is_16bit(dc, dc->pc, insn);
     dc->pc += 2;
     if (!is_16bit) {
         uint32_t insn2 = arm_lduw_code(env, dc->pc, dc->sctlr_b);
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Add a new field to retain the address of the instruction currently
being translated.  The 32-bit uses are all within subroutines used
by a32 and t32.  This will become less obvious when t16 support is
merged with a32+t32, and having a clear definition will help.

Convert aarch64 as well for consistency.  Note that there is one
instance of a pre-assert fprintf that used the wrong value for the
address of the current instruction.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.h |  2 +-
 target/arm/translate.h     |  2 ++
 target/arm/translate-a64.c | 21 +++++++++++----------
 target/arm/translate.c     | 14 ++++++++------
 4 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.h
+++ b/target/arm/translate-a64.h
@@ -XXX,XX +XXX,XX @@ void unallocated_encoding(DisasContext *s);
         qemu_log_mask(LOG_UNIMP,                                         \
                       "%s:%d: unsupported instruction encoding 0x%08x "  \
                       "at pc=%016" PRIx64 "\n",                          \
-                      __FILE__, __LINE__, insn, s->pc - 4);              \
+                      __FILE__, __LINE__, insn, s->pc_curr);             \
         unallocated_encoding(s);                                         \
     } while (0)
 
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     const ARMISARegisters *isar;
 
     target_ulong pc;
+    /* The address of the current instruction being translated. */
+    target_ulong pc_curr;
     target_ulong page_start;
     uint32_t insn;
     /* Nonzero if this instruction has been conditionally skipped.  */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static inline AArch64DecodeFn *lookup_disas_fn(const AArch64DecodeTable *table,
  */
 static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
 {
-    uint64_t addr = s->pc + sextract32(insn, 0, 26) * 4 - 4;
+    uint64_t addr = s->pc_curr + sextract32(insn, 0, 26) * 4;
 
     if (insn & (1U << 31)) {
         /* BL Branch with link */
@@ -XXX,XX +XXX,XX @@ static void disas_comp_b_imm(DisasContext *s, uint32_t insn)
     sf = extract32(insn, 31, 1);
     op = extract32(insn, 24, 1); /* 0: CBZ; 1: CBNZ */
     rt = extract32(insn, 0, 5);
-    addr = s->pc + sextract32(insn, 5, 19) * 4 - 4;
+    addr = s->pc_curr + sextract32(insn, 5, 19) * 4;
 
     tcg_cmp = read_cpu_reg(s, rt, sf);
     label_match = gen_new_label();
@@ -XXX,XX +XXX,XX @@ static void disas_test_b_imm(DisasContext *s, uint32_t insn)
 
     bit_pos = (extract32(insn, 31, 1) << 5) | extract32(insn, 19, 5);
     op = extract32(insn, 24, 1); /* 0: TBZ; 1: TBNZ */
-    addr = s->pc + sextract32(insn, 5, 14) * 4 - 4;
+    addr = s->pc_curr + sextract32(insn, 5, 14) * 4;
     rt = extract32(insn, 0, 5);
 
     tcg_cmp = tcg_temp_new_i64();
@@ -XXX,XX +XXX,XX @@ static void disas_cond_b_imm(DisasContext *s, uint32_t insn)
         unallocated_encoding(s);
         return;
     }
-    addr = s->pc + sextract32(insn, 5, 19) * 4 - 4;
+    addr = s->pc_curr + sextract32(insn, 5, 19) * 4;
     cond = extract32(insn, 0, 4);
 
     reset_btype(s);
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
         TCGv_i32 tcg_syn, tcg_isread;
         uint32_t syndrome;
 
-        gen_a64_set_pc_im(s->pc - 4);
+        gen_a64_set_pc_im(s->pc_curr);
         tmpptr = tcg_const_ptr(ri);
         syndrome = syn_aa64_sysregtrap(op0, op1, op2, crn, crm, rt, isread);
         tcg_syn = tcg_const_i32(syndrome);
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
             /* The pre HVC helper handles cases when HVC gets trapped
              * as an undefined insn by runtime configuration.
              */
-            gen_a64_set_pc_im(s->pc - 4);
+            gen_a64_set_pc_im(s->pc_curr);
             gen_helper_pre_hvc(cpu_env);
             gen_ss_advance(s);
             gen_exception_insn(s, 0, EXCP_HVC, syn_aa64_hvc(imm16), 2);
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
                 unallocated_encoding(s);
                 break;
             }
-            gen_a64_set_pc_im(s->pc - 4);
+            gen_a64_set_pc_im(s->pc_curr);
             tmp = tcg_const_i32(syn_aa64_smc(imm16));
             gen_helper_pre_smc(cpu_env, tmp);
             tcg_temp_free_i32(tmp);
@@ -XXX,XX +XXX,XX @@ static void disas_ld_lit(DisasContext *s, uint32_t insn)
 
     tcg_rt = cpu_reg(s, rt);
 
-    clean_addr = tcg_const_i64((s->pc - 4) + imm);
+    clean_addr = tcg_const_i64(s->pc_curr + imm);
     if (is_vector) {
         do_fp_ld(s, rt, clean_addr, size);
     } else {
@@ -XXX,XX +XXX,XX @@ static void disas_pc_rel_adr(DisasContext *s, uint32_t insn)
     offset = sextract64(insn, 5, 19);
     offset = offset << 2 | extract32(insn, 29, 2);
     rd = extract32(insn, 0, 5);
-    base = s->pc - 4;
+    base = s->pc_curr;
 
     if (page) {
         /* ADRP (page based) */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
                 break;
             default:
                 fprintf(stderr, "%s: insn %#04x, fpop %#2x @ %#" PRIx64 "\n",
-                        __func__, insn, fpopcode, s->pc);
+                        __func__, insn, fpopcode, s->pc_curr);
                 g_assert_not_reached();
             }
 
@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
 {
     uint32_t insn;
 
+    s->pc_curr = s->pc;
     insn = arm_ldl_code(env, s->pc, s->sctlr_b);
     s->insn = insn;
     s->pc += 4;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_hvc(DisasContext *s, int imm16)
      * as an undefined insn by runtime configuration (ie before
      * the insn really executes).
      */
-    gen_set_pc_im(s, s->pc - 4);
+    gen_set_pc_im(s, s->pc_curr);
     gen_helper_pre_hvc(cpu_env);
     /* Otherwise we will treat this as a real exception which
      * happens after execution of the insn. (The distinction matters
@@ -XXX,XX +XXX,XX @@ static inline void gen_smc(DisasContext *s)
      */
     TCGv_i32 tmp;
 
-    gen_set_pc_im(s, s->pc - 4);
+    gen_set_pc_im(s, s->pc_curr);
     tmp = tcg_const_i32(syn_aa32_smc());
     gen_helper_pre_smc(cpu_env, tmp);
     tcg_temp_free_i32(tmp);
@@ -XXX,XX +XXX,XX @@ static void gen_msr_banked(DisasContext *s, int r, int sysm, int rn)
 
     /* Sync state because msr_banked() can raise exceptions */
     gen_set_condexec(s);
-    gen_set_pc_im(s, s->pc - 4);
+    gen_set_pc_im(s, s->pc_curr);
     tcg_reg = load_reg(s, rn);
     tcg_tgtmode = tcg_const_i32(tgtmode);
     tcg_regno = tcg_const_i32(regno);
@@ -XXX,XX +XXX,XX @@ static void gen_mrs_banked(DisasContext *s, int r, int sysm, int rn)
 
     /* Sync state because mrs_banked() can raise exceptions */
     gen_set_condexec(s);
-    gen_set_pc_im(s, s->pc - 4);
+    gen_set_pc_im(s, s->pc_curr);
     tcg_reg = tcg_temp_new_i32();
     tcg_tgtmode = tcg_const_i32(tgtmode);
     tcg_regno = tcg_const_i32(regno);
@@ -XXX,XX +XXX,XX @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
             }
 
             gen_set_condexec(s);
-            gen_set_pc_im(s, s->pc - 4);
+            gen_set_pc_im(s, s->pc_curr);
             tmpptr = tcg_const_ptr(ri);
             tcg_syn = tcg_const_i32(syndrome);
             tcg_isread = tcg_const_i32(isread);
@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
     tmp = tcg_const_i32(mode);
     /* get_r13_banked() will raise an exception if called from System mode */
     gen_set_condexec(s);
-    gen_set_pc_im(s, s->pc - 4);
+    gen_set_pc_im(s, s->pc_curr);
     gen_helper_get_r13_banked(addr, cpu_env, tmp);
     tcg_temp_free_i32(tmp);
     switch (amode) {
@@ -XXX,XX +XXX,XX @@ static void arm_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
         return;
     }
 
+    dc->pc_curr = dc->pc;
     insn = arm_ldl_code(env, dc->pc, dc->sctlr_b);
     dc->insn = insn;
     dc->pc += 4;
@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
         return;
     }
 
+    dc->pc_curr = dc->pc;
     insn = arm_lduw_code(env, dc->pc, dc->sctlr_b);
     is_16bit = thumb_insn_is_16bit(dc, dc->pc, insn);
     dc->pc += 2;
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

We currently have 3 different ways of computing the architectural
value of "PC" as seen in the ARM ARM.

The value of s->pc has been incremented past the current insn,
but that is all.  Thus for a32, PC = s->pc + 4; for t32, PC = s->pc;
for t16, PC = s->pc + 2.  These differing computations make it
impossible at present to unify the various code paths.

With the newly introduced s->pc_curr, we can compute the correct
value for all cases, using the formula given in the ARM ARM.

This changes the behaviour for load_reg() and load_reg_var()
when called with reg==15 from a 32-bit Thumb instruction:
previously they would have returned the incorrect value
of pc_curr + 6, and now they will return the architecturally
correct value of PC, which is pc_curr + 4. This will not
affect well-behaved guest software, because all of the places
we call these functions from T32 code are instructions where
using r15 is UNPREDICTABLE. Using the architectural PC value
here is more consistent with the T16 and A32 behaviour.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-4-richard.henderson@linaro.org
[PMM: added commit message note about UNPREDICTABLE T32 cases]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 59 ++++++++++++++++--------------------------
 1 file changed, 23 insertions(+), 36 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void store_cpu_offset(TCGv_i32 var, int offset)
 #define store_cpu_field(var, name) \
     store_cpu_offset(var, offsetof(CPUARMState, name))
 
+/* The architectural value of PC.  */
+static uint32_t read_pc(DisasContext *s)
+{
+    return s->pc_curr + (s->thumb ? 4 : 8);
+}
+
 /* Set a variable to the value of a CPU register.  */
 static void load_reg_var(DisasContext *s, TCGv_i32 var, int reg)
 {
     if (reg == 15) {
-        uint32_t addr;
-        /* normally, since we updated PC, we need only to add one insn */
-        if (s->thumb)
-            addr = (long)s->pc + 2;
-        else
-            addr = (long)s->pc + 4;
-        tcg_gen_movi_i32(var, addr);
+        tcg_gen_movi_i32(var, read_pc(s));
     } else {
         tcg_gen_mov_i32(var, cpu_R[reg]);
     }
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
             /* branch link and change to thumb (blx <offset>) */
             int32_t offset;
 
-            val = (uint32_t)s->pc;
             tmp = tcg_temp_new_i32();
-            tcg_gen_movi_i32(tmp, val);
+            tcg_gen_movi_i32(tmp, s->pc);
             store_reg(s, 14, tmp);
             /* Sign-extend the 24-bit offset */
             offset = (((int32_t)insn) << 8) >> 8;
+            val = read_pc(s);
             /* offset * 4 + bit24 * 2 + (thumb bit) */
             val += (offset << 2) | ((insn >> 23) & 2) | 1;
-            /* pipeline offset */
-            val += 4;
             /* protected by ARCH(5); above, near the start of uncond block */
             gen_bx_im(s, val);
             return;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                         } else {
                             /* store */
                             if (i == 15) {
-                                /* special case: r15 = PC + 8 */
-                                val = (long)s->pc + 4;
                                 tmp = tcg_temp_new_i32();
-                                tcg_gen_movi_i32(tmp, val);
+                                tcg_gen_movi_i32(tmp, read_pc(s));
                             } else if (user) {
                                 tmp = tcg_temp_new_i32();
                                 tmp2 = tcg_const_i32(i);
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                 int32_t offset;
 
                 /* branch (and link) */
-                val = (int32_t)s->pc;
                 if (insn & (1 << 24)) {
                     tmp = tcg_temp_new_i32();
-                    tcg_gen_movi_i32(tmp, val);
+                    tcg_gen_movi_i32(tmp, s->pc);
                     store_reg(s, 14, tmp);
                 }
                 offset = sextract32(insn << 2, 0, 26);
-                val += offset + 4;
-                gen_jmp(s, val);
+                gen_jmp(s, read_pc(s) + offset);
             }
             break;
         case 0xc:
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                 tcg_temp_free_i32(addr);
             } else if ((insn & (7 << 5)) == 0) {
                 /* Table Branch.  */
-                if (rn == 15) {
-                    addr = tcg_temp_new_i32();
-                    tcg_gen_movi_i32(addr, s->pc);
-                } else {
-                    addr = load_reg(s, rn);
-                }
+                addr = load_reg(s, rn);
                 tmp = load_reg(s, rm);
                 tcg_gen_add_i32(addr, addr, tmp);
                 if (insn & (1 << 4)) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                 }
                 tcg_temp_free_i32(addr);
                 tcg_gen_shli_i32(tmp, tmp, 1);
-                tcg_gen_addi_i32(tmp, tmp, s->pc);
+                tcg_gen_addi_i32(tmp, tmp, read_pc(s));
                 store_reg(s, 15, tmp);
             } else {
                 bool is_lasr = false;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                     tcg_gen_movi_i32(cpu_R[14], s->pc | 1);
                 }
 
-                offset += s->pc;
+                offset += read_pc(s);
                 if (insn & (1 << 12)) {
                     /* b/bl */
                     gen_jmp(s, offset);
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                 offset |= (insn & (1 << 11)) << 8;
 
                 /* jump to the offset */
-                gen_jmp(s, s->pc + offset);
+                gen_jmp(s, read_pc(s) + offset);
             }
         } else {
             /*
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
         if (insn & (1 << 11)) {
             rd = (insn >> 8) & 7;
             /* load pc-relative.  Bit 1 of PC is ignored.  */
-            val = s->pc + 2 + ((insn & 0xff) * 4);
+            val = read_pc(s) + ((insn & 0xff) * 4);
             val &= ~(uint32_t)2;
             addr = tcg_temp_new_i32();
             tcg_gen_movi_i32(addr, val);
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
         } else {
             /* PC. bit 1 is ignored.  */
             tmp = tcg_temp_new_i32();
-            tcg_gen_movi_i32(tmp, (s->pc + 2) & ~(uint32_t)2);
+            tcg_gen_movi_i32(tmp, read_pc(s) & ~(uint32_t)2);
         }
         val = (insn & 0xff) * 4;
         tcg_gen_addi_i32(tmp, tmp, val);
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
                 tcg_gen_brcondi_i32(TCG_COND_NE, tmp, 0, s->condlabel);
             tcg_temp_free_i32(tmp);
             offset = ((insn & 0xf8) >> 2) | (insn & 0x200) >> 3;
-            val = (uint32_t)s->pc + 2;
-            val += offset;
-            gen_jmp(s, val);
+            gen_jmp(s, read_pc(s) + offset);
             break;
 
         case 15: /* IT, nop-hint.  */
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
         arm_skip_unless(s, cond);
 
         /* jump to the offset */
-        val = (uint32_t)s->pc + 2;
+        val = read_pc(s);
         offset = ((int32_t)insn << 24) >> 24;
         val += offset << 1;
         gen_jmp(s, val);
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
             break;
         }
         /* unconditional branch */
-        val = (uint32_t)s->pc;
+        val = read_pc(s);
         offset = ((int32_t)insn << 21) >> 21;
-        val += (offset << 1) + 2;
+        val += offset << 1;
         gen_jmp(s, val);
         break;
 
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
             /* 0b1111_0xxx_xxxx_xxxx : BL/BLX prefix */
             uint32_t uoffset = ((int32_t)insn << 21) >> 9;
 
-            tcg_gen_movi_i32(cpu_R[14], s->pc + 2 + uoffset);
+            tcg_gen_movi_i32(cpu_R[14], read_pc(s) + uoffset);
         }
         break;
     }
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Provide a common routine for the places that require ALIGN(PC, 4)
as the base address as opposed to plain PC.  The two are always
the same for A32, but the difference is meaningful for thumb mode.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-vfp.inc.c |  38 ++------
 target/arm/translate.c         | 166 +++++++++++++++------------------
 2 files changed, 82 insertions(+), 122 deletions(-)

diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-vfp.inc.c
+++ b/target/arm/translate-vfp.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_sp(DisasContext *s, arg_VLDR_VSTR_sp *a)
         offset = -offset;
     }
 
-    if (s->thumb && a->rn == 15) {
-        /* This is actually UNPREDICTABLE */
-        addr = tcg_temp_new_i32();
-        tcg_gen_movi_i32(addr, s->pc & ~2);
-    } else {
-        addr = load_reg(s, a->rn);
-    }
-    tcg_gen_addi_i32(addr, addr, offset);
+    /* For thumb, use of PC is UNPREDICTABLE.  */
+    addr = add_reg_for_lit(s, a->rn, offset);
     tmp = tcg_temp_new_i32();
     if (a->l) {
         gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_dp *a)
         offset = -offset;
     }
 
-    if (s->thumb && a->rn == 15) {
-        /* This is actually UNPREDICTABLE */
-        addr = tcg_temp_new_i32();
-        tcg_gen_movi_i32(addr, s->pc & ~2);
-    } else {
-        addr = load_reg(s, a->rn);
-    }
-    tcg_gen_addi_i32(addr, addr, offset);
+    /* For thumb, use of PC is UNPREDICTABLE.  */
+    addr = add_reg_for_lit(s, a->rn, offset);
     tmp = tcg_temp_new_i64();
     if (a->l) {
         gen_aa32_ld64(s, tmp, addr, get_mem_index(s));
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_sp(DisasContext *s, arg_VLDM_VSTM_sp *a)
         return true;
     }
 
-    if (s->thumb && a->rn == 15) {
-        /* This is actually UNPREDICTABLE */
-        addr = tcg_temp_new_i32();
-        tcg_gen_movi_i32(addr, s->pc & ~2);
-    } else {
-        addr = load_reg(s, a->rn);
-    }
+    /* For thumb, use of PC is UNPREDICTABLE.  */
+    addr = add_reg_for_lit(s, a->rn, 0);
     if (a->p) {
         /* pre-decrement */
         tcg_gen_addi_i32(addr, addr, -(a->imm << 2));
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a)
         return true;
     }
 
-    if (s->thumb && a->rn == 15) {
-        /* This is actually UNPREDICTABLE */
-        addr = tcg_temp_new_i32();
-        tcg_gen_movi_i32(addr, s->pc & ~2);
-    } else {
-        addr = load_reg(s, a->rn);
-    }
+    /* For thumb, use of PC is UNPREDICTABLE.  */
+    addr = add_reg_for_lit(s, a->rn, 0);
     if (a->p) {
         /* pre-decrement */
         tcg_gen_addi_i32(addr, addr, -(a->imm << 2));
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline TCGv_i32 load_reg(DisasContext *s, int reg)
     return tmp;
 }
 
+/*
+ * Create a new temp, REG + OFS, except PC is ALIGN(PC, 4).
+ * This is used for load/store for which use of PC implies (literal),
+ * or ADD that implies ADR.
+ */
+static TCGv_i32 add_reg_for_lit(DisasContext *s, int reg, int ofs)
+{
+    TCGv_i32 tmp = tcg_temp_new_i32();
+
+    if (reg == 15) {
+        tcg_gen_movi_i32(tmp, (read_pc(s) & ~3) + ofs);
+    } else {
+        tcg_gen_addi_i32(tmp, cpu_R[reg], ofs);
+    }
+    return tmp;
+}
+
 /* Set a CPU register.  The source must be a temporary and will be
    marked as dead.  */
 static void store_reg(DisasContext *s, int reg, TCGv_i32 var)
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                  */
                 bool wback = extract32(insn, 21, 1);
 
-                if (rn == 15) {
-                    if (insn & (1 << 21)) {
-                        /* UNPREDICTABLE */
-                        goto illegal_op;
-                    }
-                    addr = tcg_temp_new_i32();
-                    tcg_gen_movi_i32(addr, s->pc & ~3);
-                } else {
-                    addr = load_reg(s, rn);
+                if (rn == 15 && (insn & (1 << 21))) {
+                    /* UNPREDICTABLE */
+                    goto illegal_op;
                 }
+
+                addr = add_reg_for_lit(s, rn, 0);
                 offset = (insn & 0xff) * 4;
                 if ((insn & (1 << 23)) == 0) {
                     offset = -offset;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                         store_reg(s, rd, tmp);
                     } else {
                         /* Add/sub 12-bit immediate.  */
-                        if (rn == 15) {
-                            offset = s->pc & ~(uint32_t)3;
-                            if (insn & (1 << 23))
-                                offset -= imm;
-                            else
-                                offset += imm;
-                            tmp = tcg_temp_new_i32();
-                            tcg_gen_movi_i32(tmp, offset);
-                            store_reg(s, rd, tmp);
+                        if (insn & (1 << 23)) {
+                            imm = -imm;
+                        }
+                        tmp = add_reg_for_lit(s, rn, imm);
+                        if (rn == 13 && rd == 13) {
+                            /* ADD SP, SP, imm or SUB SP, SP, imm */
+                            store_sp_checked(s, tmp);
                         } else {
-                            tmp = load_reg(s, rn);
-                            if (insn & (1 << 23))
-                                tcg_gen_subi_i32(tmp, tmp, imm);
-                            else
-                                tcg_gen_addi_i32(tmp, tmp, imm);
-                            if (rn == 13 && rd == 13) {
-                                /* ADD SP, SP, imm or SUB SP, SP, imm */
-                                store_sp_checked(s, tmp);
-                            } else {
-                                store_reg(s, rd, tmp);
-                            }
+                            store_reg(s, rd, tmp);
                         }
                     }
                 }
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
             }
         }
         memidx = get_mem_index(s);
-        if (rn == 15) {
-            addr = tcg_temp_new_i32();
-            /* PC relative.  */
-            /* s->pc has already been incremented by 4.  */
-            imm = s->pc & 0xfffffffc;
-            if (insn & (1 << 23))
-                imm += insn & 0xfff;
-            else
-                imm -= insn & 0xfff;
-            tcg_gen_movi_i32(addr, imm);
+        imm = insn & 0xfff;
+        if (insn & (1 << 23)) {
+            /* PC relative or Positive offset.  */
+            addr = add_reg_for_lit(s, rn, imm);
+        } else if (rn == 15) {
+            /* PC relative with negative offset.  */
+            addr = add_reg_for_lit(s, rn, -imm);
         } else {
             addr = load_reg(s, rn);
-            if (insn & (1 << 23)) {
-                /* Positive offset.  */
-                imm = insn & 0xfff;
-                tcg_gen_addi_i32(addr, addr, imm);
-            } else {
-                imm = insn & 0xff;
-                switch ((insn >> 8) & 0xf) {
-                case 0x0: /* Shifted Register.  */
-                    shift = (insn >> 4) & 0xf;
-                    if (shift > 3) {
-                        tcg_temp_free_i32(addr);
-                        goto illegal_op;
-                    }
-                    tmp = load_reg(s, rm);
-                    if (shift)
-                        tcg_gen_shli_i32(tmp, tmp, shift);
-                    tcg_gen_add_i32(addr, addr, tmp);
-                    tcg_temp_free_i32(tmp);
-                    break;
-                case 0xc: /* Negative offset.  */
-                    tcg_gen_addi_i32(addr, addr, -imm);
-                    break;
-                case 0xe: /* User privilege.  */
-                    tcg_gen_addi_i32(addr, addr, imm);
-                    memidx = get_a32_user_mem_index(s);
-                    break;
-                case 0x9: /* Post-decrement.  */
-                    imm = -imm;
-                    /* Fall through.  */
-                case 0xb: /* Post-increment.  */
-                    postinc = 1;
-                    writeback = 1;
-                    break;
-                case 0xd: /* Pre-decrement.  */
-                    imm = -imm;
-                    /* Fall through.  */
-                case 0xf: /* Pre-increment.  */
-                    writeback = 1;
-                    break;
-                default:
+            imm = insn & 0xff;
+            switch ((insn >> 8) & 0xf) {
+            case 0x0: /* Shifted Register.  */
+                shift = (insn >> 4) & 0xf;
+                if (shift > 3) {
                     tcg_temp_free_i32(addr);
                     goto illegal_op;
                 }
+                tmp = load_reg(s, rm);
+                if (shift) {
+                    tcg_gen_shli_i32(tmp, tmp, shift);
+                }
+                tcg_gen_add_i32(addr, addr, tmp);
+                tcg_temp_free_i32(tmp);
+                break;
+            case 0xc: /* Negative offset.  */
+                tcg_gen_addi_i32(addr, addr, -imm);
+                break;
+            case 0xe: /* User privilege.  */
+                tcg_gen_addi_i32(addr, addr, imm);
+                memidx = get_a32_user_mem_index(s);
+                break;
+            case 0x9: /* Post-decrement.  */
+                imm = -imm;
+                /* Fall through.  */
+            case 0xb: /* Post-increment.  */
+                postinc = 1;
+                writeback = 1;
+                break;
+            case 0xd: /* Pre-decrement.  */
+                imm = -imm;
+                /* Fall through.  */
+            case 0xf: /* Pre-increment.  */
+                writeback = 1;
+                break;
+            default:
+                tcg_temp_free_i32(addr);
+                goto illegal_op;
             }
         }
 
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
         if (insn & (1 << 11)) {
             rd = (insn >> 8) & 7;
             /* load pc-relative.  Bit 1 of PC is ignored.  */
-            val = read_pc(s) + ((insn & 0xff) * 4);
-            val &= ~(uint32_t)2;
-            addr = tcg_temp_new_i32();
-            tcg_gen_movi_i32(addr, val);
+            addr = add_reg_for_lit(s, 15, (insn & 0xff) * 4);
             tmp = tcg_temp_new_i32();
             gen_aa32_ld32u_iss(s, tmp, addr, get_mem_index(s),
                                rd | ISSIs16Bit);
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
          *  - Add PC/SP (immediate)
          */
         rd = (insn >> 8) & 7;
-        if (insn & (1 << 11)) {
-            /* SP */
-            tmp = load_reg(s, 13);
-        } else {
-            /* PC. bit 1 is ignored.  */
-            tmp = tcg_temp_new_i32();
-            tcg_gen_movi_i32(tmp, read_pc(s) & ~(uint32_t)2);
-        }
         val = (insn & 0xff) * 4;
-        tcg_gen_addi_i32(tmp, tmp, val);
+        tmp = add_reg_for_lit(s, insn & (1 << 11) ? 13 : 15, val);
         store_reg(s, rd, tmp);
         break;
 
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The thumb bit has already been removed from s->pc, and is always even.

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_bkpt_insn(DisasContext *s, int offset, uint32_t syn)
 /* Force a TB lookup after an instruction that changes the CPU state.  */
 static inline void gen_lookup_tb(DisasContext *s)
 {
-    tcg_gen_movi_i32(cpu_R[15], s->pc & ~1);
+    tcg_gen_movi_i32(cpu_R[15], s->pc);
     s->base.is_jmp = DISAS_EXIT;
 }
 
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                  * self-modifying code correctly and also to take
                  * any pending interrupts immediately.
                  */
-                gen_goto_tb(s, 0, s->pc & ~1);
+                gen_goto_tb(s, 0, s->pc);
                 return;
             case 7: /* sb */
                 if ((insn & 0xf) || !dc_isar_feature(aa32_sb, s)) {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                  * for TCG; MB and end the TB instead.
                  */
                 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
-                gen_goto_tb(s, 0, s->pc & ~1);
+                gen_goto_tb(s, 0, s->pc);
                 return;
             default:
                 goto illegal_op;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                              * and also to take any pending interrupts
                              * immediately.
                              */
-                            gen_goto_tb(s, 0, s->pc & ~1);
+                            gen_goto_tb(s, 0, s->pc);
                             break;
                         case 7: /* sb */
                             if ((insn & 0xf) || !dc_isar_feature(aa32_sb, s)) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                              * for TCG; MB and end the TB instead.
                              */
                             tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
-                            gen_goto_tb(s, 0, s->pc & ~1);
+                            gen_goto_tb(s, 0, s->pc);
                             break;
                         default:
                             goto illegal_op;
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

We must update s->base.pc_next when we return from the translate_insn
hook to the main translator loop.  By incrementing s->base.pc_next
immediately after reading the insn word, "pc_next" contains the address
of the next instruction throughout translation.

All remaining uses of s->pc are referencing the address of the next insn,
so this is now a simple global replacement.  Remove the "s->pc" field.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     |   1 -
 target/arm/translate-a64.c |  51 +++++++++---------
 target/arm/translate.c     | 103 ++++++++++++++++++-------------------
 3 files changed, 72 insertions(+), 83 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     DisasContextBase base;
     const ARMISARegisters *isar;
 
-    target_ulong pc;
     /* The address of the current instruction being translated. */
     target_ulong pc_curr;
     target_ulong page_start;
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal(int excp)
 
 static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
 {
-    gen_a64_set_pc_im(s->pc - offset);
+    gen_a64_set_pc_im(s->base.pc_next - offset);
     gen_exception_internal(excp);
     s->base.is_jmp = DISAS_NORETURN;
 }
@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
 static void gen_exception_insn(DisasContext *s, int offset, int excp,
                                uint32_t syndrome, uint32_t target_el)
 {
-    gen_a64_set_pc_im(s->pc - offset);
+    gen_a64_set_pc_im(s->base.pc_next - offset);
     gen_exception(excp, syndrome, target_el);
     s->base.is_jmp = DISAS_NORETURN;
 }
@@ -XXX,XX +XXX,XX @@ static void gen_exception_bkpt_insn(DisasContext *s, int offset,
 {
     TCGv_i32 tcg_syn;
 
-    gen_a64_set_pc_im(s->pc - offset);
+    gen_a64_set_pc_im(s->base.pc_next - offset);
     tcg_syn = tcg_const_i32(syndrome);
     gen_helper_exception_bkpt_insn(cpu_env, tcg_syn);
     tcg_temp_free_i32(tcg_syn);
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
 
     if (insn & (1U << 31)) {
         /* BL Branch with link */
-        tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
+        tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next);
     }
 
     /* B Branch / BL Branch with link */
@@ -XXX,XX +XXX,XX @@ static void disas_comp_b_imm(DisasContext *s, uint32_t insn)
     tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
                         tcg_cmp, 0, label_match);
 
-    gen_goto_tb(s, 0, s->pc);
+    gen_goto_tb(s, 0, s->base.pc_next);
     gen_set_label(label_match);
     gen_goto_tb(s, 1, addr);
 }
@@ -XXX,XX +XXX,XX @@ static void disas_test_b_imm(DisasContext *s, uint32_t insn)
     tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
                         tcg_cmp, 0, label_match);
     tcg_temp_free_i64(tcg_cmp);
-    gen_goto_tb(s, 0, s->pc);
+    gen_goto_tb(s, 0, s->base.pc_next);
     gen_set_label(label_match);
     gen_goto_tb(s, 1, addr);
 }
@@ -XXX,XX +XXX,XX @@ static void disas_cond_b_imm(DisasContext *s, uint32_t insn)
         /* genuinely conditional branches */
         TCGLabel *label_match = gen_new_label();
         arm_gen_test_cc(cond, label_match);
-        gen_goto_tb(s, 0, s->pc);
+        gen_goto_tb(s, 0, s->base.pc_next);
         gen_set_label(label_match);
         gen_goto_tb(s, 1, addr);
     } else {
@@ -XXX,XX +XXX,XX @@ static void handle_sync(DisasContext *s, uint32_t insn,
          * any pending interrupts immediately.
          */
         reset_btype(s);
-        gen_goto_tb(s, 0, s->pc);
+        gen_goto_tb(s, 0, s->base.pc_next);
         return;
 
     case 7: /* SB */
@@ -XXX,XX +XXX,XX @@ static void handle_sync(DisasContext *s, uint32_t insn,
          * MB and end the TB instead.
          */
         tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
-        gen_goto_tb(s, 0, s->pc);
+        gen_goto_tb(s, 0, s->base.pc_next);
         return;
 
     default:
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
         gen_a64_set_pc(s, dst);
         /* BLR also needs to load return address */
         if (opc == 1) {
-            tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
+            tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next);
         }
         break;
 
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
         gen_a64_set_pc(s, dst);
         /* BLRAA also needs to load return address */
         if (opc == 9) {
-            tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
+            tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next);
         }
         break;
 
@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
 {
     uint32_t insn;
 
-    s->pc_curr = s->pc;
-    insn = arm_ldl_code(env, s->pc, s->sctlr_b);
+    s->pc_curr = s->base.pc_next;
+    insn = arm_ldl_code(env, s->base.pc_next, s->sctlr_b);
     s->insn = insn;
-    s->pc += 4;
+    s->base.pc_next += 4;
 
     s->fp_access_checked = false;
 
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
     int bound, core_mmu_idx;
 
     dc->isar = &arm_cpu->isar;
-    dc->pc = dc->base.pc_first;
     dc->condjmp = 0;
 
     dc->aarch64 = 1;
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
-    tcg_gen_insn_start(dc->pc, 0, 0);
+    tcg_gen_insn_start(dc->base.pc_next, 0, 0);
     dc->insn_start = tcg_last_op();
 }
 
@@ -XXX,XX +XXX,XX @@ static bool aarch64_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
     if (bp->flags & BP_CPU) {
-        gen_a64_set_pc_im(dc->pc);
+        gen_a64_set_pc_im(dc->base.pc_next);
         gen_helper_check_breakpoints(cpu_env);
         /* End the TB early; it likely won't be executed */
         dc->base.is_jmp = DISAS_TOO_MANY;
@@ -XXX,XX +XXX,XX @@ static bool aarch64_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
            to for it to be properly cleared -- thus we
            increment the PC here so that the logic setting
            tb->size below does the right thing.  */
-        dc->pc += 4;
+        dc->base.pc_next += 4;
         dc->base.is_jmp = DISAS_NORETURN;
     }
 
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
         disas_a64_insn(env, dc);
     }
 
-    dc->base.pc_next = dc->pc;
     translator_loop_temp_check(&dc->base);
 }
 
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
          */
         switch (dc->base.is_jmp) {
         default:
-            gen_a64_set_pc_im(dc->pc);
+            gen_a64_set_pc_im(dc->base.pc_next);
             /* fall through */
         case DISAS_EXIT:
         case DISAS_JUMP:
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
         switch (dc->base.is_jmp) {
         case DISAS_NEXT:
         case DISAS_TOO_MANY:
-            gen_goto_tb(dc, 1, dc->pc);
+            gen_goto_tb(dc, 1, dc->base.pc_next);
             break;
         default:
         case DISAS_UPDATE:
-            gen_a64_set_pc_im(dc->pc);
+            gen_a64_set_pc_im(dc->base.pc_next);
             /* fall through */
         case DISAS_EXIT:
             tcg_gen_exit_tb(NULL, 0);
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
         case DISAS_SWI:
             break;
         case DISAS_WFE:
-            gen_a64_set_pc_im(dc->pc);
+            gen_a64_set_pc_im(dc->base.pc_next);
             gen_helper_wfe(cpu_env);
             break;
         case DISAS_YIELD:
-            gen_a64_set_pc_im(dc->pc);
+            gen_a64_set_pc_im(dc->base.pc_next);
             gen_helper_yield(cpu_env);
             break;
         case DISAS_WFI:
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
              */
             TCGv_i32 tmp = tcg_const_i32(4);
 
-            gen_a64_set_pc_im(dc->pc);
+            gen_a64_set_pc_im(dc->base.pc_next);
             gen_helper_wfi(cpu_env, tmp);
             tcg_temp_free_i32(tmp);
             /* The helper doesn't necessarily throw an exception, but we
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
         }
         }
     }
-
-    /* Functions above can change dc->pc, so re-align db->pc_next */
-    dc->base.pc_next = dc->pc;
 }
 
 static void aarch64_tr_disas_log(const DisasContextBase *dcbase,
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_blxns(DisasContext *s, int rm)
      * We do however need to set the PC, because the blxns helper reads it.
      * The blxns helper may throw an exception.
      */
-    gen_set_pc_im(s, s->pc);
+    gen_set_pc_im(s, s->base.pc_next);
     gen_helper_v7m_blxns(cpu_env, var);
     tcg_temp_free_i32(var);
     s->base.is_jmp = DISAS_EXIT;
@@ -XXX,XX +XXX,XX @@ static inline void gen_hvc(DisasContext *s, int imm16)
      * for single stepping.)
      */
     s->svc_imm = imm16;
-    gen_set_pc_im(s, s->pc);
+    gen_set_pc_im(s, s->base.pc_next);
     s->base.is_jmp = DISAS_HVC;
 }
 
@@ -XXX,XX +XXX,XX @@ static inline void gen_smc(DisasContext *s)
     tmp = tcg_const_i32(syn_aa32_smc());
     gen_helper_pre_smc(cpu_env, tmp);
     tcg_temp_free_i32(tmp);
-    gen_set_pc_im(s, s->pc);
+    gen_set_pc_im(s, s->base.pc_next);
     s->base.is_jmp = DISAS_SMC;
 }
 
 static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
 {
     gen_set_condexec(s);
-    gen_set_pc_im(s, s->pc - offset);
+    gen_set_pc_im(s, s->base.pc_next - offset);
     gen_exception_internal(excp);
     s->base.is_jmp = DISAS_NORETURN;
 }
@@ -XXX,XX +XXX,XX @@ static void gen_exception_insn(DisasContext *s, int offset, int excp,
                                int syn, uint32_t target_el)
 {
     gen_set_condexec(s);
-    gen_set_pc_im(s, s->pc - offset);
+    gen_set_pc_im(s, s->base.pc_next - offset);
     gen_exception(excp, syn, target_el);
     s->base.is_jmp = DISAS_NORETURN;
 }
@@ -XXX,XX +XXX,XX @@ static void gen_exception_bkpt_insn(DisasContext *s, int offset, uint32_t syn)
     TCGv_i32 tcg_syn;
 
     gen_set_condexec(s);
-    gen_set_pc_im(s, s->pc - offset);
+    gen_set_pc_im(s, s->base.pc_next - offset);
     tcg_syn = tcg_const_i32(syn);
     gen_helper_exception_bkpt_insn(cpu_env, tcg_syn);
     tcg_temp_free_i32(tcg_syn);
@@ -XXX,XX +XXX,XX @@ static void gen_exception_bkpt_insn(DisasContext *s, int offset, uint32_t syn)
 /* Force a TB lookup after an instruction that changes the CPU state.  */
 static inline void gen_lookup_tb(DisasContext *s)
 {
-    tcg_gen_movi_i32(cpu_R[15], s->pc);
+    tcg_gen_movi_i32(cpu_R[15], s->base.pc_next);
     s->base.is_jmp = DISAS_EXIT;
 }
 
@@ -XXX,XX +XXX,XX @@ static inline bool use_goto_tb(DisasContext *s, target_ulong dest)
 {
 #ifndef CONFIG_USER_ONLY
     return (s->base.tb->pc & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK) ||
-           ((s->pc - 1) & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK);
+           ((s->base.pc_next - 1) & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK);
 #else
     return true;
 #endif
@@ -XXX,XX +XXX,XX @@ static void gen_nop_hint(DisasContext *s, int val)
          */
     case 1: /* yield */
         if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
-            gen_set_pc_im(s, s->pc);
+            gen_set_pc_im(s, s->base.pc_next);
             s->base.is_jmp = DISAS_YIELD;
         }
         break;
     case 3: /* wfi */
-        gen_set_pc_im(s, s->pc);
+        gen_set_pc_im(s, s->base.pc_next);
         s->base.is_jmp = DISAS_WFI;
         break;
     case 2: /* wfe */
         if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
-            gen_set_pc_im(s, s->pc);
+            gen_set_pc_im(s, s->base.pc_next);
             s->base.is_jmp = DISAS_WFE;
         }
         break;
@@ -XXX,XX +XXX,XX @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
             if (isread) {
                 return 1;
             }
-            gen_set_pc_im(s, s->pc);
+            gen_set_pc_im(s, s->base.pc_next);
             s->base.is_jmp = DISAS_WFI;
             return 0;
         default:
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                  * self-modifying code correctly and also to take
                  * any pending interrupts immediately.
                  */
-                gen_goto_tb(s, 0, s->pc);
+                gen_goto_tb(s, 0, s->base.pc_next);
                 return;
             case 7: /* sb */
                 if ((insn & 0xf) || !dc_isar_feature(aa32_sb, s)) {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                  * for TCG; MB and end the TB instead.
                  */
                 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
-                gen_goto_tb(s, 0, s->pc);
+                gen_goto_tb(s, 0, s->base.pc_next);
                 return;
             default:
                 goto illegal_op;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
             int32_t offset;
 
             tmp = tcg_temp_new_i32();
-            tcg_gen_movi_i32(tmp, s->pc);
+            tcg_gen_movi_i32(tmp, s->base.pc_next);
             store_reg(s, 14, tmp);
             /* Sign-extend the 24-bit offset */
             offset = (((int32_t)insn) << 8) >> 8;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
             /* branch link/exchange thumb (blx) */
             tmp = load_reg(s, rm);
             tmp2 = tcg_temp_new_i32();
-            tcg_gen_movi_i32(tmp2, s->pc);
+            tcg_gen_movi_i32(tmp2, s->base.pc_next);
             store_reg(s, 14, tmp2);
             gen_bx(s, tmp);
             break;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                 /* branch (and link) */
                 if (insn & (1 << 24)) {
                     tmp = tcg_temp_new_i32();
-                    tcg_gen_movi_i32(tmp, s->pc);
+                    tcg_gen_movi_i32(tmp, s->base.pc_next);
                     store_reg(s, 14, tmp);
                 }
                 offset = sextract32(insn << 2, 0, 26);
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
             break;
         case 0xf:
             /* swi */
-            gen_set_pc_im(s, s->pc);
+            gen_set_pc_im(s, s->base.pc_next);
             s->svc_imm = extract32(insn, 0, 24);
             s->base.is_jmp = DISAS_SWI;
             break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
 
                 if (insn & (1 << 14)) {
                     /* Branch and link.  */
-                    tcg_gen_movi_i32(cpu_R[14], s->pc | 1);
+                    tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | 1);
                 }
 
                 offset += read_pc(s);
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                              * and also to take any pending interrupts
                              * immediately.
                              */
-                            gen_goto_tb(s, 0, s->pc);
+                            gen_goto_tb(s, 0, s->base.pc_next);
                             break;
                         case 7: /* sb */
                             if ((insn & 0xf) || !dc_isar_feature(aa32_sb, s)) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                              * for TCG; MB and end the TB instead.
                              */
                             tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
-                            gen_goto_tb(s, 0, s->pc);
+                            gen_goto_tb(s, 0, s->base.pc_next);
                             break;
                         default:
                             goto illegal_op;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
                 /* BLX/BX */
                 tmp = load_reg(s, rm);
                 if (link) {
-                    val = (uint32_t)s->pc | 1;
+                    val = (uint32_t)s->base.pc_next | 1;
                     tmp2 = tcg_temp_new_i32();
                     tcg_gen_movi_i32(tmp2, val);
                     store_reg(s, 14, tmp2);
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
 
         if (cond == 0xf) {
             /* swi */
-            gen_set_pc_im(s, s->pc);
+            gen_set_pc_im(s, s->base.pc_next);
             s->svc_imm = extract32(insn, 0, 8);
             s->base.is_jmp = DISAS_SWI;
             break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
             tcg_gen_andi_i32(tmp, tmp, 0xfffffffc);
 
             tmp2 = tcg_temp_new_i32();
-            tcg_gen_movi_i32(tmp2, s->pc | 1);
+            tcg_gen_movi_i32(tmp2, s->base.pc_next | 1);
             store_reg(s, 14, tmp2);
             gen_bx(s, tmp);
             break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
             tcg_gen_addi_i32(tmp, tmp, offset);
 
             tmp2 = tcg_temp_new_i32();
-            tcg_gen_movi_i32(tmp2, s->pc | 1);
+            tcg_gen_movi_i32(tmp2, s->base.pc_next | 1);
             store_reg(s, 14, tmp2);
             gen_bx(s, tmp);
         } else {
@@ -XXX,XX +XXX,XX @@ undef:
 
 static bool insn_crosses_page(CPUARMState *env, DisasContext *s)
 {
-    /* Return true if the insn at dc->pc might cross a page boundary.
+    /* Return true if the insn at dc->base.pc_next might cross a page boundary.
      * (False positives are OK, false negatives are not.)
      * We know this is a Thumb insn, and our caller ensures we are
-     * only called if dc->pc is less than 4 bytes from the page
+     * only called if dc->base.pc_next is less than 4 bytes from the page
      * boundary, so we cross the page if the first 16 bits indicate
      * that this is a 32 bit insn.
      */
-    uint16_t insn = arm_lduw_code(env, s->pc, s->sctlr_b);
+    uint16_t insn = arm_lduw_code(env, s->base.pc_next, s->sctlr_b);
 
-    return !thumb_insn_is_16bit(s, s->pc, insn);
+    return !thumb_insn_is_16bit(s, s->base.pc_next, insn);
 }
 
 static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     uint32_t condexec, core_mmu_idx;
 
     dc->isar = &cpu->isar;
-    dc->pc = dc->base.pc_first;
     dc->condjmp = 0;
 
     dc->aarch64 = 0;
@@ -XXX,XX +XXX,XX @@ static void arm_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
-    tcg_gen_insn_start(dc->pc,
+    tcg_gen_insn_start(dc->base.pc_next,
                        (dc->condexec_cond << 4) | (dc->condexec_mask >> 1),
                        0);
     dc->insn_start = tcg_last_op();
@@ -XXX,XX +XXX,XX @@ static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
 
     if (bp->flags & BP_CPU) {
         gen_set_condexec(dc);
-        gen_set_pc_im(dc, dc->pc);
+        gen_set_pc_im(dc, dc->base.pc_next);
         gen_helper_check_breakpoints(cpu_env);
         /* End the TB early; it's likely not going to be executed */
         dc->base.is_jmp = DISAS_TOO_MANY;
@@ -XXX,XX +XXX,XX @@ static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
            tb->size below does the right thing.  */
         /* TODO: Advance PC by correct instruction length to
          * avoid disassembler error messages */
-        dc->pc += 2;
+        dc->base.pc_next += 2;
         dc->base.is_jmp = DISAS_NORETURN;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool arm_pre_translate_insn(DisasContext *dc)
 {
 #ifdef CONFIG_USER_ONLY
     /* Intercept jump to the magic kernel page.  */
-    if (dc->pc >= 0xffff0000) {
+    if (dc->base.pc_next >= 0xffff0000) {
         /* We always get here via a jump, so know we are not in a
            conditional execution block.  */
         gen_exception_internal(EXCP_KERNEL_TRAP);
@@ -XXX,XX +XXX,XX @@ static void arm_post_translate_insn(DisasContext *dc)
         gen_set_label(dc->condlabel);
         dc->condjmp = 0;
     }
-    dc->base.pc_next = dc->pc;
     translator_loop_temp_check(&dc->base);
 }
 
@@ -XXX,XX +XXX,XX @@ static void arm_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
         return;
     }
 
-    dc->pc_curr = dc->pc;
-    insn = arm_ldl_code(env, dc->pc, dc->sctlr_b);
+    dc->pc_curr = dc->base.pc_next;
+    insn = arm_ldl_code(env, dc->base.pc_next, dc->sctlr_b);
     dc->insn = insn;
-    dc->pc += 4;
+    dc->base.pc_next += 4;
     disas_arm_insn(dc, insn);
 
     arm_post_translate_insn(dc);
@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
         return;
     }
 
-    dc->pc_curr = dc->pc;
-    insn = arm_lduw_code(env, dc->pc, dc->sctlr_b);
-    is_16bit = thumb_insn_is_16bit(dc, dc->pc, insn);
-    dc->pc += 2;
+    dc->pc_curr = dc->base.pc_next;
+    insn = arm_lduw_code(env, dc->base.pc_next, dc->sctlr_b);
+    is_16bit = thumb_insn_is_16bit(dc, dc->base.pc_next, insn);
+    dc->base.pc_next += 2;
     if (!is_16bit) {
-        uint32_t insn2 = arm_lduw_code(env, dc->pc, dc->sctlr_b);
+        uint32_t insn2 = arm_lduw_code(env, dc->base.pc_next, dc->sctlr_b);
 
         insn = insn << 16 | insn2;
-        dc->pc += 2;
+        dc->base.pc_next += 2;
     }
     dc->insn = insn;
 
@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
      * but isn't very efficient).
      */
     if (dc->base.is_jmp == DISAS_NEXT
-        && (dc->pc - dc->page_start >= TARGET_PAGE_SIZE
-            || (dc->pc - dc->page_start >= TARGET_PAGE_SIZE - 3
+        && (dc->base.pc_next - dc->page_start >= TARGET_PAGE_SIZE
+            || (dc->base.pc_next - dc->page_start >= TARGET_PAGE_SIZE - 3
                 && insn_crosses_page(env, dc)))) {
         dc->base.is_jmp = DISAS_TOO_MANY;
     }
@@ -XXX,XX +XXX,XX @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
         case DISAS_NEXT:
         case DISAS_TOO_MANY:
         case DISAS_UPDATE:
-            gen_set_pc_im(dc, dc->pc);
+            gen_set_pc_im(dc, dc->base.pc_next);
             /* fall through */
         default:
             /* FIXME: Single stepping a WFI insn will not halt the CPU. */
@@ -XXX,XX +XXX,XX @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
         switch(dc->base.is_jmp) {
         case DISAS_NEXT:
         case DISAS_TOO_MANY:
-            gen_goto_tb(dc, 1, dc->pc);
+            gen_goto_tb(dc, 1, dc->base.pc_next);
             break;
         case DISAS_JUMP:
             gen_goto_ptr();
             break;
         case DISAS_UPDATE:
-            gen_set_pc_im(dc, dc->pc);
+            gen_set_pc_im(dc, dc->base.pc_next);
             /* fall through */
         default:
             /* indicate that the hash table must be used to find the next TB */
@@ -XXX,XX +XXX,XX @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
         gen_set_label(dc->condlabel);
         gen_set_condexec(dc);
         if (unlikely(is_singlestepping(dc))) {
-            gen_set_pc_im(dc, dc->pc);
+            gen_set_pc_im(dc, dc->base.pc_next);
             gen_singlestep_exception(dc);
         } else {
-            gen_goto_tb(dc, 1, dc->pc);
+            gen_goto_tb(dc, 1, dc->base.pc_next);
         }
     }
-
-    /* Functions above can change dc->pc, so re-align db->pc_next */
-    dc->base.pc_next = dc->pc;
 }
 
 static void arm_tr_disas_log(const DisasContextBase *dcbase, CPUState *cpu)
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The offset is variable depending on the instruction set, whereas
we have stored values for the current pc and the next pc.  Passing
in the actual value is clearer in intent.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c     | 25 ++++++++++++++-----------
 target/arm/translate-vfp.inc.c |  6 +++---
 target/arm/translate.c         | 31 ++++++++++++++++---------------
 3 files changed, 33 insertions(+), 29 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
     s->base.is_jmp = DISAS_NORETURN;
 }
 
-static void gen_exception_insn(DisasContext *s, int offset, int excp,
+static void gen_exception_insn(DisasContext *s, uint64_t pc, int excp,
                                uint32_t syndrome, uint32_t target_el)
 {
-    gen_a64_set_pc_im(s->base.pc_next - offset);
+    gen_a64_set_pc_im(pc);
     gen_exception(excp, syndrome, target_el);
     s->base.is_jmp = DISAS_NORETURN;
 }
@@ -XXX,XX +XXX,XX @@ static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest)
 void unallocated_encoding(DisasContext *s)
 {
     /* Unallocated and reserved encodings are uncategorized */
-    gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
                        default_exception_el(s));
 }
 
@@ -XXX,XX +XXX,XX @@ static inline bool fp_access_check(DisasContext *s)
         return true;
     }
 
-    gen_exception_insn(s, 4, EXCP_UDEF, syn_fp_access_trap(1, 0xe, false),
-                       s->fp_excp_el);
+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+                       syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
     return false;
 }
 
@@ -XXX,XX +XXX,XX @@ static inline bool fp_access_check(DisasContext *s)
 bool sve_access_check(DisasContext *s)
 {
     if (s->sve_excp_el) {
-        gen_exception_insn(s, 4, EXCP_UDEF, syn_sve_access_trap(),
+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_sve_access_trap(),
                            s->sve_excp_el);
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
         switch (op2_ll) {
         case 1:                                                     /* SVC */
             gen_ss_advance(s);
-            gen_exception_insn(s, 0, EXCP_SWI, syn_aa64_svc(imm16),
-                               default_exception_el(s));
+            gen_exception_insn(s, s->base.pc_next, EXCP_SWI,
+                               syn_aa64_svc(imm16), default_exception_el(s));
             break;
         case 2:                                                     /* HVC */
             if (s->current_el == 0) {
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
             gen_a64_set_pc_im(s->pc_curr);
             gen_helper_pre_hvc(cpu_env);
             gen_ss_advance(s);
-            gen_exception_insn(s, 0, EXCP_HVC, syn_aa64_hvc(imm16), 2);
+            gen_exception_insn(s, s->base.pc_next, EXCP_HVC,
+                               syn_aa64_hvc(imm16), 2);
             break;
         case 3:                                                     /* SMC */
             if (s->current_el == 0) {
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
             gen_helper_pre_smc(cpu_env, tmp);
             tcg_temp_free_i32(tmp);
             gen_ss_advance(s);
-            gen_exception_insn(s, 0, EXCP_SMC, syn_aa64_smc(imm16), 3);
+            gen_exception_insn(s, s->base.pc_next, EXCP_SMC,
+                               syn_aa64_smc(imm16), 3);
             break;
         default:
             unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
             if (s->btype != 0
                 && s->guarded_page
                 && !btype_destination_ok(insn, s->bt, s->btype)) {
-                gen_exception_insn(s, 4, EXCP_UDEF, syn_btitrap(s->btype),
+                gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+                                   syn_btitrap(s->btype),
                                    default_exception_el(s));
                 return;
             }
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-vfp.inc.c
+++ b/target/arm/translate-vfp.inc.c
@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
 {
     if (s->fp_excp_el) {
         if (arm_dc_feature(s, ARM_FEATURE_M)) {
-            gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
+            gen_exception_insn(s, s->pc_curr, EXCP_NOCP, syn_uncategorized(),
                                s->fp_excp_el);
         } else {
-            gen_exception_insn(s, 4, EXCP_UDEF,
+            gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
                                syn_fp_access_trap(1, 0xe, false),
                                s->fp_excp_el);
         }
@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
 
     if (!s->vfp_enabled && !ignore_vfp_enabled) {
         assert(!arm_dc_feature(s, ARM_FEATURE_M));
-        gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
                            default_exception_el(s));
         return false;
     }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
     s->base.is_jmp = DISAS_NORETURN;
 }
 
-static void gen_exception_insn(DisasContext *s, int offset, int excp,
+static void gen_exception_insn(DisasContext *s, uint32_t pc, int excp,
                                int syn, uint32_t target_el)
 {
     gen_set_condexec(s);
-    gen_set_pc_im(s, s->base.pc_next - offset);
+    gen_set_pc_im(s, pc);
     gen_exception(excp, syn, target_el);
     s->base.is_jmp = DISAS_NORETURN;
 }
@@ -XXX,XX +XXX,XX @@ static inline void gen_hlt(DisasContext *s, int imm)
         return;
     }
 
-    gen_exception_insn(s, s->thumb ? 2 : 4, EXCP_UDEF, syn_uncategorized(),
+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
                        default_exception_el(s));
 }
 
@@ -XXX,XX +XXX,XX @@ static bool msr_banked_access_decode(DisasContext *s, int r, int sysm, int rn,
 
 undef:
     /* If we get here then some access check did not pass */
-    gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(), exc_target);
+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+                       syn_uncategorized(), exc_target);
     return false;
 }
 
@@ -XXX,XX +XXX,XX @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn)
      * for attempts to execute invalid vfp/neon encodings with FP disabled.
      */
     if (s->fp_excp_el) {
-        gen_exception_insn(s, 4, EXCP_UDEF,
+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
         return 0;
     }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      * for attempts to execute invalid vfp/neon encodings with FP disabled.
      */
     if (s->fp_excp_el) {
-        gen_exception_insn(s, 4, EXCP_UDEF,
+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
         return 0;
     }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
     }
 
     if (s->fp_excp_el) {
-        gen_exception_insn(s, 4, EXCP_UDEF,
+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
         return 0;
     }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
         off_rm = vfp_reg_offset(0, rm);
     }
     if (s->fp_excp_el) {
-        gen_exception_insn(s, 4, EXCP_UDEF,
+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
                            syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
         return 0;
     }
@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
      * For the UNPREDICTABLE cases we choose to UNDEF.
      */
     if (s->current_el == 1 && !s->ns && mode == ARM_CPU_MODE_MON) {
-        gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(), 3);
+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(), 3);
         return;
     }
 
@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
     }
 
     if (undef) {
-        gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
+        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
                            default_exception_el(s));
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
      * UsageFault exception.
      */
     if (arm_dc_feature(s, ARM_FEATURE_M)) {
-        gen_exception_insn(s, 4, EXCP_INVSTATE, syn_uncategorized(),
+        gen_exception_insn(s, s->pc_curr, EXCP_INVSTATE, syn_uncategorized(),
                            default_exception_el(s));
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
             break;
         default:
         illegal_op:
-            gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
+            gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
                                default_exception_el(s));
             break;
         }
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
             }
 
             /* All other insns: NOCP */
-            gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
+            gen_exception_insn(s, s->pc_curr, EXCP_NOCP, syn_uncategorized(),
                                default_exception_el(s));
             break;
         }
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
     }
     return;
 illegal_op:
-    gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(),
+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
                        default_exception_el(s));
 }
 
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
     return;
 illegal_op:
 undef:
-    gen_exception_insn(s, 2, EXCP_UDEF, syn_uncategorized(),
+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
                        default_exception_el(s));
 }
 
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The offset is variable depending on the instruction set.
Passing in the actual value is clearer in intent.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 8 ++++----
 target/arm/translate.c     | 8 ++++----
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_internal(int excp)
     tcg_temp_free_i32(tcg_excp);
 }
 
-static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
+static void gen_exception_internal_insn(DisasContext *s, uint64_t pc, int excp)
 {
-    gen_a64_set_pc_im(s->base.pc_next - offset);
+    gen_a64_set_pc_im(pc);
     gen_exception_internal(excp);
     s->base.is_jmp = DISAS_NORETURN;
 }
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
                 break;
             }
 #endif
-            gen_exception_internal_insn(s, 0, EXCP_SEMIHOST);
+            gen_exception_internal_insn(s, s->base.pc_next, EXCP_SEMIHOST);
         } else {
             unsupported_encoding(s, insn);
         }
@@ -XXX,XX +XXX,XX @@ static bool aarch64_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
         /* End the TB early; it likely won't be executed */
         dc->base.is_jmp = DISAS_TOO_MANY;
     } else {
-        gen_exception_internal_insn(dc, 0, EXCP_DEBUG);
+        gen_exception_internal_insn(dc, dc->base.pc_next, EXCP_DEBUG);
         /* The address covered by the breakpoint must be
            included in [tb->pc, tb->pc + tb->size) in order
            to for it to be properly cleared -- thus we
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_smc(DisasContext *s)
     s->base.is_jmp = DISAS_SMC;
 }
 
-static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
+static void gen_exception_internal_insn(DisasContext *s, uint32_t pc, int excp)
 {
     gen_set_condexec(s);
-    gen_set_pc_im(s, s->base.pc_next - offset);
+    gen_set_pc_im(s, pc);
     gen_exception_internal(excp);
     s->base.is_jmp = DISAS_NORETURN;
 }
@@ -XXX,XX +XXX,XX @@ static inline void gen_hlt(DisasContext *s, int imm)
         s->current_el != 0 &&
 #endif
         (imm == (s->thumb ? 0x3c : 0xf000))) {
-        gen_exception_internal_insn(s, 0, EXCP_SEMIHOST);
+        gen_exception_internal_insn(s, s->base.pc_next, EXCP_SEMIHOST);
         return;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
         /* End the TB early; it's likely not going to be executed */
         dc->base.is_jmp = DISAS_TOO_MANY;
     } else {
-        gen_exception_internal_insn(dc, 0, EXCP_DEBUG);
+        gen_exception_internal_insn(dc, dc->base.pc_next, EXCP_DEBUG);
         /* The address covered by the breakpoint must be
            included in [tb->pc, tb->pc + tb->size) in order
            to for it to be properly cleared -- thus we
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Unlike the other more generic gen_exception{,_internal}_insn
interfaces, breakpoints always refer to the current instruction.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 7 +++----
 target/arm/translate.c     | 8 ++++----
 2 files changed, 7 insertions(+), 8 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Promote this function from aarch64 to fully general use.
Use it to unify the code sequences for generating illegal
opcode exceptions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.h     |  2 --
 target/arm/translate.h         |  2 ++
 target/arm/translate-a64.c     |  7 -------
 target/arm/translate-vfp.inc.c |  3 +--
 target/arm/translate.c         | 22 ++++++++++++----------
 5 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.h
+++ b/target/arm/translate-a64.h
@@ -XXX,XX +XXX,XX @@
 #ifndef TARGET_ARM_TRANSLATE_A64_H
 #define TARGET_ARM_TRANSLATE_A64_H
 
-void unallocated_encoding(DisasContext *s);
-
 #define unsupported_encoding(s, insn)                                    \
     do {                                                                 \
         qemu_log_mask(LOG_UNIMP,                                         \
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasCompare {
     bool value_global;
 } DisasCompare;
 
+void unallocated_encoding(DisasContext *s);
+
 /* Share the TCG temporaries common between 32 and 64 bit modes.  */
 extern TCGv_i32 cpu_NF, cpu_ZF, cpu_CF, cpu_VF;
 extern TCGv_i64 cpu_exclusive_addr;
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest)
     }
 }
 
-void unallocated_encoding(DisasContext *s)
-{
-    /* Unallocated and reserved encodings are uncategorized */
-    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
-                       default_exception_el(s));
-}
-
 static void init_tmp_a64_array(DisasContext *s)
 {
 #ifdef CONFIG_DEBUG_TCG
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-vfp.inc.c
+++ b/target/arm/translate-vfp.inc.c
@@ -XXX,XX +XXX,XX @@ static bool full_vfp_access_check(DisasContext *s, bool ignore_vfp_enabled)
 
     if (!s->vfp_enabled && !ignore_vfp_enabled) {
         assert(!arm_dc_feature(s, ARM_FEATURE_M));
-        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
-                           default_exception_el(s));
+        unallocated_encoding(s);
         return false;
     }
 
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_bkpt_insn(DisasContext *s, uint32_t syn)
     s->base.is_jmp = DISAS_NORETURN;
 }
 
+void unallocated_encoding(DisasContext *s)
+{
+    /* Unallocated and reserved encodings are uncategorized */
+    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
+                       default_exception_el(s));
+}
+
 /* Force a TB lookup after an instruction that changes the CPU state.  */
 static inline void gen_lookup_tb(DisasContext *s)
 {
@@ -XXX,XX +XXX,XX @@ static inline void gen_hlt(DisasContext *s, int imm)
         return;
     }
 
-    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
-                       default_exception_el(s));
+    unallocated_encoding(s);
 }
 
 static inline void gen_add_data_offset(DisasContext *s, unsigned int insn,
@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
     }
 
     if (undef) {
-        gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
-                           default_exception_el(s));
+        unallocated_encoding(s);
         return;
     }
 
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
             break;
         default:
         illegal_op:
-            gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
-                               default_exception_el(s));
+            unallocated_encoding(s);
             break;
         }
     }
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
     }
     return;
 illegal_op:
-    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
-                       default_exception_el(s));
+    unallocated_encoding(s);
 }
 
 static void disas_thumb_insn(DisasContext *s, uint32_t insn)
@@ -XXX,XX +XXX,XX @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn)
     return;
 illegal_op:
 undef:
-    gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syn_uncategorized(),
-                       default_exception_el(s));
+    unallocated_encoding(s);
 }
 
 static bool insn_crosses_page(CPUARMState *env, DisasContext *s)
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Replace x = double_saturate(y) with x = add_saturate(y, y).
There is no need for a separate more specialized helper.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h    |  1 -
 target/arm/op_helper.c | 15 ---------------
 target/arm/translate.c |  4 ++--
 3 files changed, 2 insertions(+), 18 deletions(-)

From: Andrew Jones <drjones@redhat.com>

If -cpu <cpu>,aarch64=off is used then KVM must also be used, and it
and the host must support running the vcpu in 32-bit mode. Also, if
-cpu <cpu>,aarch64=on is used, then it doesn't matter if kvm is
enabled or not.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/kvm_arm.h | 14 ++++++++++++++
 target/arm/cpu64.c   | 12 ++++++------
 target/arm/kvm64.c   |  9 +++++++++
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf);
  */
 void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu);
 
+/**
+ * kvm_arm_aarch32_supported:
+ * @cs: CPUState
+ *
+ * Returns: true if the KVM VCPU can enable AArch32 mode
+ * and false otherwise.
+ */
+bool kvm_arm_aarch32_supported(CPUState *cs);
+
 /**
  * kvm_arm_get_max_vm_ipa_size - Returns the number of bits in the
  * IPA address space supported by KVM
@@ -XXX,XX +XXX,XX @@ static inline void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
     cpu->host_cpu_probe_failed = true;
 }
 
+static inline bool kvm_arm_aarch32_supported(CPUState *cs)
+{
+    return false;
+}
+
 static inline int kvm_arm_get_max_vm_ipa_size(MachineState *ms)
 {
     return -ENOENT;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_set_aarch64(Object *obj, bool value, Error **errp)
      * restriction allows us to avoid fixing up functionality that assumes a
      * uniform execution state like do_interrupt.
      */
-    if (!kvm_enabled()) {
-        error_setg(errp, "'aarch64' feature cannot be disabled "
-                         "unless KVM is enabled");
-        return;
-    }
-
     if (value == false) {
+        if (!kvm_enabled() || !kvm_arm_aarch32_supported(CPU(cpu))) {
+            error_setg(errp, "'aarch64' feature cannot be disabled "
+                             "unless KVM is enabled and 32-bit EL1 "
+                             "is supported");
+            return;
+        }
         unset_feature(&cpu->env, ARM_FEATURE_AARCH64);
     } else {
         set_feature(&cpu->env, ARM_FEATURE_AARCH64);
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@
 #include "exec/gdbstub.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/kvm.h"
+#include "sysemu/kvm_int.h"
 #include "kvm_arm.h"
+#include "hw/boards.h"
 #include "internals.h"
 
 static bool have_guest_debug;
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
     return true;
 }
 
+bool kvm_arm_aarch32_supported(CPUState *cpu)
+{
+    KVMState *s = KVM_STATE(current_machine->accelerator);
+
+    return kvm_check_extension(s, KVM_CAP_ARM_EL1_32BIT);
+}
+
 #define ARM_CPU_ID_MPIDR       3, 0, 0, 0, 5
 
 int kvm_arch_init_vcpu(CPUState *cs)
-- 
2.20.1

From: Andrew Jones <drjones@redhat.com>

We first convert the pmu property from a static property to one with
its own accessors. Then we use the set accessor to check if the PMU is
supported when using KVM. Indeed a 32-bit KVM host does not support
the PMU, so this check will catch an attempt to use it at property-set
time.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/kvm_arm.h | 14 ++++++++++++++
 target/arm/cpu.c     | 30 +++++++++++++++++++++++++-----
 target/arm/kvm.c     |  7 +++++++
 3 files changed, 46 insertions(+), 5 deletions(-)

diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -XXX,XX +XXX,XX @@ void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu);
  */
 bool kvm_arm_aarch32_supported(CPUState *cs);
 
+/**
+ * bool kvm_arm_pmu_supported:
+ * @cs: CPUState
+ *
+ * Returns: true if the KVM VCPU can enable its PMU
+ * and false otherwise.
+ */
+bool kvm_arm_pmu_supported(CPUState *cs);
+
 /**
  * kvm_arm_get_max_vm_ipa_size - Returns the number of bits in the
  * IPA address space supported by KVM
@@ -XXX,XX +XXX,XX @@ static inline bool kvm_arm_aarch32_supported(CPUState *cs)
     return false;
 }
 
+static inline bool kvm_arm_pmu_supported(CPUState *cs)
+{
+    return false;
+}
+
 static inline int kvm_arm_get_max_vm_ipa_size(MachineState *ms)
 {
     return -ENOENT;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_has_el3_property =
 static Property arm_cpu_cfgend_property =
             DEFINE_PROP_BOOL("cfgend", ARMCPU, cfgend, false);
 
-/* use property name "pmu" to match other archs and virt tools */
-static Property arm_cpu_has_pmu_property =
-            DEFINE_PROP_BOOL("pmu", ARMCPU, has_pmu, true);
-
 static Property arm_cpu_has_vfp_property =
             DEFINE_PROP_BOOL("vfp", ARMCPU, has_vfp, true);
 
@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_pmsav7_dregion_property =
                                            pmsav7_dregion,
                                            qdev_prop_uint32, uint32_t);
 
+static bool arm_get_pmu(Object *obj, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    return cpu->has_pmu;
+}
+
+static void arm_set_pmu(Object *obj, bool value, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    if (value) {
+        if (kvm_enabled() && !kvm_arm_pmu_supported(CPU(cpu))) {
+            error_setg(errp, "'pmu' feature not supported by KVM on this host");
+            return;
+        }
+        set_feature(&cpu->env, ARM_FEATURE_PMU);
+    } else {
+        unset_feature(&cpu->env, ARM_FEATURE_PMU);
+    }
+    cpu->has_pmu = value;
+}
+
 static void arm_get_init_svtor(Object *obj, Visitor *v, const char *name,
                                void *opaque, Error **errp)
 {
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
     }
 
     if (arm_feature(&cpu->env, ARM_FEATURE_PMU)) {
-        qdev_property_add_static(DEVICE(obj), &arm_cpu_has_pmu_property,
+        cpu->has_pmu = true;
+        object_property_add_bool(obj, "pmu", arm_get_pmu, arm_set_pmu,
                                  &error_abort);
     }
 
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -XXX,XX +XXX,XX @@ void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
     env->features = arm_host_cpu_features.features;
 }
 
+bool kvm_arm_pmu_supported(CPUState *cpu)
+{
+    KVMState *s = KVM_STATE(current_machine->accelerator);
+
+    return kvm_check_extension(s, KVM_CAP_ARM_PMU_V3);
+}
+
 int kvm_arm_get_max_vm_ipa_size(MachineState *ms)
 {
     KVMState *s = KVM_STATE(ms->accelerator);
-- 
2.20.1

From: Andrew Jones <drjones@redhat.com>

The current implementation of ZCR_ELx matches the architecture, only
implementing the lower four bits, with the rest RAZ/WI. This puts
a strict limit on ARM_MAX_VQ of 16. Make sure we don't let ARM_MAX_VQ
grow without a corresponding update here.

Suggested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
     int new_len;
 
     /* Bits other than [3:0] are RAZ/WI.  */
+    QEMU_BUILD_BUG_ON(ARM_MAX_VQ > 16);
     raw_write(env, ri, value & 0xf);
 
     /*
-- 
2.20.1

From: Andrew Jones <drjones@redhat.com>

A couple return -EINVAL's forgot their '-'s.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/kvm64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
     write_cpustate_to_list(cpu, true);
 
     if (!write_list_to_kvmstate(cpu, level)) {
-        return EINVAL;
+        return -EINVAL;
     }
 
     kvm_arm_sync_mpstate_to_kvm(cpu);
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
     }
 
     if (!write_kvmstate_to_list(cpu)) {
-        return EINVAL;
+        return -EINVAL;
     }
     /* Note that it's OK to have registers which aren't in CPUState,
      * so we can ignore a failure return here.
-- 
2.20.1

From: Andrew Jones <drjones@redhat.com>

Move the getting/putting of the fpsimd registers out of
kvm_arch_get/put_registers() into their own helper functions
to prepare for alternatively getting/putting SVE registers.

No functional change.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/kvm64.c | 148 +++++++++++++++++++++++++++------------------
 1 file changed, 88 insertions(+), 60 deletions(-)

diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ int kvm_arm_cpreg_level(uint64_t regidx)
 #define AARCH64_SIMD_CTRL_REG(x)   (KVM_REG_ARM64 | KVM_REG_SIZE_U32 | \
                  KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
 
+static int kvm_arch_put_fpsimd(CPUState *cs)
+{
+    ARMCPU *cpu = ARM_CPU(cs);
+    CPUARMState *env = &cpu->env;
+    struct kvm_one_reg reg;
+    uint32_t fpr;
+    int i, ret;
+
+    for (i = 0; i < 32; i++) {
+        uint64_t *q = aa64_vfp_qreg(env, i);
+#ifdef HOST_WORDS_BIGENDIAN
+        uint64_t fp_val[2] = { q[1], q[0] };
+        reg.addr = (uintptr_t)fp_val;
+#else
+        reg.addr = (uintptr_t)q;
+#endif
+        reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]);
+        ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
+        if (ret) {
+            return ret;
+        }
+    }
+
+    reg.addr = (uintptr_t)(&fpr);
+    fpr = vfp_get_fpsr(env);
+    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpsr);
+    ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
+    if (ret) {
+        return ret;
+    }
+
+    reg.addr = (uintptr_t)(&fpr);
+    fpr = vfp_get_fpcr(env);
+    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpcr);
+    ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
+    if (ret) {
+        return ret;
+    }
+
+    return 0;
+}
+
 int kvm_arch_put_registers(CPUState *cs, int level)
 {
     struct kvm_one_reg reg;
-    uint32_t fpr;
     uint64_t val;
-    int i;
-    int ret;
+    int i, ret;
     unsigned int el;
 
     ARMCPU *cpu = ARM_CPU(cs);
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
         }
     }
 
-    /* Advanced SIMD and FP registers. */
-    for (i = 0; i < 32; i++) {
-        uint64_t *q = aa64_vfp_qreg(env, i);
-#ifdef HOST_WORDS_BIGENDIAN
-        uint64_t fp_val[2] = { q[1], q[0] };
-        reg.addr = (uintptr_t)fp_val;
-#else
-        reg.addr = (uintptr_t)q;
-#endif
-        reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]);
-        ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
-        if (ret) {
-            return ret;
-        }
-    }
-
-    reg.addr = (uintptr_t)(&fpr);
-    fpr = vfp_get_fpsr(env);
-    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpsr);
-    ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
-    if (ret) {
-        return ret;
-    }
-
-    fpr = vfp_get_fpcr(env);
-    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpcr);
-    ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
+    ret = kvm_arch_put_fpsimd(cs);
     if (ret) {
         return ret;
     }
@@ -XXX,XX +XXX,XX @@ int kvm_arch_put_registers(CPUState *cs, int level)
     return ret;
 }
 
+static int kvm_arch_get_fpsimd(CPUState *cs)
+{
+    ARMCPU *cpu = ARM_CPU(cs);
+    CPUARMState *env = &cpu->env;
+    struct kvm_one_reg reg;
+    uint32_t fpr;
+    int i, ret;
+
+    for (i = 0; i < 32; i++) {
+        uint64_t *q = aa64_vfp_qreg(env, i);
+        reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]);
+        reg.addr = (uintptr_t)q;
+        ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
+        if (ret) {
+            return ret;
+        } else {
+#ifdef HOST_WORDS_BIGENDIAN
+            uint64_t t;
+            t = q[0], q[0] = q[1], q[1] = t;
+#endif
+        }
+    }
+
+    reg.addr = (uintptr_t)(&fpr);
+    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpsr);
+    ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
+    if (ret) {
+        return ret;
+    }
+    vfp_set_fpsr(env, fpr);
+
+    reg.addr = (uintptr_t)(&fpr);
+    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpcr);
+    ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
+    if (ret) {
+        return ret;
+    }
+    vfp_set_fpcr(env, fpr);
+
+    return 0;
+}
+
 int kvm_arch_get_registers(CPUState *cs)
 {
     struct kvm_one_reg reg;
     uint64_t val;
-    uint32_t fpr;
     unsigned int el;
-    int i;
-    int ret;
+    int i, ret;
 
     ARMCPU *cpu = ARM_CPU(cs);
     CPUARMState *env = &cpu->env;
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
         env->spsr = env->banked_spsr[i];
     }
 
-    /* Advanced SIMD and FP registers */
-    for (i = 0; i < 32; i++) {
-        uint64_t *q = aa64_vfp_qreg(env, i);
-        reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]);
-        reg.addr = (uintptr_t)q;
-        ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
-        if (ret) {
-            return ret;
-        } else {
-#ifdef HOST_WORDS_BIGENDIAN
-            uint64_t t;
-            t = q[0], q[0] = q[1], q[1] = t;
-#endif
-        }
-    }
-
-    reg.addr = (uintptr_t)(&fpr);
-    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpsr);
-    ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
+    ret = kvm_arch_get_fpsimd(cs);
     if (ret) {
         return ret;
     }
-    vfp_set_fpsr(env, fpr);
-
-    reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpcr);
-    ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
-    if (ret) {
-        return ret;
-    }
-    vfp_set_fpcr(env, fpr);
 
     ret = kvm_get_vcpu_events(cpu);
     if (ret) {
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Extract is a compact combination of shift + and.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_sar(TCGv_i32 dest, TCGv_i32 t0, TCGv_i32 t1)
 
 static void shifter_out_im(TCGv_i32 var, int shift)
 {
-    if (shift == 0) {
-        tcg_gen_andi_i32(cpu_CF, var, 1);
-    } else {
-        tcg_gen_shri_i32(cpu_CF, var, shift);
-        if (shift != 31) {
-            tcg_gen_andi_i32(cpu_CF, cpu_CF, 1);
-        }
-    }
+    tcg_gen_extract_i32(cpu_CF, var, shift, 1);
 }
 
 /* Shift by immediate.  Includes special handling for shift == 0.  */
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Use deposit as the composit operation to merge the
bits from the two inputs.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                         shift = (insn >> 7) & 0x1f;
                         if (insn & (1 << 6)) {
                             /* pkhtb */
-                            if (shift == 0)
+                            if (shift == 0) {
                                 shift = 31;
+                            }
                             tcg_gen_sari_i32(tmp2, tmp2, shift);
-                            tcg_gen_andi_i32(tmp, tmp, 0xffff0000);
-                            tcg_gen_ext16u_i32(tmp2, tmp2);
+                            tcg_gen_deposit_i32(tmp, tmp, tmp2, 0, 16);
                         } else {
                             /* pkhbt */
-                            if (shift)
-                                tcg_gen_shli_i32(tmp2, tmp2, shift);
-                            tcg_gen_ext16u_i32(tmp, tmp);
-                            tcg_gen_andi_i32(tmp2, tmp2, 0xffff0000);
+                            tcg_gen_shli_i32(tmp2, tmp2, shift);
+                            tcg_gen_deposit_i32(tmp, tmp2, tmp, 0, 16);
                         }
-                        tcg_gen_or_i32(tmp, tmp, tmp2);
                         tcg_temp_free_i32(tmp2);
                         store_reg(s, rd, tmp);
                     } else if ((insn & 0x00200020) == 0x00200000) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
             shift = ((insn >> 10) & 0x1c) | ((insn >> 6) & 0x3);
             if (insn & (1 << 5)) {
                 /* pkhtb */
-                if (shift == 0)
+                if (shift == 0) {
                     shift = 31;
+                }
                 tcg_gen_sari_i32(tmp2, tmp2, shift);
-                tcg_gen_andi_i32(tmp, tmp, 0xffff0000);
-                tcg_gen_ext16u_i32(tmp2, tmp2);
+                tcg_gen_deposit_i32(tmp, tmp, tmp2, 0, 16);
             } else {
                 /* pkhbt */
-                if (shift)
-                    tcg_gen_shli_i32(tmp2, tmp2, shift);
-                tcg_gen_ext16u_i32(tmp, tmp);
-                tcg_gen_andi_i32(tmp2, tmp2, 0xffff0000);
+                tcg_gen_shli_i32(tmp2, tmp2, shift);
+                tcg_gen_deposit_i32(tmp, tmp2, tmp, 0, 16);
             }
-            tcg_gen_or_i32(tmp, tmp, tmp2);
             tcg_temp_free_i32(tmp2);
             store_reg(s, rd, tmp);
         } else {
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The immediate shift generator functions already test for,
and eliminate, the case of a shift by zero.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                         shift = (insn >> 10) & 3;
                         /* ??? In many cases it's not necessary to do a
                            rotate, a shift is sufficient.  */
-                        if (shift != 0)
-                            tcg_gen_rotri_i32(tmp, tmp, shift * 8);
+                        tcg_gen_rotri_i32(tmp, tmp, shift * 8);
                         op1 = (insn >> 20) & 7;
                         switch (op1) {
                         case 0: gen_sxtb16(tmp);  break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
             shift = (insn >> 4) & 3;
             /* ??? In many cases it's not necessary to do a
                rotate, a shift is sufficient.  */
-            if (shift != 0)
-                tcg_gen_rotri_i32(tmp, tmp, shift * 8);
+            tcg_gen_rotri_i32(tmp, tmp, shift * 8);
             op = (insn >> 20) & 7;
             switch (op) {
             case 0: gen_sxth(tmp);   break;
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                     case 7:
                         goto illegal_op;
                     default: /* Saturate.  */
-                        if (shift) {
-                            if (op & 1)
-                                tcg_gen_sari_i32(tmp, tmp, shift);
-                            else
-                                tcg_gen_shli_i32(tmp, tmp, shift);
+                        if (op & 1) {
+                            tcg_gen_sari_i32(tmp, tmp, shift);
+                        } else {
+                            tcg_gen_shli_i32(tmp, tmp, shift);
                         }
                         tmp2 = tcg_const_i32(imm);
                         if (op & 4) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                     goto illegal_op;
                 }
                 tmp = load_reg(s, rm);
-                if (shift) {
-                    tcg_gen_shli_i32(tmp, tmp, shift);
-                }
+                tcg_gen_shli_i32(tmp, tmp, shift);
                 tcg_gen_add_i32(addr, addr, tmp);
                 tcg_temp_free_i32(tmp);
                 break;
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The helper function is more documentary, and also already
handles the case of rotate by zero.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                 /* CPSR = immediate */
                 val = insn & 0xff;
                 shift = ((insn >> 8) & 0xf) * 2;
-                if (shift)
-                    val = (val >> shift) | (val << (32 - shift));
+                val = ror32(val, shift);
                 i = ((insn & (1 << 22)) != 0);
                 if (gen_set_psr_im(s, msr_mask(s, (insn >> 16) & 0xf, i),
                                    i, val)) {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
             /* immediate operand */
             val = insn & 0xff;
             shift = ((insn >> 8) & 0xf) * 2;
-            if (shift) {
-                val = (val >> shift) | (val << (32 - shift));
-            }
+            val = ror32(val, shift);
             tmp2 = tcg_temp_new_i32();
             tcg_gen_movi_i32(tmp2, val);
             if (logic_cc && shift) {
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Rotate is the more compact and obvious way to swap 16-bit
elements of a 32-bit word.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-6-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static TCGv_i64 gen_muls_i64_i32(TCGv_i32 a, TCGv_i32 b)
 /* Swap low and high halfwords.  */
 static void gen_swap_half(TCGv_i32 var)
 {
-    TCGv_i32 tmp = tcg_temp_new_i32();
-    tcg_gen_shri_i32(tmp, var, 16);
-    tcg_gen_shli_i32(var, var, 16);
-    tcg_gen_or_i32(var, var, tmp);
-    tcg_temp_free_i32(tmp);
+    tcg_gen_rotri_i32(var, var, 16);
 }
 
 /* Dual 16-bit add.  Result placed in t0 and t1 is marked as dead.
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

All of the inputs to these instructions are 32-bits.  Rather than
extend each input to 64-bits and then extract the high 32-bits of
the output, use tcg_gen_muls2_i32 and other 32-bit generator functions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-7-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 72 +++++++++++++++---------------------------
 1 file changed, 26 insertions(+), 46 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_revsh(TCGv_i32 var)
     tcg_gen_ext16s_i32(var, var);
 }
 
-/* Return (b << 32) + a. Mark inputs as dead */
-static TCGv_i64 gen_addq_msw(TCGv_i64 a, TCGv_i32 b)
-{
-    TCGv_i64 tmp64 = tcg_temp_new_i64();
-
-    tcg_gen_extu_i32_i64(tmp64, b);
-    tcg_temp_free_i32(b);
-    tcg_gen_shli_i64(tmp64, tmp64, 32);
-    tcg_gen_add_i64(a, tmp64, a);
-
-    tcg_temp_free_i64(tmp64);
-    return a;
-}
-
-/* Return (b << 32) - a. Mark inputs as dead. */
-static TCGv_i64 gen_subq_msw(TCGv_i64 a, TCGv_i32 b)
-{
-    TCGv_i64 tmp64 = tcg_temp_new_i64();
-
-    tcg_gen_extu_i32_i64(tmp64, b);
-    tcg_temp_free_i32(b);
-    tcg_gen_shli_i64(tmp64, tmp64, 32);
-    tcg_gen_sub_i64(a, tmp64, a);
-
-    tcg_temp_free_i64(tmp64);
-    return a;
-}
-
 /* 32x32->64 multiply.  Marks inputs as dead.  */
 static TCGv_i64 gen_mulu_i64_i32(TCGv_i32 a, TCGv_i32 b)
 {
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                            (SMMUL, SMMLA, SMMLS) */
                         tmp = load_reg(s, rm);
                         tmp2 = load_reg(s, rs);
-                        tmp64 = gen_muls_i64_i32(tmp, tmp2);
+                        tcg_gen_muls2_i32(tmp2, tmp, tmp, tmp2);
 
                         if (rd != 15) {
-                            tmp = load_reg(s, rd);
+                            tmp3 = load_reg(s, rd);
                             if (insn & (1 << 6)) {
-                                tmp64 = gen_subq_msw(tmp64, tmp);
+                                tcg_gen_sub_i32(tmp, tmp, tmp3);
                             } else {
-                                tmp64 = gen_addq_msw(tmp64, tmp);
+                                tcg_gen_add_i32(tmp, tmp, tmp3);
                             }
+                            tcg_temp_free_i32(tmp3);
                         }
                         if (insn & (1 << 5)) {
-                            tcg_gen_addi_i64(tmp64, tmp64, 0x80000000u);
+                            /*
+                             * Adding 0x80000000 to the 64-bit quantity
+                             * means that we have carry in to the high
+                             * word when the low word has the high bit set.
+                             */
+                            tcg_gen_shri_i32(tmp2, tmp2, 31);
+                            tcg_gen_add_i32(tmp, tmp, tmp2);
                         }
-                        tcg_gen_shri_i64(tmp64, tmp64, 32);
-                        tmp = tcg_temp_new_i32();
-                        tcg_gen_extrl_i64_i32(tmp, tmp64);
-                        tcg_temp_free_i64(tmp64);
+                        tcg_temp_free_i32(tmp2);
                         store_reg(s, rn, tmp);
                         break;
                     case 0:
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                   }
                 break;
             case 5: case 6: /* 32 * 32 -> 32msb (SMMUL, SMMLA, SMMLS) */
-                tmp64 = gen_muls_i64_i32(tmp, tmp2);
+                tcg_gen_muls2_i32(tmp2, tmp, tmp, tmp2);
                 if (rs != 15) {
-                    tmp = load_reg(s, rs);
+                    tmp3 = load_reg(s, rs);
                     if (insn & (1 << 20)) {
-                        tmp64 = gen_addq_msw(tmp64, tmp);
+                        tcg_gen_add_i32(tmp, tmp, tmp3);
                     } else {
-                        tmp64 = gen_subq_msw(tmp64, tmp);
+                        tcg_gen_sub_i32(tmp, tmp, tmp3);
                     }
+                    tcg_temp_free_i32(tmp3);
                 }
                 if (insn & (1 << 4)) {
-                    tcg_gen_addi_i64(tmp64, tmp64, 0x80000000u);
+                    /*
+                     * Adding 0x80000000 to the 64-bit quantity
+                     * means that we have carry in to the high
+                     * word when the low word has the high bit set.
+                     */
+                    tcg_gen_shri_i32(tmp2, tmp2, 31);
+                    tcg_gen_add_i32(tmp, tmp, tmp2);
                 }
-                tcg_gen_shri_i64(tmp64, tmp64, 32);
-                tmp = tcg_temp_new_i32();
-                tcg_gen_extrl_i64_i32(tmp, tmp64);
-                tcg_temp_free_i64(tmp64);
+                tcg_temp_free_i32(tmp2);
                 break;
             case 7: /* Unsigned sum of absolute differences.  */
                 gen_helper_usad8(tmp, tmp, tmp2);
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Separate shift + extract low will result in one extra insn
for hosts like RISC-V, MIPS, and Sparc.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-8-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_iwmmxt_insn(DisasContext *s, uint32_t insn)
             if (insn & ARM_CP_RW_BIT) {                         /* TMRRC */
                 iwmmxt_load_reg(cpu_V0, wrd);
                 tcg_gen_extrl_i64_i32(cpu_R[rdlo], cpu_V0);
-                tcg_gen_shri_i64(cpu_V0, cpu_V0, 32);
-                tcg_gen_extrl_i64_i32(cpu_R[rdhi], cpu_V0);
+                tcg_gen_extrh_i64_i32(cpu_R[rdhi], cpu_V0);
             } else {                                    /* TMCRR */
                 tcg_gen_concat_i32_i64(cpu_V0, cpu_R[rdlo], cpu_R[rdhi]);
                 iwmmxt_store_reg(cpu_V0, wrd);
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
         if (insn & ARM_CP_RW_BIT) {                     /* MRA */
             iwmmxt_load_reg(cpu_V0, acc);
             tcg_gen_extrl_i64_i32(cpu_R[rdlo], cpu_V0);
-            tcg_gen_shri_i64(cpu_V0, cpu_V0, 32);
-            tcg_gen_extrl_i64_i32(cpu_R[rdhi], cpu_V0);
+            tcg_gen_extrh_i64_i32(cpu_R[rdhi], cpu_V0);
             tcg_gen_andi_i32(cpu_R[rdhi], cpu_R[rdhi], (1 << (40 - 32)) - 1);
         } else {                                        /* MAR */
             tcg_gen_concat_i32_i64(cpu_V0, cpu_R[rdlo], cpu_R[rdhi]);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                 gen_helper_neon_narrow_high_u16(tmp, cpu_V0);
                                 break;
                             case 2:
-                                tcg_gen_shri_i64(cpu_V0, cpu_V0, 32);
-                                tcg_gen_extrl_i64_i32(tmp, cpu_V0);
+                                tcg_gen_extrh_i64_i32(tmp, cpu_V0);
                                 break;
                             default: abort();
                             }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                 break;
                             case 2:
                                 tcg_gen_addi_i64(cpu_V0, cpu_V0, 1u << 31);
-                                tcg_gen_shri_i64(cpu_V0, cpu_V0, 32);
-                                tcg_gen_extrl_i64_i32(tmp, cpu_V0);
+                                tcg_gen_extrh_i64_i32(tmp, cpu_V0);
                                 break;
                             default: abort();
                             }
@@ -XXX,XX +XXX,XX @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
                 tmp = tcg_temp_new_i32();
                 tcg_gen_extrl_i64_i32(tmp, tmp64);
                 store_reg(s, rt, tmp);
-                tcg_gen_shri_i64(tmp64, tmp64, 32);
                 tmp = tcg_temp_new_i32();
-                tcg_gen_extrl_i64_i32(tmp, tmp64);
+                tcg_gen_extrh_i64_i32(tmp, tmp64);
                 tcg_temp_free_i64(tmp64);
                 store_reg(s, rt2, tmp);
             } else {
@@ -XXX,XX +XXX,XX @@ static void gen_storeq_reg(DisasContext *s, int rlow, int rhigh, TCGv_i64 val)
     tcg_gen_extrl_i64_i32(tmp, val);
     store_reg(s, rlow, tmp);
     tmp = tcg_temp_new_i32();
-    tcg_gen_shri_i64(val, val, 32);
-    tcg_gen_extrl_i64_i32(tmp, val);
+    tcg_gen_extrh_i64_i32(tmp, val);
     store_reg(s, rhigh, tmp);
 }
 
-- 
2.20.1

Big pullreq this week, though none of the new features are
particularly earthshaking. Most of the bulk is from code cleanup
patches from me or rth.

thanks
-- PMM

The following changes since commit b651b80822fa8cb66ca30087ac7fbc75507ae5d2:

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-5.0-pull-request' into staging (2020-02-20 17:35:42 +0000)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200221

for you to fetch changes up to 270a679b3f950d7c4c600f324aab8bff292d0971:

target/arm: Add missing checks for fpsp_v2 (2020-02-21 12:54:25 +0000)

----------------------------------------------------------------
target-arm queue:
 * aspeed/scu: Implement chip ID register
 * hw/misc/iotkit-secctl: Fix writing to 'PPC Interrupt Clear' register
 * mainstone: Make providing flash images non-mandatory
 * z2: Make providing flash images non-mandatory
 * Fix failures to flush SVE high bits after AdvSIMD INS/ZIP/UZP/TRN/TBL/TBX/EXT
 * Minor performance improvement: spend less time recalculating hflags values
 * Code cleanup to isar_feature function tests
 * Implement ARMv8.1-PMU and ARMv8.4-PMU extensions
 * Bugfix: correct handling of PMCR_EL0.LC bit
 * Bugfix: correct definition of PMCRDP
 * Correctly implement ACTLR2, HACTLR2
 * allwinner: Wire up USB ports
 * Vectorize emulation of USHL, SSHL, PMUL*
 * xilinx_spips: Correct the number of dummy cycles for the FAST_READ_4 cmd
 * sh4: Fix PCI ISA IO memory subregion
 * Code cleanup to use more isar_feature tests and fewer ARM_FEATURE_* tests

----------------------------------------------------------------
Francisco Iglesias (1):
      xilinx_spips: Correct the number of dummy cycles for the FAST_READ_4 cmd

Guenter Roeck (6):
      mainstone: Make providing flash images non-mandatory
      z2: Make providing flash images non-mandatory
      hw: usb: hcd-ohci: Move OHCISysBusState and TYPE_SYSBUS_OHCI to include file
      hcd-ehci: Introduce "companion-enable" sysbus property
      arm: allwinner: Wire up USB ports
      sh4: Fix PCI ISA IO memory subregion

Joel Stanley (2):
      aspeed/scu: Create separate write callbacks
      aspeed/scu: Implement chip ID register

Peter Maydell (21):
      target/arm: Add _aa32_ to isar_feature functions testing 32-bit ID registers
      target/arm: Check aa32_pan in take_aarch32_exception(), not aa64_pan
      target/arm: Add isar_feature_any_fp16 and document naming/usage conventions
      target/arm: Define and use any_predinv isar_feature test
      target/arm: Factor out PMU register definitions
      target/arm: Add and use FIELD definitions for ID_AA64DFR0_EL1
      target/arm: Use FIELD macros for clearing ID_DFR0 PERFMON field
      target/arm: Define an aa32_pmu_8_1 isar feature test function
      target/arm: Add _aa64_ and _any_ versions of pmu_8_1 isar checks
      target/arm: Stop assuming DBGDIDR always exists
      target/arm: Move DBGDIDR into ARMISARegisters
      target/arm: Read debug-related ID registers from KVM
      target/arm: Implement ARMv8.1-PMU extension
      target/arm: Implement ARMv8.4-PMU extension
      target/arm: Provide ARMv8.4-PMU in '-cpu max'
      target/arm: Correct definition of PMCRDP
      target/arm: Correct handling of PMCR_EL0.LC bit
      target/arm: Test correct register in aa32_pan and aa32_ats1e1 checks
      target/arm: Use isar_feature function for testing AA32HPD feature
      target/arm: Use FIELD_EX32 for testing 32-bit fields
      target/arm: Correctly implement ACTLR2, HACTLR2

Philippe Mathieu-Daudé (1):
      hw/misc/iotkit-secctl: Fix writing to 'PPC Interrupt Clear' register

Richard Henderson (21):
      target/arm: Flush high bits of sve register after AdvSIMD EXT
      target/arm: Flush high bits of sve register after AdvSIMD TBL/TBX
      target/arm: Flush high bits of sve register after AdvSIMD ZIP/UZP/TRN
      target/arm: Flush high bits of sve register after AdvSIMD INS
      target/arm: Use bit 55 explicitly for pauth
      target/arm: Fix select for aa64_va_parameters_both
      target/arm: Remove ttbr1_valid check from get_phys_addr_lpae
      target/arm: Split out aa64_va_parameter_tbi, aa64_va_parameter_tbid
      target/arm: Vectorize USHL and SSHL
      target/arm: Convert PMUL.8 to gvec
      target/arm: Convert PMULL.64 to gvec
      target/arm: Convert PMULL.8 to gvec
      target/arm: Rename isar_feature_aa32_simd_r32
      target/arm: Use isar_feature_aa32_simd_r32 more places
      target/arm: Set MVFR0.FPSP for ARMv5 cpus
      target/arm: Add isar_feature_aa32_simd_r16
      target/arm: Rename isar_feature_aa32_fpdp_v2
      target/arm: Add isar_feature_aa32_{fpsp_v2, fpsp_v3, fpdp_v3}
      target/arm: Perform fpdp_v2 check first
      target/arm: Replace ARM_FEATURE_VFP3 checks with fp{sp, dp}_v3
      target/arm: Add missing checks for fpsp_v2

From: Joel Stanley <joel@jms.id.au>

This splits the common write callback into separate ast2400 and ast2500
implementations. This makes it clearer when implementing differing
behaviour.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200121013302.43839-2-joel@jms.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/misc/aspeed_scu.c | 80 +++++++++++++++++++++++++++++++-------------
 1 file changed, 57 insertions(+), 23 deletions(-)

diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/aspeed_scu.c
+++ b/hw/misc/aspeed_scu.c
@@ -XXX,XX +XXX,XX @@ static uint64_t aspeed_scu_read(void *opaque, hwaddr offset, unsigned size)
     return s->regs[reg];
 }
 
-static void aspeed_scu_write(void *opaque, hwaddr offset, uint64_t data,
-                             unsigned size)
+static void aspeed_ast2400_scu_write(void *opaque, hwaddr offset,
+                                     uint64_t data, unsigned size)
+{
+    AspeedSCUState *s = ASPEED_SCU(opaque);
+    int reg = TO_REG(offset);
+
+    if (reg >= ASPEED_SCU_NR_REGS) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: Out-of-bounds write at offset 0x%" HWADDR_PRIx "\n",
+                      __func__, offset);
+        return;
+    }
+
+    if (reg > PROT_KEY && reg < CPU2_BASE_SEG1 &&
+            !s->regs[PROT_KEY]) {
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: SCU is locked!\n", __func__);
+    }
+
+    trace_aspeed_scu_write(offset, size, data);
+
+    switch (reg) {
+    case PROT_KEY:
+        s->regs[reg] = (data == ASPEED_SCU_PROT_KEY) ? 1 : 0;
+        return;
+    case SILICON_REV:
+    case FREQ_CNTR_EVAL:
+    case VGA_SCRATCH1 ... VGA_SCRATCH8:
+    case RNG_DATA:
+    case FREE_CNTR4:
+    case FREE_CNTR4_EXT:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: Write to read-only offset 0x%" HWADDR_PRIx "\n",
+                      __func__, offset);
+        return;
+    }
+
+    s->regs[reg] = data;
+}
+
+static void aspeed_ast2500_scu_write(void *opaque, hwaddr offset,
+                                     uint64_t data, unsigned size)
 {
     AspeedSCUState *s = ASPEED_SCU(opaque);
     int reg = TO_REG(offset);
@@ -XXX,XX +XXX,XX @@ static void aspeed_scu_write(void *opaque, hwaddr offset, uint64_t data,
     case PROT_KEY:
         s->regs[reg] = (data == ASPEED_SCU_PROT_KEY) ? 1 : 0;
         return;
-    case CLK_SEL:
-        s->regs[reg] = data;
-        break;
     case HW_STRAP1:
-        if (ASPEED_IS_AST2500(s->regs[SILICON_REV])) {
-            s->regs[HW_STRAP1] |= data;
-            return;
-        }
-        /* Jump to assignment below */
-        break;
+        s->regs[HW_STRAP1] |= data;
+        return;
     case SILICON_REV:
-        if (ASPEED_IS_AST2500(s->regs[SILICON_REV])) {
-            s->regs[HW_STRAP1] &= ~data;
-        } else {
-            qemu_log_mask(LOG_GUEST_ERROR,
-                          "%s: Write to read-only offset 0x%" HWADDR_PRIx "\n",
-                          __func__, offset);
-        }
-        /* Avoid assignment below, we've handled everything */
+        s->regs[HW_STRAP1] &= ~data;
         return;
     case FREQ_CNTR_EVAL:
     case VGA_SCRATCH1 ... VGA_SCRATCH8:
@@ -XXX,XX +XXX,XX @@ static void aspeed_scu_write(void *opaque, hwaddr offset, uint64_t data,
     s->regs[reg] = data;
 }
 
-static const MemoryRegionOps aspeed_scu_ops = {
+static const MemoryRegionOps aspeed_ast2400_scu_ops = {
     .read = aspeed_scu_read,
-    .write = aspeed_scu_write,
+    .write = aspeed_ast2400_scu_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid.min_access_size = 4,
+    .valid.max_access_size = 4,
+    .valid.unaligned = false,
+};
+
+static const MemoryRegionOps aspeed_ast2500_scu_ops = {
+    .read = aspeed_scu_read,
+    .write = aspeed_ast2500_scu_write,
     .endianness = DEVICE_LITTLE_ENDIAN,
     .valid.min_access_size = 4,
     .valid.max_access_size = 4,
@@ -XXX,XX +XXX,XX @@ static void aspeed_2400_scu_class_init(ObjectClass *klass, void *data)
     asc->calc_hpll = aspeed_2400_scu_calc_hpll;
     asc->apb_divider = 2;
     asc->nr_regs = ASPEED_SCU_NR_REGS;
-    asc->ops = &aspeed_scu_ops;
+    asc->ops = &aspeed_ast2400_scu_ops;
 }
 
 static const TypeInfo aspeed_2400_scu_info = {
@@ -XXX,XX +XXX,XX @@ static void aspeed_2500_scu_class_init(ObjectClass *klass, void *data)
     asc->calc_hpll = aspeed_2500_scu_calc_hpll;
     asc->apb_divider = 4;
     asc->nr_regs = ASPEED_SCU_NR_REGS;
-    asc->ops = &aspeed_scu_ops;
+    asc->ops = &aspeed_ast2500_scu_ops;
 }
 
 static const TypeInfo aspeed_2500_scu_info = {
-- 
2.20.1

From: Joel Stanley <joel@jms.id.au>

This returns a fixed but non-zero value for the chip id.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200121013302.43839-3-joel@jms.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/misc/aspeed_scu.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/aspeed_scu.c
+++ b/hw/misc/aspeed_scu.c
@@ -XXX,XX +XXX,XX @@
 #define CPU2_BASE_SEG4       TO_REG(0x110)
 #define CPU2_BASE_SEG5       TO_REG(0x114)
 #define CPU2_CACHE_CTRL      TO_REG(0x118)
+#define CHIP_ID0             TO_REG(0x150)
+#define CHIP_ID1             TO_REG(0x154)
 #define UART_HPLL_CLK        TO_REG(0x160)
 #define PCIE_CTRL            TO_REG(0x180)
 #define BMC_MMIO_CTRL        TO_REG(0x184)
@@ -XXX,XX +XXX,XX @@
 #define AST2600_HW_STRAP2_PROT    TO_REG(0x518)
 #define AST2600_RNG_CTRL          TO_REG(0x524)
 #define AST2600_RNG_DATA          TO_REG(0x540)
+#define AST2600_CHIP_ID0          TO_REG(0x5B0)
+#define AST2600_CHIP_ID1          TO_REG(0x5B4)
 
 #define AST2600_CLK TO_REG(0x40)
 
@@ -XXX,XX +XXX,XX @@ static const uint32_t ast2500_a1_resets[ASPEED_SCU_NR_REGS] = {
      [CPU2_BASE_SEG1]  = 0x80000000U,
      [CPU2_BASE_SEG4]  = 0x1E600000U,
      [CPU2_BASE_SEG5]  = 0xC0000000U,
+     [CHIP_ID0]        = 0x1234ABCDU,
+     [CHIP_ID1]        = 0x88884444U,
      [UART_HPLL_CLK]   = 0x00001903U,
      [PCIE_CTRL]       = 0x0000007BU,
      [BMC_DEV_ID]      = 0x00002402U
@@ -XXX,XX +XXX,XX @@ static void aspeed_ast2500_scu_write(void *opaque, hwaddr offset,
     case RNG_DATA:
     case FREE_CNTR4:
     case FREE_CNTR4_EXT:
+    case CHIP_ID0:
+    case CHIP_ID1:
         qemu_log_mask(LOG_GUEST_ERROR,
                       "%s: Write to read-only offset 0x%" HWADDR_PRIx "\n",
                       __func__, offset);
@@ -XXX,XX +XXX,XX @@ static void aspeed_ast2600_scu_write(void *opaque, hwaddr offset,
     case AST2600_RNG_DATA:
     case AST2600_SILICON_REV:
     case AST2600_SILICON_REV2:
+    case AST2600_CHIP_ID0:
+    case AST2600_CHIP_ID1:
         /* Add read only registers here */
         qemu_log_mask(LOG_GUEST_ERROR,
                       "%s: Write to read-only offset 0x%" HWADDR_PRIx "\n",
@@ -XXX,XX +XXX,XX @@ static const uint32_t ast2600_a0_resets[ASPEED_AST2600_SCU_NR_REGS] = {
     [AST2600_CLK_STOP_CTRL2]    = 0xFFF0FFF0,
     [AST2600_SDRAM_HANDSHAKE]   = 0x00000040,  /* SoC completed DRAM init */
     [AST2600_HPLL_PARAM]        = 0x1000405F,
+    [AST2600_CHIP_ID0]          = 0x1234ABCD,
+    [AST2600_CHIP_ID1]          = 0x88884444,
+
 };
 
 static void aspeed_ast2600_scu_reset(DeviceState *dev)
-- 
2.20.1

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

Fix warning reported by Clang static code analyzer:

CC      hw/misc/iotkit-secctl.o
  hw/misc/iotkit-secctl.c:343:9: warning: Value stored to 'value' is never read
          value &= 0x00f000f3;
          ^        ~~~~~~~~~~

Fixes: b3717c23e1c
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200217132922.24607-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/misc/iotkit-secctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/misc/iotkit-secctl.c b/hw/misc/iotkit-secctl.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/iotkit-secctl.c
+++ b/hw/misc/iotkit-secctl.c
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
         qemu_set_irq(s->sec_resp_cfg, s->secrespcfg);
         break;
     case A_SECPPCINTCLR:
-        value &= 0x00f000f3;
+        s->secppcintstat &= ~(value & 0x00f000f3);
         foreach_ppc(s, iotkit_secctl_ppc_update_irq_clear);
         break;
     case A_SECPPCINTEN:
-- 
2.20.1

From: Guenter Roeck <linux@roeck-us.net>

Up to now, the mainstone machine only boots if two flash images are
provided. This is not really necessary; the machine can boot from initrd
or from SD without it. At the same time, having to provide dummy flash
images is a nuisance and does not add any real value. Make it optional.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200217210824.18513-1-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/mainstone.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/hw/arm/mainstone.c b/hw/arm/mainstone.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/mainstone.c
+++ b/hw/arm/mainstone.c
@@ -XXX,XX +XXX,XX @@ static void mainstone_common_init(MemoryRegion *address_space_mem,
     /* There are two 32MiB flash devices on the board */
     for (i = 0; i < 2; i ++) {
         dinfo = drive_get(IF_PFLASH, 0, i);
-        if (!dinfo) {
-            if (qtest_enabled()) {
-                break;
-            }
-            error_report("Two flash images must be given with the "
-                         "'pflash' parameter");
-            exit(1);
-        }
-
         if (!pflash_cfi01_register(mainstone_flash_base[i],
                                    i ? "mainstone.flash1" : "mainstone.flash0",
                                    MAINSTONE_FLASH,
-                                   blk_by_legacy_dinfo(dinfo),
+                                   dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
                                    sector_len, 4, 0, 0, 0, 0, be)) {
             error_report("Error registering flash memory");
             exit(1);
-- 
2.20.1

From: Guenter Roeck <linux@roeck-us.net>

Up to now, the z2 machine only boots if a flash image is provided.
This is not really necessary; the machine can boot from initrd or from
SD without it. At the same time, having to provide dummy flash images
is a nuisance and does not add any real value. Make it optional.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200217210903.18602-1-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/z2.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/hw/arm/z2.c b/hw/arm/z2.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/z2.c
+++ b/hw/arm/z2.c
@@ -XXX,XX +XXX,XX @@ static void z2_init(MachineState *machine)
     be = 0;
 #endif
     dinfo = drive_get(IF_PFLASH, 0, 0);
-    if (!dinfo && !qtest_enabled()) {
-        error_report("Flash image must be given with the "
-                     "'pflash' parameter");
-        exit(1);
-    }
-
     if (!pflash_cfi01_register(Z2_FLASH_BASE, "z2.flash0", Z2_FLASH_SIZE,
                                dinfo ? blk_by_legacy_dinfo(dinfo) : NULL,
                                sector_len, 4, 0, 0, 0, 0, be)) {
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Writes to AdvSIMD registers flush the bits above 128.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214194643.23317-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_simd_inse(DisasContext *s, int rd, int rn,
     write_vec_element(s, tmp, rd, dst_index, size);
 
     tcg_temp_free_i64(tmp);
+
+    /* INS is considered a 128-bit write for SVE. */
+    clear_vec_high(s, true, rd);
 }
 
 
@@ -XXX,XX +XXX,XX @@ static void handle_simd_insg(DisasContext *s, int rd, int rn, int imm5)
 
     idx = extract32(imm5, 1 + size, 4 - size);
     write_vec_element(s, cpu_reg(s, rn), rd, idx, size);
+
+    /* INS is considered a 128-bit write for SVE. */
+    clear_vec_high(s, true, rd);
 }
 
 /*
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The psuedocode in aarch64/functions/pac/auth/Auth and
aarch64/functions/pac/strip/Strip always uses bit 55 for
extfield and do not consider if the current regime has 2 ranges.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200216194343.21331-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/pauth_helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/pauth_helper.c
+++ b/target/arm/pauth_helper.c
@@ -XXX,XX +XXX,XX @@ static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
 
 static uint64_t pauth_original_ptr(uint64_t ptr, ARMVAParameters param)
 {
-    uint64_t extfield = -param.select;
+    /* Note that bit 55 is used whether or not the regime has 2 ranges. */
+    uint64_t extfield = sextract64(ptr, 55, 1);
     int bot_pac_bit = 64 - param.tsz;
     int top_pac_bit = 64 - 8 * param.tbi;
 
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Select should always be 0 for a regime with one range.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216194343.21331-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 46 +++++++++++++++++++++++----------------------
 1 file changed, 24 insertions(+), 22 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
     bool tbi, tbid, epd, hpd, using16k, using64k;
     int select, tsz;
 
-    /*
-     * Bit 55 is always between the two regions, and is canonical for
-     * determining if address tagging is enabled.
-     */
-    select = extract64(va, 55, 1);
-
     if (!regime_has_2_ranges(mmu_idx)) {
+        select = 0;
         tsz = extract32(tcr, 0, 6);
         using64k = extract32(tcr, 14, 1);
         using16k = extract32(tcr, 15, 1);
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
             tbid = extract32(tcr, 29, 1);
         }
         epd = false;
-    } else if (!select) {
-        tsz = extract32(tcr, 0, 6);
-        epd = extract32(tcr, 7, 1);
-        using64k = extract32(tcr, 14, 1);
-        using16k = extract32(tcr, 15, 1);
-        tbi = extract64(tcr, 37, 1);
-        hpd = extract64(tcr, 41, 1);
-        tbid = extract64(tcr, 51, 1);
     } else {
-        int tg = extract32(tcr, 30, 2);
-        using16k = tg == 1;
-        using64k = tg == 3;
-        tsz = extract32(tcr, 16, 6);
-        epd = extract32(tcr, 23, 1);
-        tbi = extract64(tcr, 38, 1);
-        hpd = extract64(tcr, 42, 1);
-        tbid = extract64(tcr, 52, 1);
+        /*
+         * Bit 55 is always between the two regions, and is canonical for
+         * determining if address tagging is enabled.
+         */
+        select = extract64(va, 55, 1);
+        if (!select) {
+            tsz = extract32(tcr, 0, 6);
+            epd = extract32(tcr, 7, 1);
+            using64k = extract32(tcr, 14, 1);
+            using16k = extract32(tcr, 15, 1);
+            tbi = extract64(tcr, 37, 1);
+            hpd = extract64(tcr, 41, 1);
+            tbid = extract64(tcr, 51, 1);
+        } else {
+            int tg = extract32(tcr, 30, 2);
+            using16k = tg == 1;
+            using64k = tg == 3;
+            tsz = extract32(tcr, 16, 6);
+            epd = extract32(tcr, 23, 1);
+            tbi = extract64(tcr, 38, 1);
+            hpd = extract64(tcr, 42, 1);
+            tbid = extract64(tcr, 52, 1);
+        }
     }
     tsz = MIN(tsz, 39);  /* TODO: ARMv8.4-TTST */
     tsz = MAX(tsz, 16);  /* TODO: ARMv8.2-LVA  */
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Now that aa64_va_parameters_both sets select based on the number
of ranges in the regime, the ttbr1_valid check is redundant.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216194343.21331-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
     TCR *tcr = regime_tcr(env, mmu_idx);
     int ap, ns, xn, pxn;
     uint32_t el = regime_el(env, mmu_idx);
-    bool ttbr1_valid;
     uint64_t descaddrmask;
     bool aarch64 = arm_el_is_aa64(env, el);
     bool guarded = false;
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
         param = aa64_va_parameters(env, address, mmu_idx,
                                    access_type != MMU_INST_FETCH);
         level = 0;
-        ttbr1_valid = regime_has_2_ranges(mmu_idx);
         addrsize = 64 - 8 * param.tbi;
         inputsize = 64 - param.tsz;
     } else {
         param = aa32_va_parameters(env, address, mmu_idx);
         level = 1;
-        /* There is no TTBR1 for EL2 */
-        ttbr1_valid = (el != 2);
         addrsize = (mmu_idx == ARMMMUIdx_Stage2 ? 40 : 32);
         inputsize = addrsize - param.tsz;
     }
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
     if (inputsize < addrsize) {
         target_ulong top_bits = sextract64(address, inputsize,
                                            addrsize - inputsize);
-        if (-top_bits != param.select || (param.select && !ttbr1_valid)) {
+        if (-top_bits != param.select) {
             /* The gap between the two regions is a Translation fault */
             fault_type = ARMFault_Translation;
             goto do_fault;
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

For the purpose of rebuild_hflags_a64, we do not need to compute
all of the va parameters, only tbi.  Moreover, we can compute them
in a form that is more useful to storing in hflags.

This eliminates the need for aa64_va_parameter_both, so fold that
in to aa64_va_parameter.  The remaining calls to aa64_va_parameter
are in get_phys_addr_lpae and in pauth_helper.c.

This reduces the total cpu consumption of aa64_va_parameter in a
kernel boot plus a kvm guest kernel boot from 3% to 0.5%.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216194343.21331-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/internals.h |  3 --
 target/arm/helper.c    | 68 +++++++++++++++++++++++-------------------
 2 files changed, 37 insertions(+), 34 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ typedef struct ARMVAParameters {
     unsigned tsz    : 8;
     unsigned select : 1;
     bool tbi        : 1;
-    bool tbid       : 1;
     bool epd        : 1;
     bool hpd        : 1;
     bool using16k   : 1;
     bool using64k   : 1;
 } ARMVAParameters;
 
-ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
-                                        ARMMMUIdx mmu_idx);
 ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
                                    ARMMMUIdx mmu_idx, bool data);
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static uint8_t convert_stage2_attrs(CPUARMState *env, uint8_t s2attrs)
 }
 #endif /* !CONFIG_USER_ONLY */
 
-ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
-                                        ARMMMUIdx mmu_idx)
+static int aa64_va_parameter_tbi(uint64_t tcr, ARMMMUIdx mmu_idx)
+{
+    if (regime_has_2_ranges(mmu_idx)) {
+        return extract64(tcr, 37, 2);
+    } else if (mmu_idx == ARMMMUIdx_Stage2) {
+        return 0; /* VTCR_EL2 */
+    } else {
+        return extract32(tcr, 20, 1);
+    }
+}
+
+static int aa64_va_parameter_tbid(uint64_t tcr, ARMMMUIdx mmu_idx)
+{
+    if (regime_has_2_ranges(mmu_idx)) {
+        return extract64(tcr, 51, 2);
+    } else if (mmu_idx == ARMMMUIdx_Stage2) {
+        return 0; /* VTCR_EL2 */
+    } else {
+        return extract32(tcr, 29, 1);
+    }
+}
+
+ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
+                                   ARMMMUIdx mmu_idx, bool data)
 {
     uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
-    bool tbi, tbid, epd, hpd, using16k, using64k;
-    int select, tsz;
+    bool epd, hpd, using16k, using64k;
+    int select, tsz, tbi;
 
     if (!regime_has_2_ranges(mmu_idx)) {
         select = 0;
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
         using16k = extract32(tcr, 15, 1);
         if (mmu_idx == ARMMMUIdx_Stage2) {
             /* VTCR_EL2 */
-            tbi = tbid = hpd = false;
+            hpd = false;
         } else {
-            tbi = extract32(tcr, 20, 1);
             hpd = extract32(tcr, 24, 1);
-            tbid = extract32(tcr, 29, 1);
         }
         epd = false;
     } else {
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
             epd = extract32(tcr, 7, 1);
             using64k = extract32(tcr, 14, 1);
             using16k = extract32(tcr, 15, 1);
-            tbi = extract64(tcr, 37, 1);
             hpd = extract64(tcr, 41, 1);
-            tbid = extract64(tcr, 51, 1);
         } else {
             int tg = extract32(tcr, 30, 2);
             using16k = tg == 1;
             using64k = tg == 3;
             tsz = extract32(tcr, 16, 6);
             epd = extract32(tcr, 23, 1);
-            tbi = extract64(tcr, 38, 1);
             hpd = extract64(tcr, 42, 1);
-            tbid = extract64(tcr, 52, 1);
         }
     }
     tsz = MIN(tsz, 39);  /* TODO: ARMv8.4-TTST */
     tsz = MAX(tsz, 16);  /* TODO: ARMv8.2-LVA  */
 
+    /* Present TBI as a composite with TBID.  */
+    tbi = aa64_va_parameter_tbi(tcr, mmu_idx);
+    if (!data) {
+        tbi &= ~aa64_va_parameter_tbid(tcr, mmu_idx);
+    }
+    tbi = (tbi >> select) & 1;
+
     return (ARMVAParameters) {
         .tsz = tsz,
         .select = select,
         .tbi = tbi,
-        .tbid = tbid,
         .epd = epd,
         .hpd = hpd,
         .using16k = using16k,
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, uint64_t va,
     };
 }
 
-ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
-                                   ARMMMUIdx mmu_idx, bool data)
-{
-    ARMVAParameters ret = aa64_va_parameters_both(env, va, mmu_idx);
-
-    /* Present TBI as a composite with TBID.  */
-    ret.tbi &= (data || !ret.tbid);
-    return ret;
-}
-
 #ifndef CONFIG_USER_ONLY
 static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
                                           ARMMMUIdx mmu_idx)
@@ -XXX,XX +XXX,XX @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
 {
     uint32_t flags = rebuild_hflags_aprofile(env);
     ARMMMUIdx stage1 = stage_1_mmu_idx(mmu_idx);
-    ARMVAParameters p0 = aa64_va_parameters_both(env, 0, stage1);
+    uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
     uint64_t sctlr;
     int tbii, tbid;
 
     flags = FIELD_DP32(flags, TBFLAG_ANY, AARCH64_STATE, 1);
 
     /* Get control bits for tagged addresses.  */
-    if (regime_has_2_ranges(mmu_idx)) {
-        ARMVAParameters p1 = aa64_va_parameters_both(env, -1, stage1);
-        tbid = (p1.tbi << 1) | p0.tbi;
-        tbii = tbid & ~((p1.tbid << 1) | p0.tbid);
-    } else {
-        tbid = p0.tbi;
-        tbii = tbid & !p0.tbid;
-    }
+    tbid = aa64_va_parameter_tbi(tcr, mmu_idx);
+    tbii = tbid & ~aa64_va_parameter_tbid(tcr, mmu_idx);
 
     flags = FIELD_DP32(flags, TBFLAG_A64, TBII, tbii);
     flags = FIELD_DP32(flags, TBFLAG_A64, TBID, tbid);
-- 
2.20.1

Enforce a convention that an isar_feature function that tests a
32-bit ID register always has _aa32_ in its name, and one that
tests a 64-bit ID register always has _aa64_ in its name.
We already follow this except for three cases: thumb_div,
arm_div and jazelle, which all need _aa32_ adding.

(As noted in the comment, isar_feature_aa32_fp16_arith()
is an exception in that it currently tests ID_AA64PFR0_EL1,
but will switch to MVFR1 once we've properly implemented
FP16 for AArch32.)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-2-peter.maydell@linaro.org
---
 target/arm/cpu.h       | 13 ++++++++++---
 target/arm/internals.h |  2 +-
 linux-user/elfload.c   |  4 ++--
 target/arm/cpu.c       |  6 ++++--
 target/arm/helper.c    |  2 +-
 target/arm/translate.c |  6 +++---
 6 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline uint64_t *aa64_vfp_qreg(CPUARMState *env, unsigned regno)
 /* Shared between translate-sve.c and sve_helper.c.  */
 extern const uint64_t pred_esz_masks[4];
 
+/*
+ * Naming convention for isar_feature functions:
+ * Functions which test 32-bit ID registers should have _aa32_ in
+ * their name. Functions which test 64-bit ID registers should have
+ * _aa64_ in their name.
+ */
+
 /*
  * 32-bit feature tests via id registers.
  */
-static inline bool isar_feature_thumb_div(const ARMISARegisters *id)
+static inline bool isar_feature_aa32_thumb_div(const ARMISARegisters *id)
 {
     return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) != 0;
 }
 
-static inline bool isar_feature_arm_div(const ARMISARegisters *id)
+static inline bool isar_feature_aa32_arm_div(const ARMISARegisters *id)
 {
     return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) > 1;
 }
 
-static inline bool isar_feature_jazelle(const ARMISARegisters *id)
+static inline bool isar_feature_aa32_jazelle(const ARMISARegisters *id)
 {
     return FIELD_EX32(id->id_isar1, ID_ISAR1, JAZELLE) != 0;
 }
diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline uint32_t aarch32_cpsr_valid_mask(uint64_t features,
     if ((features >> ARM_FEATURE_THUMB2) & 1) {
         valid |= CPSR_IT;
     }
-    if (isar_feature_jazelle(id)) {
+    if (isar_feature_aa32_jazelle(id)) {
         valid |= CPSR_J;
     }
     if (isar_feature_aa32_pan(id)) {
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
     GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
     GET_FEATURE(ARM_FEATURE_V6K, ARM_HWCAP_ARM_TLS);
     GET_FEATURE(ARM_FEATURE_VFP4, ARM_HWCAP_ARM_VFPv4);
-    GET_FEATURE_ID(arm_div, ARM_HWCAP_ARM_IDIVA);
-    GET_FEATURE_ID(thumb_div, ARM_HWCAP_ARM_IDIVT);
+    GET_FEATURE_ID(aa32_arm_div, ARM_HWCAP_ARM_IDIVA);
+    GET_FEATURE_ID(aa32_thumb_div, ARM_HWCAP_ARM_IDIVT);
     /* All QEMU's VFPv3 CPUs have 32 registers, see VFP_DREG in translate.c.
      * Note that the ARM_HWCAP_ARM_VFPv3D16 bit is always the inverse of
      * ARM_HWCAP_ARM_VFPD32 (and so always clear for QEMU); it is unrelated
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
          * Presence of EL2 itself is ARM_FEATURE_EL2, and of the
          * Security Extensions is ARM_FEATURE_EL3.
          */
-        assert(!tcg_enabled() || no_aa32 || cpu_isar_feature(arm_div, cpu));
+        assert(!tcg_enabled() || no_aa32 ||
+               cpu_isar_feature(aa32_arm_div, cpu));
         set_feature(env, ARM_FEATURE_LPAE);
         set_feature(env, ARM_FEATURE_V7);
     }
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
     if (arm_feature(env, ARM_FEATURE_V6)) {
         set_feature(env, ARM_FEATURE_V5);
         if (!arm_feature(env, ARM_FEATURE_M)) {
-            assert(!tcg_enabled() || no_aa32 || cpu_isar_feature(jazelle, cpu));
+            assert(!tcg_enabled() || no_aa32 ||
+                   cpu_isar_feature(aa32_jazelle, cpu));
             set_feature(env, ARM_FEATURE_AUXCR);
         }
     }
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
     if (arm_feature(env, ARM_FEATURE_LPAE)) {
         define_arm_cp_regs(cpu, lpae_cp_reginfo);
     }
-    if (cpu_isar_feature(jazelle, cpu)) {
+    if (cpu_isar_feature(aa32_jazelle, cpu)) {
         define_arm_cp_regs(cpu, jazelle_regs);
     }
     /* Slightly awkwardly, the OMAP and StrongARM cores need all of
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@
 #define ENABLE_ARCH_5     arm_dc_feature(s, ARM_FEATURE_V5)
 /* currently all emulated v5 cores are also v5TE, so don't bother */
 #define ENABLE_ARCH_5TE   arm_dc_feature(s, ARM_FEATURE_V5)
-#define ENABLE_ARCH_5J    dc_isar_feature(jazelle, s)
+#define ENABLE_ARCH_5J    dc_isar_feature(aa32_jazelle, s)
 #define ENABLE_ARCH_6     arm_dc_feature(s, ARM_FEATURE_V6)
 #define ENABLE_ARCH_6K    arm_dc_feature(s, ARM_FEATURE_V6K)
 #define ENABLE_ARCH_6T2   arm_dc_feature(s, ARM_FEATURE_THUMB2)
@@ -XXX,XX +XXX,XX @@ static bool op_div(DisasContext *s, arg_rrr *a, bool u)
     TCGv_i32 t1, t2;
 
     if (s->thumb
-        ? !dc_isar_feature(thumb_div, s)
-        : !dc_isar_feature(arm_div, s)) {
+        ? !dc_isar_feature(aa32_thumb_div, s)
+        : !dc_isar_feature(aa32_arm_div, s)) {
         return false;
     }
 
-- 
2.20.1

Our current usage of the isar_feature feature tests almost always
uses an _aa32_ test when the code path is known to be AArch32
specific and an _aa64_ test when the code path is known to be
AArch64 specific. There is just one exception: in the vfp_set_fpscr
helper we check aa64_fp16 to determine whether the FZ16 bit in
the FP(S)CR exists, but this code is also used for AArch32.
There are other places in future where we're likely to want
a general "does this feature exist for either AArch32 or
AArch64" check (typically where architecturally the feature exists
for both CPU states if it exists at all, but the CPU might be
AArch32-only or AArch64-only, and so only have one set of ID
registers).

Introduce a new category of isar_feature_* functions:
isar_feature_any_foo() should be tested when what we want to
know is "does this feature exist for either AArch32 or AArch64",
and always returns the logical OR of isar_feature_aa32_foo()
and isar_feature_aa64_foo().

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-4-peter.maydell@linaro.org
---
 target/arm/cpu.h        | 19 ++++++++++++++++++-
 target/arm/vfp_helper.c |  2 +-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ extern const uint64_t pred_esz_masks[4];
  * Naming convention for isar_feature functions:
  * Functions which test 32-bit ID registers should have _aa32_ in
  * their name. Functions which test 64-bit ID registers should have
- * _aa64_ in their name.
+ * _aa64_ in their name. These must only be used in code where we
+ * know for certain that the CPU has AArch32 or AArch64 respectively
+ * or where the correct answer for a CPU which doesn't implement that
+ * CPU state is "false" (eg when generating A32 or A64 code, if adding
+ * system registers that are specific to that CPU state, for "should
+ * we let this system register bit be set" tests where the 32-bit
+ * flavour of the register doesn't have the bit, and so on).
+ * Functions which simply ask "does this feature exist at all" have
+ * _any_ in their name, and always return the logical OR of the _aa64_
+ * and the _aa32_ function.
  */
 
 /*
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_bti(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, BT) != 0;
 }
 
+/*
+ * Feature tests for "does this exist in either 32-bit or 64-bit?"
+ */
+static inline bool isar_feature_any_fp16(const ARMISARegisters *id)
+{
+    return isar_feature_aa64_fp16(id) || isar_feature_aa32_fp16_arith(id);
+}
+
 /*
  * Forward to the above feature tests given an ARMCPU pointer.
  */
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vfp_helper.c
+++ b/target/arm/vfp_helper.c
@@ -XXX,XX +XXX,XX @@ uint32_t vfp_get_fpscr(CPUARMState *env)
 void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
 {
     /* When ARMv8.2-FP16 is not supported, FZ16 is RES0.  */
-    if (!cpu_isar_feature(aa64_fp16, env_archcpu(env))) {
+    if (!cpu_isar_feature(any_fp16, env_archcpu(env))) {
         val &= ~FPCR_FZ16;
     }
 
-- 
2.20.1

Instead of open-coding "ARM_FEATURE_AARCH64 ? aa64_predinv: aa32_predinv",
define and use an any_predinv isar_feature test function.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-5-peter.maydell@linaro.org
---
 target/arm/cpu.h    | 5 +++++
 target/arm/helper.c | 9 +--------
 2 files changed, 6 insertions(+), 8 deletions(-)

Pull the code that defines the various PMU registers out
into its own function, matching the pattern we have
already for the debug registers.

Apart from one style fix to a multi-line comment, this
is purely movement of code with no changes to it.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-6-peter.maydell@linaro.org
---
 target/arm/helper.c | 158 +++++++++++++++++++++++---------------------
 1 file changed, 82 insertions(+), 76 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
     }
 }
 
+static void define_pmu_regs(ARMCPU *cpu)
+{
+    /*
+     * v7 performance monitor control register: same implementor
+     * field as main ID register, and we implement four counters in
+     * addition to the cycle count register.
+     */
+    unsigned int i, pmcrn = 4;
+    ARMCPRegInfo pmcr = {
+        .name = "PMCR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 0,
+        .access = PL0_RW,
+        .type = ARM_CP_IO | ARM_CP_ALIAS,
+        .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmcr),
+        .accessfn = pmreg_access, .writefn = pmcr_write,
+        .raw_writefn = raw_write,
+    };
+    ARMCPRegInfo pmcr64 = {
+        .name = "PMCR_EL0", .state = ARM_CP_STATE_AA64,
+        .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 0,
+        .access = PL0_RW, .accessfn = pmreg_access,
+        .type = ARM_CP_IO,
+        .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcr),
+        .resetvalue = (cpu->midr & 0xff000000) | (pmcrn << PMCRN_SHIFT),
+        .writefn = pmcr_write, .raw_writefn = raw_write,
+    };
+    define_one_arm_cp_reg(cpu, &pmcr);
+    define_one_arm_cp_reg(cpu, &pmcr64);
+    for (i = 0; i < pmcrn; i++) {
+        char *pmevcntr_name = g_strdup_printf("PMEVCNTR%d", i);
+        char *pmevcntr_el0_name = g_strdup_printf("PMEVCNTR%d_EL0", i);
+        char *pmevtyper_name = g_strdup_printf("PMEVTYPER%d", i);
+        char *pmevtyper_el0_name = g_strdup_printf("PMEVTYPER%d_EL0", i);
+        ARMCPRegInfo pmev_regs[] = {
+            { .name = pmevcntr_name, .cp = 15, .crn = 14,
+              .crm = 8 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
+              .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
+              .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
+              .accessfn = pmreg_access },
+            { .name = pmevcntr_el0_name, .state = ARM_CP_STATE_AA64,
+              .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 8 | (3 & (i >> 3)),
+              .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
+              .type = ARM_CP_IO,
+              .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
+              .raw_readfn = pmevcntr_rawread,
+              .raw_writefn = pmevcntr_rawwrite },
+            { .name = pmevtyper_name, .cp = 15, .crn = 14,
+              .crm = 12 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
+              .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
+              .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
+              .accessfn = pmreg_access },
+            { .name = pmevtyper_el0_name, .state = ARM_CP_STATE_AA64,
+              .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 12 | (3 & (i >> 3)),
+              .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
+              .type = ARM_CP_IO,
+              .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
+              .raw_writefn = pmevtyper_rawwrite },
+            REGINFO_SENTINEL
+        };
+        define_arm_cp_regs(cpu, pmev_regs);
+        g_free(pmevcntr_name);
+        g_free(pmevcntr_el0_name);
+        g_free(pmevtyper_name);
+        g_free(pmevtyper_el0_name);
+    }
+    if (FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) >= 4 &&
+            FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) != 0xf) {
+        ARMCPRegInfo v81_pmu_regs[] = {
+            { .name = "PMCEID2", .state = ARM_CP_STATE_AA32,
+              .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 4,
+              .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
+              .resetvalue = extract64(cpu->pmceid0, 32, 32) },
+            { .name = "PMCEID3", .state = ARM_CP_STATE_AA32,
+              .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 5,
+              .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
+              .resetvalue = extract64(cpu->pmceid1, 32, 32) },
+            REGINFO_SENTINEL
+        };
+        define_arm_cp_regs(cpu, v81_pmu_regs);
+    }
+}
+
 /* We don't know until after realize whether there's a GICv3
  * attached, and that is what registers the gicv3 sysregs.
  * So we have to fill in the GIC fields in ID_PFR/ID_PFR1_EL1/ID_AA64PFR0_EL1
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
         define_arm_cp_regs(cpu, pmovsset_cp_reginfo);
     }
     if (arm_feature(env, ARM_FEATURE_V7)) {
-        /* v7 performance monitor control register: same implementor
-         * field as main ID register, and we implement four counters in
-         * addition to the cycle count register.
-         */
-        unsigned int i, pmcrn = 4;
-        ARMCPRegInfo pmcr = {
-            .name = "PMCR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 0,
-            .access = PL0_RW,
-            .type = ARM_CP_IO | ARM_CP_ALIAS,
-            .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmcr),
-            .accessfn = pmreg_access, .writefn = pmcr_write,
-            .raw_writefn = raw_write,
-        };
-        ARMCPRegInfo pmcr64 = {
-            .name = "PMCR_EL0", .state = ARM_CP_STATE_AA64,
-            .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 0,
-            .access = PL0_RW, .accessfn = pmreg_access,
-            .type = ARM_CP_IO,
-            .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcr),
-            .resetvalue = (cpu->midr & 0xff000000) | (pmcrn << PMCRN_SHIFT),
-            .writefn = pmcr_write, .raw_writefn = raw_write,
-        };
-        define_one_arm_cp_reg(cpu, &pmcr);
-        define_one_arm_cp_reg(cpu, &pmcr64);
-        for (i = 0; i < pmcrn; i++) {
-            char *pmevcntr_name = g_strdup_printf("PMEVCNTR%d", i);
-            char *pmevcntr_el0_name = g_strdup_printf("PMEVCNTR%d_EL0", i);
-            char *pmevtyper_name = g_strdup_printf("PMEVTYPER%d", i);
-            char *pmevtyper_el0_name = g_strdup_printf("PMEVTYPER%d_EL0", i);
-            ARMCPRegInfo pmev_regs[] = {
-                { .name = pmevcntr_name, .cp = 15, .crn = 14,
-                  .crm = 8 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
-                  .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
-                  .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
-                  .accessfn = pmreg_access },
-                { .name = pmevcntr_el0_name, .state = ARM_CP_STATE_AA64,
-                  .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 8 | (3 & (i >> 3)),
-                  .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
-                  .type = ARM_CP_IO,
-                  .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
-                  .raw_readfn = pmevcntr_rawread,
-                  .raw_writefn = pmevcntr_rawwrite },
-                { .name = pmevtyper_name, .cp = 15, .crn = 14,
-                  .crm = 12 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
-                  .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
-                  .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
-                  .accessfn = pmreg_access },
-                { .name = pmevtyper_el0_name, .state = ARM_CP_STATE_AA64,
-                  .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 12 | (3 & (i >> 3)),
-                  .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
-                  .type = ARM_CP_IO,
-                  .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
-                  .raw_writefn = pmevtyper_rawwrite },
-                REGINFO_SENTINEL
-            };
-            define_arm_cp_regs(cpu, pmev_regs);
-            g_free(pmevcntr_name);
-            g_free(pmevcntr_el0_name);
-            g_free(pmevtyper_name);
-            g_free(pmevtyper_el0_name);
-        }
         ARMCPRegInfo clidr = {
             .name = "CLIDR", .state = ARM_CP_STATE_BOTH,
             .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 1, .opc2 = 1,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
         define_one_arm_cp_reg(cpu, &clidr);
         define_arm_cp_regs(cpu, v7_cp_reginfo);
         define_debug_regs(cpu);
+        define_pmu_regs(cpu);
     } else {
         define_arm_cp_regs(cpu, not_v7_cp_reginfo);
     }
-    if (FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) >= 4 &&
-            FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) != 0xf) {
-        ARMCPRegInfo v81_pmu_regs[] = {
-            { .name = "PMCEID2", .state = ARM_CP_STATE_AA32,
-              .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 4,
-              .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
-              .resetvalue = extract64(cpu->pmceid0, 32, 32) },
-            { .name = "PMCEID3", .state = ARM_CP_STATE_AA32,
-              .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 5,
-              .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
-              .resetvalue = extract64(cpu->pmceid1, 32, 32) },
-            REGINFO_SENTINEL
-        };
-        define_arm_cp_regs(cpu, v81_pmu_regs);
-    }
     if (arm_feature(env, ARM_FEATURE_V8)) {
         /* AArch64 ID registers, which all have impdef reset values.
          * Note that within the ID register ranges the unused slots
-- 
2.20.1

Add FIELD() definitions for the ID_AA64DFR0_EL1 and use them
where we currently have hard-coded bit values.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-7-peter.maydell@linaro.org
---
 target/arm/cpu.h    | 10 ++++++++++
 target/arm/cpu.c    |  2 +-
 target/arm/helper.c |  6 +++---
 3 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(ID_AA64MMFR2, BBM, 52, 4)
 FIELD(ID_AA64MMFR2, EVT, 56, 4)
 FIELD(ID_AA64MMFR2, E0PD, 60, 4)
 
+FIELD(ID_AA64DFR0, DEBUGVER, 0, 4)
+FIELD(ID_AA64DFR0, TRACEVER, 4, 4)
+FIELD(ID_AA64DFR0, PMUVER, 8, 4)
+FIELD(ID_AA64DFR0, BRPS, 12, 4)
+FIELD(ID_AA64DFR0, WRPS, 20, 4)
+FIELD(ID_AA64DFR0, CTX_CMPS, 28, 4)
+FIELD(ID_AA64DFR0, PMSVER, 32, 4)
+FIELD(ID_AA64DFR0, DOUBLELOCK, 36, 4)
+FIELD(ID_AA64DFR0, TRACEFILT, 40, 4)
+
 FIELD(ID_DFR0, COPDBG, 0, 4)
 FIELD(ID_DFR0, COPSDBG, 4, 4)
 FIELD(ID_DFR0, MMAPDBG, 8, 4)
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
                 cpu);
 #endif
     } else {
-        cpu->id_aa64dfr0 &= ~0xf00;
+        cpu->id_aa64dfr0 = FIELD_DP64(cpu->id_aa64dfr0, ID_AA64DFR0, PMUVER, 0);
         cpu->id_dfr0 &= ~(0xf << 24);
         cpu->pmceid0 = 0;
         cpu->pmceid1 = 0;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
      * check that if they both exist then they agree.
      */
     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
-        assert(extract32(cpu->id_aa64dfr0, 12, 4) == brps);
-        assert(extract32(cpu->id_aa64dfr0, 20, 4) == wrps);
-        assert(extract32(cpu->id_aa64dfr0, 28, 4) == ctx_cmps);
+        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, BRPS) == brps);
+        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, WRPS) == wrps);
+        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, CTX_CMPS) == ctx_cmps);
     }
 
     define_one_arm_cp_reg(cpu, &dbgdidr);
-- 
2.20.1

Instead of open-coding a check on the ID_DFR0 PerfMon ID register
field, create a standardly-named isar_feature for "does AArch32 have
a v8.1 PMUv3" and use it.

This entails moving the id_dfr0 field into the ARMISARegisters struct.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-9-peter.maydell@linaro.org
---
 target/arm/cpu.h      |  9 ++++++++-
 hw/intc/armv7m_nvic.c |  2 +-
 target/arm/cpu.c      | 28 ++++++++++++++--------------
 target/arm/cpu64.c    |  6 +++---
 target/arm/helper.c   |  5 ++---
 5 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
         uint32_t mvfr0;
         uint32_t mvfr1;
         uint32_t mvfr2;
+        uint32_t id_dfr0;
         uint64_t id_aa64isar0;
         uint64_t id_aa64isar1;
         uint64_t id_aa64pfr0;
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
     uint32_t reset_sctlr;
     uint32_t id_pfr0;
     uint32_t id_pfr1;
-    uint32_t id_dfr0;
     uint64_t pmceid0;
     uint64_t pmceid1;
     uint32_t id_afr0;
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_ats1e1(const ARMISARegisters *id)
     return FIELD_EX64(id->mvfr0, ID_MMFR3, PAN) >= 2;
 }
 
+static inline bool isar_feature_aa32_pmu_8_1(const ARMISARegisters *id)
+{
+    /* 0xf means "non-standard IMPDEF PMU" */
+    return FIELD_EX32(id->id_dfr0, ID_DFR0, PERFMON) >= 4 &&
+        FIELD_EX32(id->id_dfr0, ID_DFR0, PERFMON) != 0xf;
+}
+
 /*
  * 64-bit feature tests via id registers.
  */
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
     case 0xd44: /* PFR1.  */
         return cpu->id_pfr1;
     case 0xd48: /* DFR0.  */
-        return cpu->id_dfr0;
+        return cpu->isar.id_dfr0;
     case 0xd4c: /* AFR0.  */
         return cpu->id_afr0;
     case 0xd50: /* MMFR0.  */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
 #endif
     } else {
         cpu->id_aa64dfr0 = FIELD_DP64(cpu->id_aa64dfr0, ID_AA64DFR0, PMUVER, 0);
-        cpu->id_dfr0 = FIELD_DP32(cpu->id_dfr0, ID_DFR0, PERFMON, 0);
+        cpu->isar.id_dfr0 = FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, PERFMON, 0);
         cpu->pmceid0 = 0;
         cpu->pmceid1 = 0;
     }
@@ -XXX,XX +XXX,XX @@ static void arm1136_r2_initfn(Object *obj)
     cpu->reset_sctlr = 0x00050078;
     cpu->id_pfr0 = 0x111;
     cpu->id_pfr1 = 0x1;
-    cpu->id_dfr0 = 0x2;
+    cpu->isar.id_dfr0 = 0x2;
     cpu->id_afr0 = 0x3;
     cpu->id_mmfr0 = 0x01130003;
     cpu->id_mmfr1 = 0x10030302;
@@ -XXX,XX +XXX,XX @@ static void arm1136_initfn(Object *obj)
     cpu->reset_sctlr = 0x00050078;
     cpu->id_pfr0 = 0x111;
     cpu->id_pfr1 = 0x1;
-    cpu->id_dfr0 = 0x2;
+    cpu->isar.id_dfr0 = 0x2;
     cpu->id_afr0 = 0x3;
     cpu->id_mmfr0 = 0x01130003;
     cpu->id_mmfr1 = 0x10030302;
@@ -XXX,XX +XXX,XX @@ static void arm1176_initfn(Object *obj)
     cpu->reset_sctlr = 0x00050078;
     cpu->id_pfr0 = 0x111;
     cpu->id_pfr1 = 0x11;
-    cpu->id_dfr0 = 0x33;
+    cpu->isar.id_dfr0 = 0x33;
     cpu->id_afr0 = 0;
     cpu->id_mmfr0 = 0x01130003;
     cpu->id_mmfr1 = 0x10030302;
@@ -XXX,XX +XXX,XX @@ static void arm11mpcore_initfn(Object *obj)
     cpu->ctr = 0x1d192992; /* 32K icache 32K dcache */
     cpu->id_pfr0 = 0x111;
     cpu->id_pfr1 = 0x1;
-    cpu->id_dfr0 = 0;
+    cpu->isar.id_dfr0 = 0;
     cpu->id_afr0 = 0x2;
     cpu->id_mmfr0 = 0x01100103;
     cpu->id_mmfr1 = 0x10020302;
@@ -XXX,XX +XXX,XX @@ static void cortex_m3_initfn(Object *obj)
     cpu->pmsav7_dregion = 8;
     cpu->id_pfr0 = 0x00000030;
     cpu->id_pfr1 = 0x00000200;
-    cpu->id_dfr0 = 0x00100000;
+    cpu->isar.id_dfr0 = 0x00100000;
     cpu->id_afr0 = 0x00000000;
     cpu->id_mmfr0 = 0x00000030;
     cpu->id_mmfr1 = 0x00000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
     cpu->isar.mvfr2 = 0x00000000;
     cpu->id_pfr0 = 0x00000030;
     cpu->id_pfr1 = 0x00000200;
-    cpu->id_dfr0 = 0x00100000;
+    cpu->isar.id_dfr0 = 0x00100000;
     cpu->id_afr0 = 0x00000000;
     cpu->id_mmfr0 = 0x00000030;
     cpu->id_mmfr1 = 0x00000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_m7_initfn(Object *obj)
     cpu->isar.mvfr2 = 0x00000040;
     cpu->id_pfr0 = 0x00000030;
     cpu->id_pfr1 = 0x00000200;
-    cpu->id_dfr0 = 0x00100000;
+    cpu->isar.id_dfr0 = 0x00100000;
     cpu->id_afr0 = 0x00000000;
     cpu->id_mmfr0 = 0x00100030;
     cpu->id_mmfr1 = 0x00000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
     cpu->isar.mvfr2 = 0x00000040;
     cpu->id_pfr0 = 0x00000030;
     cpu->id_pfr1 = 0x00000210;
-    cpu->id_dfr0 = 0x00200000;
+    cpu->isar.id_dfr0 = 0x00200000;
     cpu->id_afr0 = 0x00000000;
     cpu->id_mmfr0 = 0x00101F40;
     cpu->id_mmfr1 = 0x00000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
     cpu->midr = 0x411fc153; /* r1p3 */
     cpu->id_pfr0 = 0x0131;
     cpu->id_pfr1 = 0x001;
-    cpu->id_dfr0 = 0x010400;
+    cpu->isar.id_dfr0 = 0x010400;
     cpu->id_afr0 = 0x0;
     cpu->id_mmfr0 = 0x0210030;
     cpu->id_mmfr1 = 0x00000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
     cpu->reset_sctlr = 0x00c50078;
     cpu->id_pfr0 = 0x1031;
     cpu->id_pfr1 = 0x11;
-    cpu->id_dfr0 = 0x400;
+    cpu->isar.id_dfr0 = 0x400;
     cpu->id_afr0 = 0;
     cpu->id_mmfr0 = 0x31100003;
     cpu->id_mmfr1 = 0x20000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
     cpu->reset_sctlr = 0x00c50078;
     cpu->id_pfr0 = 0x1031;
     cpu->id_pfr1 = 0x11;
-    cpu->id_dfr0 = 0x000;
+    cpu->isar.id_dfr0 = 0x000;
     cpu->id_afr0 = 0;
     cpu->id_mmfr0 = 0x00100103;
     cpu->id_mmfr1 = 0x20000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
     cpu->reset_sctlr = 0x00c50078;
     cpu->id_pfr0 = 0x00001131;
     cpu->id_pfr1 = 0x00011011;
-    cpu->id_dfr0 = 0x02010555;
+    cpu->isar.id_dfr0 = 0x02010555;
     cpu->id_afr0 = 0x00000000;
     cpu->id_mmfr0 = 0x10101105;
     cpu->id_mmfr1 = 0x40000000;
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
     cpu->reset_sctlr = 0x00c50078;
     cpu->id_pfr0 = 0x00001131;
     cpu->id_pfr1 = 0x00011011;
-    cpu->id_dfr0 = 0x02010555;
+    cpu->isar.id_dfr0 = 0x02010555;
     cpu->id_afr0 = 0x00000000;
     cpu->id_mmfr0 = 0x10201105;
     cpu->id_mmfr1 = 0x20000000;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
     cpu->reset_sctlr = 0x00c50838;
     cpu->id_pfr0 = 0x00000131;
     cpu->id_pfr1 = 0x00011011;
-    cpu->id_dfr0 = 0x03010066;
+    cpu->isar.id_dfr0 = 0x03010066;
     cpu->id_afr0 = 0x00000000;
     cpu->id_mmfr0 = 0x10101105;
     cpu->id_mmfr1 = 0x40000000;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
     cpu->reset_sctlr = 0x00c50838;
     cpu->id_pfr0 = 0x00000131;
     cpu->id_pfr1 = 0x00011011;
-    cpu->id_dfr0 = 0x03010066;
+    cpu->isar.id_dfr0 = 0x03010066;
     cpu->id_afr0 = 0x00000000;
     cpu->id_mmfr0 = 0x10101105;
     cpu->id_mmfr1 = 0x40000000;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
     cpu->reset_sctlr = 0x00c50838;
     cpu->id_pfr0 = 0x00000131;
     cpu->id_pfr1 = 0x00011011;
-    cpu->id_dfr0 = 0x03010066;
+    cpu->isar.id_dfr0 = 0x03010066;
     cpu->id_afr0 = 0x00000000;
     cpu->id_mmfr0 = 0x10201105;
     cpu->id_mmfr1 = 0x40000000;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
         g_free(pmevtyper_name);
         g_free(pmevtyper_el0_name);
     }
-    if (FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) >= 4 &&
-            FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) != 0xf) {
+    if (cpu_isar_feature(aa32_pmu_8_1, cpu)) {
         ARMCPRegInfo v81_pmu_regs[] = {
             { .name = "PMCEID2", .state = ARM_CP_STATE_AA32,
               .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 4,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 2,
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa32_tid3,
-              .resetvalue = cpu->id_dfr0 },
+              .resetvalue = cpu->isar.id_dfr0 },
             { .name = "ID_AFR0", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 3,
               .access = PL1_R, .type = ARM_CP_CONST,
-- 
2.20.1

Add the 64-bit version of the "is this a v8.1 PMUv3?"
ID register check function, and the _any_ version that
checks for either AArch32 or AArch64 support. We'll use
this in a later commit.

We don't (yet) do any isar_feature checks on ID_AA64DFR1_EL1,
but we move id_aa64dfr1 into the ARMISARegisters struct with
id_aa64dfr0, for consistency.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-10-peter.maydell@linaro.org
---
 target/arm/cpu.h    | 15 +++++++++++++--
 target/arm/cpu.c    |  3 ++-
 target/arm/cpu64.c  |  6 +++---
 target/arm/helper.c | 12 +++++++-----
 4 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
         uint64_t id_aa64mmfr0;
         uint64_t id_aa64mmfr1;
         uint64_t id_aa64mmfr2;
+        uint64_t id_aa64dfr0;
+        uint64_t id_aa64dfr1;
     } isar;
     uint32_t midr;
     uint32_t revidr;
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
     uint32_t id_mmfr2;
     uint32_t id_mmfr3;
     uint32_t id_mmfr4;
-    uint64_t id_aa64dfr0;
-    uint64_t id_aa64dfr1;
     uint64_t id_aa64afr0;
     uint64_t id_aa64afr1;
     uint32_t dbgdidr;
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_bti(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, BT) != 0;
 }
 
+static inline bool isar_feature_aa64_pmu_8_1(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64dfr0, ID_AA64DFR0, PMUVER) >= 4 &&
+        FIELD_EX64(id->id_aa64dfr0, ID_AA64DFR0, PMUVER) != 0xf;
+}
+
 /*
  * Feature tests for "does this exist in either 32-bit or 64-bit?"
  */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_predinv(const ARMISARegisters *id)
     return isar_feature_aa64_predinv(id) || isar_feature_aa32_predinv(id);
 }
 
+static inline bool isar_feature_any_pmu_8_1(const ARMISARegisters *id)
+{
+    return isar_feature_aa64_pmu_8_1(id) || isar_feature_aa32_pmu_8_1(id);
+}
+
 /*
  * Forward to the above feature tests given an ARMCPU pointer.
  */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
                 cpu);
 #endif
     } else {
-        cpu->id_aa64dfr0 = FIELD_DP64(cpu->id_aa64dfr0, ID_AA64DFR0, PMUVER, 0);
+        cpu->isar.id_aa64dfr0 =
+            FIELD_DP64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, PMUVER, 0);
         cpu->isar.id_dfr0 = FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, PERFMON, 0);
         cpu->pmceid0 = 0;
         cpu->pmceid1 = 0;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
     cpu->isar.id_isar5 = 0x00011121;
     cpu->isar.id_isar6 = 0;
     cpu->isar.id_aa64pfr0 = 0x00002222;
-    cpu->id_aa64dfr0 = 0x10305106;
+    cpu->isar.id_aa64dfr0 = 0x10305106;
     cpu->isar.id_aa64isar0 = 0x00011120;
     cpu->isar.id_aa64mmfr0 = 0x00001124;
     cpu->dbgdidr = 0x3516d000;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
     cpu->isar.id_isar5 = 0x00011121;
     cpu->isar.id_isar6 = 0;
     cpu->isar.id_aa64pfr0 = 0x00002222;
-    cpu->id_aa64dfr0 = 0x10305106;
+    cpu->isar.id_aa64dfr0 = 0x10305106;
     cpu->isar.id_aa64isar0 = 0x00011120;
     cpu->isar.id_aa64mmfr0 = 0x00001122; /* 40 bit physical addr */
     cpu->dbgdidr = 0x3516d000;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
     cpu->isar.id_isar4 = 0x00011142;
     cpu->isar.id_isar5 = 0x00011121;
     cpu->isar.id_aa64pfr0 = 0x00002222;
-    cpu->id_aa64dfr0 = 0x10305106;
+    cpu->isar.id_aa64dfr0 = 0x10305106;
     cpu->isar.id_aa64isar0 = 0x00011120;
     cpu->isar.id_aa64mmfr0 = 0x00001124;
     cpu->dbgdidr = 0x3516d000;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/semihosting/semihost.h"
 #include "sysemu/cpus.h"
 #include "sysemu/kvm.h"
+#include "sysemu/tcg.h"
 #include "qemu/range.h"
 #include "qapi/qapi-commands-machine-target.h"
 #include "qapi/error.h"
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
      * check that if they both exist then they agree.
      */
     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
-        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, BRPS) == brps);
-        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, WRPS) == wrps);
-        assert(FIELD_EX64(cpu->id_aa64dfr0, ID_AA64DFR0, CTX_CMPS) == ctx_cmps);
+        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, BRPS) == brps);
+        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, WRPS) == wrps);
+        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, CTX_CMPS)
+               == ctx_cmps);
     }
 
     define_one_arm_cp_reg(cpu, &dbgdidr);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 5, .opc2 = 0,
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa64_tid3,
-              .resetvalue = cpu->id_aa64dfr0 },
+              .resetvalue = cpu->isar.id_aa64dfr0 },
             { .name = "ID_AA64DFR1_EL1", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 5, .opc2 = 1,
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa64_tid3,
-              .resetvalue = cpu->id_aa64dfr1 },
+              .resetvalue = cpu->isar.id_aa64dfr1 },
             { .name = "ID_AA64DFR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 5, .opc2 = 2,
               .access = PL1_R, .type = ARM_CP_CONST,
-- 
2.20.1

The AArch32 DBGDIDR defines properties like the number of
breakpoints, watchpoints and context-matching comparators.  On an
AArch64 CPU, the register may not even exist if AArch32 is not
supported at EL1.

Currently we hard-code use of DBGDIDR to identify the number of
breakpoints etc; this works for all our TCG CPUs, but will break if
we ever add an AArch64-only CPU.  We also have an assert() that the
AArch32 and AArch64 registers match, which currently works only by
luck for KVM because we don't populate either of these ID registers
from the KVM vCPU and so they are both zero.

Clean this up so we have functions for finding the number
of breakpoints, watchpoints and context comparators which look
in the appropriate ID register.

This allows us to drop the "check that AArch64 and AArch32 agree
on the number of breakpoints etc" asserts:
 * we no longer look at the AArch32 versions unless that's the
   right place to be looking
 * it's valid to have a CPU (eg AArch64-only) where they don't match
 * we shouldn't have been asserting the validity of ID registers
   in a codepath used with KVM anyway

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214175116.9164-11-peter.maydell@linaro.org
---
 target/arm/cpu.h          |  7 +++++++
 target/arm/internals.h    | 42 +++++++++++++++++++++++++++++++++++++++
 target/arm/debug_helper.c |  6 +++---
 target/arm/helper.c       | 21 +++++---------------
 4 files changed, 57 insertions(+), 19 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(ID_DFR0, MPROFDBG, 20, 4)
 FIELD(ID_DFR0, PERFMON, 24, 4)
 FIELD(ID_DFR0, TRACEFILT, 28, 4)
 
+FIELD(DBGDIDR, SE_IMP, 12, 1)
+FIELD(DBGDIDR, NSUHD_IMP, 14, 1)
+FIELD(DBGDIDR, VERSION, 16, 4)
+FIELD(DBGDIDR, CTX_CMPS, 20, 4)
+FIELD(DBGDIDR, BRPS, 24, 4)
+FIELD(DBGDIDR, WRPS, 28, 4)
+
 FIELD(MVFR0, SIMDREG, 0, 4)
 FIELD(MVFR0, FPSP, 4, 4)
 FIELD(MVFR0, FPDP, 8, 4)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline uint32_t arm_debug_exception_fsr(CPUARMState *env)
     }
 }
 
+/**
+ * arm_num_brps: Return number of implemented breakpoints.
+ * Note that the ID register BRPS field is "number of bps - 1",
+ * and we return the actual number of breakpoints.
+ */
+static inline int arm_num_brps(ARMCPU *cpu)
+{
+    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+        return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, BRPS) + 1;
+    } else {
+        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, BRPS) + 1;
+    }
+}
+
+/**
+ * arm_num_wrps: Return number of implemented watchpoints.
+ * Note that the ID register WRPS field is "number of wps - 1",
+ * and we return the actual number of watchpoints.
+ */
+static inline int arm_num_wrps(ARMCPU *cpu)
+{
+    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+        return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, WRPS) + 1;
+    } else {
+        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, WRPS) + 1;
+    }
+}
+
+/**
+ * arm_num_ctx_cmps: Return number of implemented context comparators.
+ * Note that the ID register CTX_CMPS field is "number of cmps - 1",
+ * and we return the actual number of comparators.
+ */
+static inline int arm_num_ctx_cmps(ARMCPU *cpu)
+{
+    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+        return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, CTX_CMPS) + 1;
+    } else {
+        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, CTX_CMPS) + 1;
+    }
+}
+
 /* Note make_memop_idx reserves 4 bits for mmu_idx, and MO_BSWAP is bit 3.
  * Thus a TCGMemOpIdx, without any MO_ALIGN bits, fits in 8 bits.
  */
diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/debug_helper.c
+++ b/target/arm/debug_helper.c
@@ -XXX,XX +XXX,XX @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
 {
     CPUARMState *env = &cpu->env;
     uint64_t bcr = env->cp15.dbgbcr[lbn];
-    int brps = extract32(cpu->dbgdidr, 24, 4);
-    int ctx_cmps = extract32(cpu->dbgdidr, 20, 4);
+    int brps = arm_num_brps(cpu);
+    int ctx_cmps = arm_num_ctx_cmps(cpu);
     int bt;
     uint32_t contextidr;
     uint64_t hcr_el2;
@@ -XXX,XX +XXX,XX @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
      * case DBGWCR<n>_EL1.LBN must indicate that breakpoint).
      * We choose the former.
      */
-    if (lbn > brps || lbn < (brps - ctx_cmps)) {
+    if (lbn >= brps || lbn < (brps - ctx_cmps)) {
         return false;
     }
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
     };
 
     /* Note that all these register fields hold "number of Xs minus 1". */
-    brps = extract32(cpu->dbgdidr, 24, 4);
-    wrps = extract32(cpu->dbgdidr, 28, 4);
-    ctx_cmps = extract32(cpu->dbgdidr, 20, 4);
+    brps = arm_num_brps(cpu);
+    wrps = arm_num_wrps(cpu);
+    ctx_cmps = arm_num_ctx_cmps(cpu);
 
     assert(ctx_cmps <= brps);
 
-    /* The DBGDIDR and ID_AA64DFR0_EL1 define various properties
-     * of the debug registers such as number of breakpoints;
-     * check that if they both exist then they agree.
-     */
-    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
-        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, BRPS) == brps);
-        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, WRPS) == wrps);
-        assert(FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, CTX_CMPS)
-               == ctx_cmps);
-    }
-
     define_one_arm_cp_reg(cpu, &dbgdidr);
     define_arm_cp_regs(cpu, debug_cp_reginfo);
 
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
         define_arm_cp_regs(cpu, debug_lpae_cp_reginfo);
     }
 
-    for (i = 0; i < brps + 1; i++) {
+    for (i = 0; i < brps; i++) {
         ARMCPRegInfo dbgregs[] = {
             { .name = "DBGBVR", .state = ARM_CP_STATE_BOTH,
               .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = i, .opc2 = 4,
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
         define_arm_cp_regs(cpu, dbgregs);
     }
 
-    for (i = 0; i < wrps + 1; i++) {
+    for (i = 0; i < wrps; i++) {
         ARMCPRegInfo dbgregs[] = {
             { .name = "DBGWVR", .state = ARM_CP_STATE_BOTH,
               .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = i, .opc2 = 6,
-- 
2.20.1

We're going to want to read the DBGDIDR register from KVM in
a subsequent commit, which means it needs to be in the
ARMISARegisters sub-struct. Move it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214175116.9164-12-peter.maydell@linaro.org
---
 target/arm/cpu.h       | 2 +-
 target/arm/internals.h | 6 +++---
 target/arm/cpu.c       | 8 ++++----
 target/arm/cpu64.c     | 6 +++---
 target/arm/helper.c    | 2 +-
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
         uint32_t mvfr1;
         uint32_t mvfr2;
         uint32_t id_dfr0;
+        uint32_t dbgdidr;
         uint64_t id_aa64isar0;
         uint64_t id_aa64isar1;
         uint64_t id_aa64pfr0;
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
     uint32_t id_mmfr4;
     uint64_t id_aa64afr0;
     uint64_t id_aa64afr1;
-    uint32_t dbgdidr;
     uint32_t clidr;
     uint64_t mp_affinity; /* MP ID without feature bits */
     /* The elements of this array are the CCSIDR values for each cache,
diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline int arm_num_brps(ARMCPU *cpu)
     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
         return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, BRPS) + 1;
     } else {
-        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, BRPS) + 1;
+        return FIELD_EX32(cpu->isar.dbgdidr, DBGDIDR, BRPS) + 1;
     }
 }
 
@@ -XXX,XX +XXX,XX @@ static inline int arm_num_wrps(ARMCPU *cpu)
     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
         return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, WRPS) + 1;
     } else {
-        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, WRPS) + 1;
+        return FIELD_EX32(cpu->isar.dbgdidr, DBGDIDR, WRPS) + 1;
     }
 }
 
@@ -XXX,XX +XXX,XX @@ static inline int arm_num_ctx_cmps(ARMCPU *cpu)
     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
         return FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, CTX_CMPS) + 1;
     } else {
-        return FIELD_EX32(cpu->dbgdidr, DBGDIDR, CTX_CMPS) + 1;
+        return FIELD_EX32(cpu->isar.dbgdidr, DBGDIDR, CTX_CMPS) + 1;
     }
 }
 
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
     cpu->isar.id_isar2 = 0x21232031;
     cpu->isar.id_isar3 = 0x11112131;
     cpu->isar.id_isar4 = 0x00111142;
-    cpu->dbgdidr = 0x15141000;
+    cpu->isar.dbgdidr = 0x15141000;
     cpu->clidr = (1 << 27) | (2 << 24) | 3;
     cpu->ccsidr[0] = 0xe007e01a; /* 16k L1 dcache. */
     cpu->ccsidr[1] = 0x2007e01a; /* 16k L1 icache. */
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
     cpu->isar.id_isar2 = 0x21232041;
     cpu->isar.id_isar3 = 0x11112131;
     cpu->isar.id_isar4 = 0x00111142;
-    cpu->dbgdidr = 0x35141000;
+    cpu->isar.dbgdidr = 0x35141000;
     cpu->clidr = (1 << 27) | (1 << 24) | 3;
     cpu->ccsidr[0] = 0xe00fe019; /* 16k L1 dcache. */
     cpu->ccsidr[1] = 0x200fe019; /* 16k L1 icache. */
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
     cpu->isar.id_isar2 = 0x21232041;
     cpu->isar.id_isar3 = 0x11112131;
     cpu->isar.id_isar4 = 0x10011142;
-    cpu->dbgdidr = 0x3515f005;
+    cpu->isar.dbgdidr = 0x3515f005;
     cpu->clidr = 0x0a200023;
     cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
     cpu->ccsidr[1] = 0x201fe00a; /* 32K L1 icache */
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
     cpu->isar.id_isar2 = 0x21232041;
     cpu->isar.id_isar3 = 0x11112131;
     cpu->isar.id_isar4 = 0x10011142;
-    cpu->dbgdidr = 0x3515f021;
+    cpu->isar.dbgdidr = 0x3515f021;
     cpu->clidr = 0x0a200023;
     cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
     cpu->ccsidr[1] = 0x201fe00a; /* 32K L1 icache */
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
     cpu->isar.id_aa64dfr0 = 0x10305106;
     cpu->isar.id_aa64isar0 = 0x00011120;
     cpu->isar.id_aa64mmfr0 = 0x00001124;
-    cpu->dbgdidr = 0x3516d000;
+    cpu->isar.dbgdidr = 0x3516d000;
     cpu->clidr = 0x0a200023;
     cpu->ccsidr[0] = 0x701fe00a; /* 32KB L1 dcache */
     cpu->ccsidr[1] = 0x201fe012; /* 48KB L1 icache */
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
     cpu->isar.id_aa64dfr0 = 0x10305106;
     cpu->isar.id_aa64isar0 = 0x00011120;
     cpu->isar.id_aa64mmfr0 = 0x00001122; /* 40 bit physical addr */
-    cpu->dbgdidr = 0x3516d000;
+    cpu->isar.dbgdidr = 0x3516d000;
     cpu->clidr = 0x0a200023;
     cpu->ccsidr[0] = 0x700fe01a; /* 32KB L1 dcache */
     cpu->ccsidr[1] = 0x201fe00a; /* 32KB L1 icache */
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
     cpu->isar.id_aa64dfr0 = 0x10305106;
     cpu->isar.id_aa64isar0 = 0x00011120;
     cpu->isar.id_aa64mmfr0 = 0x00001124;
-    cpu->dbgdidr = 0x3516d000;
+    cpu->isar.dbgdidr = 0x3516d000;
     cpu->clidr = 0x0a200023;
     cpu->ccsidr[0] = 0x701fe00a; /* 32KB L1 dcache */
     cpu->ccsidr[1] = 0x201fe012; /* 48KB L1 icache */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
     ARMCPRegInfo dbgdidr = {
         .name = "DBGDIDR", .cp = 14, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 0,
         .access = PL0_R, .accessfn = access_tda,
-        .type = ARM_CP_CONST, .resetvalue = cpu->dbgdidr,
+        .type = ARM_CP_CONST, .resetvalue = cpu->isar.dbgdidr,
     };
 
     /* Note that all these register fields hold "number of Xs minus 1". */
-- 
2.20.1

Now we have isar_feature test functions that look at fields in the
ID_AA64DFR0_EL1 and ID_DFR0 ID registers, add the code that reads
these register values from KVM so that the checks behave correctly
when we're using KVM.

No isar_feature function tests ID_AA64DFR1_EL1 or DBGDIDR yet, but we
add it to maintain the invariant that every field in the
ARMISARegisters struct is populated for a KVM CPU and can be relied
on.  This requirement isn't actually written down yet, so add a note
to the relevant comment.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214175116.9164-13-peter.maydell@linaro.org
---
 target/arm/cpu.h   |  5 +++++
 target/arm/kvm32.c |  8 ++++++++
 target/arm/kvm64.c | 36 ++++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
      * prefix means a constant register.
      * Some of these registers are split out into a substructure that
      * is shared with the translators to control the ISA.
+     *
+     * Note that if you add an ID register to the ARMISARegisters struct
+     * you need to also update the 32-bit and 64-bit versions of the
+     * kvm_arm_get_host_cpu_features() function to correctly populate the
+     * field by reading the value from the KVM vCPU.
      */
     struct ARMISARegisters {
         uint32_t id_isar0;
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm32.c
+++ b/target/arm/kvm32.c
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
         ahcf->isar.id_isar6 = 0;
     }
 
+    err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_dfr0,
+                          ARM_CP15_REG32(0, 0, 1, 2));
+
     err |= read_sys_reg32(fdarray[2], &ahcf->isar.mvfr0,
                           KVM_REG_ARM | KVM_REG_SIZE_U32 |
                           KVM_REG_ARM_VFP | KVM_REG_ARM_VFP_MVFR0);
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
      * Fortunately there is not yet anything in there that affects migration.
      */
 
+    /*
+     * There is no way to read DBGDIDR, because currently 32-bit KVM
+     * doesn't implement debug at all. Leave it at zero.
+     */
+
     kvm_arm_destroy_scratch_host_vcpu(fdarray);
 
     if (err < 0) {
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
     } else {
         err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64pfr1,
                               ARM64_SYS_REG(3, 0, 0, 4, 1));
+        err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64dfr0,
+                              ARM64_SYS_REG(3, 0, 0, 5, 0));
+        err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64dfr1,
+                              ARM64_SYS_REG(3, 0, 0, 5, 1));
         err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64isar0,
                               ARM64_SYS_REG(3, 0, 0, 6, 0));
         err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64isar1,
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
          * than skipping the reads and leaving 0, as we must avoid
          * considering the values in every case.
          */
+        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_dfr0,
+                              ARM64_SYS_REG(3, 0, 0, 1, 2));
         err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar0,
                               ARM64_SYS_REG(3, 0, 0, 2, 0));
         err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar1,
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
                               ARM64_SYS_REG(3, 0, 0, 3, 1));
         err |= read_sys_reg32(fdarray[2], &ahcf->isar.mvfr2,
                               ARM64_SYS_REG(3, 0, 0, 3, 2));
+
+        /*
+         * DBGDIDR is a bit complicated because the kernel doesn't
+         * provide an accessor for it in 64-bit mode, which is what this
+         * scratch VM is in, and there's no architected "64-bit sysreg
+         * which reads the same as the 32-bit register" the way there is
+         * for other ID registers. Instead we synthesize a value from the
+         * AArch64 ID_AA64DFR0, the same way the kernel code in
+         * arch/arm64/kvm/sys_regs.c:trap_dbgidr() does.
+         * We only do this if the CPU supports AArch32 at EL1.
+         */
+        if (FIELD_EX32(ahcf->isar.id_aa64pfr0, ID_AA64PFR0, EL1) >= 2) {
+            int wrps = FIELD_EX64(ahcf->isar.id_aa64dfr0, ID_AA64DFR0, WRPS);
+            int brps = FIELD_EX64(ahcf->isar.id_aa64dfr0, ID_AA64DFR0, BRPS);
+            int ctx_cmps =
+                FIELD_EX64(ahcf->isar.id_aa64dfr0, ID_AA64DFR0, CTX_CMPS);
+            int version = 6; /* ARMv8 debug architecture */
+            bool has_el3 =
+                !!FIELD_EX32(ahcf->isar.id_aa64pfr0, ID_AA64PFR0, EL3);
+            uint32_t dbgdidr = 0;
+
+            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, WRPS, wrps);
+            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, BRPS, brps);
+            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, CTX_CMPS, ctx_cmps);
+            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, VERSION, version);
+            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, NSUHD_IMP, has_el3);
+            dbgdidr = FIELD_DP32(dbgdidr, DBGDIDR, SE_IMP, has_el3);
+            dbgdidr |= (1 << 15); /* RES1 bit */
+            ahcf->isar.dbgdidr = dbgdidr;
+        }
     }
 
     sve_supported = ioctl(fdarray[0], KVM_CHECK_EXTENSION, KVM_CAP_ARM_SVE) > 0;
-- 
2.20.1

The ARMv8.1-PMU extension requires:
 * the evtCount field in PMETYPER<n>_EL0 is 16 bits, not 10
 * MDCR_EL2.HPMD allows event counting to be disabled at EL2
 * two new required events, STALL_FRONTEND and STALL_BACKEND
 * ID register bits in ID_AA64DFR0_EL1 and ID_DFR0

We already implement the 16-bit evtCount field and the
HPMD bit, so all that is missing is the two new events:
  STALL_FRONTEND
   "counts every cycle counted by the CPU_CYCLES event on which no
    operation was issued because there are no operations available
    to issue to this PE from the frontend"
  STALL_BACKEND
   "counts every cycle counted by the CPU_CYCLES event on which no
    operation was issued because the backend is unable to accept
    any available operations from the frontend"

QEMU never stalls in this sense, so our implementation is trivial:
always return a zero count.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-14-peter.maydell@linaro.org
---
 target/arm/helper.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static int64_t instructions_ns_per(uint64_t icount)
 }
 #endif
 
+static bool pmu_8_1_events_supported(CPUARMState *env)
+{
+    /* For events which are supported in any v8.1 PMU */
+    return cpu_isar_feature(any_pmu_8_1, env_archcpu(env));
+}
+
+static uint64_t zero_event_get_count(CPUARMState *env)
+{
+    /* For events which on QEMU never fire, so their count is always zero */
+    return 0;
+}
+
+static int64_t zero_event_ns_per(uint64_t cycles)
+{
+    /* An event which never fires can never overflow */
+    return -1;
+}
+
 static const pm_event pm_events[] = {
     { .number = 0x000, /* SW_INCR */
       .supported = event_always_supported,
@@ -XXX,XX +XXX,XX @@ static const pm_event pm_events[] = {
       .supported = event_always_supported,
       .get_count = cycles_get_count,
       .ns_per_count = cycles_ns_per,
-    }
+    },
 #endif
+    { .number = 0x023, /* STALL_FRONTEND */
+      .supported = pmu_8_1_events_supported,
+      .get_count = zero_event_get_count,
+      .ns_per_count = zero_event_ns_per,
+    },
+    { .number = 0x024, /* STALL_BACKEND */
+      .supported = pmu_8_1_events_supported,
+      .get_count = zero_event_get_count,
+      .ns_per_count = zero_event_ns_per,
+    },
 };
 
 /*
@@ -XXX,XX +XXX,XX @@ static const pm_event pm_events[] = {
  * should first be updated to something sparse instead of the current
  * supported_event_map[] array.
  */
-#define MAX_EVENT_ID 0x11
+#define MAX_EVENT_ID 0x24
 #define UNSUPPORTED_EVENT UINT16_MAX
 static uint16_t supported_event_map[MAX_EVENT_ID + 1];
 
-- 
2.20.1

The ARMv8.4-PMU extension adds:
 * one new required event, STALL
 * one new system register PMMIR_EL1

(There are also some more L1-cache related events, but since
we don't implement any cache we don't provide these, in the
same way we don't provide the base-PMUv3 cache events.)

The STALL event "counts every attributable cycle on which no
attributable instruction or operation was sent for execution on this
PE".  QEMU doesn't stall in this sense, so this is another
always-reads-zero event.

The PMMIR_EL1 register is a read-only register providing
implementation-specific information about the PMU; currently it has
only one field, SLOTS, which defines behaviour of the STALL_SLOT PMU
event.  Since QEMU doesn't implement the STALL_SLOT event, we can
validly make the register read zero.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-15-peter.maydell@linaro.org
---
 target/arm/cpu.h    | 18 ++++++++++++++++++
 target/arm/helper.c | 22 +++++++++++++++++++++-
 2 files changed, 39 insertions(+), 1 deletion(-)

Set the ID register bits to provide ARMv8.4-PMU (and implicitly
also ARMv8.1-PMU) in the 'max' CPU.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-16-peter.maydell@linaro.org
---
 target/arm/cpu64.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         u = FIELD_DP32(u, ID_MMFR3, PAN, 2); /* ATS1E1 */
         cpu->id_mmfr3 = u;
 
+        u = cpu->isar.id_aa64dfr0;
+        u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
+        cpu->isar.id_aa64dfr0 = u;
+
+        u = cpu->isar.id_dfr0;
+        u = FIELD_DP32(u, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
+        cpu->isar.id_dfr0 = u;
+
         /*
          * FIXME: We do not yet support ARMv8.2-fp16 for AArch32 yet,
          * so do not set MVFR1.FPHP.  Strictly speaking this is not legal,
-- 
2.20.1

The LC bit in the PMCR_EL0 register is supposed to be:
 * read/write
 * RES1 on an AArch64-only implementation
 * an architecturally UNKNOWN value on reset
(and use of LC==0 by software is deprecated).

We were implementing it incorrectly as read-only always zero,
though we do have all the code needed to test it and behave
accordingly.

Instead make it a read-write bit which resets to 1 always, which
satisfies all the architectural requirements above.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-18-peter.maydell@linaro.org
---
 target/arm/helper.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
 #define PMCRC   0x4
 #define PMCRP   0x2
 #define PMCRE   0x1
+/*
+ * Mask of PMCR bits writeable by guest (not including WO bits like C, P,
+ * which can be written as 1 to trigger behaviour but which stay RAZ).
+ */
+#define PMCR_WRITEABLE_MASK (PMCRLC | PMCRDP | PMCRX | PMCRD | PMCRE)
 
 #define PMXEVTYPER_P          0x80000000
 #define PMXEVTYPER_U          0x40000000
@@ -XXX,XX +XXX,XX @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
         }
     }
 
-    /* only the DP, X, D and E bits are writable */
-    env->cp15.c9_pmcr &= ~0x39;
-    env->cp15.c9_pmcr |= (value & 0x39);
+    env->cp15.c9_pmcr &= ~PMCR_WRITEABLE_MASK;
+    env->cp15.c9_pmcr |= (value & PMCR_WRITEABLE_MASK);
 
     pmu_op_finish(env);
 }
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
         .access = PL0_RW, .accessfn = pmreg_access,
         .type = ARM_CP_IO,
         .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcr),
-        .resetvalue = (cpu->midr & 0xff000000) | (pmcrn << PMCRN_SHIFT),
+        .resetvalue = (cpu->midr & 0xff000000) | (pmcrn << PMCRN_SHIFT) |
+                      PMCRLC,
         .writefn = pmcr_write, .raw_writefn = raw_write,
     };
     define_one_arm_cp_reg(cpu, &pmcr);
-- 
2.20.1

The isar_feature_aa32_pan and isar_feature_aa32_ats1e1 functions
are supposed to be testing fields in ID_MMFR3; but a cut-and-paste
error meant we were looking at MVFR0 instead.

Fix the functions to look at the right register; this requires
us to move at least id_mmfr3 to the ARMISARegisters struct; we
choose to move all the ID_MMFRn registers for consistency.

Fixes: 3d6ad6bb466f
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214175116.9164-19-peter.maydell@linaro.org
---
 target/arm/cpu.h      |  14 +++---
 hw/intc/armv7m_nvic.c |   8 ++--
 target/arm/cpu.c      | 104 +++++++++++++++++++++---------------------
 target/arm/cpu64.c    |  28 ++++++------
 target/arm/helper.c   |  12 ++---
 target/arm/kvm32.c    |  17 +++++++
 target/arm/kvm64.c    |  10 ++++
 7 files changed, 110 insertions(+), 83 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
         uint32_t id_isar4;
         uint32_t id_isar5;
         uint32_t id_isar6;
+        uint32_t id_mmfr0;
+        uint32_t id_mmfr1;
+        uint32_t id_mmfr2;
+        uint32_t id_mmfr3;
+        uint32_t id_mmfr4;
         uint32_t mvfr0;
         uint32_t mvfr1;
         uint32_t mvfr2;
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
     uint64_t pmceid0;
     uint64_t pmceid1;
     uint32_t id_afr0;
-    uint32_t id_mmfr0;
-    uint32_t id_mmfr1;
-    uint32_t id_mmfr2;
-    uint32_t id_mmfr3;
-    uint32_t id_mmfr4;
     uint64_t id_aa64afr0;
     uint64_t id_aa64afr1;
     uint32_t clidr;
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_vminmaxnm(const ARMISARegisters *id)
 
 static inline bool isar_feature_aa32_pan(const ARMISARegisters *id)
 {
-    return FIELD_EX64(id->mvfr0, ID_MMFR3, PAN) != 0;
+    return FIELD_EX32(id->id_mmfr3, ID_MMFR3, PAN) != 0;
 }
 
 static inline bool isar_feature_aa32_ats1e1(const ARMISARegisters *id)
 {
-    return FIELD_EX64(id->mvfr0, ID_MMFR3, PAN) >= 2;
+    return FIELD_EX32(id->id_mmfr3, ID_MMFR3, PAN) >= 2;
 }
 
 static inline bool isar_feature_aa32_pmu_8_1(const ARMISARegisters *id)
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
     case 0xd4c: /* AFR0.  */
         return cpu->id_afr0;
     case 0xd50: /* MMFR0.  */
-        return cpu->id_mmfr0;
+        return cpu->isar.id_mmfr0;
     case 0xd54: /* MMFR1.  */
-        return cpu->id_mmfr1;
+        return cpu->isar.id_mmfr1;
     case 0xd58: /* MMFR2.  */
-        return cpu->id_mmfr2;
+        return cpu->isar.id_mmfr2;
     case 0xd5c: /* MMFR3.  */
-        return cpu->id_mmfr3;
+        return cpu->isar.id_mmfr3;
     case 0xd60: /* ISAR0.  */
         return cpu->isar.id_isar0;
     case 0xd64: /* ISAR1.  */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm1136_r2_initfn(Object *obj)
     cpu->id_pfr1 = 0x1;
     cpu->isar.id_dfr0 = 0x2;
     cpu->id_afr0 = 0x3;
-    cpu->id_mmfr0 = 0x01130003;
-    cpu->id_mmfr1 = 0x10030302;
-    cpu->id_mmfr2 = 0x01222110;
+    cpu->isar.id_mmfr0 = 0x01130003;
+    cpu->isar.id_mmfr1 = 0x10030302;
+    cpu->isar.id_mmfr2 = 0x01222110;
     cpu->isar.id_isar0 = 0x00140011;
     cpu->isar.id_isar1 = 0x12002111;
     cpu->isar.id_isar2 = 0x11231111;
@@ -XXX,XX +XXX,XX @@ static void arm1136_initfn(Object *obj)
     cpu->id_pfr1 = 0x1;
     cpu->isar.id_dfr0 = 0x2;
     cpu->id_afr0 = 0x3;
-    cpu->id_mmfr0 = 0x01130003;
-    cpu->id_mmfr1 = 0x10030302;
-    cpu->id_mmfr2 = 0x01222110;
+    cpu->isar.id_mmfr0 = 0x01130003;
+    cpu->isar.id_mmfr1 = 0x10030302;
+    cpu->isar.id_mmfr2 = 0x01222110;
     cpu->isar.id_isar0 = 0x00140011;
     cpu->isar.id_isar1 = 0x12002111;
     cpu->isar.id_isar2 = 0x11231111;
@@ -XXX,XX +XXX,XX @@ static void arm1176_initfn(Object *obj)
     cpu->id_pfr1 = 0x11;
     cpu->isar.id_dfr0 = 0x33;
     cpu->id_afr0 = 0;
-    cpu->id_mmfr0 = 0x01130003;
-    cpu->id_mmfr1 = 0x10030302;
-    cpu->id_mmfr2 = 0x01222100;
+    cpu->isar.id_mmfr0 = 0x01130003;
+    cpu->isar.id_mmfr1 = 0x10030302;
+    cpu->isar.id_mmfr2 = 0x01222100;
     cpu->isar.id_isar0 = 0x0140011;
     cpu->isar.id_isar1 = 0x12002111;
     cpu->isar.id_isar2 = 0x11231121;
@@ -XXX,XX +XXX,XX @@ static void arm11mpcore_initfn(Object *obj)
     cpu->id_pfr1 = 0x1;
     cpu->isar.id_dfr0 = 0;
     cpu->id_afr0 = 0x2;
-    cpu->id_mmfr0 = 0x01100103;
-    cpu->id_mmfr1 = 0x10020302;
-    cpu->id_mmfr2 = 0x01222000;
+    cpu->isar.id_mmfr0 = 0x01100103;
+    cpu->isar.id_mmfr1 = 0x10020302;
+    cpu->isar.id_mmfr2 = 0x01222000;
     cpu->isar.id_isar0 = 0x00100011;
     cpu->isar.id_isar1 = 0x12002111;
     cpu->isar.id_isar2 = 0x11221011;
@@ -XXX,XX +XXX,XX @@ static void cortex_m3_initfn(Object *obj)
     cpu->id_pfr1 = 0x00000200;
     cpu->isar.id_dfr0 = 0x00100000;
     cpu->id_afr0 = 0x00000000;
-    cpu->id_mmfr0 = 0x00000030;
-    cpu->id_mmfr1 = 0x00000000;
-    cpu->id_mmfr2 = 0x00000000;
-    cpu->id_mmfr3 = 0x00000000;
+    cpu->isar.id_mmfr0 = 0x00000030;
+    cpu->isar.id_mmfr1 = 0x00000000;
+    cpu->isar.id_mmfr2 = 0x00000000;
+    cpu->isar.id_mmfr3 = 0x00000000;
     cpu->isar.id_isar0 = 0x01141110;
     cpu->isar.id_isar1 = 0x02111000;
     cpu->isar.id_isar2 = 0x21112231;
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
     cpu->id_pfr1 = 0x00000200;
     cpu->isar.id_dfr0 = 0x00100000;
     cpu->id_afr0 = 0x00000000;
-    cpu->id_mmfr0 = 0x00000030;
-    cpu->id_mmfr1 = 0x00000000;
-    cpu->id_mmfr2 = 0x00000000;
-    cpu->id_mmfr3 = 0x00000000;
+    cpu->isar.id_mmfr0 = 0x00000030;
+    cpu->isar.id_mmfr1 = 0x00000000;
+    cpu->isar.id_mmfr2 = 0x00000000;
+    cpu->isar.id_mmfr3 = 0x00000000;
     cpu->isar.id_isar0 = 0x01141110;
     cpu->isar.id_isar1 = 0x02111000;
     cpu->isar.id_isar2 = 0x21112231;
@@ -XXX,XX +XXX,XX @@ static void cortex_m7_initfn(Object *obj)
     cpu->id_pfr1 = 0x00000200;
     cpu->isar.id_dfr0 = 0x00100000;
     cpu->id_afr0 = 0x00000000;
-    cpu->id_mmfr0 = 0x00100030;
-    cpu->id_mmfr1 = 0x00000000;
-    cpu->id_mmfr2 = 0x01000000;
-    cpu->id_mmfr3 = 0x00000000;
+    cpu->isar.id_mmfr0 = 0x00100030;
+    cpu->isar.id_mmfr1 = 0x00000000;
+    cpu->isar.id_mmfr2 = 0x01000000;
+    cpu->isar.id_mmfr3 = 0x00000000;
     cpu->isar.id_isar0 = 0x01101110;
     cpu->isar.id_isar1 = 0x02112000;
     cpu->isar.id_isar2 = 0x20232231;
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
     cpu->id_pfr1 = 0x00000210;
     cpu->isar.id_dfr0 = 0x00200000;
     cpu->id_afr0 = 0x00000000;
-    cpu->id_mmfr0 = 0x00101F40;
-    cpu->id_mmfr1 = 0x00000000;
-    cpu->id_mmfr2 = 0x01000000;
-    cpu->id_mmfr3 = 0x00000000;
+    cpu->isar.id_mmfr0 = 0x00101F40;
+    cpu->isar.id_mmfr1 = 0x00000000;
+    cpu->isar.id_mmfr2 = 0x01000000;
+    cpu->isar.id_mmfr3 = 0x00000000;
     cpu->isar.id_isar0 = 0x01101110;
     cpu->isar.id_isar1 = 0x02212000;
     cpu->isar.id_isar2 = 0x20232232;
@@ -XXX,XX +XXX,XX @@ static void cortex_r5_initfn(Object *obj)
     cpu->id_pfr1 = 0x001;
     cpu->isar.id_dfr0 = 0x010400;
     cpu->id_afr0 = 0x0;
-    cpu->id_mmfr0 = 0x0210030;
-    cpu->id_mmfr1 = 0x00000000;
-    cpu->id_mmfr2 = 0x01200000;
-    cpu->id_mmfr3 = 0x0211;
+    cpu->isar.id_mmfr0 = 0x0210030;
+    cpu->isar.id_mmfr1 = 0x00000000;
+    cpu->isar.id_mmfr2 = 0x01200000;
+    cpu->isar.id_mmfr3 = 0x0211;
     cpu->isar.id_isar0 = 0x02101111;
     cpu->isar.id_isar1 = 0x13112111;
     cpu->isar.id_isar2 = 0x21232141;
@@ -XXX,XX +XXX,XX @@ static void cortex_a8_initfn(Object *obj)
     cpu->id_pfr1 = 0x11;
     cpu->isar.id_dfr0 = 0x400;
     cpu->id_afr0 = 0;
-    cpu->id_mmfr0 = 0x31100003;
-    cpu->id_mmfr1 = 0x20000000;
-    cpu->id_mmfr2 = 0x01202000;
-    cpu->id_mmfr3 = 0x11;
+    cpu->isar.id_mmfr0 = 0x31100003;
+    cpu->isar.id_mmfr1 = 0x20000000;
+    cpu->isar.id_mmfr2 = 0x01202000;
+    cpu->isar.id_mmfr3 = 0x11;
     cpu->isar.id_isar0 = 0x00101111;
     cpu->isar.id_isar1 = 0x12112111;
     cpu->isar.id_isar2 = 0x21232031;
@@ -XXX,XX +XXX,XX @@ static void cortex_a9_initfn(Object *obj)
     cpu->id_pfr1 = 0x11;
     cpu->isar.id_dfr0 = 0x000;
     cpu->id_afr0 = 0;
-    cpu->id_mmfr0 = 0x00100103;
-    cpu->id_mmfr1 = 0x20000000;
-    cpu->id_mmfr2 = 0x01230000;
-    cpu->id_mmfr3 = 0x00002111;
+    cpu->isar.id_mmfr0 = 0x00100103;
+    cpu->isar.id_mmfr1 = 0x20000000;
+    cpu->isar.id_mmfr2 = 0x01230000;
+    cpu->isar.id_mmfr3 = 0x00002111;
     cpu->isar.id_isar0 = 0x00101111;
     cpu->isar.id_isar1 = 0x13112111;
     cpu->isar.id_isar2 = 0x21232041;
@@ -XXX,XX +XXX,XX @@ static void cortex_a7_initfn(Object *obj)
     cpu->id_pfr1 = 0x00011011;
     cpu->isar.id_dfr0 = 0x02010555;
     cpu->id_afr0 = 0x00000000;
-    cpu->id_mmfr0 = 0x10101105;
-    cpu->id_mmfr1 = 0x40000000;
-    cpu->id_mmfr2 = 0x01240000;
-    cpu->id_mmfr3 = 0x02102211;
+    cpu->isar.id_mmfr0 = 0x10101105;
+    cpu->isar.id_mmfr1 = 0x40000000;
+    cpu->isar.id_mmfr2 = 0x01240000;
+    cpu->isar.id_mmfr3 = 0x02102211;
     /* a7_mpcore_r0p5_trm, page 4-4 gives 0x01101110; but
      * table 4-41 gives 0x02101110, which includes the arm div insns.
      */
@@ -XXX,XX +XXX,XX @@ static void cortex_a15_initfn(Object *obj)
     cpu->id_pfr1 = 0x00011011;
     cpu->isar.id_dfr0 = 0x02010555;
     cpu->id_afr0 = 0x00000000;
-    cpu->id_mmfr0 = 0x10201105;
-    cpu->id_mmfr1 = 0x20000000;
-    cpu->id_mmfr2 = 0x01240000;
-    cpu->id_mmfr3 = 0x02102211;
+    cpu->isar.id_mmfr0 = 0x10201105;
+    cpu->isar.id_mmfr1 = 0x20000000;
+    cpu->isar.id_mmfr2 = 0x01240000;
+    cpu->isar.id_mmfr3 = 0x02102211;
     cpu->isar.id_isar0 = 0x02101110;
     cpu->isar.id_isar1 = 0x13112111;
     cpu->isar.id_isar2 = 0x21232041;
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
             t = FIELD_DP32(t, MVFR2, FPMISC, 4);   /* FP MaxNum */
             cpu->isar.mvfr2 = t;
 
-            t = cpu->id_mmfr3;
+            t = cpu->isar.id_mmfr3;
             t = FIELD_DP32(t, ID_MMFR3, PAN, 2); /* ATS1E1 */
-            cpu->id_mmfr3 = t;
+            cpu->isar.id_mmfr3 = t;
 
-            t = cpu->id_mmfr4;
+            t = cpu->isar.id_mmfr4;
             t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
-            cpu->id_mmfr4 = t;
+            cpu->isar.id_mmfr4 = t;
         }
 #endif
     }
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_a57_initfn(Object *obj)
     cpu->id_pfr1 = 0x00011011;
     cpu->isar.id_dfr0 = 0x03010066;
     cpu->id_afr0 = 0x00000000;
-    cpu->id_mmfr0 = 0x10101105;
-    cpu->id_mmfr1 = 0x40000000;
-    cpu->id_mmfr2 = 0x01260000;
-    cpu->id_mmfr3 = 0x02102211;
+    cpu->isar.id_mmfr0 = 0x10101105;
+    cpu->isar.id_mmfr1 = 0x40000000;
+    cpu->isar.id_mmfr2 = 0x01260000;
+    cpu->isar.id_mmfr3 = 0x02102211;
     cpu->isar.id_isar0 = 0x02101110;
     cpu->isar.id_isar1 = 0x13112111;
     cpu->isar.id_isar2 = 0x21232042;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a53_initfn(Object *obj)
     cpu->id_pfr1 = 0x00011011;
     cpu->isar.id_dfr0 = 0x03010066;
     cpu->id_afr0 = 0x00000000;
-    cpu->id_mmfr0 = 0x10101105;
-    cpu->id_mmfr1 = 0x40000000;
-    cpu->id_mmfr2 = 0x01260000;
-    cpu->id_mmfr3 = 0x02102211;
+    cpu->isar.id_mmfr0 = 0x10101105;
+    cpu->isar.id_mmfr1 = 0x40000000;
+    cpu->isar.id_mmfr2 = 0x01260000;
+    cpu->isar.id_mmfr3 = 0x02102211;
     cpu->isar.id_isar0 = 0x02101110;
     cpu->isar.id_isar1 = 0x13112111;
     cpu->isar.id_isar2 = 0x21232042;
@@ -XXX,XX +XXX,XX @@ static void aarch64_a72_initfn(Object *obj)
     cpu->id_pfr1 = 0x00011011;
     cpu->isar.id_dfr0 = 0x03010066;
     cpu->id_afr0 = 0x00000000;
-    cpu->id_mmfr0 = 0x10201105;
-    cpu->id_mmfr1 = 0x40000000;
-    cpu->id_mmfr2 = 0x01260000;
-    cpu->id_mmfr3 = 0x02102211;
+    cpu->isar.id_mmfr0 = 0x10201105;
+    cpu->isar.id_mmfr1 = 0x40000000;
+    cpu->isar.id_mmfr2 = 0x01260000;
+    cpu->isar.id_mmfr3 = 0x02102211;
     cpu->isar.id_isar0 = 0x02101110;
     cpu->isar.id_isar1 = 0x13112111;
     cpu->isar.id_isar2 = 0x21232042;
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
         u = FIELD_DP32(u, ID_ISAR6, SPECRES, 1);
         cpu->isar.id_isar6 = u;
 
-        u = cpu->id_mmfr3;
+        u = cpu->isar.id_mmfr3;
         u = FIELD_DP32(u, ID_MMFR3, PAN, 2); /* ATS1E1 */
-        cpu->id_mmfr3 = u;
+        cpu->isar.id_mmfr3 = u;
 
         u = cpu->isar.id_aa64dfr0;
         u = FIELD_DP64(u, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 4,
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa32_tid3,
-              .resetvalue = cpu->id_mmfr0 },
+              .resetvalue = cpu->isar.id_mmfr0 },
             { .name = "ID_MMFR1", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 5,
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa32_tid3,
-              .resetvalue = cpu->id_mmfr1 },
+              .resetvalue = cpu->isar.id_mmfr1 },
             { .name = "ID_MMFR2", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 6,
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa32_tid3,
-              .resetvalue = cpu->id_mmfr2 },
+              .resetvalue = cpu->isar.id_mmfr2 },
             { .name = "ID_MMFR3", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 7,
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa32_tid3,
-              .resetvalue = cpu->id_mmfr3 },
+              .resetvalue = cpu->isar.id_mmfr3 },
             { .name = "ID_ISAR0", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 0,
               .access = PL1_R, .type = ARM_CP_CONST,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 6,
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa32_tid3,
-              .resetvalue = cpu->id_mmfr4 },
+              .resetvalue = cpu->isar.id_mmfr4 },
             { .name = "ID_ISAR6", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 7,
               .access = PL1_R, .type = ARM_CP_CONST,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
         define_arm_cp_regs(cpu, vmsa_pmsa_cp_reginfo);
         define_arm_cp_regs(cpu, vmsa_cp_reginfo);
         /* TTCBR2 is introduced with ARMv8.2-A32HPD.  */
-        if (FIELD_EX32(cpu->id_mmfr4, ID_MMFR4, HPDS) != 0) {
+        if (FIELD_EX32(cpu->isar.id_mmfr4, ID_MMFR4, HPDS) != 0) {
             define_one_arm_cp_reg(cpu, &ttbcr2_reginfo);
         }
     }
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm32.c
+++ b/target/arm/kvm32.c
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
      * Fortunately there is not yet anything in there that affects migration.
      */
 
+    err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr0,
+                          ARM_CP15_REG32(0, 0, 1, 4));
+    err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr1,
+                          ARM_CP15_REG32(0, 0, 1, 5));
+    err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr2,
+                          ARM_CP15_REG32(0, 0, 1, 6));
+    err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr3,
+                          ARM_CP15_REG32(0, 0, 1, 7));
+    if (read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr4,
+                       ARM_CP15_REG32(0, 0, 2, 6))) {
+        /*
+         * Older kernels don't support reading ID_MMFR4 (a new in v8
+         * register); assume it's zero.
+         */
+        ahcf->isar.id_mmfr4 = 0;
+    }
+
     /*
      * There is no way to read DBGDIDR, because currently 32-bit KVM
      * doesn't implement debug at all. Leave it at zero.
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
          */
         err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_dfr0,
                               ARM64_SYS_REG(3, 0, 0, 1, 2));
+        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr0,
+                              ARM64_SYS_REG(3, 0, 0, 1, 4));
+        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr1,
+                              ARM64_SYS_REG(3, 0, 0, 1, 5));
+        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr2,
+                              ARM64_SYS_REG(3, 0, 0, 1, 6));
+        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr3,
+                              ARM64_SYS_REG(3, 0, 0, 1, 7));
         err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar0,
                               ARM64_SYS_REG(3, 0, 0, 2, 0));
         err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar1,
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
                               ARM64_SYS_REG(3, 0, 0, 2, 4));
         err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar5,
                               ARM64_SYS_REG(3, 0, 0, 2, 5));
+        err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_mmfr4,
+                              ARM64_SYS_REG(3, 0, 0, 2, 6));
         err |= read_sys_reg32(fdarray[2], &ahcf->isar.id_isar6,
                               ARM64_SYS_REG(3, 0, 0, 2, 7));
 
-- 
2.20.1

Now we have moved ID_MMFR4 into the ARMISARegisters struct, we
can define and use an isar_feature for the presence of the
ARMv8.2-AA32HPD feature, rather than open-coding the test.

While we're here, correct a comment typo which missed an 'A'
from the feature name.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214175116.9164-20-peter.maydell@linaro.org
---
 target/arm/cpu.h    | 5 +++++
 target/arm/helper.c | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

Cut-and-paste errors mean we're using FIELD_EX64() to extract fields from
some 32-bit ID register fields. Use FIELD_EX32() instead. (This makes
no difference in behaviour, it's just more consistent.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214175116.9164-21-peter.maydell@linaro.org
---
 target/arm/cpu.h | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
 static inline bool isar_feature_aa32_fp_d32(const ARMISARegisters *id)
 {
     /* Return true if D16-D31 are implemented */
-    return FIELD_EX64(id->mvfr0, MVFR0, SIMDREG) >= 2;
+    return FIELD_EX32(id->mvfr0, MVFR0, SIMDREG) >= 2;
 }
 
 static inline bool isar_feature_aa32_fpshvec(const ARMISARegisters *id)
 {
-    return FIELD_EX64(id->mvfr0, MVFR0, FPSHVEC) > 0;
+    return FIELD_EX32(id->mvfr0, MVFR0, FPSHVEC) > 0;
 }
 
 static inline bool isar_feature_aa32_fpdp(const ARMISARegisters *id)
 {
     /* Return true if CPU supports double precision floating point */
-    return FIELD_EX64(id->mvfr0, MVFR0, FPDP) > 0;
+    return FIELD_EX32(id->mvfr0, MVFR0, FPDP) > 0;
 }
 
 /*
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fpdp(const ARMISARegisters *id)
  */
 static inline bool isar_feature_aa32_fp16_spconv(const ARMISARegisters *id)
 {
-    return FIELD_EX64(id->mvfr1, MVFR1, FPHP) > 0;
+    return FIELD_EX32(id->mvfr1, MVFR1, FPHP) > 0;
 }
 
 static inline bool isar_feature_aa32_fp16_dpconv(const ARMISARegisters *id)
 {
-    return FIELD_EX64(id->mvfr1, MVFR1, FPHP) > 1;
+    return FIELD_EX32(id->mvfr1, MVFR1, FPHP) > 1;
 }
 
 static inline bool isar_feature_aa32_vsel(const ARMISARegisters *id)
 {
-    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 1;
+    return FIELD_EX32(id->mvfr2, MVFR2, FPMISC) >= 1;
 }
 
 static inline bool isar_feature_aa32_vcvt_dr(const ARMISARegisters *id)
 {
-    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 2;
+    return FIELD_EX32(id->mvfr2, MVFR2, FPMISC) >= 2;
 }
 
 static inline bool isar_feature_aa32_vrint(const ARMISARegisters *id)
 {
-    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 3;
+    return FIELD_EX32(id->mvfr2, MVFR2, FPMISC) >= 3;
 }
 
 static inline bool isar_feature_aa32_vminmaxnm(const ARMISARegisters *id)
 {
-    return FIELD_EX64(id->mvfr2, MVFR2, FPMISC) >= 4;
+    return FIELD_EX32(id->mvfr2, MVFR2, FPMISC) >= 4;
 }
 
 static inline bool isar_feature_aa32_pan(const ARMISARegisters *id)
-- 
2.20.1

The ACTLR2 and HACTLR2 AArch32 system registers didn't exist in ARMv7
or the original ARMv8.  They were later added as optional registers,
whose presence is signaled by the ID_MMFR4.AC2 field.  From ARMv8.2
they are mandatory (ie ID_MMFR4.AC2 must be non-zero).

We implemented HACTLR2 in commit 0e0456ab8895a5e85, but we
incorrectly made it exist for all v8 CPUs, and we didn't implement
ACTLR2 at all.

Sort this out by implementing both registers only when they are
supposed to exist, and setting the ID_MMFR4 bit for -cpu max.

Note that this removes HACTLR2 from our Cortex-A53, -A47 and -A72
CPU models; this is correct, because those CPUs do not implement
this register.

Fixes: 0e0456ab8895a5e85
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214175116.9164-22-peter.maydell@linaro.org
---
 target/arm/cpu.h    |  5 +++++
 target/arm/cpu.c    |  1 +
 target/arm/cpu64.c  |  4 ++++
 target/arm/helper.c | 32 +++++++++++++++++++++++---------
 4 files changed, 33 insertions(+), 9 deletions(-)

From: Guenter Roeck <linux@roeck-us.net>

We need to be able to use OHCISysBusState outside hcd-ohci.c, so move it
to its include file.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20200217204812.9857-2-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/usb/hcd-ohci.h | 16 ++++++++++++++++
 hw/usb/hcd-ohci.c | 15 ---------------
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/hw/usb/hcd-ohci.h b/hw/usb/hcd-ohci.h
index XXXXXXX..XXXXXXX 100644
--- a/hw/usb/hcd-ohci.h
+++ b/hw/usb/hcd-ohci.h
@@ -XXX,XX +XXX,XX @@
 #define HCD_OHCI_H
 
 #include "sysemu/dma.h"
+#include "hw/usb.h"
 
 /* Number of Downstream Ports on the root hub: */
 #define OHCI_MAX_PORTS 15
@@ -XXX,XX +XXX,XX @@ typedef struct OHCIState {
     void (*ohci_die)(struct OHCIState *ohci);
 } OHCIState;
 
+#define TYPE_SYSBUS_OHCI "sysbus-ohci"
+#define SYSBUS_OHCI(obj) OBJECT_CHECK(OHCISysBusState, (obj), TYPE_SYSBUS_OHCI)
+
+typedef struct {
+    /*< private >*/
+    SysBusDevice parent_obj;
+    /*< public >*/
+
+    OHCIState ohci;
+    char *masterbus;
+    uint32_t num_ports;
+    uint32_t firstport;
+    dma_addr_t dma_offset;
+} OHCISysBusState;
+
 extern const VMStateDescription vmstate_ohci_state;
 
 void usb_ohci_init(OHCIState *ohci, DeviceState *dev, uint32_t num_ports,
diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -XXX,XX +XXX,XX @@ void ohci_sysbus_die(struct OHCIState *ohci)
     ohci_bus_stop(ohci);
 }
 
-#define TYPE_SYSBUS_OHCI "sysbus-ohci"
-#define SYSBUS_OHCI(obj) OBJECT_CHECK(OHCISysBusState, (obj), TYPE_SYSBUS_OHCI)
-
-typedef struct {
-    /*< private >*/
-    SysBusDevice parent_obj;
-    /*< public >*/
-
-    OHCIState ohci;
-    char *masterbus;
-    uint32_t num_ports;
-    uint32_t firstport;
-    dma_addr_t dma_offset;
-} OHCISysBusState;
-
 static void ohci_realize_pxa(DeviceState *dev, Error **errp)
 {
     OHCISysBusState *s = SYSBUS_OHCI(dev);
-- 
2.20.1

From: Guenter Roeck <linux@roeck-us.net>

Instantiate EHCI and OHCI controllers on Allwinner A10. OHCI ports are
modeled as companions of the respective EHCI ports.

With this patch applied, USB controllers are discovered and instantiated
when booting the cubieboard machine with a recent Linux kernel.

ehci-platform 1c14000.usb: EHCI Host Controller
ehci-platform 1c14000.usb: new USB bus registered, assigned bus number 1
ehci-platform 1c14000.usb: irq 26, io mem 0x01c14000
ehci-platform 1c14000.usb: USB 2.0 started, EHCI 1.00
ehci-platform 1c1c000.usb: EHCI Host Controller
ehci-platform 1c1c000.usb: new USB bus registered, assigned bus number 2
ehci-platform 1c1c000.usb: irq 31, io mem 0x01c1c000
ehci-platform 1c1c000.usb: USB 2.0 started, EHCI 1.00
ohci-platform 1c14400.usb: Generic Platform OHCI controller
ohci-platform 1c14400.usb: new USB bus registered, assigned bus number 3
ohci-platform 1c14400.usb: irq 27, io mem 0x01c14400
ohci-platform 1c1c400.usb: Generic Platform OHCI controller
ohci-platform 1c1c400.usb: new USB bus registered, assigned bus number 4
ohci-platform 1c1c400.usb: irq 32, io mem 0x01c1c400
usb 2-1: new high-speed USB device number 2 using ehci-platform
usb-storage 2-1:1.0: USB Mass Storage device detected
scsi host1: usb-storage 2-1:1.0
usb 3-1: new full-speed USB device number 2 using ohci-platform
input: QEMU QEMU USB Mouse as /devices/platform/soc/1c14400.usb/usb3/3-1/3-1:1.0/0003:0627:0001.0001/input/input0

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20200217204812.9857-4-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/allwinner-a10.h |  6 +++++
 hw/arm/allwinner-a10.c         | 43 ++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/include/hw/arm/allwinner-a10.h b/include/hw/arm/allwinner-a10.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/allwinner-a10.h
+++ b/include/hw/arm/allwinner-a10.h
@@ -XXX,XX +XXX,XX @@
 #include "hw/intc/allwinner-a10-pic.h"
 #include "hw/net/allwinner_emac.h"
 #include "hw/ide/ahci.h"
+#include "hw/usb/hcd-ohci.h"
+#include "hw/usb/hcd-ehci.h"
 
 #include "target/arm/cpu.h"
 
 
 #define AW_A10_SDRAM_BASE       0x40000000
 
+#define AW_A10_NUM_USB          2
+
 #define TYPE_AW_A10 "allwinner-a10"
 #define AW_A10(obj) OBJECT_CHECK(AwA10State, (obj), TYPE_AW_A10)
 
@@ -XXX,XX +XXX,XX @@ typedef struct AwA10State {
     AwEmacState emac;
     AllwinnerAHCIState sata;
     MemoryRegion sram_a;
+    EHCISysBusState ehci[AW_A10_NUM_USB];
+    OHCISysBusState ohci[AW_A10_NUM_USB];
 } AwA10State;
 
 #endif
diff --git a/hw/arm/allwinner-a10.c b/hw/arm/allwinner-a10.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/allwinner-a10.c
+++ b/hw/arm/allwinner-a10.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/arm/allwinner-a10.h"
 #include "hw/misc/unimp.h"
 #include "sysemu/sysemu.h"
+#include "hw/boards.h"
+#include "hw/usb/hcd-ohci.h"
 
 #define AW_A10_PIC_REG_BASE     0x01c20400
 #define AW_A10_PIT_REG_BASE     0x01c20c00
 #define AW_A10_UART0_REG_BASE   0x01c28000
 #define AW_A10_EMAC_BASE        0x01c0b000
+#define AW_A10_EHCI_BASE        0x01c14000
+#define AW_A10_OHCI_BASE        0x01c14400
 #define AW_A10_SATA_BASE        0x01c18000
 
 static void aw_a10_init(Object *obj)
@@ -XXX,XX +XXX,XX @@ static void aw_a10_init(Object *obj)
 
     sysbus_init_child_obj(obj, "sata", &s->sata, sizeof(s->sata),
                           TYPE_ALLWINNER_AHCI);
+
+    if (machine_usb(current_machine)) {
+        int i;
+
+        for (i = 0; i < AW_A10_NUM_USB; i++) {
+            sysbus_init_child_obj(obj, "ehci[*]", OBJECT(&s->ehci[i]),
+                                  sizeof(s->ehci[i]), TYPE_PLATFORM_EHCI);
+            sysbus_init_child_obj(obj, "ohci[*]", OBJECT(&s->ohci[i]),
+                                  sizeof(s->ohci[i]), TYPE_SYSBUS_OHCI);
+        }
+    }
 }
 
 static void aw_a10_realize(DeviceState *dev, Error **errp)
@@ -XXX,XX +XXX,XX @@ static void aw_a10_realize(DeviceState *dev, Error **errp)
     serial_mm_init(get_system_memory(), AW_A10_UART0_REG_BASE, 2,
                    qdev_get_gpio_in(dev, 1),
                    115200, serial_hd(0), DEVICE_NATIVE_ENDIAN);
+
+    if (machine_usb(current_machine)) {
+        int i;
+
+        for (i = 0; i < AW_A10_NUM_USB; i++) {
+            char bus[16];
+
+            sprintf(bus, "usb-bus.%d", i);
+
+            object_property_set_bool(OBJECT(&s->ehci[i]), true,
+                                     "companion-enable", &error_fatal);
+            object_property_set_bool(OBJECT(&s->ehci[i]), true, "realized",
+                                     &error_fatal);
+            sysbus_mmio_map(SYS_BUS_DEVICE(&s->ehci[i]), 0,
+                            AW_A10_EHCI_BASE + i * 0x8000);
+            sysbus_connect_irq(SYS_BUS_DEVICE(&s->ehci[i]), 0,
+                               qdev_get_gpio_in(dev, 39 + i));
+
+            object_property_set_str(OBJECT(&s->ohci[i]), bus, "masterbus",
+                                    &error_fatal);
+            object_property_set_bool(OBJECT(&s->ohci[i]), true, "realized",
+                                     &error_fatal);
+            sysbus_mmio_map(SYS_BUS_DEVICE(&s->ohci[i]), 0,
+                            AW_A10_OHCI_BASE + i * 0x8000);
+            sysbus_connect_irq(SYS_BUS_DEVICE(&s->ohci[i]), 0,
+                               qdev_get_gpio_in(dev, 64 + i));
+        }
+    }
 }
 
 static void aw_a10_class_init(ObjectClass *oc, void *data)
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

These instructions shift left or right depending on the sign
of the input, and 7 bits are significant to the shift.  This
requires several masks and selects in addition to the actual
shifts to form the complete answer.

That said, the operation is still a small improvement even for
two 64-bit elements -- 13 vector operations instead of 2 * 7
integer operations.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  11 +-
 target/arm/translate.h     |   6 +
 target/arm/neon_helper.c   |  33 ----
 target/arm/translate-a64.c |  18 +--
 target/arm/translate.c     | 299 +++++++++++++++++++++++++++++++++++--
 target/arm/vec_helper.c    |  88 +++++++++++
 6 files changed, 389 insertions(+), 66 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_abd_s16, i32, i32, i32)
 DEF_HELPER_2(neon_abd_u32, i32, i32, i32)
 DEF_HELPER_2(neon_abd_s32, i32, i32, i32)
 
-DEF_HELPER_2(neon_shl_u8, i32, i32, i32)
-DEF_HELPER_2(neon_shl_s8, i32, i32, i32)
 DEF_HELPER_2(neon_shl_u16, i32, i32, i32)
 DEF_HELPER_2(neon_shl_s16, i32, i32, i32)
-DEF_HELPER_2(neon_shl_u32, i32, i32, i32)
-DEF_HELPER_2(neon_shl_s32, i32, i32, i32)
-DEF_HELPER_2(neon_shl_u64, i64, i64, i64)
-DEF_HELPER_2(neon_shl_s64, i64, i64, i64)
 DEF_HELPER_2(neon_rshl_u8, i32, i32, i32)
 DEF_HELPER_2(neon_rshl_s8, i32, i32, i32)
 DEF_HELPER_2(neon_rshl_u16, i32, i32, i32)
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(frint64_s, TCG_CALL_NO_RWG, f32, f32, ptr)
 DEF_HELPER_FLAGS_2(frint32_d, TCG_CALL_NO_RWG, f64, f64, ptr)
 DEF_HELPER_FLAGS_2(frint64_d, TCG_CALL_NO_RWG, f64, f64, ptr)
 
+DEF_HELPER_FLAGS_4(gvec_sshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_sshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_ushl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #include "helper-sve.h"
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ uint64_t vfp_expand_imm(int size, uint8_t imm8);
 extern const GVecGen3 mla_op[4];
 extern const GVecGen3 mls_op[4];
 extern const GVecGen3 cmtst_op[4];
+extern const GVecGen3 sshl_op[4];
+extern const GVecGen3 ushl_op[4];
 extern const GVecGen2i ssra_op[4];
 extern const GVecGen2i usra_op[4];
 extern const GVecGen2i sri_op[4];
@@ -XXX,XX +XXX,XX @@ extern const GVecGen4 sqadd_op[4];
 extern const GVecGen4 uqsub_op[4];
 extern const GVecGen4 sqsub_op[4];
 void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
+void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
+void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
+void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
+void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 
 /*
  * Forward to the isar_feature_* tests given a DisasContext pointer.
diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon_helper.c
+++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ NEON_VOP(abd_u32, neon_u32, 1)
     } else { \
         dest = src1 << tmp; \
     }} while (0)
-NEON_VOP(shl_u8, neon_u8, 4)
 NEON_VOP(shl_u16, neon_u16, 2)
-NEON_VOP(shl_u32, neon_u32, 1)
 #undef NEON_FN
 
-uint64_t HELPER(neon_shl_u64)(uint64_t val, uint64_t shiftop)
-{
-    int8_t shift = (int8_t)shiftop;
-    if (shift >= 64 || shift <= -64) {
-        val = 0;
-    } else if (shift < 0) {
-        val >>= -shift;
-    } else {
-        val <<= shift;
-    }
-    return val;
-}
-
 #define NEON_FN(dest, src1, src2) do { \
     int8_t tmp; \
     tmp = (int8_t)src2; \
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_shl_u64)(uint64_t val, uint64_t shiftop)
     } else { \
         dest = src1 << tmp; \
     }} while (0)
-NEON_VOP(shl_s8, neon_s8, 4)
 NEON_VOP(shl_s16, neon_s16, 2)
-NEON_VOP(shl_s32, neon_s32, 1)
 #undef NEON_FN
 
-uint64_t HELPER(neon_shl_s64)(uint64_t valop, uint64_t shiftop)
-{
-    int8_t shift = (int8_t)shiftop;
-    int64_t val = valop;
-    if (shift >= 64) {
-        val = 0;
-    } else if (shift <= -64) {
-        val >>= 63;
-    } else if (shift < 0) {
-        val >>= -shift;
-    } else {
-        val <<= shift;
-    }
-    return val;
-}
-
 #define NEON_FN(dest, src1, src2) do { \
     int8_t tmp; \
     tmp = (int8_t)src2; \
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_3same_64(DisasContext *s, int opcode, bool u,
         break;
     case 0x8: /* SSHL, USHL */
         if (u) {
-            gen_helper_neon_shl_u64(tcg_rd, tcg_rn, tcg_rm);
+            gen_ushl_i64(tcg_rd, tcg_rn, tcg_rm);
         } else {
-            gen_helper_neon_shl_s64(tcg_rd, tcg_rn, tcg_rm);
+            gen_sshl_i64(tcg_rd, tcg_rn, tcg_rm);
         }
         break;
     case 0x9: /* SQSHL, UQSHL */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                        is_q ? 16 : 8, vec_full_reg_size(s),
                        (u ? uqsub_op : sqsub_op) + size);
         return;
+    case 0x08: /* SSHL, USHL */
+        gen_gvec_op3(s, is_q, rd, rn, rm,
+                     u ? &ushl_op[size] : &sshl_op[size]);
+        return;
     case 0x0c: /* SMAX, UMAX */
         if (u) {
             gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                 genfn = fns[size][u];
                 break;
             }
-            case 0x8: /* SSHL, USHL */
-            {
-                static NeonGenTwoOpFn * const fns[3][2] = {
-                    { gen_helper_neon_shl_s8, gen_helper_neon_shl_u8 },
-                    { gen_helper_neon_shl_s16, gen_helper_neon_shl_u16 },
-                    { gen_helper_neon_shl_s32, gen_helper_neon_shl_u32 },
-                };
-                genfn = fns[size][u];
-                break;
-            }
             case 0x9: /* SQSHL, UQSHL */
             {
                 static NeonGenTwoOpEnvFn * const fns[3][2] = {
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_shift_narrow(int size, TCGv_i32 var, TCGv_i32 shift,
         if (u) {
             switch (size) {
             case 1: gen_helper_neon_shl_u16(var, var, shift); break;
-            case 2: gen_helper_neon_shl_u32(var, var, shift); break;
+            case 2: gen_ushl_i32(var, var, shift); break;
             default: abort();
             }
         } else {
             switch (size) {
             case 1: gen_helper_neon_shl_s16(var, var, shift); break;
-            case 2: gen_helper_neon_shl_s32(var, var, shift); break;
+            case 2: gen_sshl_i32(var, var, shift); break;
             default: abort();
             }
         }
@@ -XXX,XX +XXX,XX @@ const GVecGen3 cmtst_op[4] = {
       .vece = MO_64 },
 };
 
+void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
+{
+    TCGv_i32 lval = tcg_temp_new_i32();
+    TCGv_i32 rval = tcg_temp_new_i32();
+    TCGv_i32 lsh = tcg_temp_new_i32();
+    TCGv_i32 rsh = tcg_temp_new_i32();
+    TCGv_i32 zero = tcg_const_i32(0);
+    TCGv_i32 max = tcg_const_i32(32);
+
+    /*
+     * Rely on the TCG guarantee that out of range shifts produce
+     * unspecified results, not undefined behaviour (i.e. no trap).
+     * Discard out-of-range results after the fact.
+     */
+    tcg_gen_ext8s_i32(lsh, shift);
+    tcg_gen_neg_i32(rsh, lsh);
+    tcg_gen_shl_i32(lval, src, lsh);
+    tcg_gen_shr_i32(rval, src, rsh);
+    tcg_gen_movcond_i32(TCG_COND_LTU, dst, lsh, max, lval, zero);
+    tcg_gen_movcond_i32(TCG_COND_LTU, dst, rsh, max, rval, dst);
+
+    tcg_temp_free_i32(lval);
+    tcg_temp_free_i32(rval);
+    tcg_temp_free_i32(lsh);
+    tcg_temp_free_i32(rsh);
+    tcg_temp_free_i32(zero);
+    tcg_temp_free_i32(max);
+}
+
+void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
+{
+    TCGv_i64 lval = tcg_temp_new_i64();
+    TCGv_i64 rval = tcg_temp_new_i64();
+    TCGv_i64 lsh = tcg_temp_new_i64();
+    TCGv_i64 rsh = tcg_temp_new_i64();
+    TCGv_i64 zero = tcg_const_i64(0);
+    TCGv_i64 max = tcg_const_i64(64);
+
+    /*
+     * Rely on the TCG guarantee that out of range shifts produce
+     * unspecified results, not undefined behaviour (i.e. no trap).
+     * Discard out-of-range results after the fact.
+     */
+    tcg_gen_ext8s_i64(lsh, shift);
+    tcg_gen_neg_i64(rsh, lsh);
+    tcg_gen_shl_i64(lval, src, lsh);
+    tcg_gen_shr_i64(rval, src, rsh);
+    tcg_gen_movcond_i64(TCG_COND_LTU, dst, lsh, max, lval, zero);
+    tcg_gen_movcond_i64(TCG_COND_LTU, dst, rsh, max, rval, dst);
+
+    tcg_temp_free_i64(lval);
+    tcg_temp_free_i64(rval);
+    tcg_temp_free_i64(lsh);
+    tcg_temp_free_i64(rsh);
+    tcg_temp_free_i64(zero);
+    tcg_temp_free_i64(max);
+}
+
+static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
+                         TCGv_vec src, TCGv_vec shift)
+{
+    TCGv_vec lval = tcg_temp_new_vec_matching(dst);
+    TCGv_vec rval = tcg_temp_new_vec_matching(dst);
+    TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
+    TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
+    TCGv_vec msk, max;
+
+    tcg_gen_neg_vec(vece, rsh, shift);
+    if (vece == MO_8) {
+        tcg_gen_mov_vec(lsh, shift);
+    } else {
+        msk = tcg_temp_new_vec_matching(dst);
+        tcg_gen_dupi_vec(vece, msk, 0xff);
+        tcg_gen_and_vec(vece, lsh, shift, msk);
+        tcg_gen_and_vec(vece, rsh, rsh, msk);
+        tcg_temp_free_vec(msk);
+    }
+
+    /*
+     * Rely on the TCG guarantee that out of range shifts produce
+     * unspecified results, not undefined behaviour (i.e. no trap).
+     * Discard out-of-range results after the fact.
+     */
+    tcg_gen_shlv_vec(vece, lval, src, lsh);
+    tcg_gen_shrv_vec(vece, rval, src, rsh);
+
+    max = tcg_temp_new_vec_matching(dst);
+    tcg_gen_dupi_vec(vece, max, 8 << vece);
+
+    /*
+     * The choice of LT (signed) and GEU (unsigned) are biased toward
+     * the instructions of the x86_64 host.  For MO_8, the whole byte
+     * is significant so we must use an unsigned compare; otherwise we
+     * have already masked to a byte and so a signed compare works.
+     * Other tcg hosts have a full set of comparisons and do not care.
+     */
+    if (vece == MO_8) {
+        tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max);
+        tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max);
+        tcg_gen_andc_vec(vece, lval, lval, lsh);
+        tcg_gen_andc_vec(vece, rval, rval, rsh);
+    } else {
+        tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max);
+        tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max);
+        tcg_gen_and_vec(vece, lval, lval, lsh);
+        tcg_gen_and_vec(vece, rval, rval, rsh);
+    }
+    tcg_gen_or_vec(vece, dst, lval, rval);
+
+    tcg_temp_free_vec(max);
+    tcg_temp_free_vec(lval);
+    tcg_temp_free_vec(rval);
+    tcg_temp_free_vec(lsh);
+    tcg_temp_free_vec(rsh);
+}
+
+static const TCGOpcode ushl_list[] = {
+    INDEX_op_neg_vec, INDEX_op_shlv_vec,
+    INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
+};
+
+const GVecGen3 ushl_op[4] = {
+    { .fniv = gen_ushl_vec,
+      .fno = gen_helper_gvec_ushl_b,
+      .opt_opc = ushl_list,
+      .vece = MO_8 },
+    { .fniv = gen_ushl_vec,
+      .fno = gen_helper_gvec_ushl_h,
+      .opt_opc = ushl_list,
+      .vece = MO_16 },
+    { .fni4 = gen_ushl_i32,
+      .fniv = gen_ushl_vec,
+      .opt_opc = ushl_list,
+      .vece = MO_32 },
+    { .fni8 = gen_ushl_i64,
+      .fniv = gen_ushl_vec,
+      .opt_opc = ushl_list,
+      .vece = MO_64 },
+};
+
+void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
+{
+    TCGv_i32 lval = tcg_temp_new_i32();
+    TCGv_i32 rval = tcg_temp_new_i32();
+    TCGv_i32 lsh = tcg_temp_new_i32();
+    TCGv_i32 rsh = tcg_temp_new_i32();
+    TCGv_i32 zero = tcg_const_i32(0);
+    TCGv_i32 max = tcg_const_i32(31);
+
+    /*
+     * Rely on the TCG guarantee that out of range shifts produce
+     * unspecified results, not undefined behaviour (i.e. no trap).
+     * Discard out-of-range results after the fact.
+     */
+    tcg_gen_ext8s_i32(lsh, shift);
+    tcg_gen_neg_i32(rsh, lsh);
+    tcg_gen_shl_i32(lval, src, lsh);
+    tcg_gen_umin_i32(rsh, rsh, max);
+    tcg_gen_sar_i32(rval, src, rsh);
+    tcg_gen_movcond_i32(TCG_COND_LEU, lval, lsh, max, lval, zero);
+    tcg_gen_movcond_i32(TCG_COND_LT, dst, lsh, zero, rval, lval);
+
+    tcg_temp_free_i32(lval);
+    tcg_temp_free_i32(rval);
+    tcg_temp_free_i32(lsh);
+    tcg_temp_free_i32(rsh);
+    tcg_temp_free_i32(zero);
+    tcg_temp_free_i32(max);
+}
+
+void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
+{
+    TCGv_i64 lval = tcg_temp_new_i64();
+    TCGv_i64 rval = tcg_temp_new_i64();
+    TCGv_i64 lsh = tcg_temp_new_i64();
+    TCGv_i64 rsh = tcg_temp_new_i64();
+    TCGv_i64 zero = tcg_const_i64(0);
+    TCGv_i64 max = tcg_const_i64(63);
+
+    /*
+     * Rely on the TCG guarantee that out of range shifts produce
+     * unspecified results, not undefined behaviour (i.e. no trap).
+     * Discard out-of-range results after the fact.
+     */
+    tcg_gen_ext8s_i64(lsh, shift);
+    tcg_gen_neg_i64(rsh, lsh);
+    tcg_gen_shl_i64(lval, src, lsh);
+    tcg_gen_umin_i64(rsh, rsh, max);
+    tcg_gen_sar_i64(rval, src, rsh);
+    tcg_gen_movcond_i64(TCG_COND_LEU, lval, lsh, max, lval, zero);
+    tcg_gen_movcond_i64(TCG_COND_LT, dst, lsh, zero, rval, lval);
+
+    tcg_temp_free_i64(lval);
+    tcg_temp_free_i64(rval);
+    tcg_temp_free_i64(lsh);
+    tcg_temp_free_i64(rsh);
+    tcg_temp_free_i64(zero);
+    tcg_temp_free_i64(max);
+}
+
+static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
+                         TCGv_vec src, TCGv_vec shift)
+{
+    TCGv_vec lval = tcg_temp_new_vec_matching(dst);
+    TCGv_vec rval = tcg_temp_new_vec_matching(dst);
+    TCGv_vec lsh = tcg_temp_new_vec_matching(dst);
+    TCGv_vec rsh = tcg_temp_new_vec_matching(dst);
+    TCGv_vec tmp = tcg_temp_new_vec_matching(dst);
+
+    /*
+     * Rely on the TCG guarantee that out of range shifts produce
+     * unspecified results, not undefined behaviour (i.e. no trap).
+     * Discard out-of-range results after the fact.
+     */
+    tcg_gen_neg_vec(vece, rsh, shift);
+    if (vece == MO_8) {
+        tcg_gen_mov_vec(lsh, shift);
+    } else {
+        tcg_gen_dupi_vec(vece, tmp, 0xff);
+        tcg_gen_and_vec(vece, lsh, shift, tmp);
+        tcg_gen_and_vec(vece, rsh, rsh, tmp);
+    }
+
+    /* Bound rsh so out of bound right shift gets -1.  */
+    tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1);
+    tcg_gen_umin_vec(vece, rsh, rsh, tmp);
+    tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp);
+
+    tcg_gen_shlv_vec(vece, lval, src, lsh);
+    tcg_gen_sarv_vec(vece, rval, src, rsh);
+
+    /* Select in-bound left shift.  */
+    tcg_gen_andc_vec(vece, lval, lval, tmp);
+
+    /* Select between left and right shift.  */
+    if (vece == MO_8) {
+        tcg_gen_dupi_vec(vece, tmp, 0);
+        tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, rval, lval);
+    } else {
+        tcg_gen_dupi_vec(vece, tmp, 0x80);
+        tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, lval, rval);
+    }
+
+    tcg_temp_free_vec(lval);
+    tcg_temp_free_vec(rval);
+    tcg_temp_free_vec(lsh);
+    tcg_temp_free_vec(rsh);
+    tcg_temp_free_vec(tmp);
+}
+
+static const TCGOpcode sshl_list[] = {
+    INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
+    INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
+};
+
+const GVecGen3 sshl_op[4] = {
+    { .fniv = gen_sshl_vec,
+      .fno = gen_helper_gvec_sshl_b,
+      .opt_opc = sshl_list,
+      .vece = MO_8 },
+    { .fniv = gen_sshl_vec,
+      .fno = gen_helper_gvec_sshl_h,
+      .opt_opc = sshl_list,
+      .vece = MO_16 },
+    { .fni4 = gen_sshl_i32,
+      .fniv = gen_sshl_vec,
+      .opt_opc = sshl_list,
+      .vece = MO_32 },
+    { .fni8 = gen_sshl_i64,
+      .fniv = gen_sshl_vec,
+      .opt_opc = sshl_list,
+      .vece = MO_64 },
+};
+
 static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                           TCGv_vec a, TCGv_vec b)
 {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                   vec_size, vec_size);
             }
             return 0;
+
+        case NEON_3R_VSHL:
+            /* Note the operation is vshl vd,vm,vn */
+            tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size,
+                           u ? &ushl_op[size] : &sshl_op[size]);
+            return 0;
         }
 
         if (size == 3) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 neon_load_reg64(cpu_V0, rn + pass);
                 neon_load_reg64(cpu_V1, rm + pass);
                 switch (op) {
-                case NEON_3R_VSHL:
-                    if (u) {
-                        gen_helper_neon_shl_u64(cpu_V0, cpu_V1, cpu_V0);
-                    } else {
-                        gen_helper_neon_shl_s64(cpu_V0, cpu_V1, cpu_V0);
-                    }
-                    break;
                 case NEON_3R_VQSHL:
                     if (u) {
                         gen_helper_neon_qshl_u64(cpu_V0, cpu_env,
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         }
         pairwise = 0;
         switch (op) {
-        case NEON_3R_VSHL:
         case NEON_3R_VQSHL:
         case NEON_3R_VRSHL:
         case NEON_3R_VQRSHL:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VHSUB:
             GEN_NEON_INTEGER_OP(hsub);
             break;
-        case NEON_3R_VSHL:
-            GEN_NEON_INTEGER_OP(shl);
-            break;
         case NEON_3R_VQSHL:
             GEN_NEON_INTEGER_OP_ENV(qshl);
             break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             }
                         } else {
                             if (input_unsigned) {
-                                gen_helper_neon_shl_u64(cpu_V0, in, tmp64);
+                                gen_ushl_i64(cpu_V0, in, tmp64);
                             } else {
-                                gen_helper_neon_shl_s64(cpu_V0, in, tmp64);
+                                gen_sshl_i64(cpu_V0, in, tmp64);
                             }
                         }
                         tmp = tcg_temp_new_i32();
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fmlal_idx_a64)(void *vd, void *vn, void *vm,
     do_fmlal_idx(vd, vn, vm, &env->vfp.fp_status, desc,
                  get_flush_inputs_to_zero(&env->vfp.fp_status_f16));
 }
+
+void HELPER(gvec_sshl_b)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+    intptr_t i, opr_sz = simd_oprsz(desc);
+    int8_t *d = vd, *n = vn, *m = vm;
+
+    for (i = 0; i < opr_sz; ++i) {
+        int8_t mm = m[i];
+        int8_t nn = n[i];
+        int8_t res = 0;
+        if (mm >= 0) {
+            if (mm < 8) {
+                res = nn << mm;
+            }
+        } else {
+            res = nn >> (mm > -8 ? -mm : 7);
+        }
+        d[i] = res;
+    }
+    clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
+void HELPER(gvec_sshl_h)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+    intptr_t i, opr_sz = simd_oprsz(desc);
+    int16_t *d = vd, *n = vn, *m = vm;
+
+    for (i = 0; i < opr_sz / 2; ++i) {
+        int8_t mm = m[i];   /* only 8 bits of shift are significant */
+        int16_t nn = n[i];
+        int16_t res = 0;
+        if (mm >= 0) {
+            if (mm < 16) {
+                res = nn << mm;
+            }
+        } else {
+            res = nn >> (mm > -16 ? -mm : 15);
+        }
+        d[i] = res;
+    }
+    clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
+void HELPER(gvec_ushl_b)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+    intptr_t i, opr_sz = simd_oprsz(desc);
+    uint8_t *d = vd, *n = vn, *m = vm;
+
+    for (i = 0; i < opr_sz; ++i) {
+        int8_t mm = m[i];
+        uint8_t nn = n[i];
+        uint8_t res = 0;
+        if (mm >= 0) {
+            if (mm < 8) {
+                res = nn << mm;
+            }
+        } else {
+            if (mm > -8) {
+                res = nn >> -mm;
+            }
+        }
+        d[i] = res;
+    }
+    clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
+void HELPER(gvec_ushl_h)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+    intptr_t i, opr_sz = simd_oprsz(desc);
+    uint16_t *d = vd, *n = vn, *m = vm;
+
+    for (i = 0; i < opr_sz / 2; ++i) {
+        int8_t mm = m[i];   /* only 8 bits of shift are significant */
+        uint16_t nn = n[i];
+        uint16_t res = 0;
+        if (mm >= 0) {
+            if (mm < 16) {
+                res = nn << mm;
+            }
+        } else {
+            if (mm > -16) {
+                res = nn >> -mm;
+            }
+        }
+        d[i] = res;
+    }
+    clear_tail(d, opr_sz, simd_maxsz(desc));
+}
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The gvec form will be needed for implementing SVE2.

Extend the implementation to operate on uint64_t instead of uint32_t.
Use a counted inner loop instead of terminating when op1 goes to zero,
looking toward the required implementation for ARMv8.4-DIT.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  3 ++-
 target/arm/neon_helper.c   | 22 ----------------------
 target/arm/translate-a64.c | 10 +++-------
 target/arm/translate.c     | 11 ++++-------
 target/arm/vec_helper.c    | 30 ++++++++++++++++++++++++++++++
 5 files changed, 39 insertions(+), 37 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

The gvec form will be needed for implementing SVE2.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  4 +---
 target/arm/neon_helper.c   | 30 ------------------------------
 target/arm/translate-a64.c | 28 +++-------------------------
 target/arm/translate.c     | 16 ++--------------
 target/arm/vec_helper.c    | 33 +++++++++++++++++++++++++++++++++
 5 files changed, 39 insertions(+), 72 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
 DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
 DEF_HELPER_2(dc_zva, void, env, i64)
 
-DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-
 DEF_HELPER_FLAGS_5(gvec_qrdmlah_s16, TCG_CALL_NO_RWG,
                    void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s16, TCG_CALL_NO_RWG,
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(gvec_ushl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 
 DEF_HELPER_FLAGS_4(gvec_pmul_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon_helper.c
+++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(neon_zip16)(void *vd, void *vm)
     rm[0] = m0;
     rd[0] = d0;
 }
-
-/* Helper function for 64 bit polynomial multiply case:
- * perform PolynomialMult(op1, op2) and return either the top or
- * bottom half of the 128 bit result.
- */
-uint64_t HELPER(neon_pmull_64_lo)(uint64_t op1, uint64_t op2)
-{
-    int bitnum;
-    uint64_t res = 0;
-
-    for (bitnum = 0; bitnum < 64; bitnum++) {
-        if (op1 & (1ULL << bitnum)) {
-            res ^= op2 << bitnum;
-        }
-    }
-    return res;
-}
-uint64_t HELPER(neon_pmull_64_hi)(uint64_t op1, uint64_t op2)
-{
-    int bitnum;
-    uint64_t res = 0;
-
-    /* bit 0 of op1 can't influence the high 64 bits at all */
-    for (bitnum = 1; bitnum < 64; bitnum++) {
-        if (op1 & (1ULL << bitnum)) {
-            res ^= op2 >> (64 - bitnum);
-        }
-    }
-    return res;
-}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_3rd_narrowing(DisasContext *s, int is_q, int is_u, int size,
     clear_vec_high(s, is_q, rd);
 }
 
-static void handle_pmull_64(DisasContext *s, int is_q, int rd, int rn, int rm)
-{
-    /* PMULL of 64 x 64 -> 128 is an odd special case because it
-     * is the only three-reg-diff instruction which produces a
-     * 128-bit wide result from a single operation. However since
-     * it's possible to calculate the two halves more or less
-     * separately we just use two helper calls.
-     */
-    TCGv_i64 tcg_op1 = tcg_temp_new_i64();
-    TCGv_i64 tcg_op2 = tcg_temp_new_i64();
-    TCGv_i64 tcg_res = tcg_temp_new_i64();
-
-    read_vec_element(s, tcg_op1, rn, is_q, MO_64);
-    read_vec_element(s, tcg_op2, rm, is_q, MO_64);
-    gen_helper_neon_pmull_64_lo(tcg_res, tcg_op1, tcg_op2);
-    write_vec_element(s, tcg_res, rd, 0, MO_64);
-    gen_helper_neon_pmull_64_hi(tcg_res, tcg_op1, tcg_op2);
-    write_vec_element(s, tcg_res, rd, 1, MO_64);
-
-    tcg_temp_free_i64(tcg_op1);
-    tcg_temp_free_i64(tcg_op2);
-    tcg_temp_free_i64(tcg_res);
-}
-
 /* AdvSIMD three different
  *   31  30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
  * +---+---+---+-----------+------+---+------+--------+-----+------+------+
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
             if (!fp_access_check(s)) {
                 return;
             }
-            handle_pmull_64(s, is_q, rd, rn, rm);
+            /* The Q field specifies lo/hi half input for this insn.  */
+            gen_gvec_op3_ool(s, true, rd, rn, rm, is_q,
+                             gen_helper_gvec_pmull_q);
             return;
         }
         goto is_widening;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  * outside the loop below as it only performs a single pass.
                  */
                 if (op == 14 && size == 2) {
-                    TCGv_i64 tcg_rn, tcg_rm, tcg_rd;
-
                     if (!dc_isar_feature(aa32_pmull, s)) {
                         return 1;
                     }
-                    tcg_rn = tcg_temp_new_i64();
-                    tcg_rm = tcg_temp_new_i64();
-                    tcg_rd = tcg_temp_new_i64();
-                    neon_load_reg64(tcg_rn, rn);
-                    neon_load_reg64(tcg_rm, rm);
-                    gen_helper_neon_pmull_64_lo(tcg_rd, tcg_rn, tcg_rm);
-                    neon_store_reg64(tcg_rd, rd);
-                    gen_helper_neon_pmull_64_hi(tcg_rd, tcg_rn, tcg_rm);
-                    neon_store_reg64(tcg_rd, rd + 1);
-                    tcg_temp_free_i64(tcg_rn);
-                    tcg_temp_free_i64(tcg_rm);
-                    tcg_temp_free_i64(tcg_rd);
+                    tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
+                                       0, gen_helper_gvec_pmull_q);
                     return 0;
                 }
 
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_pmul_b)(void *vd, void *vn, void *vm, uint32_t desc)
     }
     clear_tail(d, opr_sz, simd_maxsz(desc));
 }
+
+/*
+ * 64x64->128 polynomial multiply.
+ * Because of the lanes are not accessed in strict columns,
+ * this probably cannot be turned into a generic helper.
+ */
+void HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+    intptr_t i, j, opr_sz = simd_oprsz(desc);
+    intptr_t hi = simd_data(desc);
+    uint64_t *d = vd, *n = vn, *m = vm;
+
+    for (i = 0; i < opr_sz / 8; i += 2) {
+        uint64_t nn = n[i + hi];
+        uint64_t mm = m[i + hi];
+        uint64_t rhi = 0;
+        uint64_t rlo = 0;
+
+        /* Bit 0 can only influence the low 64-bit result.  */
+        if (nn & 1) {
+            rlo = mm;
+        }
+
+        for (j = 1; j < 64; ++j) {
+            uint64_t mask = -((nn >> j) & 1);
+            rlo ^= (mm << j) & mask;
+            rhi ^= (mm >> (64 - j)) & mask;
+        }
+        d[i] = rlo;
+        d[i + 1] = rhi;
+    }
+    clear_tail(d, opr_sz, simd_maxsz(desc));
+}
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

We still need two different helpers, since NEON and SVE2 get the
inputs from different locations within the source vector.  However,
we can convert both to the same internal form for computation.

The sve2 helper is not used yet, but adding it with this patch
helps illustrate why the neon changes are helpful.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper-sve.h    |  2 ++
 target/arm/helper.h        |  3 +-
 target/arm/neon_helper.c   | 32 --------------------
 target/arm/translate-a64.c | 27 +++++++++++------
 target/arm/translate.c     | 26 ++++++++---------
 target/arm/vec_helper.c    | 60 ++++++++++++++++++++++++++++++++++++++
 6 files changed, 95 insertions(+), 55 deletions(-)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(sve_stdd_le_zd, TCG_CALL_NO_WG,
                    void, env, ptr, ptr, ptr, tl, i32)
 DEF_HELPER_FLAGS_6(sve_stdd_be_zd, TCG_CALL_NO_WG,
                    void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_sub_u8, i32, i32, i32)
 DEF_HELPER_2(neon_sub_u16, i32, i32, i32)
 DEF_HELPER_2(neon_mul_u8, i32, i32, i32)
 DEF_HELPER_2(neon_mul_u16, i32, i32, i32)
-DEF_HELPER_2(neon_mull_p8, i64, i32, i32)
 
 DEF_HELPER_2(neon_tst_u8, i32, i32, i32)
 DEF_HELPER_2(neon_tst_u16, i32, i32, i32)
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_ushl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(gvec_pmul_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #include "helper-sve.h"
diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon_helper.c
+++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ NEON_VOP(mul_u8, neon_u8, 4)
 NEON_VOP(mul_u16, neon_u16, 2)
 #undef NEON_FN
 
-/* Polynomial multiplication is like integer multiplication except the
-   partial products are XORed, not added.  */
-uint64_t HELPER(neon_mull_p8)(uint32_t op1, uint32_t op2)
-{
-    uint64_t result = 0;
-    uint64_t mask;
-    uint64_t op2ex = op2;
-    op2ex = (op2ex & 0xff) |
-        ((op2ex & 0xff00) << 8) |
-        ((op2ex & 0xff0000) << 16) |
-        ((op2ex & 0xff000000) << 24);
-    while (op1) {
-        mask = 0;
-        if (op1 & 1) {
-            mask |= 0xffff;
-        }
-        if (op1 & (1 << 8)) {
-            mask |= (0xffffU << 16);
-        }
-        if (op1 & (1 << 16)) {
-            mask |= (0xffffULL << 32);
-        }
-        if (op1 & (1 << 24)) {
-            mask |= (0xffffULL << 48);
-        }
-        result ^= op2ex & mask;
-        op1 = (op1 >> 1) & 0x7f7f7f7f;
-        op2ex <<= 1;
-    }
-    return result;
-}
-
 #define NEON_FN(dest, src1, src2) dest = (src1 & src2) ? -1 : 0
 NEON_VOP(tst_u8, neon_u8, 4)
 NEON_VOP(tst_u16, neon_u16, 2)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_3rd_widening(DisasContext *s, int is_q, int is_u, int size,
                 gen_helper_neon_addl_saturate_s32(tcg_passres, cpu_env,
                                                   tcg_passres, tcg_passres);
                 break;
-            case 14: /* PMULL */
-                assert(size == 0);
-                gen_helper_neon_mull_p8(tcg_passres, tcg_op1, tcg_op2);
-                break;
             default:
                 g_assert_not_reached();
             }
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
         handle_3rd_narrowing(s, is_q, is_u, size, opcode, rd, rn, rm);
         break;
     case 14: /* PMULL, PMULL2 */
-        if (is_u || size == 1 || size == 2) {
+        if (is_u) {
             unallocated_encoding(s);
             return;
         }
-        if (size == 3) {
+        switch (size) {
+        case 0: /* PMULL.P8 */
+            if (!fp_access_check(s)) {
+                return;
+            }
+            /* The Q field specifies lo/hi half input for this insn.  */
+            gen_gvec_op3_ool(s, true, rd, rn, rm, is_q,
+                             gen_helper_neon_pmull_h);
+            break;
+
+        case 3: /* PMULL.P64 */
             if (!dc_isar_feature(aa64_pmull, s)) {
                 unallocated_encoding(s);
                 return;
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
             /* The Q field specifies lo/hi half input for this insn.  */
             gen_gvec_op3_ool(s, true, rd, rn, rm, is_q,
                              gen_helper_gvec_pmull_q);
-            return;
+            break;
+
+        default:
+            unallocated_encoding(s);
+            break;
         }
-        goto is_widening;
+        return;
     case 9: /* SQDMLAL, SQDMLAL2 */
     case 11: /* SQDMLSL, SQDMLSL2 */
     case 13: /* SQDMULL, SQDMULL2 */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
             unallocated_encoding(s);
             return;
         }
-    is_widening:
         if (!fp_access_check(s)) {
             return;
         }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     return 1;
                 }
 
-                /* Handle VMULL.P64 (Polynomial 64x64 to 128 bit multiply)
-                 * outside the loop below as it only performs a single pass.
-                 */
-                if (op == 14 && size == 2) {
-                    if (!dc_isar_feature(aa32_pmull, s)) {
-                        return 1;
+                /* Handle polynomial VMULL in a single pass.  */
+                if (op == 14) {
+                    if (size == 0) {
+                        /* VMULL.P8 */
+                        tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
+                                           0, gen_helper_neon_pmull_h);
+                    } else {
+                        /* VMULL.P64 */
+                        if (!dc_isar_feature(aa32_pmull, s)) {
+                            return 1;
+                        }
+                        tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
+                                           0, gen_helper_gvec_pmull_q);
                     }
-                    tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
-                                       0, gen_helper_gvec_pmull_q);
                     return 0;
                 }
 
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */
                         gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
                         break;
-                    case 14: /* Polynomial VMULL */
-                        gen_helper_neon_mull_p8(cpu_V0, tmp, tmp2);
-                        tcg_temp_free_i32(tmp2);
-                        tcg_temp_free_i32(tmp);
-                        break;
                     default: /* 15 is RESERVED: caught earlier  */
                         abort();
                     }
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc)
     }
     clear_tail(d, opr_sz, simd_maxsz(desc));
 }
+
+/*
+ * 8x8->16 polynomial multiply.
+ *
+ * The byte inputs are expanded to (or extracted from) half-words.
+ * Note that neon and sve2 get the inputs from different positions.
+ * This allows 4 bytes to be processed in parallel with uint64_t.
+ */
+
+static uint64_t expand_byte_to_half(uint64_t x)
+{
+    return  (x & 0x000000ff)
+         | ((x & 0x0000ff00) << 8)
+         | ((x & 0x00ff0000) << 16)
+         | ((x & 0xff000000) << 24);
+}
+
+static uint64_t pmull_h(uint64_t op1, uint64_t op2)
+{
+    uint64_t result = 0;
+    int i;
+
+    for (i = 0; i < 8; ++i) {
+        uint64_t mask = (op1 & 0x0001000100010001ull) * 0xffff;
+        result ^= op2 & mask;
+        op1 >>= 1;
+        op2 <<= 1;
+    }
+    return result;
+}
+
+void HELPER(neon_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+    int hi = simd_data(desc);
+    uint64_t *d = vd, *n = vn, *m = vm;
+    uint64_t nn = n[hi], mm = m[hi];
+
+    d[0] = pmull_h(expand_byte_to_half(nn), expand_byte_to_half(mm));
+    nn >>= 32;
+    mm >>= 32;
+    d[1] = pmull_h(expand_byte_to_half(nn), expand_byte_to_half(mm));
+
+    clear_tail(d, 16, simd_maxsz(desc));
+}
+
+#ifdef TARGET_AARCH64
+void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+    int shift = simd_data(desc) * 8;
+    intptr_t i, opr_sz = simd_oprsz(desc);
+    uint64_t *d = vd, *n = vn, *m = vm;
+
+    for (i = 0; i < opr_sz / 8; ++i) {
+        uint64_t nn = (n[i] >> shift) & 0x00ff00ff00ff00ffull;
+        uint64_t mm = (m[i] >> shift) & 0x00ff00ff00ff00ffull;
+
+        d[i] = pmull_h(nn, mm);
+    }
+}
+#endif
-- 
2.20.1

From: Francisco Iglesias <francisco.iglesias@xilinx.com>

Correct the number of dummy cycles required by the FAST_READ_4 command (to
be eight, one dummy byte).

Fixes: ef06ca3946 ("xilinx_spips: Add support for RX discard and RX drain")
Suggested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20200218113350.6090-1-frasse.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/ssi/xilinx_spips.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/ssi/xilinx_spips.c
+++ b/hw/ssi/xilinx_spips.c
@@ -XXX,XX +XXX,XX @@ static int xilinx_spips_num_dummies(XilinxQSPIPS *qs, uint8_t command)
     case FAST_READ:
     case DOR:
     case QOR:
+    case FAST_READ_4:
     case DOR_4:
     case QOR_4:
         return 1;
     case DIOR:
-    case FAST_READ_4:
     case DIOR_4:
         return 2;
     case QIOR:
-- 
2.20.1

From: Guenter Roeck <linux@roeck-us.net>

Booting the r2d machine from flash fails because flash is not discovered.
Looking at the flattened memory tree, we see the following.

FlatView #1
 AS "memory", root: system
 AS "cpu-memory-0", root: system
 AS "sh_pci_host", root: bus master container
 Root memory region: system
  0000000000000000-000000000000ffff (prio 0, i/o): io
  0000000000010000-0000000000ffffff (prio 0, i/o): r2d.flash @0000000000010000

The overlapping memory region is sh_pci.isa, ie the ISA I/O region bridge.
This region is initially assigned to address 0xfe240000, but overwritten
with a write into the PCIIOBR register. This write is expected to adjust
the PCI memory window, but not to change the region's base adddress.

Peter Maydell provided the following detailed explanation.

"Section 22.3.7 and in particular figure 22.3 (of "SSH7751R user's manual:
hardware") are clear about how this is supposed to work: there is a window
at 0xfe240000 in the system register space for PCI I/O space. When the CPU
makes an access into that area, the PCI controller calculates the PCI
address to use by combining bits 0..17 of the system address with the
bits 31..18 value that the guest has put into the PCIIOBR. That is, writing
to the PCIIOBR changes which section of the IO address space is visible in
the 0xfe240000 window. Instead what QEMU's implementation does is move the
window to whatever value the guest writes to the PCIIOBR register -- so if
the guest writes 0 we put the window at 0 in system address space."

Fix the problem by calling memory_region_set_alias_offset() instead of
removing and re-adding the PCI ISA subregion on writes into PCIIOBR.
At the same time, in sh_pci_device_realize(), don't set iobr since
it is overwritten later anyway. Instead, pass the base address to
memory_region_add_subregion() directly.

Many thanks to Peter Maydell for the detailed problem analysis, and for
providing suggestions on how to fix the problem.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20200218201050.15273-1-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/sh4/sh_pci.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/hw/sh4/sh_pci.c b/hw/sh4/sh_pci.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/sh4/sh_pci.c
+++ b/hw/sh4/sh_pci.c
@@ -XXX,XX +XXX,XX @@ static void sh_pci_reg_write (void *p, hwaddr addr, uint64_t val,
         pcic->mbr = val & 0xff000001;
         break;
     case 0x1c8:
-        if ((val & 0xfffc0000) != (pcic->iobr & 0xfffc0000)) {
-            memory_region_del_subregion(get_system_memory(), &pcic->isa);
-            pcic->iobr = val & 0xfffc0001;
-            memory_region_add_subregion(get_system_memory(),
-                                        pcic->iobr & 0xfffc0000, &pcic->isa);
-        }
+        pcic->iobr = val & 0xfffc0001;
+        memory_region_set_alias_offset(&pcic->isa, val & 0xfffc0000);
         break;
     case 0x220:
         pci_data_write(phb->bus, pcic->par, val, 4);
@@ -XXX,XX +XXX,XX @@ static void sh_pci_device_realize(DeviceState *dev, Error **errp)
                              get_system_io(), 0, 0x40000);
     sysbus_init_mmio(sbd, &s->memconfig_p4);
     sysbus_init_mmio(sbd, &s->memconfig_a7);
-    s->iobr = 0xfe240000;
-    memory_region_add_subregion(get_system_memory(), s->iobr, &s->isa);
+    memory_region_add_subregion(get_system_memory(), 0xfe240000, &s->isa);
 
     s->dev = pci_create_simple(phb->bus, PCI_DEVFN(0, 0), "sh_pci_host");
 }
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The old name, isar_feature_aa32_fp_d32, does not reflect
the MVFR0 field name, SIMDReg.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200214181547.21408-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: wrapped one long line]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h               |  2 +-
 target/arm/translate-vfp.inc.c | 53 +++++++++++++++++-----------------
 2 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
 }
 
-static inline bool isar_feature_aa32_fp_d32(const ARMISARegisters *id)
+static inline bool isar_feature_aa32_simd_r32(const ARMISARegisters *id)
 {
     /* Return true if D16-D31 are implemented */
     return FIELD_EX32(id->mvfr0, MVFR0, SIMDREG) >= 2;
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-vfp.inc.c
+++ b/target/arm/translate-vfp.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
         ((a->vm | a->vn | a->vd) & 0x10)) {
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
         ((a->vm | a->vn | a->vd) & 0x10)) {
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_fp_d32, s) &&
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
         ((a->vm | a->vd) & 0x10)) {
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a)
     uint32_t offset;
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vn & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vn & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_from_gp(DisasContext *s, arg_VMOV_from_gp *a)
     uint32_t offset;
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vn & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vn & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vn & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vn & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_64_dp(DisasContext *s, arg_VMOV_64_dp *a)
      */
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_dp *a)
     TCGv_i64 tmp;
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd + n) > 16) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd + n) > 16) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
     TCGv_ptr fpst;
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_fp_d32, s) && ((vd | vn | vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vn | vm) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
     TCGv_i64 f0, fd;
 
     /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_fp_d32, s) && ((vd | vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vm) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VFM_dp(DisasContext *s, arg_VFM_dp *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vn | a->vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
     vd = a->vd;
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (vd & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (vd & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && ((a->vd | a->vm) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd  & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd  & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm  & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm  & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && ((a->vd | a->vm) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && ((a->vd | a->vm) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && ((a->vd | a->vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && ((a->vd | a->vm) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a)
     TCGv_i32 vm;
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a)
     TCGv_i32 vd;
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a)
     TCGv_ptr fpst;
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
     }
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vd & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a)
     TCGv_ptr fpst;
 
     /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_fp_d32, s) && (a->vm & 0x10)) {
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Many uses of ARM_FEATURE_VFP3 are testing for the number of simd
registers implemented.  Use the proper test vs MVFR0.SIMDReg.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214181547.21408-4-richard.henderson@linaro.org
[PMM: fix typo in commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c       |  9 ++++-----
 target/arm/helper.c    | 13 ++++++-------
 target/arm/translate.c |  2 +-
 3 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_dump_state(CPUState *cs, FILE *f, int flags)
 
     if (flags & CPU_DUMP_FPU) {
         int numvfpregs = 0;
-        if (arm_feature(env, ARM_FEATURE_VFP)) {
-            numvfpregs += 16;
-        }
-        if (arm_feature(env, ARM_FEATURE_VFP3)) {
-            numvfpregs += 16;
+        if (cpu_isar_feature(aa32_simd_r32, cpu)) {
+            numvfpregs = 32;
+        } else if (arm_feature(env, ARM_FEATURE_VFP)) {
+            numvfpregs = 16;
         }
         for (i = 0; i < numvfpregs; i++) {
             uint64_t v = *aa32_vfp_dreg(env, i);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void switch_mode(CPUARMState *env, int mode);
 
 static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
 {
-    int nregs;
+    ARMCPU *cpu = env_archcpu(env);
+    int nregs = cpu_isar_feature(aa32_simd_r32, cpu) ? 32 : 16;
 
     /* VFP data registers are always little-endian.  */
-    nregs = arm_feature(env, ARM_FEATURE_VFP3) ? 32 : 16;
     if (reg < nregs) {
         stq_le_p(buf, *aa32_vfp_dreg(env, reg));
         return 8;
@@ -XXX,XX +XXX,XX @@ static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
 
 static int vfp_gdb_set_reg(CPUARMState *env, uint8_t *buf, int reg)
 {
-    int nregs;
+    ARMCPU *cpu = env_archcpu(env);
+    int nregs = cpu_isar_feature(aa32_simd_r32, cpu) ? 32 : 16;
 
-    nregs = arm_feature(env, ARM_FEATURE_VFP3) ? 32 : 16;
     if (reg < nregs) {
         *aa32_vfp_dreg(env, reg) = ldq_le_p(buf);
         return 8;
@@ -XXX,XX +XXX,XX @@ static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
             /* VFPv3 and upwards with NEON implement 32 double precision
              * registers (D0-D31).
              */
-            if (!arm_feature(env, ARM_FEATURE_NEON) ||
-                    !arm_feature(env, ARM_FEATURE_VFP3)) {
+            if (!cpu_isar_feature(aa32_simd_r32, env_archcpu(env))) {
                 /* D32DIS [30] is RAO/WI if D16-31 are not implemented. */
                 value |= (1 << 30);
             }
@@ -XXX,XX +XXX,XX @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
     } else if (arm_feature(env, ARM_FEATURE_NEON)) {
         gdb_register_coprocessor(cs, vfp_gdb_get_reg, vfp_gdb_set_reg,
                                  51, "arm-neon.xml", 0);
-    } else if (arm_feature(env, ARM_FEATURE_VFP3)) {
+    } else if (cpu_isar_feature(aa32_simd_r32, cpu)) {
         gdb_register_coprocessor(cs, vfp_gdb_get_reg, vfp_gdb_set_reg,
                                  35, "arm-vfp3.xml", 0);
     } else if (arm_feature(env, ARM_FEATURE_VFP)) {
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
 #define VFP_SREG(insn, bigbit, smallbit) \
   ((VFP_REG_SHR(insn, bigbit - 1) & 0x1e) | (((insn) >> (smallbit)) & 1))
 #define VFP_DREG(reg, insn, bigbit, smallbit) do { \
-    if (arm_dc_feature(s, ARM_FEATURE_VFP3)) { \
+    if (dc_isar_feature(aa32_simd_r32, s)) { \
         reg = (((insn) >> (bigbit)) & 0x0f) \
               | (((insn) >> ((smallbit) - 4)) & 0x10); \
     } else { \
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

We are going to convert FEATURE tests to ISAR tests,
so FPSP needs to be set for these cpus, like we have
already for FPDP.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214181547.21408-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm926_initfn(Object *obj)
      */
     cpu->isar.id_isar1 = FIELD_DP32(cpu->isar.id_isar1, ID_ISAR1, JAZELLE, 1);
     /*
-     * Similarly, we need to set MVFR0 fields to enable double precision
-     * and short vector support even though ARMv5 doesn't have this register.
+     * Similarly, we need to set MVFR0 fields to enable vfp and short vector
+     * support even though ARMv5 doesn't have this register.
      */
     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSHVEC, 1);
+    cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSP, 1);
     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPDP, 1);
 }
 
@@ -XXX,XX +XXX,XX @@ static void arm1026_initfn(Object *obj)
      */
     cpu->isar.id_isar1 = FIELD_DP32(cpu->isar.id_isar1, ID_ISAR1, JAZELLE, 1);
     /*
-     * Similarly, we need to set MVFR0 fields to enable double precision
-     * and short vector support even though ARMv5 doesn't have this register.
+     * Similarly, we need to set MVFR0 fields to enable vfp and short vector
+     * support even though ARMv5 doesn't have this register.
      */
     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSHVEC, 1);
+    cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSP, 1);
     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPDP, 1);
 
     {
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Use this in the places that were checking ARM_FEATURE_VFP, and
are obviously testing for the existance of the register set
as opposed to testing for some particular instruction extension.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214181547.21408-6-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h        |  6 ++++++
 hw/intc/armv7m_nvic.c   | 20 ++++++++++----------
 linux-user/arm/signal.c |  4 ++--
 target/arm/arch_dump.c  | 11 ++++++-----
 target/arm/cpu.c        |  8 ++++----
 target/arm/helper.c     |  4 ++--
 target/arm/m_helper.c   | 11 ++++++-----
 target/arm/machine.c    |  3 +--
 8 files changed, 37 insertions(+), 30 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_fp16_arith(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
 }
 
+static inline bool isar_feature_aa32_simd_r16(const ARMISARegisters *id)
+{
+    /* Return true if D0-D15 are implemented */
+    return FIELD_EX32(id->mvfr0, MVFR0, SIMDREG) > 0;
+}
+
 static inline bool isar_feature_aa32_simd_r32(const ARMISARegisters *id)
 {
     /* Return true if D16-D31 are implemented */
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
     case 0xd84: /* CSSELR */
         return cpu->env.v7m.csselr[attrs.secure];
     case 0xd88: /* CPACR */
-        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+        if (!cpu_isar_feature(aa32_simd_r16, cpu)) {
             return 0;
         }
         return cpu->env.v7m.cpacr[attrs.secure];
     case 0xd8c: /* NSACR */
-        if (!attrs.secure || !arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+        if (!attrs.secure || !cpu_isar_feature(aa32_simd_r16, cpu)) {
             return 0;
         }
         return cpu->env.v7m.nsacr;
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
         }
         return cpu->env.v7m.sfar;
     case 0xf34: /* FPCCR */
-        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+        if (!cpu_isar_feature(aa32_simd_r16, cpu)) {
             return 0;
         }
         if (attrs.secure) {
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
             return value;
         }
     case 0xf38: /* FPCAR */
-        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+        if (!cpu_isar_feature(aa32_simd_r16, cpu)) {
             return 0;
         }
         return cpu->env.v7m.fpcar[attrs.secure];
     case 0xf3c: /* FPDSCR */
-        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+        if (!cpu_isar_feature(aa32_simd_r16, cpu)) {
             return 0;
         }
         return cpu->env.v7m.fpdscr[attrs.secure];
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
         }
         break;
     case 0xd88: /* CPACR */
-        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
             /* We implement only the Floating Point extension's CP10/CP11 */
             cpu->env.v7m.cpacr[attrs.secure] = value & (0xf << 20);
         }
         break;
     case 0xd8c: /* NSACR */
-        if (attrs.secure && arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+        if (attrs.secure && cpu_isar_feature(aa32_simd_r16, cpu)) {
             /* We implement only the Floating Point extension's CP10/CP11 */
             cpu->env.v7m.nsacr = value & (3 << 10);
         }
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
         break;
     }
     case 0xf34: /* FPCCR */
-        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
             /* Not all bits here are banked. */
             uint32_t fpccr_s;
 
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
         }
         break;
     case 0xf38: /* FPCAR */
-        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
             value &= ~7;
             cpu->env.v7m.fpcar[attrs.secure] = value;
         }
         break;
     case 0xf3c: /* FPDSCR */
-        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
             value &= 0x07c00000;
             cpu->env.v7m.fpdscr[attrs.secure] = value;
         }
diff --git a/linux-user/arm/signal.c b/linux-user/arm/signal.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/arm/signal.c
+++ b/linux-user/arm/signal.c
@@ -XXX,XX +XXX,XX @@ static void setup_sigframe_v2(struct target_ucontext_v2 *uc,
     setup_sigcontext(&uc->tuc_mcontext, env, set->sig[0]);
     /* Save coprocessor signal frame.  */
     regspace = uc->tuc_regspace;
-    if (arm_feature(env, ARM_FEATURE_VFP)) {
+    if (cpu_isar_feature(aa32_simd_r16, env_archcpu(env))) {
         regspace = setup_sigframe_v2_vfp(regspace, env);
     }
     if (arm_feature(env, ARM_FEATURE_IWMMXT)) {
@@ -XXX,XX +XXX,XX @@ static int do_sigframe_return_v2(CPUARMState *env,
 
     /* Restore coprocessor signal frame */
     regspace = uc->tuc_regspace;
-    if (arm_feature(env, ARM_FEATURE_VFP)) {
+    if (cpu_isar_feature(aa32_simd_r16, env_archcpu(env))) {
         regspace = restore_sigframe_v2_vfp(env, regspace);
         if (!regspace) {
             return 1;
diff --git a/target/arm/arch_dump.c b/target/arm/arch_dump.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/arch_dump.c
+++ b/target/arm/arch_dump.c
@@ -XXX,XX +XXX,XX @@ int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
                              int cpuid, void *opaque)
 {
     struct arm_note note;
-    CPUARMState *env = &ARM_CPU(cs)->env;
+    ARMCPU *cpu = ARM_CPU(cs);
+    CPUARMState *env = &cpu->env;
     DumpState *s = opaque;
-    int ret, i, fpvalid = !!arm_feature(env, ARM_FEATURE_VFP);
+    int ret, i;
+    bool fpvalid = cpu_isar_feature(aa32_simd_r16, cpu);
 
     arm_note_init(&note, s, "CORE", 5, NT_PRSTATUS, sizeof(note.prstatus));
 
@@ -XXX,XX +XXX,XX @@ int cpu_get_dump_info(ArchDumpInfo *info,
 ssize_t cpu_get_note_size(int class, int machine, int nr_cpus)
 {
     ARMCPU *cpu = ARM_CPU(first_cpu);
-    CPUARMState *env = &cpu->env;
     size_t note_size;
 
     if (class == ELFCLASS64) {
@@ -XXX,XX +XXX,XX @@ ssize_t cpu_get_note_size(int class, int machine, int nr_cpus)
         note_size += AARCH64_PRFPREG_NOTE_SIZE;
 #ifdef TARGET_AARCH64
         if (cpu_isar_feature(aa64_sve, cpu)) {
-            note_size += AARCH64_SVE_NOTE_SIZE(env);
+            note_size += AARCH64_SVE_NOTE_SIZE(&cpu->env);
         }
 #endif
     } else {
         note_size = ARM_PRSTATUS_NOTE_SIZE;
-        if (arm_feature(env, ARM_FEATURE_VFP)) {
+        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
             note_size += ARM_VFP_NOTE_SIZE;
         }
     }
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
             env->v7m.ccr[M_REG_S] |= R_V7M_CCR_UNALIGN_TRP_MASK;
         }
 
-        if (arm_feature(env, ARM_FEATURE_VFP)) {
+        if (cpu_isar_feature(aa32_simd_r16, cpu)) {
             env->v7m.fpccr[M_REG_NS] = R_V7M_FPCCR_ASPEN_MASK;
             env->v7m.fpccr[M_REG_S] = R_V7M_FPCCR_ASPEN_MASK |
                 R_V7M_FPCCR_LSPEN_MASK | R_V7M_FPCCR_S_MASK;
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_dump_state(CPUState *cs, FILE *f, int flags)
         int numvfpregs = 0;
         if (cpu_isar_feature(aa32_simd_r32, cpu)) {
             numvfpregs = 32;
-        } else if (arm_feature(env, ARM_FEATURE_VFP)) {
+        } else if (cpu_isar_feature(aa32_simd_r16, cpu)) {
             numvfpregs = 16;
         }
         for (i = 0; i < numvfpregs; i++) {
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
      * KVM does not currently allow us to lie to the guest about its
      * ID/feature registers, so the guest always sees what the host has.
      */
-    if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+    if (cpu_isar_feature(aa32_simd_r16, cpu)) {
         cpu->has_vfp = true;
         if (!kvm_enabled()) {
             qdev_property_add_static(DEVICE(obj), &arm_cpu_has_vfp_property);
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
      * We rely on no XScale CPU having VFP so we can use the same bits in the
      * TB flags field for VECSTRIDE and XSCALE_CPAR.
      */
-    assert(!(arm_feature(env, ARM_FEATURE_VFP) &&
+    assert(!(cpu_isar_feature(aa32_simd_r16, cpu) &&
              arm_feature(env, ARM_FEATURE_XSCALE)));
 
     if (arm_feature(env, ARM_FEATURE_V7) &&
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
          * ASEDIS [31] and D32DIS [30] are both UNK/SBZP without VFP.
          * TRCDIS [28] is RAZ/WI since we do not implement a trace macrocell.
          */
-        if (arm_feature(env, ARM_FEATURE_VFP)) {
+        if (cpu_isar_feature(aa32_simd_r16, env_archcpu(env))) {
             /* VFP coprocessor: cp10 & cp11 [23:20] */
             mask |= (1 << 31) | (1 << 30) | (0xf << 20);
 
@@ -XXX,XX +XXX,XX @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
     } else if (cpu_isar_feature(aa32_simd_r32, cpu)) {
         gdb_register_coprocessor(cs, vfp_gdb_get_reg, vfp_gdb_set_reg,
                                  35, "arm-vfp3.xml", 0);
-    } else if (arm_feature(env, ARM_FEATURE_VFP)) {
+    } else if (cpu_isar_feature(aa32_simd_r16, cpu)) {
         gdb_register_coprocessor(cs, vfp_gdb_get_reg, vfp_gdb_set_reg,
                                  19, "arm-vfp.xml", 0);
     }
diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/m_helper.c
+++ b/target/arm/m_helper.c
@@ -XXX,XX +XXX,XX @@ static uint32_t v7m_integrity_sig(CPUARMState *env, uint32_t lr)
      */
     uint32_t sig = 0xfefa125a;
 
-    if (!arm_feature(env, ARM_FEATURE_VFP) || (lr & R_V7M_EXCRET_FTYPE_MASK)) {
+    if (!cpu_isar_feature(aa32_simd_r16, env_archcpu(env))
+        || (lr & R_V7M_EXCRET_FTYPE_MASK)) {
         sig |= 1;
     }
     return sig;
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
 
     if (dotailchain) {
         /* Sanitize LR FType and PREFIX bits */
-        if (!arm_feature(env, ARM_FEATURE_VFP)) {
+        if (!cpu_isar_feature(aa32_simd_r16, cpu)) {
             lr |= R_V7M_EXCRET_FTYPE_MASK;
         }
         lr = deposit32(lr, 24, 8, 0xff);
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
 
     ftype = excret & R_V7M_EXCRET_FTYPE_MASK;
 
-    if (!arm_feature(env, ARM_FEATURE_VFP) && !ftype) {
+    if (!ftype && !cpu_isar_feature(aa32_simd_r16, cpu)) {
         qemu_log_mask(LOG_GUEST_ERROR, "M profile: zero FTYPE in exception "
                       "exit PC value 0x%" PRIx32 " is UNPREDICTABLE "
                       "if FPU not present\n",
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
              * SFPA is RAZ/WI from NS. FPCA is RO if NSACR.CP10 == 0,
              * RES0 if the FPU is not present, and is stored in the S bank
              */
-            if (arm_feature(env, ARM_FEATURE_VFP) &&
+            if (cpu_isar_feature(aa32_simd_r16, env_archcpu(env)) &&
                 extract32(env->v7m.nsacr, 10, 1)) {
                 env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
                 env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_FPCA_MASK;
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
             env->v7m.control[env->v7m.secure] &= ~R_V7M_CONTROL_NPRIV_MASK;
             env->v7m.control[env->v7m.secure] |= val & R_V7M_CONTROL_NPRIV_MASK;
         }
-        if (arm_feature(env, ARM_FEATURE_VFP)) {
+        if (cpu_isar_feature(aa32_simd_r16, env_archcpu(env))) {
             /*
              * SFPA is RAZ/WI from NS or if no FPU.
              * FPCA is RO if NSACR.CP10 == 0, RES0 if the FPU is not present.
diff --git a/target/arm/machine.c b/target/arm/machine.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@
 static bool vfp_needed(void *opaque)
 {
     ARMCPU *cpu = opaque;
-    CPUARMState *env = &cpu->env;
 
-    return arm_feature(env, ARM_FEATURE_VFP);
+    return cpu_isar_feature(aa32_simd_r16, cpu);
 }
 
 static int get_fpscr(QEMUFile *f, void *opaque, size_t size,
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The old name, isar_feature_aa32_fpdp, does not reflect
that the test includes VFPv2.  We will introduce further
feature tests for VFPv3.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200214181547.21408-7-richard.henderson@linaro.org
[PMM: fixed grammar in commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h               |  4 ++--
 target/arm/translate-vfp.inc.c | 40 +++++++++++++++++-----------------
 2 files changed, 22 insertions(+), 22 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

We will shortly use these to test for VFPv2 and VFPv3
in different situations.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214181547.21408-8-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

From: Richard Henderson <richard.henderson@linaro.org>

Shuffle the order of the checks so that we test the ISA
before we test anything else, such as the register arguments.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214181547.21408-9-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-vfp.inc.c | 144 ++++++++++++++++-----------------
 1 file changed, 72 insertions(+), 72 deletions(-)

diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-vfp.inc.c
+++ b/target/arm/translate-vfp.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-        ((a->vm | a->vn | a->vd) & 0x10)) {
+    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vm | a->vn | a->vd) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-        ((a->vm | a->vn | a->vd) & 0x10)) {
+    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vm | a->vn | a->vd) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-        ((a->vm | a->vd) & 0x10)) {
+    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vm | a->vd) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
+    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
     TCGv_i64 f0, f1, fd;
     TCGv_ptr fpst;
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vn | vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vn | vm) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
     int veclen = s->vec_len;
     TCGv_i64 f0, fd;
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vm) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VFM_dp(DisasContext *s, arg_VFM_dp *a)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) &&
-        ((a->vd | a->vn | a->vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
 
     vd = a->vd;
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && (vd & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (vd & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
 {
     TCGv_i64 vd, vm;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     /* Vm/M bits must be zero for the Z variant */
     if (a->z && a->vm != 0) {
         return false;
@@ -XXX,XX +XXX,XX @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
     TCGv_i32 tmp;
     TCGv_i64 vd;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
     TCGv_i32 tmp;
     TCGv_i64 vm;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a)
     TCGv_ptr fpst;
     TCGv_i64 tmp;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_vrint, s)) {
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a)
     TCGv_i64 tmp;
     TCGv_i32 tcg_rmode;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_vrint, s)) {
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
     TCGv_ptr fpst;
     TCGv_i64 tmp;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_vrint, s)) {
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a)
     TCGv_i64 vd;
     TCGv_i32 vm;
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a)
     TCGv_i64 vm;
     TCGv_i32 vd;
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a)
     TCGv_i64 vd;
     TCGv_ptr fpst;
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
         return false;
     }
 
@@ -XXX,XX +XXX,XX @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
     TCGv_i32 vd;
     TCGv_i64 vm;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_jscvt, s)) {
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
     TCGv_ptr fpst;
     int frac_bits;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
         return false;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -XXX,XX +XXX,XX @@ static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a)
     TCGv_i64 vm;
     TCGv_ptr fpst;
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Sort this check to the start of a trans_* function.
Merge this with any existing test for fpdp_v2.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214181547.21408-10-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-vfp.inc.c | 24 ++++++++----------------
 1 file changed, 8 insertions(+), 16 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

We will eventually remove the early ARM_FEATURE_VFP test,
so add a proper test for each trans_* that does not already
have another ISA test.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214181547.21408-11-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-vfp.inc.c | 78 ++++++++++++++++++++++++++++++----
 1 file changed, 69 insertions(+), 9 deletions(-)