Series comparison

-[Qemu-devel] [PULL 00/16] target-arm queue
+[PULL 00/29] target-arm queue
-The following changes since commit ad1b4ec39caa5b3f17cbd8160283a03a3dcfe2ae:
+Hi; this mostly contains the first slice of A64 decodetree
 patches, plus some other minor pieces. It also has the
 enablement of MTE for KVM guests.
-  Merge remote-tracking branch 'remotes/kraxel/tags/input-20180515-pull-request' into staging (2018-05-15 12:50:06 +0100)
+thanks
 -- PMM
 The following changes since commit d27e7c359330ba7020bdbed7ed2316cb4cf6ffc1:
   qapi/parser: Drop two bad type hints for now (2023-05-17 10:18:33 -0700)
 are available in the Git repository at:
-  git://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20180515
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20230518
-for you to fetch changes up to ae7651804748c6b479d5ae09aeac4edb9c44f76e:
+for you to fetch changes up to 91608e2a44f36e79cb83f863b8a7bb57d2c98061:
-  tcg: Optionally log FPU state in TCG -d cpu logging (2018-05-15 14:58:44 +0100)
+  docs: Convert u2f.txt to rST (2023-05-18 11:40:32 +0100)
 ----------------------------------------------------------------
 target-arm queue:
- * Fix coverity nit in int_to_float code
+ * Fix vd == vm overlap in sve_ldff1_z
- * Don't set Invalid for float-to-int(MAXINT)
+ * Add support for MTE with KVM guests
- * Fix fp_status_f16 tininess before rounding
+ * Add RAZ/WI handling for DBGDTR[TX|RX]
- * Add various missing insns from the v8.2-FP16 extension
+ * Start of conversion of A64 decoder to decodetree
- * Fix sqrt_f16 exception raising
+ * Saturate L2CTLR_EL1 core count field rather than overflowing
- * sdcard: Correct CRC16 offset in sd_function_switch()
+ * vexpress: Avoid trivial memory leak of 'flashalias'
- * tcg: Optionally log FPU state in TCG -d cpu logging
+ * sbsa-ref: switch default cpu core to Neoverse-N1
  * sbsa-ref: use Bochs graphics card instead of VGA
  * MAINTAINERS: Add Marcin Juszkiewicz to sbsa-ref reviewer list
  * docs: Convert u2f.txt to rST
 ----------------------------------------------------------------
-Alex Bennée (5):
+Alex Bennée (1):
-      fpu/softfloat: int_to_float ensure r fully initialised
+      target/arm: add RAZ/WI handling for DBGDTR[TX|RX]
       target/arm: Implement FCMP for fp16
       target/arm: Implement FCSEL for fp16
       target/arm: Implement FMOV (immediate) for fp16
       target/arm: Fix sqrt_f16 exception raising
-Peter Maydell (3):
+Cornelia Huck (1):
-      fpu/softfloat: Don't set Invalid for float-to-int(MAXINT)
+      arm/kvm: add support for MTE
       target/arm: Fix fp_status_f16 tininess before rounding
       tcg: Optionally log FPU state in TCG -d cpu logging
-Philippe Mathieu-Daudé (1):
+Marcin Juszkiewicz (3):
-      sdcard: Correct CRC16 offset in sd_function_switch()
+      sbsa-ref: switch default cpu core to Neoverse-N1
       Maintainers: add myself as reviewer for sbsa-ref
       sbsa-ref: use Bochs graphics card instead of VGA
-Richard Henderson (7):
+Peter Maydell (14):
-      target/arm: Implement FMOV (general) for fp16
+      target/arm: Create decodetree skeleton for A64
-      target/arm: Early exit after unallocated_encoding in disas_fp_int_conv
+      target/arm: Pull calls to disas_sve() and disas_sme() out of legacy decoder
-      target/arm: Implement FCVT (scalar, integer) for fp16
+      target/arm: Convert Extract instructions to decodetree
-      target/arm: Implement FCVT (scalar, fixed-point) for fp16
+      target/arm: Convert unconditional branch immediate to decodetree
-      target/arm: Introduce and use read_fp_hreg
+      target/arm: Convert CBZ, CBNZ to decodetree
-      target/arm: Implement FP data-processing (2 source) for fp16
+      target/arm: Convert TBZ, TBNZ to decodetree
-      target/arm: Implement FP data-processing (3 source) for fp16
+      target/arm: Convert conditional branch insns to decodetree
       target/arm: Convert BR, BLR, RET to decodetree
       target/arm: Convert BRA[AB]Z, BLR[AB]Z, RETA[AB] to decodetree
       target/arm: Convert BRAA, BRAB, BLRAA, BLRAB to decodetree
       target/arm: Convert ERET, ERETAA, ERETAB to decodetree
       target/arm: Saturate L2CTLR_EL1 core count field rather than overflowing
       hw/arm/vexpress: Avoid trivial memory leak of 'flashalias'
       docs: Convert u2f.txt to rST
- include/qemu/log.h         |   1 +
+Richard Henderson (10):
- target/arm/helper-a64.h    |   2 +
+      target/arm: Fix vd == vm overlap in sve_ldff1_z
- target/arm/helper.h        |   6 +
+      target/arm: Split out disas_a64_legacy
- accel/tcg/cpu-exec.c       |   9 +-
+      target/arm: Convert PC-rel addressing to decodetree
- fpu/softfloat.c            |   6 +-
+      target/arm: Split gen_add_CC and gen_sub_CC
- hw/sd/sd.c                 |   2 +-
+      target/arm: Convert Add/subtract (immediate) to decodetree
- target/arm/cpu.c           |   2 +
+      target/arm: Convert Add/subtract (immediate with tags) to decodetree
- target/arm/helper-a64.c    |  10 ++
+      target/arm: Replace bitmask64 with MAKE_64BIT_MASK
- target/arm/helper.c        |  38 +++-
+      target/arm: Convert Logical (immediate) to decodetree
- target/arm/translate-a64.c | 421 ++++++++++++++++++++++++++++++++++++++-------
+      target/arm: Convert Move wide (immediate) to decodetree
- util/log.c                 |   2 +
+      target/arm: Convert Bitfield to decodetree
 files changed, 428 insertions(+), 71 deletions(-)
+ MAINTAINERS                      |    1 +
+ docs/system/device-emulation.rst |    1 +
+ docs/system/devices/usb-u2f.rst  |   93 +++
+ docs/system/devices/usb.rst      |    2 +-
+ docs/u2f.txt                     |  110 ----
+ target/arm/cpu.h                 |    4 +
+ target/arm/kvm_arm.h             |   19 +
+ target/arm/tcg/translate.h       |    5 +
+ target/arm/tcg/a64.decode        |  152 +++++
+ hw/arm/sbsa-ref.c                |    4 +-
+ hw/arm/vexpress.c                |   40 +-
+ hw/arm/virt.c                    |   73 ++-
+ target/arm/cortex-regs.c         |   11 +-
+ target/arm/cpu.c                 |    9 +-
+ target/arm/debug_helper.c        |   11 +-
+ target/arm/kvm.c                 |   35 +
+ target/arm/kvm64.c               |    5 +
+ target/arm/tcg/sve_helper.c      |    6 +
+ target/arm/tcg/translate-a64.c   | 1321 ++++++++++++++++----------------------
+ target/arm/tcg/meson.build       |    1 +
+files changed, 979 insertions(+), 924 deletions(-)
+ create mode 100644 docs/system/devices/usb-u2f.rst
+ delete mode 100644 docs/u2f.txt
+ create mode 100644 target/arm/tcg/a64.decode

-New patch
+[PULL 01/29] sbsa-ref: switch default cpu core to Neoverse-N1
+From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
+The world outside moves to newer and newer cpu cores. Let move SBSA
+Reference Platform to something newer as well.
+Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
+Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
+Message-id: 20230506183417.1360427-1-marcin.juszkiewicz@linaro.org
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/arm/sbsa-ref.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/sbsa-ref.c
++++ b/hw/arm/sbsa-ref.c
+@@ -XXX,XX +XXX,XX @@ static void sbsa_ref_class_init(ObjectClass *oc, void *data)
+     mc->init = sbsa_ref_init;
+     mc->desc = "QEMU 'SBSA Reference' ARM Virtual Machine";
+-    mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a57");
++    mc->default_cpu_type = ARM_CPU_TYPE_NAME("neoverse-n1");
+     mc->max_cpus = 512;
+     mc->pci_allow_0_address = true;
+     mc->minimum_page_bits = 12;
+--
+.34.1

-[Qemu-devel] [PULL 13/16] target/arm: Implement FMOV (immediate) for fp16
+[PULL 02/29] target/arm: Fix vd == vm overlap in sve_ldff1_z
-From: Alex Bennée <alex.bennee@linaro.org>
+From: Richard Henderson <richard.henderson@linaro.org>
-All the hard work is already done by vfp_expand_imm, we just need to
+If vd == vm, copy vm to scratch, so that we can pre-zero
-make sure we pick up the correct size.
+the output and still access the gather indicies.
 Cc: qemu-stable@nongnu.org
+Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1612
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230504104232.1877774-1-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
-Tested-by: Alex Bennée <alex.bennee@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180512003217.9105-11-richard.henderson@linaro.org
-[rth: Merge unallocated_encoding check with TCGMemOp conversion.]
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 20 +++++++++++++++++---
+ target/arm/tcg/sve_helper.c | 6 ++++++
-file changed, 17 insertions(+), 3 deletions(-)
+file changed, 6 insertions(+)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/tcg/sve_helper.c
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/tcg/sve_helper.c
-@@ -XXX,XX +XXX,XX @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ void sve_ldff1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
- {
+     intptr_t reg_off;
-     int rd = extract32(insn, 0, 5);
+     SVEHostPage info;
-     int imm8 = extract32(insn, 13, 8);
+     target_ulong addr, in_page;
--    int is_double = extract32(insn, 22, 2);
++    ARMVectorReg scratch;
-+    int type = extract32(insn, 22, 2);
-     uint64_t imm;
+     /* Skip to the first true predicate.  */
-     TCGv_i64 tcg_res;
+     reg_off = find_next_active(vg, 0, reg_max, esz);
-+    TCGMemOp sz;
+@@ -XXX,XX +XXX,XX @@ void sve_ldff1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
 -    if (is_double > 1) {
 +    switch (type) {
 +    case 0:
 +        sz = MO_32;
 +        break;
 +    case 1:
 +        sz = MO_64;
 +        break;
 +    case 3:
 +        sz = MO_16;
 +        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +            break;
 +        }
 +        /* fallthru */
 +    default:
          unallocated_encoding(s);
          return;
      }
-@@ -XXX,XX +XXX,XX @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
-         return;
++    /* Protect against overlap between vd and vm. */
-     }
++    if (unlikely(vd == vm)) {
++        vm = memcpy(&scratch, vm, reg_max);
--    imm = vfp_expand_imm(MO_32 + is_double, imm8);
++    }
-+    imm = vfp_expand_imm(sz, imm8);
++
+     /*
-     tcg_res = tcg_const_i64(imm);
+      * Probe the first element, allowing faults.
-     write_fp_dreg(s, rd, tcg_res);
+      */
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 12/16] target/arm: Implement FCSEL for fp16
+[PULL 03/29] Maintainers: add myself as reviewer for sbsa-ref
-From: Alex Bennée <alex.bennee@linaro.org>
+From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
-These were missed out from the rest of the half-precision work.
+At Linaro I work on sbsa-ref, know direction it goes.
-Cc: qemu-stable@nongnu.org
+May not get code details each time.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
+Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
-Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230515143753.365591-1-marcin.juszkiewicz@linaro.org
 Message-id: 20180512003217.9105-10-richard.henderson@linaro.org
 [rth: Fix erroneous check vs type]
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 31 +++++++++++++++++++++++++------
+ MAINTAINERS | 1 +
-file changed, 25 insertions(+), 6 deletions(-)
+file changed, 1 insertion(+)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/MAINTAINERS b/MAINTAINERS
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/MAINTAINERS
-+++ b/target/arm/translate-a64.c
++++ b/MAINTAINERS
-@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ SBSA-REF
-     unsigned int mos, type, rm, cond, rn, rd;
+ M: Radoslaw Biernacki <rad@semihalf.com>
-     TCGv_i64 t_true, t_false, t_zero;
+ M: Peter Maydell <peter.maydell@linaro.org>
-     DisasCompare64 c;
+ R: Leif Lindholm <quic_llindhol@quicinc.com>
-+    TCGMemOp sz;
++R: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
+ L: qemu-arm@nongnu.org
-     mos = extract32(insn, 29, 3);
+ S: Maintained
--    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
+ F: hw/arm/sbsa-ref.c
 +    type = extract32(insn, 22, 2);
      rm = extract32(insn, 16, 5);
      cond = extract32(insn, 12, 4);
      rn = extract32(insn, 5, 5);
      rd = extract32(insn, 0, 5);
 -    if (mos || type > 1) {
 +    if (mos) {
 +        unallocated_encoding(s);
 +        return;
 +    }
 +
 +    switch (type) {
 +    case 0:
 +        sz = MO_32;
 +        break;
 +    case 1:
 +        sz = MO_64;
 +        break;
 +    case 3:
 +        sz = MO_16;
 +        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +            break;
 +        }
 +        /* fallthru */
 +    default:
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
          return;
      }
 -    /* Zero extend sreg inputs to 64 bits now.  */
 +    /* Zero extend sreg & hreg inputs to 64 bits now.  */
      t_true = tcg_temp_new_i64();
      t_false = tcg_temp_new_i64();
 -    read_vec_element(s, t_true, rn, 0, type ? MO_64 : MO_32);
 -    read_vec_element(s, t_false, rm, 0, type ? MO_64 : MO_32);
 +    read_vec_element(s, t_true, rn, 0, sz);
 +    read_vec_element(s, t_false, rm, 0, sz);
      a64_test_cc(&c, cond);
      t_zero = tcg_const_i64(0);
@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
      tcg_temp_free_i64(t_false);
      a64_free_cc(&c);
 -    /* Note that sregs write back zeros to the high bits,
 +    /* Note that sregs & hregs write back zeros to the high bits,
         and we've already done the zero-extension.  */
      write_fp_dreg(s, rd, t_true);
      tcg_temp_free_i64(t_true);
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 03/16] target/arm: Fix fp_status_f16 tininess before rounding
+[PULL 04/29] arm/kvm: add support for MTE
-In commit d81ce0ef2c4f105 we added an extra float_status field
+From: Cornelia Huck <cohuck@redhat.com>
-fp_status_fp16 for Arm, but forgot to initialize it correctly
-by setting it to float_tininess_before_rounding. This currently
+Extend the 'mte' property for the virt machine to cover KVM as
-will only cause problems for the new V8_FP16 feature, since the
+well. For KVM, we don't allocate tag memory, but instead enable the
-float-to-float conversion code doesn't use it yet. The effect
+capability.
-would be that we failed to set the Underflow IEEE exception flag
-in all the cases where we should.
+If MTE has been enabled, we need to disable migration, as we do not
+yet have a way to migrate the tags as well. Therefore, MTE will stay
-Add the missing initialization.
+off with KVM unless requested explicitly.
-Fixes: d81ce0ef2c4f105
+Signed-off-by: Cornelia Huck <cohuck@redhat.com>
-Cc: qemu-stable@nongnu.org
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+Message-id: 20230428095533.21747-2-cohuck@redhat.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20180512004311.9299-16-richard.henderson@linaro.org
 ---
- target/arm/cpu.c | 2 ++
+ target/arm/cpu.h     |  4 +++
-file changed, 2 insertions(+)
+ target/arm/kvm_arm.h | 19 ++++++++++++
+ hw/arm/virt.c        | 73 +++++++++++++++++++++++++-------------------
  target/arm/cpu.c     |  9 +++---
  target/arm/kvm.c     | 35 +++++++++++++++++++++
  target/arm/kvm64.c   |  5 +++
 files changed, 109 insertions(+), 36 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
       */
      uint32_t psci_conduit;
 +    /* CPU has Memory Tag Extension */
 +    bool has_mte;
 +
      /* For v8M, initial value of the Secure VTOR */
      uint32_t init_svtor;
      /* For v8M, initial value of the Non-secure VTOR */
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
      bool prop_pauth;
      bool prop_pauth_impdef;
      bool prop_lpa2;
 +    OnOffAuto prop_mte;
      /* DCZ blocksize, in log_2(words), ie low 4 bits of DCZID_EL0 */
      uint32_t dcz_blocksize;
 diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm_arm.h
 +++ b/target/arm/kvm_arm.h
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_pmu_supported(void);
   */
  bool kvm_arm_sve_supported(void);
 +/**
 + * kvm_arm_mte_supported:
 + *
 + * Returns: true if KVM can enable MTE, and false otherwise.
 + */
 +bool kvm_arm_mte_supported(void);
 +
  /**
   * kvm_arm_get_max_vm_ipa_size:
   * @ms: Machine state handle
@@ -XXX,XX +XXX,XX @@ void kvm_arm_pvtime_init(CPUState *cs, uint64_t ipa);
  int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level);
 +void kvm_arm_enable_mte(Object *cpuobj, Error **errp);
 +
  #else
  /*
@@ -XXX,XX +XXX,XX @@ static inline bool kvm_arm_steal_time_supported(void)
      return false;
  }
 +static inline bool kvm_arm_mte_supported(void)
 +{
 +    return false;
 +}
 +
  /*
   * These functions should never actually be called without KVM support.
   */
@@ -XXX,XX +XXX,XX @@ static inline uint32_t kvm_arm_sve_get_vls(CPUState *cs)
      g_assert_not_reached();
  }
 +static inline void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
 +{
 +    g_assert_not_reached();
 +}
 +
  #endif
  static inline const char *gic_class_name(void)
 diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/virt.c
 +++ b/hw/arm/virt.c
@@ -XXX,XX +XXX,XX @@ static void machvirt_init(MachineState *machine)
          exit(1);
      }
 -    if (vms->mte && (kvm_enabled() || hvf_enabled())) {
 +    if (vms->mte && hvf_enabled()) {
          error_report("mach-virt: %s does not support providing "
                       "MTE to the guest CPU",
                       current_accel_name());
@@ -XXX,XX +XXX,XX @@ static void machvirt_init(MachineState *machine)
          }
          if (vms->mte) {
 -            /* Create the memory region only once, but link to all cpus. */
 -            if (!tag_sysmem) {
 -                /*
 -                 * The property exists only if MemTag is supported.
 -                 * If it is, we must allocate the ram to back that up.
 -                 */
 -                if (!object_property_find(cpuobj, "tag-memory")) {
 -                    error_report("MTE requested, but not supported "
 -                                 "by the guest CPU");
 +            if (tcg_enabled()) {
 +                /* Create the memory region only once, but link to all cpus. */
 +                if (!tag_sysmem) {
 +                    /*
 +                     * The property exists only if MemTag is supported.
 +                     * If it is, we must allocate the ram to back that up.
 +                     */
 +                    if (!object_property_find(cpuobj, "tag-memory")) {
 +                        error_report("MTE requested, but not supported "
 +                                     "by the guest CPU");
 +                        exit(1);
 +                    }
 +
 +                    tag_sysmem = g_new(MemoryRegion, 1);
 +                    memory_region_init(tag_sysmem, OBJECT(machine),
 +                                       "tag-memory", UINT64_MAX / 32);
 +
 +                    if (vms->secure) {
 +                        secure_tag_sysmem = g_new(MemoryRegion, 1);
 +                        memory_region_init(secure_tag_sysmem, OBJECT(machine),
 +                                           "secure-tag-memory",
 +                                           UINT64_MAX / 32);
 +
 +                        /* As with ram, secure-tag takes precedence over tag. */
 +                        memory_region_add_subregion_overlap(secure_tag_sysmem,
 +                                                            0, tag_sysmem, -1);
 +                    }
 +                }
 +
 +                object_property_set_link(cpuobj, "tag-memory",
 +                                         OBJECT(tag_sysmem), &error_abort);
 +                if (vms->secure) {
 +                    object_property_set_link(cpuobj, "secure-tag-memory",
 +                                             OBJECT(secure_tag_sysmem),
 +                                             &error_abort);
 +                }
 +            } else if (kvm_enabled()) {
 +                if (!kvm_arm_mte_supported()) {
 +                    error_report("MTE requested, but not supported by KVM");
                      exit(1);
                  }
 -
 -                tag_sysmem = g_new(MemoryRegion, 1);
 -                memory_region_init(tag_sysmem, OBJECT(machine),
 -                                   "tag-memory", UINT64_MAX / 32);
 -
 -                if (vms->secure) {
 -                    secure_tag_sysmem = g_new(MemoryRegion, 1);
 -                    memory_region_init(secure_tag_sysmem, OBJECT(machine),
 -                                       "secure-tag-memory", UINT64_MAX / 32);
 -
 -                    /* As with ram, secure-tag takes precedence over tag.  */
 -                    memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
 -                                                        tag_sysmem, -1);
 -                }
 -            }
 -
 -            object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem),
 -                                     &error_abort);
 -            if (vms->secure) {
 -                object_property_set_link(cpuobj, "secure-tag-memory",
 -                                         OBJECT(secure_tag_sysmem),
 -                                         &error_abort);
 +                kvm_arm_enable_mte(cpuobj, &error_abort);
              }
          }
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
+@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
-                               &env->vfp.fp_status);
+                                      qdev_prop_allow_set_link_before_realize,
-     set_float_detect_tininess(float_tininess_before_rounding,
+                                      OBJ_PROP_LINK_STRONG);
-                               &env->vfp.standard_fp_status);
+         }
-+    set_float_detect_tininess(float_tininess_before_rounding,
++        cpu->has_mte = true;
-+                              &env->vfp.fp_status_f16);
+     }
  #endif
  }
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
          }
          if (cpu->tag_memory) {
              error_setg(errp,
 -                       "Cannot enable %s when guest CPUs has MTE enabled",
 +                       "Cannot enable %s when guest CPUs has tag memory enabled",
                         current_accel_name());
              return;
          }
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
      }
  #ifndef CONFIG_USER_ONLY
-     if (kvm_enabled()) {
+-    if (cpu->tag_memory == NULL && cpu_isar_feature(aa64_mte, cpu)) {
-         kvm_arm_reset_vcpu(cpu);
++    if (!cpu->has_mte && cpu_isar_feature(aa64_mte, cpu)) {
          /*
 -         * Disable the MTE feature bits if we do not have tag-memory
 -         * provided by the machine.
 +         * Disable the MTE feature bits if we do not have the feature
 +         * setup by the machine.
           */
          cpu->isar.id_aa64pfr1 =
              FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 0);
 diff --git a/target/arm/kvm.c b/target/arm/kvm.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm.c
 +++ b/target/arm/kvm.c
@@ -XXX,XX +XXX,XX @@
  #include "hw/boards.h"
  #include "hw/irq.h"
  #include "qemu/log.h"
 +#include "migration/blocker.h"
  const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
      KVM_CAP_LAST_INFO
@@ -XXX,XX +XXX,XX @@ bool kvm_arch_cpu_check_are_resettable(void)
  void kvm_arch_accel_class_init(ObjectClass *oc)
  {
  }
 +
 +void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
 +{
 +    static bool tried_to_enable;
 +    static bool succeeded_to_enable;
 +    Error *mte_migration_blocker = NULL;
 +    int ret;
 +
 +    if (!tried_to_enable) {
 +        /*
 +         * MTE on KVM is enabled on a per-VM basis (and retrying doesn't make
 +         * sense), and we only want a single migration blocker as well.
 +         */
 +        tried_to_enable = true;
 +
 +        ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_MTE, 0);
 +        if (ret) {
 +            error_setg_errno(errp, -ret, "Failed to enable KVM_CAP_ARM_MTE");
 +            return;
 +        }
 +
 +        /* TODO: add proper migration support with MTE enabled */
 +        error_setg(&mte_migration_blocker,
 +                   "Live migration disabled due to MTE enabled");
 +        if (migrate_add_blocker(mte_migration_blocker, errp)) {
 +            error_free(mte_migration_blocker);
 +            return;
 +        }
 +        succeeded_to_enable = true;
 +    }
 +    if (succeeded_to_enable) {
 +        object_property_set_bool(cpuobj, "has_mte", true, NULL);
 +    }
 +}
 diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm64.c
 +++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_steal_time_supported(void)
      return kvm_check_extension(kvm_state, KVM_CAP_STEAL_TIME);
  }
 +bool kvm_arm_mte_supported(void)
 +{
 +    return kvm_check_extension(kvm_state, KVM_CAP_ARM_MTE);
 +}
 +
  QEMU_BUILD_BUG_ON(KVM_ARM64_SVE_VQ_MIN != 1);
  uint32_t kvm_arm_sve_get_vls(CPUState *cs)
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 14/16] target/arm: Fix sqrt_f16 exception raising
+[PULL 05/29] target/arm: add RAZ/WI handling for DBGDTR[TX|RX]
 From: Alex Bennée <alex.bennee@linaro.org>
-We are meant to explicitly pass fpst, not cpu_env.
+The commit b3aa2f2128 (target/arm: provide stubs for more external
 debug registers) was added to handle HyperV's unconditional usage of
 Debug Communications Channel. It turns out that Linux will similarly
 break if you enable CONFIG_HVC_DCC "ARM JTAG DCC console".
-Cc: qemu-stable@nongnu.org
+Extend the registers we RAZ/WI set to avoid this.
 Cc: Anders Roxell <anders.roxell@linaro.org>
 Cc: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
 Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230516104420.407912-1-alex.bennee@linaro.org
 Tested-by: Alex Bennée <alex.bennee@linaro.org>
 Message-id: 20180512003217.9105-12-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 3 ++-
+ target/arm/debug_helper.c | 11 +++++++++--
-file changed, 2 insertions(+), 1 deletion(-)
+file changed, 9 insertions(+), 2 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/debug_helper.c
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/debug_helper.c
-@@ -XXX,XX +XXX,XX @@ static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
-         tcg_gen_xori_i32(tcg_res, tcg_op, 0x8000);
+       .access = PL0_R, .accessfn = access_tdcc,
-         break;
+       .type = ARM_CP_CONST, .resetvalue = 0 },
-     case 0x3: /* FSQRT */
+     /*
--        gen_helper_sqrt_f16(tcg_res, tcg_op, cpu_env);
+-     * OSDTRRX_EL1/OSDTRTX_EL1 are used for save and restore of DBGDTRRX_EL0.
-+        fpst = get_fpstatus_ptr(true);
+-     * It is a component of the Debug Communications Channel, which is not implemented.
-+        gen_helper_sqrt_f16(tcg_res, tcg_op, fpst);
++     * These registers belong to the Debug Communications Channel,
-         break;
++     * which is not implemented. However we implement RAZ/WI behaviour
-     case 0x8: /* FRINTN */
++     * with trapping to prevent spurious SIGILLs if the guest OS does
-     case 0x9: /* FRINTP */
++     * access them as the support cannot be probed for.
       */
      { .name = "OSDTRRX_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14,
        .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 0, .opc2 = 2,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
        .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 2,
        .access = PL1_RW, .accessfn = access_tdcc,
        .type = ARM_CP_CONST, .resetvalue = 0 },
 +    /* DBGDTRTX_EL0/DBGDTRRX_EL0 depend on direction */
 +    { .name = "DBGDTR_EL0", .state = ARM_CP_STATE_BOTH, .cp = 14,
 +      .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 5, .opc2 = 0,
 +      .access = PL0_RW, .accessfn = access_tdcc,
 +      .type = ARM_CP_CONST, .resetvalue = 0 },
      /*
       * OSECCR_EL1 provides a mechanism for an operating system
       * to access the contents of EDECCR. EDECCR is not implemented though,
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 01/16] fpu/softfloat: int_to_float ensure r fully initialised
+[PULL 06/29] sbsa-ref: use Bochs graphics card instead of VGA
-From: Alex Bennée <alex.bennee@linaro.org>
+From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
-Reported by Coverity (CID1390635). We ensure this for uint_to_float
+Bochs card is normal PCI Express card so it fits better in system with
-later on so we might as well mirror that.
+PCI Express bus. VGA is simple legacy PCI card.
-Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
+Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20230505120936.1097060-1-marcin.juszkiewicz@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- fpu/softfloat.c | 2 +-
+ hw/arm/sbsa-ref.c | 2 +-
 file changed, 1 insertion(+), 1 deletion(-)
-diff --git a/fpu/softfloat.c b/fpu/softfloat.c
+diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
 index XXXXXXX..XXXXXXX 100644
---- a/fpu/softfloat.c
+--- a/hw/arm/sbsa-ref.c
-+++ b/fpu/softfloat.c
++++ b/hw/arm/sbsa-ref.c
-@@ -XXX,XX +XXX,XX @@ FLOAT_TO_UINT(64, 64)
+@@ -XXX,XX +XXX,XX @@ static void create_pcie(SBSAMachineState *sms)
+         }
- static FloatParts int_to_float(int64_t a, float_status *status)
+     }
- {
--    FloatParts r;
+-    pci_create_simple(pci->bus, -1, "VGA");
-+    FloatParts r = {};
++    pci_create_simple(pci->bus, -1, "bochs-display");
-     if (a == 0) {
-         r.cls = float_class_zero;
+     create_smmu(sms, pci->bus);
-         r.sign = false;
+ }
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 09/16] target/arm: Implement FP data-processing (2 source) for fp16
+[PULL 07/29] target/arm: Split out disas_a64_legacy
 From: Richard Henderson <richard.henderson@linaro.org>
-We missed all of the scalar fp16 binary operations.
+Split out all of the decode stuff from aarch64_tr_translate_insn.
 Call it disas_a64_legacy to indicate it will be replaced.
-Cc: qemu-stable@nongnu.org
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20180512003217.9105-7-richard.henderson@linaro.org
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Message-id: 20230512144106.3608981-2-peter.maydell@linaro.org
 [PMM: Rebased]
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 65 ++++++++++++++++++++++++++++++++++++++
+ target/arm/tcg/translate-a64.c | 82 ++++++++++++++++++----------------
-file changed, 65 insertions(+)
+file changed, 44 insertions(+), 38 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/tcg/translate-a64.c
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
+@@ -XXX,XX +XXX,XX @@ static bool btype_destination_ok(uint32_t insn, bool bt, int btype)
-     tcg_temp_free_i64(tcg_res);
+     return false;
  }
-+/* Floating-point data-processing (2 source) - half precision */
++/* C3.1 A64 instruction index by encoding */
-+static void handle_fp_2src_half(DisasContext *s, int opcode,
++static void disas_a64_legacy(DisasContext *s, uint32_t insn)
 +                                int rd, int rn, int rm)
 +{
-+    TCGv_i32 tcg_op1;
++    switch (extract32(insn, 25, 4)) {
-+    TCGv_i32 tcg_op2;
++    case 0x0:
-+    TCGv_i32 tcg_res;
++        if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
-+    TCGv_ptr fpst;
++            unallocated_encoding(s);
-+
++        }
 +    tcg_res = tcg_temp_new_i32();
 +    fpst = get_fpstatus_ptr(true);
 +    tcg_op1 = read_fp_hreg(s, rn);
 +    tcg_op2 = read_fp_hreg(s, rm);
 +
 +    switch (opcode) {
 +    case 0x0: /* FMUL */
 +        gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
 +        break;
-+    case 0x1: /* FDIV */
++    case 0x1: case 0x3: /* UNALLOCATED */
-+        gen_helper_advsimd_divh(tcg_res, tcg_op1, tcg_op2, fpst);
++        unallocated_encoding(s);
 +        break;
-+    case 0x2: /* FADD */
++    case 0x2:
-+        gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
++        if (!disas_sve(s, insn)) {
 +            unallocated_encoding(s);
 +        }
 +        break;
-+    case 0x3: /* FSUB */
++    case 0x8: case 0x9: /* Data processing - immediate */
-+        gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
++        disas_data_proc_imm(s, insn);
 +        break;
-+    case 0x4: /* FMAX */
++    case 0xa: case 0xb: /* Branch, exception generation and system insns */
-+        gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
++        disas_b_exc_sys(s, insn);
 +        break;
-+    case 0x5: /* FMIN */
++    case 0x4:
-+        gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
++    case 0x6:
 +    case 0xc:
 +    case 0xe:      /* Loads and stores */
 +        disas_ldst(s, insn);
 +        break;
-+    case 0x6: /* FMAXNM */
++    case 0x5:
-+        gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
++    case 0xd:      /* Data processing - register */
 +        disas_data_proc_reg(s, insn);
 +        break;
-+    case 0x7: /* FMINNM */
++    case 0x7:
-+        gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst);
++    case 0xf:      /* Data processing - SIMD and floating point */
-+        break;
++        disas_data_proc_simd_fp(s, insn);
 +    case 0x8: /* FNMUL */
 +        gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
 +        tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000);
 +        break;
 +    default:
-+        g_assert_not_reached();
++        assert(FALSE); /* all 15 cases should be handled above */
 +        break;
 +    }
-+
-+    write_fp_sreg(s, rd, tcg_res);
-+
-+    tcg_temp_free_ptr(fpst);
-+    tcg_temp_free_i32(tcg_op1);
-+    tcg_temp_free_i32(tcg_op2);
-+    tcg_temp_free_i32(tcg_res);
 +}
 +
- /* Floating point data-processing (2 source)
+ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
-  *   31  30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
+                                           CPUState *cpu)
-  * +---+---+---+-----------+------+---+------+--------+-----+------+------+
+ {
-@@ -XXX,XX +XXX,XX @@ static void disas_fp_2src(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
-         }
+         disas_sme_fa64(s, insn);
          handle_fp_2src_double(s, opcode, rd, rn, rm);
          break;
 +    case 3:
 +        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +            unallocated_encoding(s);
 +            return;
 +        }
 +        if (!fp_access_check(s)) {
 +            return;
 +        }
 +        handle_fp_2src_half(s, opcode, rd, rn, rm);
 +        break;
      default:
          unallocated_encoding(s);
      }
+-    switch (extract32(insn, 25, 4)) {
+-    case 0x0:
+-        if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
+-            unallocated_encoding(s);
+-        }
+-        break;
+-    case 0x1: case 0x3: /* UNALLOCATED */
+-        unallocated_encoding(s);
+-        break;
+-    case 0x2:
+-        if (!disas_sve(s, insn)) {
+-            unallocated_encoding(s);
+-        }
+-        break;
+-    case 0x8: case 0x9: /* Data processing - immediate */
+-        disas_data_proc_imm(s, insn);
+-        break;
+-    case 0xa: case 0xb: /* Branch, exception generation and system insns */
+-        disas_b_exc_sys(s, insn);
+-        break;
+-    case 0x4:
+-    case 0x6:
+-    case 0xc:
+-    case 0xe:      /* Loads and stores */
+-        disas_ldst(s, insn);
+-        break;
+-    case 0x5:
+-    case 0xd:      /* Data processing - register */
+-        disas_data_proc_reg(s, insn);
+-        break;
+-    case 0x7:
+-    case 0xf:      /* Data processing - SIMD and floating point */
+-        disas_data_proc_simd_fp(s, insn);
+-        break;
+-    default:
+-        assert(FALSE); /* all 15 cases should be handled above */
+-        break;
+-    }
++    disas_a64_legacy(s, insn);
+     /*
+      * After execution of most insns, btype is reset to 0.
 --
-.17.0
+.34.1

-New patch
+[PULL 08/29] target/arm: Create decodetree skeleton for A64
+The A64 translator uses a hand-written decoder for everything except
+SVE or SME.  It's fairly well structured, but it's becoming obvious
+that it's still more painful to add instructions to than the A32
+translator, because putting a new instruction into the right place in
+a hand-written decoder is much harder than adding new instruction
+patterns to a decodetree file.
+As the first step in conversion to decodetree, create the skeleton of
+the decodetree decoder; where it does not handle instructions we will
+fall back to the legacy decoder (which will be for everything at the
+moment, since there are no patterns in a64.decode).
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230512144106.3608981-3-peter.maydell@linaro.org
+---
+ target/arm/tcg/a64.decode      | 20 ++++++++++++++++++++
+ target/arm/tcg/translate-a64.c | 18 +++++++++++-------
+ target/arm/tcg/meson.build     |  1 +
+files changed, 32 insertions(+), 7 deletions(-)
+ create mode 100644 target/arm/tcg/a64.decode
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
+new file mode 100644
+index XXXXXXX..XXXXXXX
+--- /dev/null
++++ b/target/arm/tcg/a64.decode
+@@ -XXX,XX +XXX,XX @@
++# AArch64 A64 allowed instruction decoding
++#
++#  Copyright (c) 2023 Linaro, Ltd
++#
++# This library is free software; you can redistribute it and/or
++# modify it under the terms of the GNU Lesser General Public
++# License as published by the Free Software Foundation; either
++# version 2.1 of the License, or (at your option) any later version.
++#
++# This library is distributed in the hope that it will be useful,
++# but WITHOUT ANY WARRANTY; without even the implied warranty of
++# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
++# Lesser General Public License for more details.
++#
++# You should have received a copy of the GNU Lesser General Public
++# License along with this library; if not, see <http://www.gnu.org/licenses/>.
++
++#
++# This file is processed by scripts/decodetree.py
++#
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ enum a64_shift_type {
+     A64_SHIFT_TYPE_ROR = 3
+ };
++/*
++ * Include the generated decoders.
++ */
++
++#include "decode-sme-fa64.c.inc"
++#include "decode-a64.c.inc"
++
+ /* Table based decoder typedefs - used when the relevant bits for decode
+  * are too awkwardly scattered across the instruction (eg SIMD).
+  */
+@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_simd_fp(DisasContext *s, uint32_t insn)
+     }
+ }
+-/*
+- * Include the generated SME FA64 decoder.
+- */
+-
+-#include "decode-sme-fa64.c.inc"
+-
+ static bool trans_OK(DisasContext *s, arg_OK *a)
+ {
+     return true;
+@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
+         disas_sme_fa64(s, insn);
+     }
+-    disas_a64_legacy(s, insn);
++
++    if (!disas_a64(s, insn)) {
++        disas_a64_legacy(s, insn);
++    }
+     /*
+      * After execution of most insns, btype is reset to 0.
+diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/meson.build
++++ b/target/arm/tcg/meson.build
+@@ -XXX,XX +XXX,XX @@ gen = [
+   decodetree.process('a32-uncond.decode', extra_args: '--static-decode=disas_a32_uncond'),
+   decodetree.process('t32.decode', extra_args: '--static-decode=disas_t32'),
+   decodetree.process('t16.decode', extra_args: ['-w', '16', '--static-decode=disas_t16']),
++  decodetree.process('a64.decode', extra_args: ['--static-decode=disas_a64']),
+ ]
+ arm_ss.add(gen)
+--
+.34.1

-New patch
+[PULL 09/29] target/arm: Pull calls to disas_sve() and disas_sme() out of legacy decoder
+The SVE and SME decode is already done by decodetree.  Pull the calls
+to these decoders out of the legacy decoder.  This doesn't change
+behaviour because all the patterns in sve.decode and sme.decode
+already require the bits that the legacy decoder is decoding to have
+the correct values.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230512144106.3608981-4-peter.maydell@linaro.org
+---
+ target/arm/tcg/translate-a64.c | 20 ++++----------------
+file changed, 4 insertions(+), 16 deletions(-)
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static bool btype_destination_ok(uint32_t insn, bool bt, int btype)
+ static void disas_a64_legacy(DisasContext *s, uint32_t insn)
+ {
+     switch (extract32(insn, 25, 4)) {
+-    case 0x0:
+-        if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
+-            unallocated_encoding(s);
+-        }
+-        break;
+-    case 0x1: case 0x3: /* UNALLOCATED */
+-        unallocated_encoding(s);
+-        break;
+-    case 0x2:
+-        if (!disas_sve(s, insn)) {
+-            unallocated_encoding(s);
+-        }
+-        break;
+     case 0x8: case 0x9: /* Data processing - immediate */
+         disas_data_proc_imm(s, insn);
+         break;
+@@ -XXX,XX +XXX,XX @@ static void disas_a64_legacy(DisasContext *s, uint32_t insn)
+         disas_data_proc_simd_fp(s, insn);
+         break;
+     default:
+-        assert(FALSE); /* all 15 cases should be handled above */
++        unallocated_encoding(s);
+         break;
+     }
+ }
+@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
+         disas_sme_fa64(s, insn);
+     }
+-
+-    if (!disas_a64(s, insn)) {
++    if (!disas_a64(s, insn) &&
++        !disas_sme(s, insn) &&
++        !disas_sve(s, insn)) {
+         disas_a64_legacy(s, insn);
+     }
+--
+.34.1

-[Qemu-devel] [PULL 04/16] target/arm: Implement FMOV (general) for fp16
+[PULL 10/29] target/arm: Convert PC-rel addressing to decodetree
 From: Richard Henderson <richard.henderson@linaro.org>
-Adding the fp16 moves to/from general registers.
+Convert the ADR and ADRP instructions.
-Cc: qemu-stable@nongnu.org
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20180512003217.9105-2-richard.henderson@linaro.org
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Message-id: 20230512144106.3608981-5-peter.maydell@linaro.org
 [PMM: Rebased]
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 21 +++++++++++++++++++++
+ target/arm/tcg/a64.decode      | 13 ++++++++++++
-file changed, 21 insertions(+)
+ target/arm/tcg/translate-a64.c | 38 +++++++++++++---------------------
 files changed, 27 insertions(+), 24 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/tcg/a64.decode
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/tcg/a64.decode
-@@ -XXX,XX +XXX,XX @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
+@@ -XXX,XX +XXX,XX @@
-             tcg_gen_st_i64(tcg_rn, cpu_env, fp_reg_hi_offset(s, rd));
+ #
-             clear_vec_high(s, true, rd);
+ # This file is processed by scripts/decodetree.py
-             break;
+ #
-+        case 3:
++
-+            /* 16 bit */
++&ri              rd imm
-+            tmp = tcg_temp_new_i64();
++
-+            tcg_gen_ext16u_i64(tmp, tcg_rn);
++
-+            write_fp_dreg(s, rd, tmp);
++### Data Processing - Immediate
-+            tcg_temp_free_i64(tmp);
++
-+            break;
++# PC-rel addressing
-+        default:
++
-+            g_assert_not_reached();
++%imm_pcrel      5:s19 29:2
-         }
++@pcrel          . .. ..... ................... rd:5     &ri imm=%imm_pcrel
-     } else {
++
-         TCGv_i64 tcg_rd = cpu_reg(s, rd);
++ADR             0 .. 10000 ................... .....    @pcrel
-@@ -XXX,XX +XXX,XX @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
++ADRP            1 .. 10000 ................... .....    @pcrel
-             /* 64 bits from top half */
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
-             tcg_gen_ld_i64(tcg_rd, cpu_env, fp_reg_hi_offset(s, rn));
+index XXXXXXX..XXXXXXX 100644
-             break;
+--- a/target/arm/tcg/translate-a64.c
-+        case 3:
++++ b/target/arm/tcg/translate-a64.c
-+            /* 16 bit */
+@@ -XXX,XX +XXX,XX @@ static void disas_ldst(DisasContext *s, uint32_t insn)
 +            tcg_gen_ld16u_i64(tcg_rd, cpu_env, fp_reg_offset(s, rn, MO_16));
 +            break;
 +        default:
 +            g_assert_not_reached();
          }
      }
  }
-@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
-         case 0xa: /* 64 bit */
+-/* PC-rel. addressing
-         case 0xd: /* 64 bit to top half of quad */
+- *   31  30   29 28       24 23                5 4    0
-             break;
+- * +----+-------+-----------+-------------------+------+
-+        case 0x6: /* 16-bit float, 32-bit int */
+- * | op | immlo | 1 0 0 0 0 |       immhi       |  Rd  |
-+        case 0xe: /* 16-bit float, 64-bit int */
+- * +----+-------+-----------+-------------------+------+
-+            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
++/*
-+                break;
++ * PC-rel. addressing
-+            }
+  */
-+            /* fallthru */
+-static void disas_pc_rel_adr(DisasContext *s, uint32_t insn)
-         default:
++
-             /* all other sf/type/rmode combinations are invalid */
++static bool trans_ADR(DisasContext *s, arg_ri *a)
-             unallocated_encoding(s);
+ {
 -    unsigned int page, rd;
 -    int64_t offset;
 +    gen_pc_plus_diff(s, cpu_reg(s, a->rd), a->imm);
 +    return true;
 +}
 -    page = extract32(insn, 31, 1);
 -    /* SignExtend(immhi:immlo) -> offset */
 -    offset = sextract64(insn, 5, 19);
 -    offset = offset << 2 | extract32(insn, 29, 2);
 -    rd = extract32(insn, 0, 5);
 +static bool trans_ADRP(DisasContext *s, arg_ri *a)
 +{
 +    int64_t offset = (int64_t)a->imm << 12;
 -    if (page) {
 -        /* ADRP (page based) */
 -        offset <<= 12;
 -        /* The page offset is ok for CF_PCREL. */
 -        offset -= s->pc_curr & 0xfff;
 -    }
 -
 -    gen_pc_plus_diff(s, cpu_reg(s, rd), offset);
 +    /* The page offset is ok for CF_PCREL. */
 +    offset -= s->pc_curr & 0xfff;
 +    gen_pc_plus_diff(s, cpu_reg(s, a->rd), offset);
 +    return true;
  }
  /*
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
  static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
  {
      switch (extract32(insn, 23, 6)) {
 -    case 0x20: case 0x21: /* PC-rel. addressing */
 -        disas_pc_rel_adr(s, insn);
 -        break;
      case 0x22: /* Add/subtract (immediate) */
          disas_add_sub_imm(s, insn);
          break;
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 06/16] target/arm: Implement FCVT (scalar, integer) for fp16
+[PULL 11/29] target/arm: Split gen_add_CC and gen_sub_CC
 From: Richard Henderson <richard.henderson@linaro.org>
-Cc: qemu-stable@nongnu.org
+Split out specific 32-bit and 64-bit functions.
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+These carry the same signature as tcg_gen_add_i64,
 and so will be easier to pass as callbacks.
 Retain gen_add_CC and gen_sub_CC during conversion.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20180512003217.9105-4-richard.henderson@linaro.org
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Message-id: 20230512144106.3608981-6-peter.maydell@linaro.org
 [PMM: rebased]
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h        |  6 +++
+ target/arm/tcg/translate-a64.c | 149 +++++++++++++++++++--------------
- target/arm/helper.c        | 38 ++++++++++++++-
+file changed, 84 insertions(+), 65 deletions(-)
  target/arm/translate-a64.c | 96 +++++++++++++++++++++++++++++++-------
 files changed, 122 insertions(+), 18 deletions(-)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/target/arm/tcg/translate-a64.c
-+++ b/target/arm/helper.h
++++ b/target/arm/tcg/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_touhd_round_to_zero, i64, f64, i32, ptr)
+@@ -XXX,XX +XXX,XX @@ static inline void gen_logic_CC(int sf, TCGv_i64 result)
  DEF_HELPER_3(vfp_tould_round_to_zero, i64, f64, i32, ptr)
  DEF_HELPER_3(vfp_touhh, i32, f16, i32, ptr)
  DEF_HELPER_3(vfp_toshh, i32, f16, i32, ptr)
 +DEF_HELPER_3(vfp_toulh, i32, f16, i32, ptr)
 +DEF_HELPER_3(vfp_toslh, i32, f16, i32, ptr)
 +DEF_HELPER_3(vfp_touqh, i64, f16, i32, ptr)
 +DEF_HELPER_3(vfp_tosqh, i64, f16, i32, ptr)
  DEF_HELPER_3(vfp_toshs, i32, f32, i32, ptr)
  DEF_HELPER_3(vfp_tosls, i32, f32, i32, ptr)
  DEF_HELPER_3(vfp_tosqs, i64, f32, i32, ptr)
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_ultod, f64, i64, i32, ptr)
  DEF_HELPER_3(vfp_uqtod, f64, i64, i32, ptr)
  DEF_HELPER_3(vfp_sltoh, f16, i32, i32, ptr)
  DEF_HELPER_3(vfp_ultoh, f16, i32, i32, ptr)
 +DEF_HELPER_3(vfp_sqtoh, f16, i64, i32, ptr)
 +DEF_HELPER_3(vfp_uqtoh, f16, i64, i32, ptr)
  DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, ptr)
  DEF_HELPER_FLAGS_2(set_neon_rmode, TCG_CALL_NO_RWG, i32, i32, env)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ VFP_CONV_FIX_A64(uq, s, 32, 64, uint64)
  #undef VFP_CONV_FIX_A64
  /* Conversion to/from f16 can overflow to infinity before/after scaling.
 - * Therefore we convert to f64 (which does not round), scale,
 - * and then convert f64 to f16 (which may round).
 + * Therefore we convert to f64, scale, and then convert f64 to f16; or
 + * vice versa for conversion to integer.
 + *
 + * For 16- and 32-bit integers, the conversion to f64 never rounds.
 + * For 64-bit integers, any integer that would cause rounding will also
 + * overflow to f16 infinity, so there is no double rounding problem.
   */
  static float16 do_postscale_fp16(float64 f, int shift, float_status *fpst)
@@ -XXX,XX +XXX,XX @@ float16 HELPER(vfp_ultoh)(uint32_t x, uint32_t shift, void *fpst)
      return do_postscale_fp16(uint32_to_float64(x, fpst), shift, fpst);
  }
-+float16 HELPER(vfp_sqtoh)(uint64_t x, uint32_t shift, void *fpst)
+ /* dest = T0 + T1; compute C, N, V and Z flags */
 +static void gen_add64_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
 +{
-+    return do_postscale_fp16(int64_to_float64(x, fpst), shift, fpst);
++    TCGv_i64 result, flag, tmp;
 +    result = tcg_temp_new_i64();
 +    flag = tcg_temp_new_i64();
 +    tmp = tcg_temp_new_i64();
 +
 +    tcg_gen_movi_i64(tmp, 0);
 +    tcg_gen_add2_i64(result, flag, t0, tmp, t1, tmp);
 +
 +    tcg_gen_extrl_i64_i32(cpu_CF, flag);
 +
 +    gen_set_NZ64(result);
 +
 +    tcg_gen_xor_i64(flag, result, t0);
 +    tcg_gen_xor_i64(tmp, t0, t1);
 +    tcg_gen_andc_i64(flag, flag, tmp);
 +    tcg_gen_extrh_i64_i32(cpu_VF, flag);
 +
 +    tcg_gen_mov_i64(dest, result);
 +}
 +
-+float16 HELPER(vfp_uqtoh)(uint64_t x, uint32_t shift, void *fpst)
++static void gen_add32_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
 +{
-+    return do_postscale_fp16(uint64_to_float64(x, fpst), shift, fpst);
++    TCGv_i32 t0_32 = tcg_temp_new_i32();
 +    TCGv_i32 t1_32 = tcg_temp_new_i32();
 +    TCGv_i32 tmp = tcg_temp_new_i32();
 +
 +    tcg_gen_movi_i32(tmp, 0);
 +    tcg_gen_extrl_i64_i32(t0_32, t0);
 +    tcg_gen_extrl_i64_i32(t1_32, t1);
 +    tcg_gen_add2_i32(cpu_NF, cpu_CF, t0_32, tmp, t1_32, tmp);
 +    tcg_gen_mov_i32(cpu_ZF, cpu_NF);
 +    tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
 +    tcg_gen_xor_i32(tmp, t0_32, t1_32);
 +    tcg_gen_andc_i32(cpu_VF, cpu_VF, tmp);
 +    tcg_gen_extu_i32_i64(dest, cpu_NF);
 +}
 +
- static float64 do_prescale_fp16(float16 f, int shift, float_status *fpst)
+ static void gen_add_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
  {
-     if (unlikely(float16_is_any_nan(f))) {
+     if (sf) {
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_touhh)(float16 x, uint32_t shift, void *fpst)
+-        TCGv_i64 result, flag, tmp;
-     return float64_to_uint16(do_prescale_fp16(x, shift, fpst), fpst);
+-        result = tcg_temp_new_i64();
 -        flag = tcg_temp_new_i64();
 -        tmp = tcg_temp_new_i64();
 -
 -        tcg_gen_movi_i64(tmp, 0);
 -        tcg_gen_add2_i64(result, flag, t0, tmp, t1, tmp);
 -
 -        tcg_gen_extrl_i64_i32(cpu_CF, flag);
 -
 -        gen_set_NZ64(result);
 -
 -        tcg_gen_xor_i64(flag, result, t0);
 -        tcg_gen_xor_i64(tmp, t0, t1);
 -        tcg_gen_andc_i64(flag, flag, tmp);
 -        tcg_gen_extrh_i64_i32(cpu_VF, flag);
 -
 -        tcg_gen_mov_i64(dest, result);
 +        gen_add64_CC(dest, t0, t1);
      } else {
 -        /* 32 bit arithmetic */
 -        TCGv_i32 t0_32 = tcg_temp_new_i32();
 -        TCGv_i32 t1_32 = tcg_temp_new_i32();
 -        TCGv_i32 tmp = tcg_temp_new_i32();
 -
 -        tcg_gen_movi_i32(tmp, 0);
 -        tcg_gen_extrl_i64_i32(t0_32, t0);
 -        tcg_gen_extrl_i64_i32(t1_32, t1);
 -        tcg_gen_add2_i32(cpu_NF, cpu_CF, t0_32, tmp, t1_32, tmp);
 -        tcg_gen_mov_i32(cpu_ZF, cpu_NF);
 -        tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
 -        tcg_gen_xor_i32(tmp, t0_32, t1_32);
 -        tcg_gen_andc_i32(cpu_VF, cpu_VF, tmp);
 -        tcg_gen_extu_i32_i64(dest, cpu_NF);
 +        gen_add32_CC(dest, t0, t1);
      }
  }
-+uint32_t HELPER(vfp_toslh)(float16 x, uint32_t shift, void *fpst)
+ /* dest = T0 - T1; compute C, N, V and Z flags */
 +static void gen_sub64_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
 +{
-+    return float64_to_int32(do_prescale_fp16(x, shift, fpst), fpst);
++    /* 64 bit arithmetic */
 +    TCGv_i64 result, flag, tmp;
 +
 +    result = tcg_temp_new_i64();
 +    flag = tcg_temp_new_i64();
 +    tcg_gen_sub_i64(result, t0, t1);
 +
 +    gen_set_NZ64(result);
 +
 +    tcg_gen_setcond_i64(TCG_COND_GEU, flag, t0, t1);
 +    tcg_gen_extrl_i64_i32(cpu_CF, flag);
 +
 +    tcg_gen_xor_i64(flag, result, t0);
 +    tmp = tcg_temp_new_i64();
 +    tcg_gen_xor_i64(tmp, t0, t1);
 +    tcg_gen_and_i64(flag, flag, tmp);
 +    tcg_gen_extrh_i64_i32(cpu_VF, flag);
 +    tcg_gen_mov_i64(dest, result);
 +}
 +
-+uint32_t HELPER(vfp_toulh)(float16 x, uint32_t shift, void *fpst)
++static void gen_sub32_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
 +{
-+    return float64_to_uint32(do_prescale_fp16(x, shift, fpst), fpst);
++    /* 32 bit arithmetic */
 +    TCGv_i32 t0_32 = tcg_temp_new_i32();
 +    TCGv_i32 t1_32 = tcg_temp_new_i32();
 +    TCGv_i32 tmp;
 +
 +    tcg_gen_extrl_i64_i32(t0_32, t0);
 +    tcg_gen_extrl_i64_i32(t1_32, t1);
 +    tcg_gen_sub_i32(cpu_NF, t0_32, t1_32);
 +    tcg_gen_mov_i32(cpu_ZF, cpu_NF);
 +    tcg_gen_setcond_i32(TCG_COND_GEU, cpu_CF, t0_32, t1_32);
 +    tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
 +    tmp = tcg_temp_new_i32();
 +    tcg_gen_xor_i32(tmp, t0_32, t1_32);
 +    tcg_gen_and_i32(cpu_VF, cpu_VF, tmp);
 +    tcg_gen_extu_i32_i64(dest, cpu_NF);
 +}
 +
-+uint64_t HELPER(vfp_tosqh)(float16 x, uint32_t shift, void *fpst)
+ static void gen_sub_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
 +{
 +    return float64_to_int64(do_prescale_fp16(x, shift, fpst), fpst);
 +}
 +
 +uint64_t HELPER(vfp_touqh)(float16 x, uint32_t shift, void *fpst)
 +{
 +    return float64_to_uint64(do_prescale_fp16(x, shift, fpst), fpst);
 +}
 +
  /* Set the current fp rounding mode and return the old one.
   * The argument is a softfloat float_round_ value.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                             bool itof, int rmode, int scale, int sf, int type)
  {
-     bool is_signed = !(opcode & 1);
+     if (sf) {
--    bool is_double = type;
+-        /* 64 bit arithmetic */
-     TCGv_ptr tcg_fpstatus;
+-        TCGv_i64 result, flag, tmp;
--    TCGv_i32 tcg_shift;
+-
-+    TCGv_i32 tcg_shift, tcg_single;
+-        result = tcg_temp_new_i64();
-+    TCGv_i64 tcg_double;
+-        flag = tcg_temp_new_i64();
+-        tcg_gen_sub_i64(result, t0, t1);
--    tcg_fpstatus = get_fpstatus_ptr(false);
+-
-+    tcg_fpstatus = get_fpstatus_ptr(type == 3);
+-        gen_set_NZ64(result);
+-
-     tcg_shift = tcg_const_i32(64 - scale);
+-        tcg_gen_setcond_i64(TCG_COND_GEU, flag, t0, t1);
+-        tcg_gen_extrl_i64_i32(cpu_CF, flag);
-@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
+-
-             tcg_int = tcg_extend;
+-        tcg_gen_xor_i64(flag, result, t0);
-         }
+-        tmp = tcg_temp_new_i64();
+-        tcg_gen_xor_i64(tmp, t0, t1);
--        if (is_double) {
+-        tcg_gen_and_i64(flag, flag, tmp);
--            TCGv_i64 tcg_double = tcg_temp_new_i64();
+-        tcg_gen_extrh_i64_i32(cpu_VF, flag);
-+        switch (type) {
+-        tcg_gen_mov_i64(dest, result);
-+        case 1: /* float64 */
++        gen_sub64_CC(dest, t0, t1);
 +            tcg_double = tcg_temp_new_i64();
              if (is_signed) {
                  gen_helper_vfp_sqtod(tcg_double, tcg_int,
                                       tcg_shift, tcg_fpstatus);
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
              }
              write_fp_dreg(s, rd, tcg_double);
              tcg_temp_free_i64(tcg_double);
 -        } else {
 -            TCGv_i32 tcg_single = tcg_temp_new_i32();
 +            break;
 +
 +        case 0: /* float32 */
 +            tcg_single = tcg_temp_new_i32();
              if (is_signed) {
                  gen_helper_vfp_sqtos(tcg_single, tcg_int,
                                       tcg_shift, tcg_fpstatus);
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
              }
              write_fp_sreg(s, rd, tcg_single);
              tcg_temp_free_i32(tcg_single);
 +            break;
 +
 +        case 3: /* float16 */
 +            tcg_single = tcg_temp_new_i32();
 +            if (is_signed) {
 +                gen_helper_vfp_sqtoh(tcg_single, tcg_int,
 +                                     tcg_shift, tcg_fpstatus);
 +            } else {
 +                gen_helper_vfp_uqtoh(tcg_single, tcg_int,
 +                                     tcg_shift, tcg_fpstatus);
 +            }
 +            write_fp_sreg(s, rd, tcg_single);
 +            tcg_temp_free_i32(tcg_single);
 +            break;
 +
 +        default:
 +            g_assert_not_reached();
          }
      } else {
-         TCGv_i64 tcg_int = cpu_reg(s, rd);
+-        /* 32 bit arithmetic */
-@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
+-        TCGv_i32 t0_32 = tcg_temp_new_i32();
+-        TCGv_i32 t1_32 = tcg_temp_new_i32();
-         gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
+-        TCGv_i32 tmp;
 -        if (is_double) {
 -            TCGv_i64 tcg_double = read_fp_dreg(s, rn);
 +        switch (type) {
 +        case 1: /* float64 */
 +            tcg_double = read_fp_dreg(s, rn);
              if (is_signed) {
                  if (!sf) {
                      gen_helper_vfp_tosld(tcg_int, tcg_double,
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                                           tcg_shift, tcg_fpstatus);
                  }
              }
 +            if (!sf) {
 +                tcg_gen_ext32u_i64(tcg_int, tcg_int);
 +            }
              tcg_temp_free_i64(tcg_double);
 -        } else {
 -            TCGv_i32 tcg_single = read_fp_sreg(s, rn);
 +            break;
 +
 +        case 0: /* float32 */
 +            tcg_single = read_fp_sreg(s, rn);
              if (sf) {
                  if (is_signed) {
                      gen_helper_vfp_tosqs(tcg_int, tcg_single,
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                  tcg_temp_free_i32(tcg_dest);
              }
              tcg_temp_free_i32(tcg_single);
 +            break;
 +
 +        case 3: /* float16 */
 +            tcg_single = read_fp_sreg(s, rn);
 +            if (sf) {
 +                if (is_signed) {
 +                    gen_helper_vfp_tosqh(tcg_int, tcg_single,
 +                                         tcg_shift, tcg_fpstatus);
 +                } else {
 +                    gen_helper_vfp_touqh(tcg_int, tcg_single,
 +                                         tcg_shift, tcg_fpstatus);
 +                }
 +            } else {
 +                TCGv_i32 tcg_dest = tcg_temp_new_i32();
 +                if (is_signed) {
 +                    gen_helper_vfp_toslh(tcg_dest, tcg_single,
 +                                         tcg_shift, tcg_fpstatus);
 +                } else {
 +                    gen_helper_vfp_toulh(tcg_dest, tcg_single,
 +                                         tcg_shift, tcg_fpstatus);
 +                }
 +                tcg_gen_extu_i32_i64(tcg_int, tcg_dest);
 +                tcg_temp_free_i32(tcg_dest);
 +            }
 +            tcg_temp_free_i32(tcg_single);
 +            break;
 +
 +        default:
 +            g_assert_not_reached();
          }
          gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
          tcg_temp_free_i32(tcg_rmode);
 -
--        if (!sf) {
+-        tcg_gen_extrl_i64_i32(t0_32, t0);
--            tcg_gen_ext32u_i64(tcg_int, tcg_int);
+-        tcg_gen_extrl_i64_i32(t1_32, t1);
--        }
+-        tcg_gen_sub_i32(cpu_NF, t0_32, t1_32);
 -        tcg_gen_mov_i32(cpu_ZF, cpu_NF);
 -        tcg_gen_setcond_i32(TCG_COND_GEU, cpu_CF, t0_32, t1_32);
 -        tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
 -        tmp = tcg_temp_new_i32();
 -        tcg_gen_xor_i32(tmp, t0_32, t1_32);
 -        tcg_gen_and_i32(cpu_VF, cpu_VF, tmp);
 -        tcg_gen_extu_i32_i64(dest, cpu_NF);
 +        gen_sub32_CC(dest, t0, t1);
      }
+ }
-     tcg_temp_free_ptr(tcg_fpstatus);
@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
          /* actual FP conversions */
          bool itof = extract32(opcode, 1, 1);
 -        if (type > 1 || (rmode != 0 && opcode > 1)) {
 +        if (rmode != 0 && opcode > 1) {
 +            unallocated_encoding(s);
 +            return;
 +        }
 +        switch (type) {
 +        case 0: /* float32 */
 +        case 1: /* float64 */
 +            break;
 +        case 3: /* float16 */
 +            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +                break;
 +            }
 +            /* fallthru */
 +        default:
              unallocated_encoding(s);
              return;
          }
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 10/16] target/arm: Implement FP data-processing (3 source) for fp16
+[PULL 12/29] target/arm: Convert Add/subtract (immediate) to decodetree
 From: Richard Henderson <richard.henderson@linaro.org>
-We missed all of the scalar fp16 fma operations.
+Convert the ADD and SUB (immediate) instructions.
-Cc: qemu-stable@nongnu.org
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20180512003217.9105-8-richard.henderson@linaro.org
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Message-id: 20230512144106.3608981-7-peter.maydell@linaro.org
 [PMM: Rebased; adjusted to use translate.h's TRANS macro]
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 48 ++++++++++++++++++++++++++++++++++++++
+ target/arm/tcg/translate.h     |  5 +++
-file changed, 48 insertions(+)
+ target/arm/tcg/a64.decode      | 17 ++++++++
  target/arm/tcg/translate-a64.c | 73 ++++++++++------------------------
 files changed, 42 insertions(+), 53 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/tcg/translate.h
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/tcg/translate.h
-@@ -XXX,XX +XXX,XX @@ static void handle_fp_3src_double(DisasContext *s, bool o0, bool o1,
+@@ -XXX,XX +XXX,XX @@ static inline int rsub_8(DisasContext *s, int x)
-     tcg_temp_free_i64(tcg_res);
+     return 8 - x;
  }
-+/* Floating-point data-processing (3 source) - half precision */
++static inline int shl_12(DisasContext *s, int x)
 +static void handle_fp_3src_half(DisasContext *s, bool o0, bool o1,
 +                                int rd, int rn, int rm, int ra)
 +{
-+    TCGv_i32 tcg_op1, tcg_op2, tcg_op3;
++    return x << 12;
 +    TCGv_i32 tcg_res = tcg_temp_new_i32();
 +    TCGv_ptr fpst = get_fpstatus_ptr(true);
 +
 +    tcg_op1 = read_fp_hreg(s, rn);
 +    tcg_op2 = read_fp_hreg(s, rm);
 +    tcg_op3 = read_fp_hreg(s, ra);
 +
 +    /* These are fused multiply-add, and must be done as one
 +     * floating point operation with no rounding between the
 +     * multiplication and addition steps.
 +     * NB that doing the negations here as separate steps is
 +     * correct : an input NaN should come out with its sign bit
 +     * flipped if it is a negated-input.
 +     */
 +    if (o1 == true) {
 +        tcg_gen_xori_i32(tcg_op3, tcg_op3, 0x8000);
 +    }
 +
 +    if (o0 != o1) {
 +        tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000);
 +    }
 +
 +    gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst);
 +
 +    write_fp_sreg(s, rd, tcg_res);
 +
 +    tcg_temp_free_ptr(fpst);
 +    tcg_temp_free_i32(tcg_op1);
 +    tcg_temp_free_i32(tcg_op2);
 +    tcg_temp_free_i32(tcg_op3);
 +    tcg_temp_free_i32(tcg_res);
 +}
 +
- /* Floating point data-processing (3 source)
+ static inline int neon_3same_fp_size(DisasContext *s, int x)
-  *   31  30  29 28       24 23  22  21  20  16  15  14  10 9    5 4    0
+ {
-  * +---+---+---+-----------+------+----+------+----+------+------+------+
+     /* Convert 0==fp32, 1==fp16 into a MO_* value */
-@@ -XXX,XX +XXX,XX @@ static void disas_fp_3src(DisasContext *s, uint32_t insn)
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
-         }
+index XXXXXXX..XXXXXXX 100644
-         handle_fp_3src_double(s, o0, o1, rd, rn, rm, ra);
+--- a/target/arm/tcg/a64.decode
 +++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@
  #
  &ri              rd imm
 +&rri_sf          rd rn imm sf
  ### Data Processing - Immediate
@@ -XXX,XX +XXX,XX @@
  ADR             0 .. 10000 ................... .....    @pcrel
  ADRP            1 .. 10000 ................... .....    @pcrel
 +
 +# Add/subtract (immediate)
 +
 +%imm12_sh12     10:12 !function=shl_12
 +@addsub_imm     sf:1 .. ...... . imm:12 rn:5 rd:5
 +@addsub_imm12   sf:1 .. ...... . ............ rn:5 rd:5 imm=%imm12_sh12
 +
 +ADD_i           . 00 100010 0 ............ ..... .....  @addsub_imm
 +ADD_i           . 00 100010 1 ............ ..... .....  @addsub_imm12
 +ADDS_i          . 01 100010 0 ............ ..... .....  @addsub_imm
 +ADDS_i          . 01 100010 1 ............ ..... .....  @addsub_imm12
 +
 +SUB_i           . 10 100010 0 ............ ..... .....  @addsub_imm
 +SUB_i           . 10 100010 1 ............ ..... .....  @addsub_imm12
 +SUBS_i          . 11 100010 0 ............ ..... .....  @addsub_imm
 +SUBS_i          . 11 100010 1 ............ ..... .....  @addsub_imm12
 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/tcg/translate-a64.c
 +++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_ldst(DisasContext *s, uint32_t insn)
      }
  }
 +typedef void ArithTwoOp(TCGv_i64, TCGv_i64, TCGv_i64);
 +
 +static bool gen_rri(DisasContext *s, arg_rri_sf *a,
 +                    bool rd_sp, bool rn_sp, ArithTwoOp *fn)
 +{
 +    TCGv_i64 tcg_rn = rn_sp ? cpu_reg_sp(s, a->rn) : cpu_reg(s, a->rn);
 +    TCGv_i64 tcg_rd = rd_sp ? cpu_reg_sp(s, a->rd) : cpu_reg(s, a->rd);
 +    TCGv_i64 tcg_imm = tcg_constant_i64(a->imm);
 +
 +    fn(tcg_rd, tcg_rn, tcg_imm);
 +    if (!a->sf) {
 +        tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
 +    }
 +    return true;
 +}
 +
  /*
   * PC-rel. addressing
   */
@@ -XXX,XX +XXX,XX @@ static bool trans_ADRP(DisasContext *s, arg_ri *a)
  /*
   * Add/subtract (immediate)
 - *
 - *  31 30 29 28         23 22 21         10 9   5 4   0
 - * +--+--+--+-------------+--+-------------+-----+-----+
 - * |sf|op| S| 1 0 0 0 1 0 |sh|    imm12    |  Rn | Rd  |
 - * +--+--+--+-------------+--+-------------+-----+-----+
 - *
 - *    sf: 0 -> 32bit, 1 -> 64bit
 - *    op: 0 -> add  , 1 -> sub
 - *     S: 1 -> set flags
 - *    sh: 1 -> LSL imm by 12
   */
 -static void disas_add_sub_imm(DisasContext *s, uint32_t insn)
 -{
 -    int rd = extract32(insn, 0, 5);
 -    int rn = extract32(insn, 5, 5);
 -    uint64_t imm = extract32(insn, 10, 12);
 -    bool shift = extract32(insn, 22, 1);
 -    bool setflags = extract32(insn, 29, 1);
 -    bool sub_op = extract32(insn, 30, 1);
 -    bool is_64bit = extract32(insn, 31, 1);
 -
 -    TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
 -    TCGv_i64 tcg_rd = setflags ? cpu_reg(s, rd) : cpu_reg_sp(s, rd);
 -    TCGv_i64 tcg_result;
 -
 -    if (shift) {
 -        imm <<= 12;
 -    }
 -
 -    tcg_result = tcg_temp_new_i64();
 -    if (!setflags) {
 -        if (sub_op) {
 -            tcg_gen_subi_i64(tcg_result, tcg_rn, imm);
 -        } else {
 -            tcg_gen_addi_i64(tcg_result, tcg_rn, imm);
 -        }
 -    } else {
 -        TCGv_i64 tcg_imm = tcg_constant_i64(imm);
 -        if (sub_op) {
 -            gen_sub_CC(is_64bit, tcg_result, tcg_rn, tcg_imm);
 -        } else {
 -            gen_add_CC(is_64bit, tcg_result, tcg_rn, tcg_imm);
 -        }
 -    }
 -
 -    if (is_64bit) {
 -        tcg_gen_mov_i64(tcg_rd, tcg_result);
 -    } else {
 -        tcg_gen_ext32u_i64(tcg_rd, tcg_result);
 -    }
 -}
 +TRANS(ADD_i, gen_rri, a, 1, 1, tcg_gen_add_i64)
 +TRANS(SUB_i, gen_rri, a, 1, 1, tcg_gen_sub_i64)
 +TRANS(ADDS_i, gen_rri, a, 0, 1, a->sf ? gen_add64_CC : gen_add32_CC)
 +TRANS(SUBS_i, gen_rri, a, 0, 1, a->sf ? gen_sub64_CC : gen_sub32_CC)
  /*
   * Add/subtract (immediate, with tags)
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
  static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
  {
      switch (extract32(insn, 23, 6)) {
 -    case 0x22: /* Add/subtract (immediate) */
 -        disas_add_sub_imm(s, insn);
 -        break;
      case 0x23: /* Add/subtract (immediate, with tags) */
          disas_add_sub_imm_with_tags(s, insn);
          break;
-+    case 3:
-+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
-+            unallocated_encoding(s);
-+            return;
-+        }
-+        if (!fp_access_check(s)) {
-+            return;
-+        }
-+        handle_fp_3src_half(s, o0, o1, rd, rn, rm, ra);
-+        break;
-     default:
-         unallocated_encoding(s);
-     }
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 15/16] sdcard: Correct CRC16 offset in sd_function_switch()
+[PULL 13/29] target/arm: Convert Add/subtract (immediate with tags) to decodetree
-From: Philippe Mathieu-Daudé <f4bug@amsat.org>
+From: Richard Henderson <richard.henderson@linaro.org>
-Per the Physical Layer Simplified Spec. "4.3.10.4 Switch Function Status":
+Convert the ADDG and SUBG (immediate) instructions.
-  The block length is predefined to 512 bits
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-and "4.10.2 SD Status":
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20230512144106.3608981-8-peter.maydell@linaro.org
-  The SD Status contains status bits that are related to the SD Memory Card
+[PMM: Rebased; use TRANS_FEAT()]
   proprietary features and may be used for future application-specific usage.
   The size of the SD Status is one data block of 512 bit. The content of this
   register is transmitted to the Host over the DAT bus along with a 16-bit CRC.
 Thus the 16-bit CRC goes at offset 64.
 Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Message-id: 20180509060104.4458-3-f4bug@amsat.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/sd/sd.c | 2 +-
+ target/arm/tcg/a64.decode      |  8 +++++++
-file changed, 1 insertion(+), 1 deletion(-)
+ target/arm/tcg/translate-a64.c | 38 ++++++++++------------------------
 files changed, 19 insertions(+), 27 deletions(-)
-diff --git a/hw/sd/sd.c b/hw/sd/sd.c
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/sd/sd.c
+--- a/target/arm/tcg/a64.decode
-+++ b/hw/sd/sd.c
++++ b/target/arm/tcg/a64.decode
-@@ -XXX,XX +XXX,XX @@ static void sd_function_switch(SDState *sd, uint32_t arg)
+@@ -XXX,XX +XXX,XX @@ SUB_i           . 10 100010 0 ............ ..... .....  @addsub_imm
-         sd->data[14 + (i >> 1)] = new_func << ((i * 4) & 4);
+ SUB_i           . 10 100010 1 ............ ..... .....  @addsub_imm12
  SUBS_i          . 11 100010 0 ............ ..... .....  @addsub_imm
  SUBS_i          . 11 100010 1 ............ ..... .....  @addsub_imm12
 +
 +# Add/subtract (immediate with tags)
 +
 +&rri_tag        rd rn uimm6 uimm4
 +@addsub_imm_tag . .. ...... . uimm6:6 .. uimm4:4 rn:5 rd:5 &rri_tag
 +
 +ADDG_i          1 00 100011 0 ...... 00 .... ..... ..... @addsub_imm_tag
 +SUBG_i          1 10 100011 0 ...... 00 .... ..... ..... @addsub_imm_tag
 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/tcg/translate-a64.c
 +++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ TRANS(SUBS_i, gen_rri, a, 0, 1, a->sf ? gen_sub64_CC : gen_sub32_CC)
  /*
   * Add/subtract (immediate, with tags)
 - *
 - *  31 30 29 28         23 22 21     16 14      10 9   5 4   0
 - * +--+--+--+-------------+--+---------+--+-------+-----+-----+
 - * |sf|op| S| 1 0 0 0 1 1 |o2|  uimm6  |o3| uimm4 |  Rn | Rd  |
 - * +--+--+--+-------------+--+---------+--+-------+-----+-----+
 - *
 - *    op: 0 -> add, 1 -> sub
   */
 -static void disas_add_sub_imm_with_tags(DisasContext *s, uint32_t insn)
 +
 +static bool gen_add_sub_imm_with_tags(DisasContext *s, arg_rri_tag *a,
 +                                      bool sub_op)
  {
 -    int rd = extract32(insn, 0, 5);
 -    int rn = extract32(insn, 5, 5);
 -    int uimm4 = extract32(insn, 10, 4);
 -    int uimm6 = extract32(insn, 16, 6);
 -    bool sub_op = extract32(insn, 30, 1);
      TCGv_i64 tcg_rn, tcg_rd;
      int imm;
 -    /* Test all of sf=1, S=0, o2=0, o3=0.  */
 -    if ((insn & 0xa040c000u) != 0x80000000u ||
 -        !dc_isar_feature(aa64_mte_insn_reg, s)) {
 -        unallocated_encoding(s);
 -        return;
 -    }
 -
 -    imm = uimm6 << LOG2_TAG_GRANULE;
 +    imm = a->uimm6 << LOG2_TAG_GRANULE;
      if (sub_op) {
          imm = -imm;
      }
-     memset(&sd->data[17], 0, 47);
--    stw_be_p(sd->data + 65, sd_crc16(sd->data, 64));
+-    tcg_rn = cpu_reg_sp(s, rn);
-+    stw_be_p(sd->data + 64, sd_crc16(sd->data, 64));
+-    tcg_rd = cpu_reg_sp(s, rd);
 +    tcg_rn = cpu_reg_sp(s, a->rn);
 +    tcg_rd = cpu_reg_sp(s, a->rd);
      if (s->ata) {
          gen_helper_addsubg(tcg_rd, cpu_env, tcg_rn,
                             tcg_constant_i32(imm),
 -                           tcg_constant_i32(uimm4));
 +                           tcg_constant_i32(a->uimm4));
      } else {
          tcg_gen_addi_i64(tcg_rd, tcg_rn, imm);
          gen_address_with_allocation_tag0(tcg_rd, tcg_rd);
      }
 +    return true;
  }
- static inline bool sd_wp_addr(SDState *sd, uint64_t addr)
++TRANS_FEAT(ADDG_i, aa64_mte_insn_reg, gen_add_sub_imm_with_tags, a, false)
 +TRANS_FEAT(SUBG_i, aa64_mte_insn_reg, gen_add_sub_imm_with_tags, a, true)
 +
  /* The input should be a value in the bottom e bits (with higher
   * bits zero); returns that value replicated into every element
   * of size e in a 64 bit integer.
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
  static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
  {
      switch (extract32(insn, 23, 6)) {
 -    case 0x23: /* Add/subtract (immediate, with tags) */
 -        disas_add_sub_imm_with_tags(s, insn);
 -        break;
      case 0x24: /* Logical (immediate) */
          disas_logic_imm(s, insn);
          break;
 --
-.17.0
+.34.1

-New patch
+[PULL 14/29] target/arm: Replace bitmask64 with MAKE_64BIT_MASK
+From: Richard Henderson <richard.henderson@linaro.org>
+Use the bitops.h macro rather than rolling our own here.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20230512144106.3608981-9-peter.maydell@linaro.org
+---
+ target/arm/tcg/translate-a64.c | 11 ++---------
+file changed, 2 insertions(+), 9 deletions(-)
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static uint64_t bitfield_replicate(uint64_t mask, unsigned int e)
+     return mask;
+ }
+-/* Return a value with the bottom len bits set (where 0 < len <= 64) */
+-static inline uint64_t bitmask64(unsigned int length)
+-{
+-    assert(length > 0 && length <= 64);
+-    return ~0ULL >> (64 - length);
+-}
+-
+ /* Simplified variant of pseudocode DecodeBitMasks() for the case where we
+  * only require the wmask. Returns false if the imms/immr/immn are a reserved
+  * value (ie should cause a guest UNDEF exception), and true if they are
+@@ -XXX,XX +XXX,XX @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
+     /* Create the value of one element: s+1 set bits rotated
+      * by r within the element (which is e bits wide)...
+      */
+-    mask = bitmask64(s + 1);
++    mask = MAKE_64BIT_MASK(0, s + 1);
+     if (r) {
+         mask = (mask >> r) | (mask << (e - r));
+-        mask &= bitmask64(e);
++        mask &= MAKE_64BIT_MASK(0, e);
+     }
+     /* ...then replicate the element over the whole 64 bit value */
+     mask = bitfield_replicate(mask, e);
+--
+.34.1

-[Qemu-devel] [PULL 07/16] target/arm: Implement FCVT (scalar, fixed-point) for fp16
+[PULL 15/29] target/arm: Convert Logical (immediate) to decodetree
 From: Richard Henderson <richard.henderson@linaro.org>
-Cc: qemu-stable@nongnu.org
+Convert the ADD, ORR, EOR, ANDS (immediate) instructions.
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20180512003217.9105-5-richard.henderson@linaro.org
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Message-id: 20230512144106.3608981-10-peter.maydell@linaro.org
 [PMM: rebased]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 17 +++++++++++++++--
+ target/arm/tcg/a64.decode      | 15 ++++++
-file changed, 15 insertions(+), 2 deletions(-)
+ target/arm/tcg/translate-a64.c | 94 +++++++++++-----------------------
 files changed, 44 insertions(+), 65 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/tcg/a64.decode
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/tcg/a64.decode
-@@ -XXX,XX +XXX,XX @@ static void disas_fp_fixed_conv(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ SUBS_i          . 11 100010 1 ............ ..... .....  @addsub_imm12
-     bool sf = extract32(insn, 31, 1);
-     bool itof;
+ ADDG_i          1 00 100011 0 ...... 00 .... ..... ..... @addsub_imm_tag
+ SUBG_i          1 10 100011 0 ...... 00 .... ..... ..... @addsub_imm_tag
--    if (sbit || (type > 1)
++
--        || (!sf && scale < 32)) {
++# Logical (immediate)
-+    if (sbit || (!sf && scale < 32)) {
++
-+        unallocated_encoding(s);
++&rri_log        rd rn sf dbm
-+        return;
++@logic_imm_64   1 .. ...... dbm:13 rn:5 rd:5            &rri_log sf=1
 +@logic_imm_32   0 .. ...... 0 dbm:12 rn:5 rd:5          &rri_log sf=0
 +
 +AND_i           . 00 100100 . ...... ...... ..... ..... @logic_imm_64
 +AND_i           . 00 100100 . ...... ...... ..... ..... @logic_imm_32
 +ORR_i           . 01 100100 . ...... ...... ..... ..... @logic_imm_64
 +ORR_i           . 01 100100 . ...... ...... ..... ..... @logic_imm_32
 +EOR_i           . 10 100100 . ...... ...... ..... ..... @logic_imm_64
 +EOR_i           . 10 100100 . ...... ...... ..... ..... @logic_imm_32
 +ANDS_i          . 11 100100 . ...... ...... ..... ..... @logic_imm_64
 +ANDS_i          . 11 100100 . ...... ...... ..... ..... @logic_imm_32
 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/tcg/translate-a64.c
 +++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static uint64_t bitfield_replicate(uint64_t mask, unsigned int e)
      return mask;
  }
 -/* Simplified variant of pseudocode DecodeBitMasks() for the case where we
 +/*
 + * Logical (immediate)
 + */
 +
 +/*
 + * Simplified variant of pseudocode DecodeBitMasks() for the case where we
   * only require the wmask. Returns false if the imms/immr/immn are a reserved
   * value (ie should cause a guest UNDEF exception), and true if they are
   * valid, in which case the decoded bit pattern is written to result.
@@ -XXX,XX +XXX,XX @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
      return true;
  }
 -/* Logical (immediate)
 - *   31  30 29 28         23 22  21  16 15  10 9    5 4    0
 - * +----+-----+-------------+---+------+------+------+------+
 - * | sf | opc | 1 0 0 1 0 0 | N | immr | imms |  Rn  |  Rd  |
 - * +----+-----+-------------+---+------+------+------+------+
 - */
 -static void disas_logic_imm(DisasContext *s, uint32_t insn)
 +static bool gen_rri_log(DisasContext *s, arg_rri_log *a, bool set_cc,
 +                        void (*fn)(TCGv_i64, TCGv_i64, int64_t))
  {
 -    unsigned int sf, opc, is_n, immr, imms, rn, rd;
      TCGv_i64 tcg_rd, tcg_rn;
 -    uint64_t wmask;
 -    bool is_and = false;
 +    uint64_t imm;
 -    sf = extract32(insn, 31, 1);
 -    opc = extract32(insn, 29, 2);
 -    is_n = extract32(insn, 22, 1);
 -    immr = extract32(insn, 16, 6);
 -    imms = extract32(insn, 10, 6);
 -    rn = extract32(insn, 5, 5);
 -    rd = extract32(insn, 0, 5);
 -
 -    if (!sf && is_n) {
 -        unallocated_encoding(s);
 -        return;
 +    /* Some immediate field values are reserved. */
 +    if (!logic_imm_decode_wmask(&imm, extract32(a->dbm, 12, 1),
 +                                extract32(a->dbm, 0, 6),
 +                                extract32(a->dbm, 6, 6))) {
 +        return false;
 +    }
++    if (!a->sf) {
++        imm &= 0xffffffffull;
+     }
+-    if (opc == 0x3) { /* ANDS */
+-        tcg_rd = cpu_reg(s, rd);
+-    } else {
+-        tcg_rd = cpu_reg_sp(s, rd);
+-    }
+-    tcg_rn = cpu_reg(s, rn);
++    tcg_rd = set_cc ? cpu_reg(s, a->rd) : cpu_reg_sp(s, a->rd);
++    tcg_rn = cpu_reg(s, a->rn);
+-    if (!logic_imm_decode_wmask(&wmask, is_n, imms, immr)) {
+-        /* some immediate field values are reserved */
+-        unallocated_encoding(s);
+-        return;
++    fn(tcg_rd, tcg_rn, imm);
++    if (set_cc) {
++        gen_logic_CC(a->sf, tcg_rd);
+     }
+-
+-    if (!sf) {
+-        wmask &= 0xffffffff;
+-    }
+-
+-    switch (opc) {
+-    case 0x3: /* ANDS */
+-    case 0x0: /* AND */
+-        tcg_gen_andi_i64(tcg_rd, tcg_rn, wmask);
+-        is_and = true;
+-        break;
+-    case 0x1: /* ORR */
+-        tcg_gen_ori_i64(tcg_rd, tcg_rn, wmask);
+-        break;
+-    case 0x2: /* EOR */
+-        tcg_gen_xori_i64(tcg_rd, tcg_rn, wmask);
+-        break;
+-    default:
+-        assert(FALSE); /* must handle all above */
+-        break;
+-    }
+-
+-    if (!sf && !is_and) {
+-        /* zero extend final result; we know we can skip this for AND
+-         * since the immediate had the high 32 bits clear.
+-         */
++    if (!a->sf) {
+         tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
+     }
+-
+-    if (opc == 3) { /* ANDS */
+-        gen_logic_CC(sf, tcg_rd);
+-    }
++    return true;
+ }
++TRANS(AND_i, gen_rri_log, a, false, tcg_gen_andi_i64)
++TRANS(ORR_i, gen_rri_log, a, false, tcg_gen_ori_i64)
++TRANS(EOR_i, gen_rri_log, a, false, tcg_gen_xori_i64)
++TRANS(ANDS_i, gen_rri_log, a, true, tcg_gen_andi_i64)
 +
-+    switch (type) {
+ /*
-+    case 0: /* float32 */
+  * Move wide (immediate)
-+    case 1: /* float64 */
+  *
-+        break;
+@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
-+    case 3: /* float16 */
+ static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
-+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+ {
-+            break;
+     switch (extract32(insn, 23, 6)) {
-+        }
+-    case 0x24: /* Logical (immediate) */
-+        /* fallthru */
+-        disas_logic_imm(s, insn);
-+    default:
+-        break;
-         unallocated_encoding(s);
+     case 0x25: /* Move wide (immediate) */
-         return;
+         disas_movw_imm(s, insn);
-     }
+         break;
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 05/16] target/arm: Early exit after unallocated_encoding in disas_fp_int_conv
+[PULL 16/29] target/arm: Convert Move wide (immediate) to decodetree
 From: Richard Henderson <richard.henderson@linaro.org>
-No sense in emitting code after the exception.
+Convert the MON, MOVZ, MOVK instructions.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20180512003217.9105-3-richard.henderson@linaro.org
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Message-id: 20230512144106.3608981-11-peter.maydell@linaro.org
 [PMM: Rebased]
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 2 +-
+ target/arm/tcg/a64.decode      | 13 ++++++
-file changed, 1 insertion(+), 1 deletion(-)
+ target/arm/tcg/translate-a64.c | 73 ++++++++++++++--------------------
 files changed, 42 insertions(+), 44 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/tcg/a64.decode
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/tcg/a64.decode
-@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ EOR_i           . 10 100100 . ...... ...... ..... ..... @logic_imm_64
-         default:
+ EOR_i           . 10 100100 . ...... ...... ..... ..... @logic_imm_32
-             /* all other sf/type/rmode combinations are invalid */
+ ANDS_i          . 11 100100 . ...... ...... ..... ..... @logic_imm_64
-             unallocated_encoding(s);
+ ANDS_i          . 11 100100 . ...... ...... ..... ..... @logic_imm_32
--            break;
++
-+            return;
++# Move wide (immediate)
-         }
++
++&movw           rd sf imm hw
-         if (!fp_access_check(s)) {
++@movw_64        1 .. ...... hw:2   imm:16 rd:5          &movw sf=1
 +@movw_32        0 .. ...... 0 hw:1 imm:16 rd:5          &movw sf=0
 +
 +MOVN            . 00 100101 .. ................ .....   @movw_64
 +MOVN            . 00 100101 .. ................ .....   @movw_32
 +MOVZ            . 10 100101 .. ................ .....   @movw_64
 +MOVZ            . 10 100101 .. ................ .....   @movw_32
 +MOVK            . 11 100101 .. ................ .....   @movw_64
 +MOVK            . 11 100101 .. ................ .....   @movw_32
 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/tcg/translate-a64.c
 +++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ TRANS(ANDS_i, gen_rri_log, a, true, tcg_gen_andi_i64)
  /*
   * Move wide (immediate)
 - *
 - *  31 30 29 28         23 22 21 20             5 4    0
 - * +--+-----+-------------+-----+----------------+------+
 - * |sf| opc | 1 0 0 1 0 1 |  hw |  imm16         |  Rd  |
 - * +--+-----+-------------+-----+----------------+------+
 - *
 - * sf: 0 -> 32 bit, 1 -> 64 bit
 - * opc: 00 -> N, 10 -> Z, 11 -> K
 - * hw: shift/16 (0,16, and sf only 32, 48)
   */
 -static void disas_movw_imm(DisasContext *s, uint32_t insn)
 +
 +static bool trans_MOVZ(DisasContext *s, arg_movw *a)
  {
 -    int rd = extract32(insn, 0, 5);
 -    uint64_t imm = extract32(insn, 5, 16);
 -    int sf = extract32(insn, 31, 1);
 -    int opc = extract32(insn, 29, 2);
 -    int pos = extract32(insn, 21, 2) << 4;
 -    TCGv_i64 tcg_rd = cpu_reg(s, rd);
 +    int pos = a->hw << 4;
 +    tcg_gen_movi_i64(cpu_reg(s, a->rd), (uint64_t)a->imm << pos);
 +    return true;
 +}
 -    if (!sf && (pos >= 32)) {
 -        unallocated_encoding(s);
 -        return;
 -    }
 +static bool trans_MOVN(DisasContext *s, arg_movw *a)
 +{
 +    int pos = a->hw << 4;
 +    uint64_t imm = a->imm;
 -    switch (opc) {
 -    case 0: /* MOVN */
 -    case 2: /* MOVZ */
 -        imm <<= pos;
 -        if (opc == 0) {
 -            imm = ~imm;
 -        }
 -        if (!sf) {
 -            imm &= 0xffffffffu;
 -        }
 -        tcg_gen_movi_i64(tcg_rd, imm);
 -        break;
 -    case 3: /* MOVK */
 -        tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_constant_i64(imm), pos, 16);
 -        if (!sf) {
 -            tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
 -        }
 -        break;
 -    default:
 -        unallocated_encoding(s);
 -        break;
 +    imm = ~(imm << pos);
 +    if (!a->sf) {
 +        imm = (uint32_t)imm;
      }
 +    tcg_gen_movi_i64(cpu_reg(s, a->rd), imm);
 +    return true;
 +}
 +
 +static bool trans_MOVK(DisasContext *s, arg_movw *a)
 +{
 +    int pos = a->hw << 4;
 +    TCGv_i64 tcg_rd, tcg_im;
 +
 +    tcg_rd = cpu_reg(s, a->rd);
 +    tcg_im = tcg_constant_i64(a->imm);
 +    tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_im, pos, 16);
 +    if (!a->sf) {
 +        tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
 +    }
 +    return true;
  }
  /* Bitfield
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
  static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
  {
      switch (extract32(insn, 23, 6)) {
 -    case 0x25: /* Move wide (immediate) */
 -        disas_movw_imm(s, insn);
 -        break;
      case 0x26: /* Bitfield */
          disas_bitfield(s, insn);
          break;
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 08/16] target/arm: Introduce and use read_fp_hreg
+[PULL 17/29] target/arm: Convert Bitfield to decodetree
 From: Richard Henderson <richard.henderson@linaro.org>
-Cc: qemu-stable@nongnu.org
+Convert the BFM, SBFM, UBFM instructions.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Tested-by: Alex Bennée <alex.bennee@linaro.org>
+Message-id: 20230512144106.3608981-12-peter.maydell@linaro.org
-Message-id: 20180512003217.9105-6-richard.henderson@linaro.org
+[PMM: Rebased]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 30 ++++++++++++++----------------
+ target/arm/tcg/a64.decode      |  13 +++
-file changed, 14 insertions(+), 16 deletions(-)
+ target/arm/tcg/translate-a64.c | 144 ++++++++++++++++++---------------
 files changed, 94 insertions(+), 63 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/tcg/a64.decode
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/tcg/a64.decode
-@@ -XXX,XX +XXX,XX @@ static TCGv_i32 read_fp_sreg(DisasContext *s, int reg)
+@@ -XXX,XX +XXX,XX @@ MOVZ            . 10 100101 .. ................ .....   @movw_64
-     return v;
+ MOVZ            . 10 100101 .. ................ .....   @movw_32
  MOVK            . 11 100101 .. ................ .....   @movw_64
  MOVK            . 11 100101 .. ................ .....   @movw_32
 +
 +# Bitfield
 +
 +&bitfield       rd rn sf immr imms
 +@bitfield_64    1 .. ...... 1 immr:6 imms:6 rn:5 rd:5      &bitfield sf=1
 +@bitfield_32    0 .. ...... 0 0 immr:5 0 imms:5 rn:5 rd:5  &bitfield sf=0
 +
 +SBFM            . 00 100110 . ...... ...... ..... ..... @bitfield_64
 +SBFM            . 00 100110 . ...... ...... ..... ..... @bitfield_32
 +BFM             . 01 100110 . ...... ...... ..... ..... @bitfield_64
 +BFM             . 01 100110 . ...... ...... ..... ..... @bitfield_32
 +UBFM            . 10 100110 . ...... ...... ..... ..... @bitfield_64
 +UBFM            . 10 100110 . ...... ...... ..... ..... @bitfield_32
 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/tcg/translate-a64.c
 +++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool trans_MOVK(DisasContext *s, arg_movw *a)
      return true;
  }
-+static TCGv_i32 read_fp_hreg(DisasContext *s, int reg)
+-/* Bitfield
 - *   31  30 29 28         23 22  21  16 15  10 9    5 4    0
 - * +----+-----+-------------+---+------+------+------+------+
 - * | sf | opc | 1 0 0 1 1 0 | N | immr | imms |  Rn  |  Rd  |
 - * +----+-----+-------------+---+------+------+------+------+
 +/*
 + * Bitfield
   */
 -static void disas_bitfield(DisasContext *s, uint32_t insn)
 +
 +static bool trans_SBFM(DisasContext *s, arg_SBFM *a)
  {
 -    unsigned int sf, n, opc, ri, si, rn, rd, bitsize, pos, len;
 -    TCGv_i64 tcg_rd, tcg_tmp;
 +    TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
 +    TCGv_i64 tcg_tmp = read_cpu_reg(s, a->rn, 1);
 +    unsigned int bitsize = a->sf ? 64 : 32;
 +    unsigned int ri = a->immr;
 +    unsigned int si = a->imms;
 +    unsigned int pos, len;
 -    sf = extract32(insn, 31, 1);
 -    opc = extract32(insn, 29, 2);
 -    n = extract32(insn, 22, 1);
 -    ri = extract32(insn, 16, 6);
 -    si = extract32(insn, 10, 6);
 -    rn = extract32(insn, 5, 5);
 -    rd = extract32(insn, 0, 5);
 -    bitsize = sf ? 64 : 32;
 -
 -    if (sf != n || ri >= bitsize || si >= bitsize || opc > 2) {
 -        unallocated_encoding(s);
 -        return;
 -    }
 -
 -    tcg_rd = cpu_reg(s, rd);
 -
 -    /* Suppress the zero-extend for !sf.  Since RI and SI are constrained
 -       to be smaller than bitsize, we'll never reference data outside the
 -       low 32-bits anyway.  */
 -    tcg_tmp = read_cpu_reg(s, rn, 1);
 -
 -    /* Recognize simple(r) extractions.  */
      if (si >= ri) {
          /* Wd<s-r:0> = Wn<s:r> */
          len = (si - ri) + 1;
 -        if (opc == 0) { /* SBFM: ASR, SBFX, SXTB, SXTH, SXTW */
 -            tcg_gen_sextract_i64(tcg_rd, tcg_tmp, ri, len);
 -            goto done;
 -        } else if (opc == 2) { /* UBFM: UBFX, LSR, UXTB, UXTH */
 -            tcg_gen_extract_i64(tcg_rd, tcg_tmp, ri, len);
 -            return;
 +        tcg_gen_sextract_i64(tcg_rd, tcg_tmp, ri, len);
 +        if (!a->sf) {
 +            tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
          }
 -        /* opc == 1, BFXIL fall through to deposit */
 +    } else {
 +        /* Wd<32+s-r,32-r> = Wn<s:0> */
 +        len = si + 1;
 +        pos = (bitsize - ri) & (bitsize - 1);
 +
 +        if (len < ri) {
 +            /*
 +             * Sign extend the destination field from len to fill the
 +             * balance of the word.  Let the deposit below insert all
 +             * of those sign bits.
 +             */
 +            tcg_gen_sextract_i64(tcg_tmp, tcg_tmp, 0, len);
 +            len = ri;
 +        }
 +
 +        /*
 +         * We start with zero, and we haven't modified any bits outside
 +         * bitsize, therefore no final zero-extension is unneeded for !sf.
 +         */
 +        tcg_gen_deposit_z_i64(tcg_rd, tcg_tmp, pos, len);
 +    }
 +    return true;
 +}
 +
 +static bool trans_UBFM(DisasContext *s, arg_UBFM *a)
 +{
-+    TCGv_i32 v = tcg_temp_new_i32();
++    TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
-+
++    TCGv_i64 tcg_tmp = read_cpu_reg(s, a->rn, 1);
-+    tcg_gen_ld16u_i32(v, cpu_env, fp_reg_offset(s, reg, MO_16));
++    unsigned int bitsize = a->sf ? 64 : 32;
-+    return v;
++    unsigned int ri = a->immr;
 +    unsigned int si = a->imms;
 +    unsigned int pos, len;
 +
 +    tcg_rd = cpu_reg(s, a->rd);
 +    tcg_tmp = read_cpu_reg(s, a->rn, 1);
 +
 +    if (si >= ri) {
 +        /* Wd<s-r:0> = Wn<s:r> */
 +        len = (si - ri) + 1;
 +        tcg_gen_extract_i64(tcg_rd, tcg_tmp, ri, len);
 +    } else {
 +        /* Wd<32+s-r,32-r> = Wn<s:0> */
 +        len = si + 1;
 +        pos = (bitsize - ri) & (bitsize - 1);
 +        tcg_gen_deposit_z_i64(tcg_rd, tcg_tmp, pos, len);
 +    }
 +    return true;
 +}
 +
- /* Clear the bits above an N-bit vector, for N = (is_q ? 128 : 64).
++static bool trans_BFM(DisasContext *s, arg_BFM *a)
-  * If SVE is not enabled, then there are only 128 bits in the vector.
++{
-  */
++    TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
-@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
++    TCGv_i64 tcg_tmp = read_cpu_reg(s, a->rn, 1);
- static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
++    unsigned int bitsize = a->sf ? 64 : 32;
 +    unsigned int ri = a->immr;
 +    unsigned int si = a->imms;
 +    unsigned int pos, len;
 +
 +    tcg_rd = cpu_reg(s, a->rd);
 +    tcg_tmp = read_cpu_reg(s, a->rn, 1);
 +
 +    if (si >= ri) {
 +        /* Wd<s-r:0> = Wn<s:r> */
          tcg_gen_shri_i64(tcg_tmp, tcg_tmp, ri);
 +        len = (si - ri) + 1;
          pos = 0;
      } else {
 -        /* Handle the ri > si case with a deposit
 -         * Wd<32+s-r,32-r> = Wn<s:0>
 -         */
 +        /* Wd<32+s-r,32-r> = Wn<s:0> */
          len = si + 1;
          pos = (bitsize - ri) & (bitsize - 1);
      }
 -    if (opc == 0 && len < ri) {
 -        /* SBFM: sign extend the destination field from len to fill
 -           the balance of the word.  Let the deposit below insert all
 -           of those sign bits.  */
 -        tcg_gen_sextract_i64(tcg_tmp, tcg_tmp, 0, len);
 -        len = ri;
 -    }
 -
 -    if (opc == 1) { /* BFM, BFXIL */
 -        tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, pos, len);
 -    } else {
 -        /* SBFM or UBFM: We start with zero, and we haven't modified
 -           any bits outside bitsize, therefore the zero-extension
 -           below is unneeded.  */
 -        tcg_gen_deposit_z_i64(tcg_rd, tcg_tmp, pos, len);
 -        return;
 -    }
 -
 - done:
 -    if (!sf) { /* zero extend final result */
 +    tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, pos, len);
 +    if (!a->sf) {
          tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
      }
 +    return true;
  }
  /* Extract
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
  static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
  {
-     TCGv_ptr fpst = NULL;
+     switch (extract32(insn, 23, 6)) {
--    TCGv_i32 tcg_op = tcg_temp_new_i32();
+-    case 0x26: /* Bitfield */
-+    TCGv_i32 tcg_op = read_fp_hreg(s, rn);
+-        disas_bitfield(s, insn);
-     TCGv_i32 tcg_res = tcg_temp_new_i32();
+-        break;
+     case 0x27: /* Extract */
--    read_vec_element_i32(s, tcg_op, rn, 0, MO_16);
+         disas_extract(s, insn);
--
+         break;
      switch (opcode) {
      case 0x0: /* FMOV */
          tcg_gen_mov_i32(tcg_res, tcg_op);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn)
          tcg_temp_free_i64(tcg_op2);
          tcg_temp_free_i64(tcg_res);
      } else {
 -        TCGv_i32 tcg_op1 = tcg_temp_new_i32();
 -        TCGv_i32 tcg_op2 = tcg_temp_new_i32();
 +        TCGv_i32 tcg_op1 = read_fp_hreg(s, rn);
 +        TCGv_i32 tcg_op2 = read_fp_hreg(s, rm);
          TCGv_i64 tcg_res = tcg_temp_new_i64();
 -        read_vec_element_i32(s, tcg_op1, rn, 0, MO_16);
 -        read_vec_element_i32(s, tcg_op2, rm, 0, MO_16);
 -
          gen_helper_neon_mull_s16(tcg_res, tcg_op1, tcg_op2);
          gen_helper_neon_addl_saturate_s32(tcg_res, cpu_env, tcg_res, tcg_res);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
      fpst = get_fpstatus_ptr(true);
 -    tcg_op1 = tcg_temp_new_i32();
 -    tcg_op2 = tcg_temp_new_i32();
 +    tcg_op1 = read_fp_hreg(s, rn);
 +    tcg_op2 = read_fp_hreg(s, rm);
      tcg_res = tcg_temp_new_i32();
 -    read_vec_element_i32(s, tcg_op1, rn, 0, MO_16);
 -    read_vec_element_i32(s, tcg_op2, rm, 0, MO_16);
 -
      switch (fpopcode) {
      case 0x03: /* FMULX */
          gen_helper_advsimd_mulxh(tcg_res, tcg_op1, tcg_op2, fpst);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
      }
      if (is_scalar) {
 -        TCGv_i32 tcg_op = tcg_temp_new_i32();
 +        TCGv_i32 tcg_op = read_fp_hreg(s, rn);
          TCGv_i32 tcg_res = tcg_temp_new_i32();
 -        read_vec_element_i32(s, tcg_op, rn, 0, MO_16);
 -
          switch (fpop) {
          case 0x1a: /* FCVTNS */
          case 0x1b: /* FCVTMS */
 --
-.17.0
+.34.1

-New patch
+[PULL 18/29] target/arm: Convert Extract instructions to decodetree
+Convert the EXTR instruction to decodetree (this is the
+only one in the 'Extract" class). This is the last of
+the dp-immediate insns in the legacy decoder, so we
+can now remove disas_data_proc_imm().
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230512144106.3608981-13-peter.maydell@linaro.org
+---
+ target/arm/tcg/a64.decode      |  7 +++
+ target/arm/tcg/translate-a64.c | 94 +++++++++++-----------------------
+files changed, 36 insertions(+), 65 deletions(-)
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/a64.decode
++++ b/target/arm/tcg/a64.decode
+@@ -XXX,XX +XXX,XX @@ BFM             . 01 100110 . ...... ...... ..... ..... @bitfield_64
+ BFM             . 01 100110 . ...... ...... ..... ..... @bitfield_32
+ UBFM            . 10 100110 . ...... ...... ..... ..... @bitfield_64
+ UBFM            . 10 100110 . ...... ...... ..... ..... @bitfield_32
++
++# Extract
++
++&extract        rd rn rm imm sf
++
++EXTR            1 00 100111 1 0 rm:5 imm:6 rn:5 rd:5     &extract sf=1
++EXTR            0 00 100111 0 0 rm:5 0 imm:5 rn:5 rd:5   &extract sf=0
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_BFM(DisasContext *s, arg_BFM *a)
+     return true;
+ }
+-/* Extract
+- *   31  30  29 28         23 22   21  20  16 15    10 9    5 4    0
+- * +----+------+-------------+---+----+------+--------+------+------+
+- * | sf | op21 | 1 0 0 1 1 1 | N | o0 |  Rm  |  imms  |  Rn  |  Rd  |
+- * +----+------+-------------+---+----+------+--------+------+------+
+- */
+-static void disas_extract(DisasContext *s, uint32_t insn)
++static bool trans_EXTR(DisasContext *s, arg_extract *a)
+ {
+-    unsigned int sf, n, rm, imm, rn, rd, bitsize, op21, op0;
++    TCGv_i64 tcg_rd, tcg_rm, tcg_rn;
+-    sf = extract32(insn, 31, 1);
+-    n = extract32(insn, 22, 1);
+-    rm = extract32(insn, 16, 5);
+-    imm = extract32(insn, 10, 6);
+-    rn = extract32(insn, 5, 5);
+-    rd = extract32(insn, 0, 5);
+-    op21 = extract32(insn, 29, 2);
+-    op0 = extract32(insn, 21, 1);
+-    bitsize = sf ? 64 : 32;
++    tcg_rd = cpu_reg(s, a->rd);
+-    if (sf != n || op21 || op0 || imm >= bitsize) {
+-        unallocated_encoding(s);
+-    } else {
+-        TCGv_i64 tcg_rd, tcg_rm, tcg_rn;
+-
+-        tcg_rd = cpu_reg(s, rd);
+-
+-        if (unlikely(imm == 0)) {
+-            /* tcg shl_i32/shl_i64 is undefined for 32/64 bit shifts,
+-             * so an extract from bit 0 is a special case.
+-             */
+-            if (sf) {
+-                tcg_gen_mov_i64(tcg_rd, cpu_reg(s, rm));
+-            } else {
+-                tcg_gen_ext32u_i64(tcg_rd, cpu_reg(s, rm));
+-            }
++    if (unlikely(a->imm == 0)) {
++        /*
++         * tcg shl_i32/shl_i64 is undefined for 32/64 bit shifts,
++         * so an extract from bit 0 is a special case.
++         */
++        if (a->sf) {
++            tcg_gen_mov_i64(tcg_rd, cpu_reg(s, a->rm));
+         } else {
+-            tcg_rm = cpu_reg(s, rm);
+-            tcg_rn = cpu_reg(s, rn);
++            tcg_gen_ext32u_i64(tcg_rd, cpu_reg(s, a->rm));
++        }
++    } else {
++        tcg_rm = cpu_reg(s, a->rm);
++        tcg_rn = cpu_reg(s, a->rn);
+-            if (sf) {
+-                /* Specialization to ROR happens in EXTRACT2.  */
+-                tcg_gen_extract2_i64(tcg_rd, tcg_rm, tcg_rn, imm);
++        if (a->sf) {
++            /* Specialization to ROR happens in EXTRACT2.  */
++            tcg_gen_extract2_i64(tcg_rd, tcg_rm, tcg_rn, a->imm);
++        } else {
++            TCGv_i32 t0 = tcg_temp_new_i32();
++
++            tcg_gen_extrl_i64_i32(t0, tcg_rm);
++            if (a->rm == a->rn) {
++                tcg_gen_rotri_i32(t0, t0, a->imm);
+             } else {
+-                TCGv_i32 t0 = tcg_temp_new_i32();
+-
+-                tcg_gen_extrl_i64_i32(t0, tcg_rm);
+-                if (rm == rn) {
+-                    tcg_gen_rotri_i32(t0, t0, imm);
+-                } else {
+-                    TCGv_i32 t1 = tcg_temp_new_i32();
+-                    tcg_gen_extrl_i64_i32(t1, tcg_rn);
+-                    tcg_gen_extract2_i32(t0, t0, t1, imm);
+-                }
+-                tcg_gen_extu_i32_i64(tcg_rd, t0);
++                TCGv_i32 t1 = tcg_temp_new_i32();
++                tcg_gen_extrl_i64_i32(t1, tcg_rn);
++                tcg_gen_extract2_i32(t0, t0, t1, a->imm);
+             }
++            tcg_gen_extu_i32_i64(tcg_rd, t0);
+         }
+     }
+-}
+-
+-/* Data processing - immediate */
+-static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
+-{
+-    switch (extract32(insn, 23, 6)) {
+-    case 0x27: /* Extract */
+-        disas_extract(s, insn);
+-        break;
+-    default:
+-        unallocated_encoding(s);
+-        break;
+-    }
++    return true;
+ }
+ /* Shift a TCGv src by TCGv shift_amount, put result in dst.
+@@ -XXX,XX +XXX,XX @@ static bool btype_destination_ok(uint32_t insn, bool bt, int btype)
+ static void disas_a64_legacy(DisasContext *s, uint32_t insn)
+ {
+     switch (extract32(insn, 25, 4)) {
+-    case 0x8: case 0x9: /* Data processing - immediate */
+-        disas_data_proc_imm(s, insn);
+-        break;
+     case 0xa: case 0xb: /* Branch, exception generation and system insns */
+         disas_b_exc_sys(s, insn);
+         break;
+--
+.34.1

-New patch
+[PULL 19/29] target/arm: Convert unconditional branch immediate to decodetree
+Convert the unconditional branch immediate insns B and BL to
+decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230512144106.3608981-14-peter.maydell@linaro.org
+---
+ target/arm/tcg/a64.decode      |  9 +++++++++
+ target/arm/tcg/translate-a64.c | 31 +++++++++++--------------------
+files changed, 20 insertions(+), 20 deletions(-)
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/a64.decode
++++ b/target/arm/tcg/a64.decode
+@@ -XXX,XX +XXX,XX @@
+ &ri              rd imm
+ &rri_sf          rd rn imm sf
++&i               imm
+ ### Data Processing - Immediate
+@@ -XXX,XX +XXX,XX @@ UBFM            . 10 100110 . ...... ...... ..... ..... @bitfield_32
+ EXTR            1 00 100111 1 0 rm:5 imm:6 rn:5 rd:5     &extract sf=1
+ EXTR            0 00 100111 0 0 rm:5 0 imm:5 rn:5 rd:5   &extract sf=0
++
++# Branches
++
++%imm26   0:s26 !function=times_4
++@branch         . ..... .......................... &i imm=%imm26
++
++B               0 00101 .......................... @branch
++BL              1 00101 .......................... @branch
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static inline AArch64DecodeFn *lookup_disas_fn(const AArch64DecodeTable *table,
+  * match up with those in the manual.
+  */
+-/* Unconditional branch (immediate)
+- *   31  30       26 25                                  0
+- * +----+-----------+-------------------------------------+
+- * | op | 0 0 1 0 1 |                 imm26               |
+- * +----+-----------+-------------------------------------+
+- */
+-static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
++static bool trans_B(DisasContext *s, arg_i *a)
+ {
+-    int64_t diff = sextract32(insn, 0, 26) * 4;
+-
+-    if (insn & (1U << 31)) {
+-        /* BL Branch with link */
+-        gen_pc_plus_diff(s, cpu_reg(s, 30), curr_insn_len(s));
+-    }
+-
+-    /* B Branch / BL Branch with link */
+     reset_btype(s);
+-    gen_goto_tb(s, 0, diff);
++    gen_goto_tb(s, 0, a->imm);
++    return true;
++}
++
++static bool trans_BL(DisasContext *s, arg_i *a)
++{
++    gen_pc_plus_diff(s, cpu_reg(s, 30), curr_insn_len(s));
++    reset_btype(s);
++    gen_goto_tb(s, 0, a->imm);
++    return true;
+ }
+ /* Compare and branch (immediate)
+@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
+ static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
+ {
+     switch (extract32(insn, 25, 7)) {
+-    case 0x0a: case 0x0b:
+-    case 0x4a: case 0x4b: /* Unconditional branch (immediate) */
+-        disas_uncond_b_imm(s, insn);
+-        break;
+     case 0x1a: case 0x5a: /* Compare & branch (immediate) */
+         disas_comp_b_imm(s, insn);
+         break;
+--
+.34.1

-New patch
+[PULL 20/29] target/arm: Convert CBZ, CBNZ to decodetree
+Convert the compare-and-branch-immediate insns CBZ and CBNZ
+to decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230512144106.3608981-15-peter.maydell@linaro.org
+---
+ target/arm/tcg/a64.decode      |  5 +++++
+ target/arm/tcg/translate-a64.c | 26 ++++++--------------------
+files changed, 11 insertions(+), 20 deletions(-)
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/a64.decode
++++ b/target/arm/tcg/a64.decode
+@@ -XXX,XX +XXX,XX @@ EXTR            0 00 100111 0 0 rm:5 0 imm:5 rn:5 rd:5   &extract sf=0
+ B               0 00101 .......................... @branch
+ BL              1 00101 .......................... @branch
++
++%imm19   5:s19 !function=times_4
++&cbz     rt imm sf nz
++
++CBZ             sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_BL(DisasContext *s, arg_i *a)
+     return true;
+ }
+-/* Compare and branch (immediate)
+- *   31  30         25  24  23                  5 4      0
+- * +----+-------------+----+---------------------+--------+
+- * | sf | 0 1 1 0 1 0 | op |         imm19       |   Rt   |
+- * +----+-------------+----+---------------------+--------+
+- */
+-static void disas_comp_b_imm(DisasContext *s, uint32_t insn)
++
++static bool trans_CBZ(DisasContext *s, arg_cbz *a)
+ {
+-    unsigned int sf, op, rt;
+-    int64_t diff;
+     DisasLabel match;
+     TCGv_i64 tcg_cmp;
+-    sf = extract32(insn, 31, 1);
+-    op = extract32(insn, 24, 1); /* 0: CBZ; 1: CBNZ */
+-    rt = extract32(insn, 0, 5);
+-    diff = sextract32(insn, 5, 19) * 4;
+-
+-    tcg_cmp = read_cpu_reg(s, rt, sf);
++    tcg_cmp = read_cpu_reg(s, a->rt, a->sf);
+     reset_btype(s);
+     match = gen_disas_label(s);
+-    tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
++    tcg_gen_brcondi_i64(a->nz ? TCG_COND_NE : TCG_COND_EQ,
+                         tcg_cmp, 0, match.label);
+     gen_goto_tb(s, 0, 4);
+     set_disas_label(s, match);
+-    gen_goto_tb(s, 1, diff);
++    gen_goto_tb(s, 1, a->imm);
++    return true;
+ }
+ /* Test and branch (immediate)
+@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
+ static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
+ {
+     switch (extract32(insn, 25, 7)) {
+-    case 0x1a: case 0x5a: /* Compare & branch (immediate) */
+-        disas_comp_b_imm(s, insn);
+-        break;
+     case 0x1b: case 0x5b: /* Test & branch (immediate) */
+         disas_test_b_imm(s, insn);
+         break;
+--
+.34.1

-New patch
+[PULL 21/29] target/arm: Convert TBZ, TBNZ to decodetree
+Convert the test-and-branch-immediate insns TBZ and TBNZ
+to decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230512144106.3608981-16-peter.maydell@linaro.org
+---
+ target/arm/tcg/a64.decode      |  6 ++++++
+ target/arm/tcg/translate-a64.c | 25 +++++--------------------
+files changed, 11 insertions(+), 20 deletions(-)
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/a64.decode
++++ b/target/arm/tcg/a64.decode
+@@ -XXX,XX +XXX,XX @@ BL              1 00101 .......................... @branch
+ &cbz     rt imm sf nz
+ CBZ             sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
++
++%imm14     5:s14 !function=times_4
++%imm31_19  31:1 19:5
++&tbz       rt imm nz bitpos
++
++TBZ             . 011011 nz:1 ..... .............. rt:5 &tbz  imm=%imm14 bitpos=%imm31_19
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_CBZ(DisasContext *s, arg_cbz *a)
+     return true;
+ }
+-/* Test and branch (immediate)
+- *   31  30         25  24  23   19 18          5 4    0
+- * +----+-------------+----+-------+-------------+------+
+- * | b5 | 0 1 1 0 1 1 | op |  b40  |    imm14    |  Rt  |
+- * +----+-------------+----+-------+-------------+------+
+- */
+-static void disas_test_b_imm(DisasContext *s, uint32_t insn)
++static bool trans_TBZ(DisasContext *s, arg_tbz *a)
+ {
+-    unsigned int bit_pos, op, rt;
+-    int64_t diff;
+     DisasLabel match;
+     TCGv_i64 tcg_cmp;
+-    bit_pos = (extract32(insn, 31, 1) << 5) | extract32(insn, 19, 5);
+-    op = extract32(insn, 24, 1); /* 0: TBZ; 1: TBNZ */
+-    diff = sextract32(insn, 5, 14) * 4;
+-    rt = extract32(insn, 0, 5);
+-
+     tcg_cmp = tcg_temp_new_i64();
+-    tcg_gen_andi_i64(tcg_cmp, cpu_reg(s, rt), (1ULL << bit_pos));
++    tcg_gen_andi_i64(tcg_cmp, cpu_reg(s, a->rt), 1ULL << a->bitpos);
+     reset_btype(s);
+     match = gen_disas_label(s);
+-    tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
++    tcg_gen_brcondi_i64(a->nz ? TCG_COND_NE : TCG_COND_EQ,
+                         tcg_cmp, 0, match.label);
+     gen_goto_tb(s, 0, 4);
+     set_disas_label(s, match);
+-    gen_goto_tb(s, 1, diff);
++    gen_goto_tb(s, 1, a->imm);
++    return true;
+ }
+ /* Conditional branch (immediate)
+@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
+ static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
+ {
+     switch (extract32(insn, 25, 7)) {
+-    case 0x1b: case 0x5b: /* Test & branch (immediate) */
+-        disas_test_b_imm(s, insn);
+-        break;
+     case 0x2a: /* Conditional branch (immediate) */
+         disas_cond_b_imm(s, insn);
+         break;
+--
+.34.1

-New patch
+[PULL 22/29] target/arm: Convert conditional branch insns to decodetree
+Convert the immediate conditional branch insn B.cond to
+decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230512144106.3608981-17-peter.maydell@linaro.org
+---
+ target/arm/tcg/a64.decode      |  2 ++
+ target/arm/tcg/translate-a64.c | 30 ++++++------------------------
+files changed, 8 insertions(+), 24 deletions(-)
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/a64.decode
++++ b/target/arm/tcg/a64.decode
+@@ -XXX,XX +XXX,XX @@ CBZ             sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
+ &tbz       rt imm nz bitpos
+ TBZ             . 011011 nz:1 ..... .............. rt:5 &tbz  imm=%imm14 bitpos=%imm31_19
++
++B_cond          0101010 0 ................... 0 cond:4 imm=%imm19
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_TBZ(DisasContext *s, arg_tbz *a)
+     return true;
+ }
+-/* Conditional branch (immediate)
+- *  31           25  24  23                  5   4  3    0
+- * +---------------+----+---------------------+----+------+
+- * | 0 1 0 1 0 1 0 | o1 |         imm19       | o0 | cond |
+- * +---------------+----+---------------------+----+------+
+- */
+-static void disas_cond_b_imm(DisasContext *s, uint32_t insn)
++static bool trans_B_cond(DisasContext *s, arg_B_cond *a)
+ {
+-    unsigned int cond;
+-    int64_t diff;
+-
+-    if ((insn & (1 << 4)) || (insn & (1 << 24))) {
+-        unallocated_encoding(s);
+-        return;
+-    }
+-    diff = sextract32(insn, 5, 19) * 4;
+-    cond = extract32(insn, 0, 4);
+-
+     reset_btype(s);
+-    if (cond < 0x0e) {
++    if (a->cond < 0x0e) {
+         /* genuinely conditional branches */
+         DisasLabel match = gen_disas_label(s);
+-        arm_gen_test_cc(cond, match.label);
++        arm_gen_test_cc(a->cond, match.label);
+         gen_goto_tb(s, 0, 4);
+         set_disas_label(s, match);
+-        gen_goto_tb(s, 1, diff);
++        gen_goto_tb(s, 1, a->imm);
+     } else {
+         /* 0xe and 0xf are both "always" conditions */
+-        gen_goto_tb(s, 0, diff);
++        gen_goto_tb(s, 0, a->imm);
+     }
++    return true;
+ }
+ /* HINT instruction group, including various allocated HINTs */
+@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
+ static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
+ {
+     switch (extract32(insn, 25, 7)) {
+-    case 0x2a: /* Conditional branch (immediate) */
+-        disas_cond_b_imm(s, insn);
+-        break;
+     case 0x6a: /* Exception generation / System */
+         if (insn & (1 << 24)) {
+             if (extract32(insn, 22, 2) == 0) {
+--
+.34.1

-New patch
+[PULL 23/29] target/arm: Convert BR, BLR, RET to decodetree
+Convert the simple (non-pointer-auth) BR, BLR and RET insns
+to decodetree.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230512144106.3608981-18-peter.maydell@linaro.org
+---
+ target/arm/tcg/a64.decode      |  5 ++++
+ target/arm/tcg/translate-a64.c | 55 ++++++++++++++++++++++++++++++----
+files changed, 54 insertions(+), 6 deletions(-)
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/a64.decode
++++ b/target/arm/tcg/a64.decode
+@@ -XXX,XX +XXX,XX @@
+ # This file is processed by scripts/decodetree.py
+ #
++&r               rn
+ &ri              rd imm
+ &rri_sf          rd rn imm sf
+ &i               imm
+@@ -XXX,XX +XXX,XX @@ CBZ             sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
+ TBZ             . 011011 nz:1 ..... .............. rt:5 &tbz  imm=%imm14 bitpos=%imm31_19
+ B_cond          0101010 0 ................... 0 cond:4 imm=%imm19
++
++BR              1101011 0000 11111 000000 rn:5 00000 &r
++BLR             1101011 0001 11111 000000 rn:5 00000 &r
++RET             1101011 0010 11111 000000 rn:5 00000 &r
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_B_cond(DisasContext *s, arg_B_cond *a)
+     return true;
+ }
++static void set_btype_for_br(DisasContext *s, int rn)
++{
++    if (dc_isar_feature(aa64_bti, s)) {
++        /* BR to {x16,x17} or !guard -> 1, else 3.  */
++        set_btype(s, rn == 16 || rn == 17 || !s->guarded_page ? 1 : 3);
++    }
++}
++
++static void set_btype_for_blr(DisasContext *s)
++{
++    if (dc_isar_feature(aa64_bti, s)) {
++        /* BLR sets BTYPE to 2, regardless of source guarded page.  */
++        set_btype(s, 2);
++    }
++}
++
++static bool trans_BR(DisasContext *s, arg_r *a)
++{
++    gen_a64_set_pc(s, cpu_reg(s, a->rn));
++    set_btype_for_br(s, a->rn);
++    s->base.is_jmp = DISAS_JUMP;
++    return true;
++}
++
++static bool trans_BLR(DisasContext *s, arg_r *a)
++{
++    TCGv_i64 dst = cpu_reg(s, a->rn);
++    TCGv_i64 lr = cpu_reg(s, 30);
++    if (dst == lr) {
++        TCGv_i64 tmp = tcg_temp_new_i64();
++        tcg_gen_mov_i64(tmp, dst);
++        dst = tmp;
++    }
++    gen_pc_plus_diff(s, lr, curr_insn_len(s));
++    gen_a64_set_pc(s, dst);
++    set_btype_for_blr(s);
++    s->base.is_jmp = DISAS_JUMP;
++    return true;
++}
++
++static bool trans_RET(DisasContext *s, arg_r *a)
++{
++    gen_a64_set_pc(s, cpu_reg(s, a->rn));
++    s->base.is_jmp = DISAS_JUMP;
++    return true;
++}
++
+ /* HINT instruction group, including various allocated HINTs */
+ static void handle_hint(DisasContext *s, uint32_t insn,
+                         unsigned int op1, unsigned int op2, unsigned int crm)
+@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
+         btype_mod = opc;
+         switch (op3) {
+         case 0:
+-            /* BR, BLR, RET */
+-            if (op4 != 0) {
+-                goto do_unallocated;
+-            }
+-            dst = cpu_reg(s, rn);
+-            break;
++            /* BR, BLR, RET : handled in decodetree */
++            goto do_unallocated;
+         case 2:
+         case 3:
+--
+.34.1

-New patch
+[PULL 24/29] target/arm: Convert BRA[AB]Z, BLR[AB]Z, RETA[AB] to decodetree
+Convert the single-register pointer-authentication variants of BR,
+BLR, RET to decodetree. (BRAA/BLRAA are in a different branch of
+the legacy decoder and will be dealt with in the next commit.)
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230512144106.3608981-19-peter.maydell@linaro.org
+---
+ target/arm/tcg/a64.decode      |   7 ++
+ target/arm/tcg/translate-a64.c | 132 +++++++++++++++++++--------------
+files changed, 84 insertions(+), 55 deletions(-)
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/a64.decode
++++ b/target/arm/tcg/a64.decode
+@@ -XXX,XX +XXX,XX @@ B_cond          0101010 0 ................... 0 cond:4 imm=%imm19
+ BR              1101011 0000 11111 000000 rn:5 00000 &r
+ BLR             1101011 0001 11111 000000 rn:5 00000 &r
+ RET             1101011 0010 11111 000000 rn:5 00000 &r
++
++&braz       rn m
++BRAZ            1101011 0000 11111 00001 m:1 rn:5 11111 &braz   # BRAAZ, BRABZ
++BLRAZ           1101011 0001 11111 00001 m:1 rn:5 11111 &braz   # BLRAAZ, BLRABZ
++
++&reta       m
++RETA            1101011 0010 11111 00001 m:1 11111 11111 &reta  # RETAA, RETAB
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_RET(DisasContext *s, arg_r *a)
+     return true;
+ }
++static TCGv_i64 auth_branch_target(DisasContext *s, TCGv_i64 dst,
++                                   TCGv_i64 modifier, bool use_key_a)
++{
++    TCGv_i64 truedst;
++    /*
++     * Return the branch target for a BRAA/RETA/etc, which is either
++     * just the destination dst, or that value with the pauth check
++     * done and the code removed from the high bits.
++     */
++    if (!s->pauth_active) {
++        return dst;
++    }
++
++    truedst = tcg_temp_new_i64();
++    if (use_key_a) {
++        gen_helper_autia(truedst, cpu_env, dst, modifier);
++    } else {
++        gen_helper_autib(truedst, cpu_env, dst, modifier);
++    }
++    return truedst;
++}
++
++static bool trans_BRAZ(DisasContext *s, arg_braz *a)
++{
++    TCGv_i64 dst;
++
++    if (!dc_isar_feature(aa64_pauth, s)) {
++        return false;
++    }
++
++    dst = auth_branch_target(s, cpu_reg(s, a->rn), tcg_constant_i64(0), !a->m);
++    gen_a64_set_pc(s, dst);
++    set_btype_for_br(s, a->rn);
++    s->base.is_jmp = DISAS_JUMP;
++    return true;
++}
++
++static bool trans_BLRAZ(DisasContext *s, arg_braz *a)
++{
++    TCGv_i64 dst, lr;
++
++    if (!dc_isar_feature(aa64_pauth, s)) {
++        return false;
++    }
++
++    dst = auth_branch_target(s, cpu_reg(s, a->rn), tcg_constant_i64(0), !a->m);
++    lr = cpu_reg(s, 30);
++    if (dst == lr) {
++        TCGv_i64 tmp = tcg_temp_new_i64();
++        tcg_gen_mov_i64(tmp, dst);
++        dst = tmp;
++    }
++    gen_pc_plus_diff(s, lr, curr_insn_len(s));
++    gen_a64_set_pc(s, dst);
++    set_btype_for_blr(s);
++    s->base.is_jmp = DISAS_JUMP;
++    return true;
++}
++
++static bool trans_RETA(DisasContext *s, arg_reta *a)
++{
++    TCGv_i64 dst;
++
++    dst = auth_branch_target(s, cpu_reg(s, 30), cpu_X[31], !a->m);
++    gen_a64_set_pc(s, dst);
++    s->base.is_jmp = DISAS_JUMP;
++    return true;
++}
++
+ /* HINT instruction group, including various allocated HINTs */
+ static void handle_hint(DisasContext *s, uint32_t insn,
+                         unsigned int op1, unsigned int op2, unsigned int crm)
+@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
+     }
+     switch (opc) {
+-    case 0: /* BR */
+-    case 1: /* BLR */
+-    case 2: /* RET */
+-        btype_mod = opc;
+-        switch (op3) {
+-        case 0:
+-            /* BR, BLR, RET : handled in decodetree */
+-            goto do_unallocated;
+-
+-        case 2:
+-        case 3:
+-            if (!dc_isar_feature(aa64_pauth, s)) {
+-                goto do_unallocated;
+-            }
+-            if (opc == 2) {
+-                /* RETAA, RETAB */
+-                if (rn != 0x1f || op4 != 0x1f) {
+-                    goto do_unallocated;
+-                }
+-                rn = 30;
+-                modifier = cpu_X[31];
+-            } else {
+-                /* BRAAZ, BRABZ, BLRAAZ, BLRABZ */
+-                if (op4 != 0x1f) {
+-                    goto do_unallocated;
+-                }
+-                modifier = tcg_constant_i64(0);
+-            }
+-            if (s->pauth_active) {
+-                dst = tcg_temp_new_i64();
+-                if (op3 == 2) {
+-                    gen_helper_autia(dst, cpu_env, cpu_reg(s, rn), modifier);
+-                } else {
+-                    gen_helper_autib(dst, cpu_env, cpu_reg(s, rn), modifier);
+-                }
+-            } else {
+-                dst = cpu_reg(s, rn);
+-            }
+-            break;
+-
+-        default:
+-            goto do_unallocated;
+-        }
+-        /* BLR also needs to load return address */
+-        if (opc == 1) {
+-            TCGv_i64 lr = cpu_reg(s, 30);
+-            if (dst == lr) {
+-                TCGv_i64 tmp = tcg_temp_new_i64();
+-                tcg_gen_mov_i64(tmp, dst);
+-                dst = tmp;
+-            }
+-            gen_pc_plus_diff(s, lr, curr_insn_len(s));
+-        }
+-        gen_a64_set_pc(s, dst);
+-        break;
++    case 0:
++    case 1:
++    case 2:
++        /*
++         * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ:
++         * handled in decodetree
++         */
++        goto do_unallocated;
+     case 8: /* BRAA */
+     case 9: /* BLRAA */
+--
+.34.1

-[Qemu-devel] [PULL 11/16] target/arm: Implement FCMP for fp16
+[PULL 25/29] target/arm: Convert BRAA, BRAB, BLRAA, BLRAB to decodetree
-From: Alex Bennée <alex.bennee@linaro.org>
+Convert the last four BR-with-pointer-auth insns to decodetree.
 The remaining cases in the outer switch in disas_uncond_b_reg()
 all return early rather than leaving the case statement, so we
 can delete the now-unused code at the end of that function.
-These where missed out from the rest of the half-precision work.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20230512144106.3608981-20-peter.maydell@linaro.org
 ---
  target/arm/tcg/a64.decode      |  4 ++
  target/arm/tcg/translate-a64.c | 97 ++++++++++++++--------------------
 files changed, 43 insertions(+), 58 deletions(-)
-Cc: qemu-stable@nongnu.org
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
 Tested-by: Alex Bennée <alex.bennee@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20180512003217.9105-9-richard.henderson@linaro.org
 [rth: Diagnose lack of FP16 before fp_access_check]
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  target/arm/helper-a64.h    |  2 +
  target/arm/helper-a64.c    | 10 +++++
  target/arm/translate-a64.c | 88 ++++++++++++++++++++++++++++++--------
 files changed, 83 insertions(+), 17 deletions(-)
 diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper-a64.h
+--- a/target/arm/tcg/a64.decode
-+++ b/target/arm/helper-a64.h
++++ b/target/arm/tcg/a64.decode
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ BLRAZ           1101011 0001 11111 00001 m:1 rn:5 11111 &braz   # BLRAAZ, BLRABZ
- DEF_HELPER_FLAGS_2(udiv64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
- DEF_HELPER_FLAGS_2(sdiv64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
+ &reta       m
- DEF_HELPER_FLAGS_1(rbit64, TCG_CALL_NO_RWG_SE, i64, i64)
+ RETA            1101011 0010 11111 00001 m:1 11111 11111 &reta  # RETAA, RETAB
-+DEF_HELPER_3(vfp_cmph_a64, i64, f16, f16, ptr)
++
-+DEF_HELPER_3(vfp_cmpeh_a64, i64, f16, f16, ptr)
++&bra        rn rm m
- DEF_HELPER_3(vfp_cmps_a64, i64, f32, f32, ptr)
++BRA             1101011 1000 11111 00001 m:1 rn:5 rm:5 &bra # BRAA, BRAB
- DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr)
++BLRA            1101011 1001 11111 00001 m:1 rn:5 rm:5 &bra # BLRAA, BLRAB
- DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr)
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
 diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper-a64.c
+--- a/target/arm/tcg/translate-a64.c
-+++ b/target/arm/helper-a64.c
++++ b/target/arm/tcg/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static inline uint32_t float_rel_to_flags(int res)
+@@ -XXX,XX +XXX,XX @@ static bool trans_RETA(DisasContext *s, arg_reta *a)
-     return flags;
+     return true;
  }
-+uint64_t HELPER(vfp_cmph_a64)(float16 x, float16 y, void *fp_status)
++static bool trans_BRA(DisasContext *s, arg_bra *a)
 +{
-+    return float_rel_to_flags(float16_compare_quiet(x, y, fp_status));
++    TCGv_i64 dst;
 +
 +    if (!dc_isar_feature(aa64_pauth, s)) {
 +        return false;
 +    }
 +    dst = auth_branch_target(s, cpu_reg(s,a->rn), cpu_reg_sp(s, a->rm), !a->m);
 +    gen_a64_set_pc(s, dst);
 +    set_btype_for_br(s, a->rn);
 +    s->base.is_jmp = DISAS_JUMP;
 +    return true;
 +}
 +
-+uint64_t HELPER(vfp_cmpeh_a64)(float16 x, float16 y, void *fp_status)
++static bool trans_BLRA(DisasContext *s, arg_bra *a)
 +{
-+    return float_rel_to_flags(float16_compare(x, y, fp_status));
++    TCGv_i64 dst, lr;
 +
 +    if (!dc_isar_feature(aa64_pauth, s)) {
 +        return false;
 +    }
 +    dst = auth_branch_target(s, cpu_reg(s, a->rn), cpu_reg_sp(s, a->rm), !a->m);
 +    lr = cpu_reg(s, 30);
 +    if (dst == lr) {
 +        TCGv_i64 tmp = tcg_temp_new_i64();
 +        tcg_gen_mov_i64(tmp, dst);
 +        dst = tmp;
 +    }
 +    gen_pc_plus_diff(s, lr, curr_insn_len(s));
 +    gen_a64_set_pc(s, dst);
 +    set_btype_for_blr(s);
 +    s->base.is_jmp = DISAS_JUMP;
 +    return true;
 +}
 +
- uint64_t HELPER(vfp_cmps_a64)(float32 x, float32 y, void *fp_status)
+ /* HINT instruction group, including various allocated HINTs */
  static void handle_hint(DisasContext *s, uint32_t insn,
                          unsigned int op1, unsigned int op2, unsigned int crm)
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
  static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
  {
-     return float_rel_to_flags(float32_compare_quiet(x, y, fp_status));
+     unsigned int opc, op2, op3, rn, op4;
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+-    unsigned btype_mod = 2;   /* 0: BR, 1: BLR, 2: other */
-index XXXXXXX..XXXXXXX 100644
+     TCGv_i64 dst;
---- a/target/arm/translate-a64.c
+     TCGv_i64 modifier;
-+++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_reg(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
-     }
+     case 0:
- }
+     case 1:
+     case 2:
--static void handle_fp_compare(DisasContext *s, bool is_double,
++    case 8:
-+static void handle_fp_compare(DisasContext *s, int size,
++    case 9:
-                               unsigned int rn, unsigned int rm,
+         /*
-                               bool cmp_with_zero, bool signal_all_nans)
+-         * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ:
- {
+-         * handled in decodetree
-     TCGv_i64 tcg_flags = tcg_temp_new_i64();
++         * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ,
--    TCGv_ptr fpst = get_fpstatus_ptr(false);
++         * BRAA, BLRAA: handled in decodetree
-+    TCGv_ptr fpst = get_fpstatus_ptr(size == MO_16);
+          */
+         goto do_unallocated;
--    if (is_double) {
-+    if (size == MO_64) {
+-    case 8: /* BRAA */
-         TCGv_i64 tcg_vn, tcg_vm;
+-    case 9: /* BLRAA */
+-        if (!dc_isar_feature(aa64_pauth, s)) {
-         tcg_vn = read_fp_dreg(s, rn);
+-            goto do_unallocated;
-@@ -XXX,XX +XXX,XX @@ static void handle_fp_compare(DisasContext *s, bool is_double,
+-        }
-         tcg_temp_free_i64(tcg_vn);
+-        if ((op3 & ~1) != 2) {
-         tcg_temp_free_i64(tcg_vm);
+-            goto do_unallocated;
-     } else {
+-        }
--        TCGv_i32 tcg_vn, tcg_vm;
+-        btype_mod = opc & 1;
-+        TCGv_i32 tcg_vn = tcg_temp_new_i32();
+-        if (s->pauth_active) {
-+        TCGv_i32 tcg_vm = tcg_temp_new_i32();
+-            dst = tcg_temp_new_i64();
+-            modifier = cpu_reg_sp(s, op4);
--        tcg_vn = read_fp_sreg(s, rn);
+-            if (op3 == 2) {
-+        read_vec_element_i32(s, tcg_vn, rn, 0, size);
+-                gen_helper_autia(dst, cpu_env, cpu_reg(s, rn), modifier);
-         if (cmp_with_zero) {
+-            } else {
--            tcg_vm = tcg_const_i32(0);
+-                gen_helper_autib(dst, cpu_env, cpu_reg(s, rn), modifier);
-+            tcg_gen_movi_i32(tcg_vm, 0);
+-            }
          } else {
 -            tcg_vm = read_fp_sreg(s, rm);
 +            read_vec_element_i32(s, tcg_vm, rm, 0, size);
          }
 -        if (signal_all_nans) {
 -            gen_helper_vfp_cmpes_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
 -        } else {
--            gen_helper_vfp_cmps_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+-            dst = cpu_reg(s, rn);
-+
+-        }
-+        switch (size) {
+-        /* BLRAA also needs to load return address */
-+        case MO_32:
+-        if (opc == 9) {
-+            if (signal_all_nans) {
+-            TCGv_i64 lr = cpu_reg(s, 30);
-+                gen_helper_vfp_cmpes_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+-            if (dst == lr) {
-+            } else {
+-                TCGv_i64 tmp = tcg_temp_new_i64();
-+                gen_helper_vfp_cmps_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+-                tcg_gen_mov_i64(tmp, dst);
-+            }
+-                dst = tmp;
-+            break;
+-            }
-+        case MO_16:
+-            gen_pc_plus_diff(s, lr, curr_insn_len(s));
-+            if (signal_all_nans) {
+-        }
-+                gen_helper_vfp_cmpeh_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+-        gen_a64_set_pc(s, dst);
-+            } else {
+-        break;
-+                gen_helper_vfp_cmph_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+-
-+            }
+     case 4: /* ERET */
-+            break;
+         if (s->current_el == 0) {
-+        default:
+             goto do_unallocated;
-+            g_assert_not_reached();
+@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
          }
 +
          tcg_temp_free_i32(tcg_vn);
          tcg_temp_free_i32(tcg_vm);
      }
@@ -XXX,XX +XXX,XX @@ static void handle_fp_compare(DisasContext *s, bool is_double,
  static void disas_fp_compare(DisasContext *s, uint32_t insn)
  {
      unsigned int mos, type, rm, op, rn, opc, op2r;
 +    int size;
      mos = extract32(insn, 29, 3);
 -    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
 +    type = extract32(insn, 22, 2);
      rm = extract32(insn, 16, 5);
      op = extract32(insn, 14, 2);
      rn = extract32(insn, 5, 5);
      opc = extract32(insn, 3, 2);
      op2r = extract32(insn, 0, 3);
 -    if (mos || op || op2r || type > 1) {
 +    if (mos || op || op2r) {
 +        unallocated_encoding(s);
 +        return;
 +    }
 +
 +    switch (type) {
 +    case 0:
 +        size = MO_32;
 +        break;
 +    case 1:
 +        size = MO_64;
 +        break;
 +    case 3:
 +        size = MO_16;
 +        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +            break;
 +        }
 +        /* fallthru */
 +    default:
          unallocated_encoding(s);
          return;
      }
-@@ -XXX,XX +XXX,XX @@ static void disas_fp_compare(DisasContext *s, uint32_t insn)
+-
-         return;
+-    switch (btype_mod) {
-     }
+-    case 0: /* BR */
+-        if (dc_isar_feature(aa64_bti, s)) {
--    handle_fp_compare(s, type, rn, rm, opc & 1, opc & 2);
+-            /* BR to {x16,x17} or !guard -> 1, else 3.  */
-+    handle_fp_compare(s, size, rn, rm, opc & 1, opc & 2);
+-            set_btype(s, rn == 16 || rn == 17 || !s->guarded_page ? 1 : 3);
 -        }
 -        break;
 -
 -    case 1: /* BLR */
 -        if (dc_isar_feature(aa64_bti, s)) {
 -            /* BLR sets BTYPE to 2, regardless of source guarded page.  */
 -            set_btype(s, 2);
 -        }
 -        break;
 -
 -    default: /* RET or none of the above.  */
 -        /* BTYPE will be set to 0 by normal end-of-insn processing.  */
 -        break;
 -    }
 -
 -    s->base.is_jmp = DISAS_JUMP;
  }
- /* Floating point conditional compare
+ /* Branches, exception generating and system instructions */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
      unsigned int mos, type, rm, cond, rn, op, nzcv;
      TCGv_i64 tcg_flags;
      TCGLabel *label_continue = NULL;
 +    int size;
      mos = extract32(insn, 29, 3);
 -    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
 +    type = extract32(insn, 22, 2);
      rm = extract32(insn, 16, 5);
      cond = extract32(insn, 12, 4);
      rn = extract32(insn, 5, 5);
      op = extract32(insn, 4, 1);
      nzcv = extract32(insn, 0, 4);
 -    if (mos || type > 1) {
 +    if (mos) {
 +        unallocated_encoding(s);
 +        return;
 +    }
 +
 +    switch (type) {
 +    case 0:
 +        size = MO_32;
 +        break;
 +    case 1:
 +        size = MO_64;
 +        break;
 +    case 3:
 +        size = MO_16;
 +        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +            break;
 +        }
 +        /* fallthru */
 +    default:
          unallocated_encoding(s);
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
          gen_set_label(label_match);
      }
 -    handle_fp_compare(s, type, rn, rm, false, op);
 +    handle_fp_compare(s, size, rn, rm, false, op);
      if (cond < 0x0e) {
          gen_set_label(label_continue);
 --
-.17.0
+.34.1

-New patch
+[PULL 26/29] target/arm: Convert ERET, ERETAA, ERETAB to decodetree
+Convert the exception-return insns ERET, ERETA and ERETB to
+decodetree. These were the last insns left in the legacy
+decoder function disas_uncond_reg_b(), which allows us to
+remove it.
+The old decoder explicitly decoded the DRPS instruction,
+only in order to call unallocated_encoding() on it, exactly
+as would have happened if it hadn't decoded it. This is
+because this insn always UNDEFs unless the CPU is in
+halting-debug state, which we don't emulate. So we list
+the pattern in a comment in a64.decode, but don't actively
+decode it.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230512144106.3608981-21-peter.maydell@linaro.org
+---
+ target/arm/tcg/a64.decode      |   8 ++
+ target/arm/tcg/translate-a64.c | 163 +++++++++++----------------------
+files changed, 63 insertions(+), 108 deletions(-)
+diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/a64.decode
++++ b/target/arm/tcg/a64.decode
+@@ -XXX,XX +XXX,XX @@ RETA            1101011 0010 11111 00001 m:1 11111 11111 &reta  # RETAA, RETAB
+ &bra        rn rm m
+ BRA             1101011 1000 11111 00001 m:1 rn:5 rm:5 &bra # BRAA, BRAB
+ BLRA            1101011 1001 11111 00001 m:1 rn:5 rm:5 &bra # BLRAA, BLRAB
++
++ERET            1101011 0100 11111 000000 11111 00000
++ERETA           1101011 0100 11111 00001 m:1 11111 11111 &reta  # ERETAA, ERETAB
++
++# We don't need to decode DRPS because it always UNDEFs except when
++# the processor is in halting debug state (which we don't implement).
++# The pattern is listed here as documentation.
++# DRPS            1101011 0101 11111 000000 11111 00000
+diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/translate-a64.c
++++ b/target/arm/tcg/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_BLRA(DisasContext *s, arg_bra *a)
+     return true;
+ }
++static bool trans_ERET(DisasContext *s, arg_ERET *a)
++{
++    TCGv_i64 dst;
++
++    if (s->current_el == 0) {
++        return false;
++    }
++    if (s->fgt_eret) {
++        gen_exception_insn_el(s, 0, EXCP_UDEF, 0, 2);
++        return true;
++    }
++    dst = tcg_temp_new_i64();
++    tcg_gen_ld_i64(dst, cpu_env,
++                   offsetof(CPUARMState, elr_el[s->current_el]));
++
++    if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
++        gen_io_start();
++    }
++
++    gen_helper_exception_return(cpu_env, dst);
++    /* Must exit loop to check un-masked IRQs */
++    s->base.is_jmp = DISAS_EXIT;
++    return true;
++}
++
++static bool trans_ERETA(DisasContext *s, arg_reta *a)
++{
++    TCGv_i64 dst;
++
++    if (!dc_isar_feature(aa64_pauth, s)) {
++        return false;
++    }
++    if (s->current_el == 0) {
++        return false;
++    }
++    /* The FGT trap takes precedence over an auth trap. */
++    if (s->fgt_eret) {
++        gen_exception_insn_el(s, 0, EXCP_UDEF, a->m ? 3 : 2, 2);
++        return true;
++    }
++    dst = tcg_temp_new_i64();
++    tcg_gen_ld_i64(dst, cpu_env,
++                   offsetof(CPUARMState, elr_el[s->current_el]));
++
++    dst = auth_branch_target(s, dst, cpu_X[31], !a->m);
++    if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
++        gen_io_start();
++    }
++
++    gen_helper_exception_return(cpu_env, dst);
++    /* Must exit loop to check un-masked IRQs */
++    s->base.is_jmp = DISAS_EXIT;
++    return true;
++}
++
+ /* HINT instruction group, including various allocated HINTs */
+ static void handle_hint(DisasContext *s, uint32_t insn,
+                         unsigned int op1, unsigned int op2, unsigned int crm)
+@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
+     }
+ }
+-/* Unconditional branch (register)
+- *  31           25 24   21 20   16 15   10 9    5 4     0
+- * +---------------+-------+-------+-------+------+-------+
+- * | 1 1 0 1 0 1 1 |  opc  |  op2  |  op3  |  Rn  |  op4  |
+- * +---------------+-------+-------+-------+------+-------+
+- */
+-static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
+-{
+-    unsigned int opc, op2, op3, rn, op4;
+-    TCGv_i64 dst;
+-    TCGv_i64 modifier;
+-
+-    opc = extract32(insn, 21, 4);
+-    op2 = extract32(insn, 16, 5);
+-    op3 = extract32(insn, 10, 6);
+-    rn = extract32(insn, 5, 5);
+-    op4 = extract32(insn, 0, 5);
+-
+-    if (op2 != 0x1f) {
+-        goto do_unallocated;
+-    }
+-
+-    switch (opc) {
+-    case 0:
+-    case 1:
+-    case 2:
+-    case 8:
+-    case 9:
+-        /*
+-         * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ,
+-         * BRAA, BLRAA: handled in decodetree
+-         */
+-        goto do_unallocated;
+-
+-    case 4: /* ERET */
+-        if (s->current_el == 0) {
+-            goto do_unallocated;
+-        }
+-        switch (op3) {
+-        case 0: /* ERET */
+-            if (op4 != 0) {
+-                goto do_unallocated;
+-            }
+-            if (s->fgt_eret) {
+-                gen_exception_insn_el(s, 0, EXCP_UDEF, syn_erettrap(op3), 2);
+-                return;
+-            }
+-            dst = tcg_temp_new_i64();
+-            tcg_gen_ld_i64(dst, cpu_env,
+-                           offsetof(CPUARMState, elr_el[s->current_el]));
+-            break;
+-
+-        case 2: /* ERETAA */
+-        case 3: /* ERETAB */
+-            if (!dc_isar_feature(aa64_pauth, s)) {
+-                goto do_unallocated;
+-            }
+-            if (rn != 0x1f || op4 != 0x1f) {
+-                goto do_unallocated;
+-            }
+-            /* The FGT trap takes precedence over an auth trap. */
+-            if (s->fgt_eret) {
+-                gen_exception_insn_el(s, 0, EXCP_UDEF, syn_erettrap(op3), 2);
+-                return;
+-            }
+-            dst = tcg_temp_new_i64();
+-            tcg_gen_ld_i64(dst, cpu_env,
+-                           offsetof(CPUARMState, elr_el[s->current_el]));
+-            if (s->pauth_active) {
+-                modifier = cpu_X[31];
+-                if (op3 == 2) {
+-                    gen_helper_autia(dst, cpu_env, dst, modifier);
+-                } else {
+-                    gen_helper_autib(dst, cpu_env, dst, modifier);
+-                }
+-            }
+-            break;
+-
+-        default:
+-            goto do_unallocated;
+-        }
+-        if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+-            gen_io_start();
+-        }
+-
+-        gen_helper_exception_return(cpu_env, dst);
+-        /* Must exit loop to check un-masked IRQs */
+-        s->base.is_jmp = DISAS_EXIT;
+-        return;
+-
+-    case 5: /* DRPS */
+-        if (op3 != 0 || op4 != 0 || rn != 0x1f) {
+-            goto do_unallocated;
+-        } else {
+-            unallocated_encoding(s);
+-        }
+-        return;
+-
+-    default:
+-    do_unallocated:
+-        unallocated_encoding(s);
+-        return;
+-    }
+-}
+-
+ /* Branches, exception generating and system instructions */
+ static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
+ {
+@@ -XXX,XX +XXX,XX @@ static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
+             disas_exc(s, insn);
+         }
+         break;
+-    case 0x6b: /* Unconditional branch (register) */
+-        disas_uncond_b_reg(s, insn);
+-        break;
+     default:
+         unallocated_encoding(s);
+         break;
+--
+.34.1

-[Qemu-devel] [PULL 16/16] tcg: Optionally log FPU state in TCG -d cpu logging
+[PULL 27/29] target/arm: Saturate L2CTLR_EL1 core count field rather than overflowing
-Usually the logging of the CPU state produced by -d cpu is sufficient
+The IMPDEF sysreg L2CTLR_EL1 found on the Cortex-A35, A53, A57, A72
-to diagnose problems, but sometimes you want to see the state of
+and which we (arguably dubiously) also provide in '-cpu max' has a
-the floating point registers as well. We don't want to enable that
+bit field for the number of processors in the cluster. On real
-by default as it adds a lot of extra data to the log; instead,
+hardware this must be sufficient because it can only be configured
-allow it to be optionally enabled via -d fpu.
+with up to 4 CPUs in the cluster. However on QEMU if the board code
 does not explicitly configure the code into clusters with the right
 CPU count we default to "give the value assuming that all CPUs in
 the system are in a single cluster", which might be too big to fit
 in the field.
 Instead of just overflowing this 2-bit field, saturate to 3 (meaning
 "4 CPUs", so at least we don't overwrite other fields in the register.
 It's unlikely that any guest code really cares about the value in
 this field; at least, if it does it probably also wants the system
 to be more closely matching real hardware, i.e. not to have more
 than 4 CPUs.
 This issue has been present since the L2CTLR was first added in
 commit 377a44ec8f2fac5b back in 2014. It was only noticed because
 Coverity complains (CID 1509227) that the shift might overflow 32 bits
 and inadvertently sign extend into the top half of the 64 bit value.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180510130024.31678-1-peter.maydell@linaro.org
+Message-id: 20230512170223.3801643-2-peter.maydell@linaro.org
 ---
- include/qemu/log.h   | 1 +
+ target/arm/cortex-regs.c | 11 +++++++++--
- accel/tcg/cpu-exec.c | 9 ++++++---
+file changed, 9 insertions(+), 2 deletions(-)
  util/log.c           | 2 ++
 files changed, 9 insertions(+), 3 deletions(-)
-diff --git a/include/qemu/log.h b/include/qemu/log.h
+diff --git a/target/arm/cortex-regs.c b/target/arm/cortex-regs.c
 index XXXXXXX..XXXXXXX 100644
---- a/include/qemu/log.h
+--- a/target/arm/cortex-regs.c
-+++ b/include/qemu/log.h
++++ b/target/arm/cortex-regs.c
-@@ -XXX,XX +XXX,XX @@ static inline bool qemu_log_separate(void)
+@@ -XXX,XX +XXX,XX @@ static uint64_t l2ctlr_read(CPUARMState *env, const ARMCPRegInfo *ri)
- #define CPU_LOG_PAGE       (1 << 14)
+ {
- /* LOG_TRACE (1 << 15) is defined in log-for-trace.h */
+     ARMCPU *cpu = env_archcpu(env);
- #define CPU_LOG_TB_OP_IND  (1 << 16)
-+#define CPU_LOG_TB_FPU     (1 << 17)
+-    /* Number of cores is in [25:24]; otherwise we RAZ */
+-    return (cpu->core_count - 1) << 24;
- /* Lock output for a series of related logs.  Since this is not needed
++    /*
-  * for a single qemu_log / qemu_log_mask / qemu_log_mask_and_addr, we
++     * Number of cores is in [25:24]; otherwise we RAZ.
-diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
++     * If the board didn't configure the CPUs into clusters,
-index XXXXXXX..XXXXXXX 100644
++     * we default to "all CPUs in one cluster", which might be
---- a/accel/tcg/cpu-exec.c
++     * more than the 4 that the hardware permits and which is
-+++ b/accel/tcg/cpu-exec.c
++     * all you can report in this two-bit field. Saturate to
-@@ -XXX,XX +XXX,XX @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock *itb)
++     * 0b11 (== 4 CPUs) rather than overflowing the field.
-     if (qemu_loglevel_mask(CPU_LOG_TB_CPU)
++     */
-         && qemu_log_in_addr_range(itb->pc)) {
++    return MIN(cpu->core_count - 1, 3) << 24;
-         qemu_log_lock();
+ }
-+        int flags = 0;
-+        if (qemu_loglevel_mask(CPU_LOG_TB_FPU)) {
+ static const ARMCPRegInfo cortex_a72_a57_a53_cp_reginfo[] = {
 +            flags |= CPU_DUMP_FPU;
 +        }
  #if defined(TARGET_I386)
 -        log_cpu_state(cpu, CPU_DUMP_CCOP);
 -#else
 -        log_cpu_state(cpu, 0);
 +        flags |= CPU_DUMP_CCOP;
  #endif
 +        log_cpu_state(cpu, flags);
          qemu_log_unlock();
      }
  #endif /* DEBUG_DISAS */
 diff --git a/util/log.c b/util/log.c
 index XXXXXXX..XXXXXXX 100644
 --- a/util/log.c
 +++ b/util/log.c
@@ -XXX,XX +XXX,XX @@ const QEMULogItem qemu_log_items[] = {
        "show trace before each executed TB (lots of logs)" },
      { CPU_LOG_TB_CPU, "cpu",
        "show CPU registers before entering a TB (lots of logs)" },
 +    { CPU_LOG_TB_FPU, "fpu",
 +      "include FPU registers in the 'cpu' logging" },
      { CPU_LOG_MMU, "mmu",
        "log MMU-related activities" },
      { CPU_LOG_PCALL, "pcall",
 --
-.17.0
+.34.1

-[Qemu-devel] [PULL 02/16] fpu/softfloat: Don't set Invalid for float-to-int(MAXINT)
+[PULL 28/29] hw/arm/vexpress: Avoid trivial memory leak of 'flashalias'
-In float-to-integer conversion, if the floating point input
+In the vexpress board code, we allocate a new MemoryRegion at the top
-converts exactly to the largest or smallest integer that
+of vexpress_common_init() but only set it up and use it inside the
-fits in to the result type, this is not an overflow.
+"if (map[VE_NORFLASHALIAS] != -1)" conditional, so we leak it if not.
-In this situation we were producing the correct result value,
+This isn't a very interesting leak as it's a tiny amount of memory
-but were incorrectly setting the Invalid flag.
+once at startup, but it's easy to fix.
 For example for Arm A64, "FCVTAS w0, d0" on an input of
 x41dfffffffc00000 should produce 0x7fffffff and set no flags.
-Fix the boundary case to take the right half of the if()
+We could silence Coverity simply by moving the g_new() into the
-statements.
+if() block, but this use of g_new(MemoryRegion, 1) is a legacy from
 when this board model was originally written; we wouldn't do that
 if we wrote it today. The MemoryRegions are conceptually a part of
 the board and must not go away until the whole board is done with
 (at the end of the simulation), so they belong in its state struct.
-This fixes a regression from 2.11 introduced by the softfloat
+This machine already has a VexpressMachineState struct that extends
-refactoring.
+MachineState, so statically put the MemoryRegions in there instead of
 dynamically allocating them separately at runtime.
-Cc: qemu-stable@nongnu.org
+Spotted by Coverity (CID 1509083).
-Fixes: ab52f973a50
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180510140141.12120-1-peter.maydell@linaro.org
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
 Message-id: 20230512170223.3801643-3-peter.maydell@linaro.org
 ---
- fpu/softfloat.c | 4 ++--
+ hw/arm/vexpress.c | 40 ++++++++++++++++++++--------------------
-file changed, 2 insertions(+), 2 deletions(-)
+file changed, 20 insertions(+), 20 deletions(-)
-diff --git a/fpu/softfloat.c b/fpu/softfloat.c
+diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
 index XXXXXXX..XXXXXXX 100644
---- a/fpu/softfloat.c
+--- a/hw/arm/vexpress.c
-+++ b/fpu/softfloat.c
++++ b/hw/arm/vexpress.c
-@@ -XXX,XX +XXX,XX @@ static int64_t round_to_int_and_pack(FloatParts in, int rmode,
+@@ -XXX,XX +XXX,XX @@ struct VexpressMachineClass {
-             r = UINT64_MAX;
-         }
+ struct VexpressMachineState {
-         if (p.sign) {
+     MachineState parent;
--            if (r < -(uint64_t) min) {
++    MemoryRegion vram;
-+            if (r <= -(uint64_t) min) {
++    MemoryRegion sram;
-                 return -r;
++    MemoryRegion flashalias;
-             } else {
++    MemoryRegion lowram;
-                 s->float_exception_flags = orig_flags | float_flag_invalid;
++    MemoryRegion a15sram;
-                 return min;
+     bool secure;
-             }
+     bool virt;
-         } else {
+ };
--            if (r < max) {
+@@ -XXX,XX +XXX,XX @@ struct VexpressMachineState {
-+            if (r <= max) {
+ #define TYPE_VEXPRESS_A15_MACHINE   MACHINE_TYPE_NAME("vexpress-a15")
-                 return r;
+ OBJECT_DECLARE_TYPE(VexpressMachineState, VexpressMachineClass, VEXPRESS_MACHINE)
-             } else {
-                 s->float_exception_flags = orig_flags | float_flag_invalid;
+-typedef void DBoardInitFn(const VexpressMachineState *machine,
 +typedef void DBoardInitFn(VexpressMachineState *machine,
                            ram_addr_t ram_size,
                            const char *cpu_type,
                            qemu_irq *pic);
@@ -XXX,XX +XXX,XX @@ static void init_cpus(MachineState *ms, const char *cpu_type,
      }
  }
 -static void a9_daughterboard_init(const VexpressMachineState *vms,
 +static void a9_daughterboard_init(VexpressMachineState *vms,
                                    ram_addr_t ram_size,
                                    const char *cpu_type,
                                    qemu_irq *pic)
  {
      MachineState *machine = MACHINE(vms);
      MemoryRegion *sysmem = get_system_memory();
 -    MemoryRegion *lowram = g_new(MemoryRegion, 1);
      ram_addr_t low_ram_size;
      if (ram_size > 0x40000000) {
@@ -XXX,XX +XXX,XX @@ static void a9_daughterboard_init(const VexpressMachineState *vms,
       * address space should in theory be remappable to various
       * things including ROM or RAM; we always map the RAM there.
       */
 -    memory_region_init_alias(lowram, NULL, "vexpress.lowmem", machine->ram,
 -                             0, low_ram_size);
 -    memory_region_add_subregion(sysmem, 0x0, lowram);
 +    memory_region_init_alias(&vms->lowram, NULL, "vexpress.lowmem",
 +                             machine->ram, 0, low_ram_size);
 +    memory_region_add_subregion(sysmem, 0x0, &vms->lowram);
      memory_region_add_subregion(sysmem, 0x60000000, machine->ram);
      /* 0x1e000000 A9MPCore (SCU) private memory region */
@@ -XXX,XX +XXX,XX @@ static VEDBoardInfo a9_daughterboard = {
      .init = a9_daughterboard_init,
  };
 -static void a15_daughterboard_init(const VexpressMachineState *vms,
 +static void a15_daughterboard_init(VexpressMachineState *vms,
                                     ram_addr_t ram_size,
                                     const char *cpu_type,
                                     qemu_irq *pic)
  {
      MachineState *machine = MACHINE(vms);
      MemoryRegion *sysmem = get_system_memory();
 -    MemoryRegion *sram = g_new(MemoryRegion, 1);
      {
          /* We have to use a separate 64 bit variable here to avoid the gcc
@@ -XXX,XX +XXX,XX @@ static void a15_daughterboard_init(const VexpressMachineState *vms,
      /* 0x2b060000: SP805 watchdog: not modelled */
      /* 0x2b0a0000: PL341 dynamic memory controller: not modelled */
      /* 0x2e000000: system SRAM */
 -    memory_region_init_ram(sram, NULL, "vexpress.a15sram", 0x10000,
 +    memory_region_init_ram(&vms->a15sram, NULL, "vexpress.a15sram", 0x10000,
                             &error_fatal);
 -    memory_region_add_subregion(sysmem, 0x2e000000, sram);
 +    memory_region_add_subregion(sysmem, 0x2e000000, &vms->a15sram);
      /* 0x7ffb0000: DMA330 DMA controller: not modelled */
      /* 0x7ffd0000: PL354 static memory controller: not modelled */
@@ -XXX,XX +XXX,XX @@ static void vexpress_common_init(MachineState *machine)
      I2CBus *i2c;
      ram_addr_t vram_size, sram_size;
      MemoryRegion *sysmem = get_system_memory();
 -    MemoryRegion *vram = g_new(MemoryRegion, 1);
 -    MemoryRegion *sram = g_new(MemoryRegion, 1);
 -    MemoryRegion *flashalias = g_new(MemoryRegion, 1);
 -    MemoryRegion *flash0mem;
      const hwaddr *map = daughterboard->motherboard_map;
      int i;
@@ -XXX,XX +XXX,XX @@ static void vexpress_common_init(MachineState *machine)
      if (map[VE_NORFLASHALIAS] != -1) {
          /* Map flash 0 as an alias into low memory */
 +        MemoryRegion *flash0mem;
          flash0mem = sysbus_mmio_get_region(SYS_BUS_DEVICE(pflash0), 0);
 -        memory_region_init_alias(flashalias, NULL, "vexpress.flashalias",
 +        memory_region_init_alias(&vms->flashalias, NULL, "vexpress.flashalias",
                                   flash0mem, 0, VEXPRESS_FLASH_SIZE);
 -        memory_region_add_subregion(sysmem, map[VE_NORFLASHALIAS], flashalias);
 +        memory_region_add_subregion(sysmem, map[VE_NORFLASHALIAS], &vms->flashalias);
      }
      dinfo = drive_get(IF_PFLASH, 0, 1);
      ve_pflash_cfi01_register(map[VE_NORFLASH1], "vexpress.flash1", dinfo);
      sram_size = 0x2000000;
 -    memory_region_init_ram(sram, NULL, "vexpress.sram", sram_size,
 +    memory_region_init_ram(&vms->sram, NULL, "vexpress.sram", sram_size,
                             &error_fatal);
 -    memory_region_add_subregion(sysmem, map[VE_SRAM], sram);
 +    memory_region_add_subregion(sysmem, map[VE_SRAM], &vms->sram);
      vram_size = 0x800000;
 -    memory_region_init_ram(vram, NULL, "vexpress.vram", vram_size,
 +    memory_region_init_ram(&vms->vram, NULL, "vexpress.vram", vram_size,
                             &error_fatal);
 -    memory_region_add_subregion(sysmem, map[VE_VIDEORAM], vram);
 +    memory_region_add_subregion(sysmem, map[VE_VIDEORAM], &vms->vram);
      /* 0x4e000000 LAN9118 Ethernet */
      if (nd_table[0].used) {
 --
-.17.0
+.34.1

-New patch
+[PULL 29/29] docs: Convert u2f.txt to rST
+Convert the u2f.txt file to rST, and place it in the right place
 in our manual layout. The old text didn't fit very well into our
 manual style, so the new version ends up looking like a rewrite,
 although some of the original text is preserved:
  * the 'building' section of the old file is removed, since we
    generally assume that users have already built QEMU
  * some rather verbose text has been cut back
  * document the passthrough device first, on the assumption
    that's most likely to be of interest to users
  * cut back on the duplication of text between sections
  * format example command lines etc with rST
 As it's a short document it seemed simplest to do this all
 in one go rather than try to do a minimal syntactic conversion
 and then clean up the wording and layout.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Thomas Huth <thuth@redhat.com>
 Message-id: 20230421163734.1152076-1-peter.maydell@linaro.org
 ---
  docs/system/device-emulation.rst |   1 +
  docs/system/devices/usb-u2f.rst  |  93 ++++++++++++++++++++++++++
  docs/system/devices/usb.rst      |   2 +-
  docs/u2f.txt                     | 110 -------------------------------
 files changed, 95 insertions(+), 111 deletions(-)
  create mode 100644 docs/system/devices/usb-u2f.rst
  delete mode 100644 docs/u2f.txt
 diff --git a/docs/system/device-emulation.rst b/docs/system/device-emulation.rst
 index XXXXXXX..XXXXXXX 100644
 --- a/docs/system/device-emulation.rst
 +++ b/docs/system/device-emulation.rst
@@ -XXX,XX +XXX,XX @@ Emulated Devices
     devices/virtio-pmem.rst
     devices/vhost-user-rng.rst
     devices/canokey.rst
 +   devices/usb-u2f.rst
     devices/igb.rst
 diff --git a/docs/system/devices/usb-u2f.rst b/docs/system/devices/usb-u2f.rst
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/docs/system/devices/usb-u2f.rst
@@ -XXX,XX +XXX,XX @@
 +Universal Second Factor (U2F) USB Key Device
 +============================================
 +
 +U2F is an open authentication standard that enables relying parties
 +exposed to the internet to offer a strong second factor option for end
 +user authentication.
 +
 +The second factor is provided by a device implementing the U2F
 +protocol. In case of a USB U2F security key, it is a USB HID device
 +that implements the U2F protocol.
 +
 +QEMU supports both pass-through of a host U2F key device to a VM,
 +and software emulation of a U2F key.
 +
 +``u2f-passthru``
 +----------------
 +
 +The ``u2f-passthru`` device allows you to connect a real hardware
 +U2F key on your host to a guest VM. All requests made from the guest
 +are passed through to the physical security key connected to the
 +host machine and vice versa.
 +
 +In addition, the dedicated pass-through allows you to share a single
 +U2F security key with several guest VMs, which is not possible with a
 +simple host device assignment pass-through.
 +
 +You can specify the host U2F key to use with the ``hidraw``
 +option, which takes the host path to a Linux ``/dev/hidrawN`` device:
 +
 +.. parsed-literal::
 +   |qemu_system| -usb -device u2f-passthru,hidraw=/dev/hidraw0
 +
 +If you don't specify the device, the ``u2f-passthru`` device will
 +autoscan to take the first U2F device it finds on the host (this
 +requires a working libudev):
 +
 +.. parsed-literal::
 +   |qemu_system| -usb -device u2f-passthru
 +
 +``u2f-emulated``
 +----------------
 +
 +``u2f-emulated`` is a completely software emulated U2F device.
 +It uses `libu2f-emu <https://github.com/MattGorko/libu2f-emu>`__
 +for the U2F key emulation. libu2f-emu
 +provides a complete implementation of the U2F protocol device part for
 +all specified transports given by the FIDO Alliance.
 +
 +To work, an emulated U2F device must have four elements:
 +
 + * ec x509 certificate
 + * ec private key
 + * counter (four bytes value)
 + * 48 bytes of entropy (random bits)
 +
 +To use this type of device, these have to be configured, and these
 +four elements must be passed one way or another.
 +
 +Assuming that you have a working libu2f-emu installed on the host,
 +there are three possible ways to configure the ``u2f-emulated`` device:
 +
 + * ephemeral
 + * setup directory
 + * manual
 +
 +Ephemeral is the simplest way to configure; it lets the device generate
 +all the elements it needs for a single use of the lifetime of the device.
 +It is the default if you do not pass any other options to the device.
 +
 +.. parsed-literal::
 +   |qemu_system| -usb -device u2f-emulated
 +
 +You can pass the device the path of a setup directory on the host
 +using the ``dir`` option; the directory must contain these four files:
 +
 + * ``certificate.pem``: ec x509 certificate
 + * ``private-key.pem``: ec private key
 + * ``counter``: counter value
 + * ``entropy``: 48 bytes of entropy
 +
 +.. parsed-literal::
 +   |qemu_system| -usb -device u2f-emulated,dir=$dir
 +
 +You can also manually pass the device the paths to each of these files,
 +if you don't want them all to be in the same directory, using the options
 +
 + * ``cert``
 + * ``priv``
 + * ``counter``
 + * ``entropy``
 +
 +.. parsed-literal::
 +   |qemu_system| -usb -device u2f-emulated,cert=$DIR1/$FILE1,priv=$DIR2/$FILE2,counter=$DIR3/$FILE3,entropy=$DIR4/$FILE4
 diff --git a/docs/system/devices/usb.rst b/docs/system/devices/usb.rst
 index XXXXXXX..XXXXXXX 100644
 --- a/docs/system/devices/usb.rst
 +++ b/docs/system/devices/usb.rst
@@ -XXX,XX +XXX,XX @@ option or the ``device_add`` monitor command. Available devices are:
     USB audio device
  ``u2f-{emulated,passthru}``
 -   Universal Second Factor device
 +   :doc:`usb-u2f`
  ``canokey``
     An Open-source Secure Key implementing FIDO2, OpenPGP, PIV and more.
 diff --git a/docs/u2f.txt b/docs/u2f.txt
 deleted file mode 100644
 index XXXXXXX..XXXXXXX
 --- a/docs/u2f.txt
 +++ /dev/null
@@ -XXX,XX +XXX,XX @@
 -QEMU U2F Key Device Documentation.
 -
 -Contents
 -1. USB U2F key device
 -2. Building
 -3. Using u2f-emulated
 -4. Using u2f-passthru
 -5. Libu2f-emu
 -
 -1. USB U2F key device
 -
 -U2F is an open authentication standard that enables relying parties
 -exposed to the internet to offer a strong second factor option for end
 -user authentication.
 -
 -The standard brings many advantages to both parties, client and server,
 -allowing to reduce over-reliance on passwords, it increases authentication
 -security and simplifies passwords.
 -
 -The second factor is materialized by a device implementing the U2F
 -protocol. In case of a USB U2F security key, it is a USB HID device
 -that implements the U2F protocol.
 -
 -In QEMU, the USB U2F key device offers a dedicated support of U2F, allowing
 -guest USB FIDO/U2F security keys operating in two possible modes:
 -pass-through and emulated.
 -
 -The pass-through mode consists of passing all requests made from the guest
 -to the physical security key connected to the host machine and vice versa.
 -In addition, the dedicated pass-through allows to have a U2F security key
 -shared on several guests which is not possible with a simple host device
 -assignment pass-through.
 -
 -The emulated mode consists of completely emulating the behavior of an
 -U2F device through software part. Libu2f-emu is used for that.
 -
 -
 -2. Building
 -
 -To ensure the build of the u2f-emulated device variant which depends
 -on libu2f-emu: configuring and building:
 -
 -    ./configure --enable-u2f && make
 -
 -The pass-through mode is built by default on Linux. To take advantage
 -of the autoscan option it provides, make sure you have a working libudev
 -installed on the host.
 -
 -
 -3. Using u2f-emulated
 -
 -To work, an emulated U2F device must have four elements:
 - * ec x509 certificate
 - * ec private key
 - * counter (four bytes value)
 - * 48 bytes of entropy (random bits)
 -
 -To use this type of device, this one has to be configured, and these
 -four elements must be passed one way or another.
 -
 -Assuming that you have a working libu2f-emu installed on the host.
 -There are three possible ways of configurations:
 - * ephemeral
 - * setup directory
 - * manual
 -
 -Ephemeral is the simplest way to configure, it lets the device generate
 -all the elements it needs for a single use of the lifetime of the device.
 -
 -    qemu -usb -device u2f-emulated
 -
 -Setup directory allows to configure the device from a directory containing
 -four files:
 - * certificate.pem: ec x509 certificate
 - * private-key.pem: ec private key
 - * counter: counter value
 - * entropy: 48 bytes of entropy
 -
 -    qemu -usb -device u2f-emulated,dir=$dir
 -
 -Manual allows to configure the device more finely by specifying each
 -of the elements necessary for the device:
 - * cert
 - * priv
 - * counter
 - * entropy
 -
 -    qemu -usb -device u2f-emulated,cert=$DIR1/$FILE1,priv=$DIR2/$FILE2,counter=$DIR3/$FILE3,entropy=$DIR4/$FILE4
 -
 -
 -4. Using u2f-passthru
 -
 -On the host specify the u2f-passthru device with a suitable hidraw:
 -
 -    qemu -usb -device u2f-passthru,hidraw=/dev/hidraw0
 -
 -Alternately, the u2f-passthru device can autoscan to take the first
 -U2F device it finds on the host (this requires a working libudev):
 -
 -    qemu -usb -device u2f-passthru
 -
 -
 -5. Libu2f-emu
 -
 -The u2f-emulated device uses libu2f-emu for the U2F key emulation. Libu2f-emu
 -implements completely the U2F protocol device part for all specified
 -transport given by the FIDO Alliance.
 -
 -For more information about libu2f-emu see this page:
 -https://github.com/MattGorko/libu2f-emu.
 --
 .34.1

The following changes since commit ad1b4ec39caa5b3f17cbd8160283a03a3dcfe2ae:

Merge remote-tracking branch 'remotes/kraxel/tags/input-20180515-pull-request' into staging (2018-05-15 12:50:06 +0100)

are available in the Git repository at:

git://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20180515

for you to fetch changes up to ae7651804748c6b479d5ae09aeac4edb9c44f76e:

tcg: Optionally log FPU state in TCG -d cpu logging (2018-05-15 14:58:44 +0100)

----------------------------------------------------------------
target-arm queue:
 * Fix coverity nit in int_to_float code
 * Don't set Invalid for float-to-int(MAXINT)
 * Fix fp_status_f16 tininess before rounding
 * Add various missing insns from the v8.2-FP16 extension
 * Fix sqrt_f16 exception raising
 * sdcard: Correct CRC16 offset in sd_function_switch()
 * tcg: Optionally log FPU state in TCG -d cpu logging

----------------------------------------------------------------
Alex Bennée (5):
      fpu/softfloat: int_to_float ensure r fully initialised
      target/arm: Implement FCMP for fp16
      target/arm: Implement FCSEL for fp16
      target/arm: Implement FMOV (immediate) for fp16
      target/arm: Fix sqrt_f16 exception raising

Peter Maydell (3):
      fpu/softfloat: Don't set Invalid for float-to-int(MAXINT)
      target/arm: Fix fp_status_f16 tininess before rounding
      tcg: Optionally log FPU state in TCG -d cpu logging

Philippe Mathieu-Daudé (1):
      sdcard: Correct CRC16 offset in sd_function_switch()

Richard Henderson (7):
      target/arm: Implement FMOV (general) for fp16
      target/arm: Early exit after unallocated_encoding in disas_fp_int_conv
      target/arm: Implement FCVT (scalar, integer) for fp16
      target/arm: Implement FCVT (scalar, fixed-point) for fp16
      target/arm: Introduce and use read_fp_hreg
      target/arm: Implement FP data-processing (2 source) for fp16
      target/arm: Implement FP data-processing (3 source) for fp16

In float-to-integer conversion, if the floating point input
converts exactly to the largest or smallest integer that
fits in to the result type, this is not an overflow.
In this situation we were producing the correct result value,
but were incorrectly setting the Invalid flag.
For example for Arm A64, "FCVTAS w0, d0" on an input of
0x41dfffffffc00000 should produce 0x7fffffff and set no flags.

Fix the boundary case to take the right half of the if()
statements.

This fixes a regression from 2.11 introduced by the softfloat
refactoring.

Cc: qemu-stable@nongnu.org
Fixes: ab52f973a50
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180510140141.12120-1-peter.maydell@linaro.org
---
 fpu/softfloat.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index XXXXXXX..XXXXXXX 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -XXX,XX +XXX,XX @@ static int64_t round_to_int_and_pack(FloatParts in, int rmode,
             r = UINT64_MAX;
         }
         if (p.sign) {
-            if (r < -(uint64_t) min) {
+            if (r <= -(uint64_t) min) {
                 return -r;
             } else {
                 s->float_exception_flags = orig_flags | float_flag_invalid;
                 return min;
             }
         } else {
-            if (r < max) {
+            if (r <= max) {
                 return r;
             } else {
                 s->float_exception_flags = orig_flags | float_flag_invalid;
-- 
2.17.0

In commit d81ce0ef2c4f105 we added an extra float_status field
fp_status_fp16 for Arm, but forgot to initialize it correctly
by setting it to float_tininess_before_rounding. This currently
will only cause problems for the new V8_FP16 feature, since the
float-to-float conversion code doesn't use it yet. The effect
would be that we failed to set the Underflow IEEE exception flag
in all the cases where we should.

Add the missing initialization.

Fixes: d81ce0ef2c4f105
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180512004311.9299-16-richard.henderson@linaro.org
---
 target/arm/cpu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
                               &env->vfp.fp_status);
     set_float_detect_tininess(float_tininess_before_rounding,
                               &env->vfp.standard_fp_status);
+    set_float_detect_tininess(float_tininess_before_rounding,
+                              &env->vfp.fp_status_f16);
 #ifndef CONFIG_USER_ONLY
     if (kvm_enabled()) {
         kvm_arm_reset_vcpu(cpu);
-- 
2.17.0

From: Richard Henderson <richard.henderson@linaro.org>

Adding the fp16 moves to/from general registers.

Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180512003217.9105-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
             tcg_gen_st_i64(tcg_rn, cpu_env, fp_reg_hi_offset(s, rd));
             clear_vec_high(s, true, rd);
             break;
+        case 3:
+            /* 16 bit */
+            tmp = tcg_temp_new_i64();
+            tcg_gen_ext16u_i64(tmp, tcg_rn);
+            write_fp_dreg(s, rd, tmp);
+            tcg_temp_free_i64(tmp);
+            break;
+        default:
+            g_assert_not_reached();
         }
     } else {
         TCGv_i64 tcg_rd = cpu_reg(s, rd);
@@ -XXX,XX +XXX,XX @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
             /* 64 bits from top half */
             tcg_gen_ld_i64(tcg_rd, cpu_env, fp_reg_hi_offset(s, rn));
             break;
+        case 3:
+            /* 16 bit */
+            tcg_gen_ld16u_i64(tcg_rd, cpu_env, fp_reg_offset(s, rn, MO_16));
+            break;
+        default:
+            g_assert_not_reached();
         }
     }
 }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
         case 0xa: /* 64 bit */
         case 0xd: /* 64 bit to top half of quad */
             break;
+        case 0x6: /* 16-bit float, 32-bit int */
+        case 0xe: /* 16-bit float, 64-bit int */
+            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+                break;
+            }
+            /* fallthru */
         default:
             /* all other sf/type/rmode combinations are invalid */
             unallocated_encoding(s);
-- 
2.17.0

From: Richard Henderson <richard.henderson@linaro.org>

Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180512003217.9105-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  6 +++
 target/arm/helper.c        | 38 ++++++++++++++-
 target/arm/translate-a64.c | 96 +++++++++++++++++++++++++++++++-------
 3 files changed, 122 insertions(+), 18 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_touhd_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_tould_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_touhh, i32, f16, i32, ptr)
 DEF_HELPER_3(vfp_toshh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_toulh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_toslh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_touqh, i64, f16, i32, ptr)
+DEF_HELPER_3(vfp_tosqh, i64, f16, i32, ptr)
 DEF_HELPER_3(vfp_toshs, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosls, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosqs, i64, f32, i32, ptr)
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(vfp_ultod, f64, i64, i32, ptr)
 DEF_HELPER_3(vfp_uqtod, f64, i64, i32, ptr)
 DEF_HELPER_3(vfp_sltoh, f16, i32, i32, ptr)
 DEF_HELPER_3(vfp_ultoh, f16, i32, i32, ptr)
+DEF_HELPER_3(vfp_sqtoh, f16, i64, i32, ptr)
+DEF_HELPER_3(vfp_uqtoh, f16, i64, i32, ptr)
 
 DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, ptr)
 DEF_HELPER_FLAGS_2(set_neon_rmode, TCG_CALL_NO_RWG, i32, i32, env)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ VFP_CONV_FIX_A64(uq, s, 32, 64, uint64)
 #undef VFP_CONV_FIX_A64
 
 /* Conversion to/from f16 can overflow to infinity before/after scaling.
- * Therefore we convert to f64 (which does not round), scale,
- * and then convert f64 to f16 (which may round).
+ * Therefore we convert to f64, scale, and then convert f64 to f16; or
+ * vice versa for conversion to integer.
+ *
+ * For 16- and 32-bit integers, the conversion to f64 never rounds.
+ * For 64-bit integers, any integer that would cause rounding will also
+ * overflow to f16 infinity, so there is no double rounding problem.
  */
 
 static float16 do_postscale_fp16(float64 f, int shift, float_status *fpst)
@@ -XXX,XX +XXX,XX @@ float16 HELPER(vfp_ultoh)(uint32_t x, uint32_t shift, void *fpst)
     return do_postscale_fp16(uint32_to_float64(x, fpst), shift, fpst);
 }
 
+float16 HELPER(vfp_sqtoh)(uint64_t x, uint32_t shift, void *fpst)
+{
+    return do_postscale_fp16(int64_to_float64(x, fpst), shift, fpst);
+}
+
+float16 HELPER(vfp_uqtoh)(uint64_t x, uint32_t shift, void *fpst)
+{
+    return do_postscale_fp16(uint64_to_float64(x, fpst), shift, fpst);
+}
+
 static float64 do_prescale_fp16(float16 f, int shift, float_status *fpst)
 {
     if (unlikely(float16_is_any_nan(f))) {
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_touhh)(float16 x, uint32_t shift, void *fpst)
     return float64_to_uint16(do_prescale_fp16(x, shift, fpst), fpst);
 }
 
+uint32_t HELPER(vfp_toslh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_int32(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint32_t HELPER(vfp_toulh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_uint32(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint64_t HELPER(vfp_tosqh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_int64(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint64_t HELPER(vfp_touqh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_uint64(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
 /* Set the current fp rounding mode and return the old one.
  * The argument is a softfloat float_round_ value.
  */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                            bool itof, int rmode, int scale, int sf, int type)
 {
     bool is_signed = !(opcode & 1);
-    bool is_double = type;
     TCGv_ptr tcg_fpstatus;
-    TCGv_i32 tcg_shift;
+    TCGv_i32 tcg_shift, tcg_single;
+    TCGv_i64 tcg_double;
 
-    tcg_fpstatus = get_fpstatus_ptr(false);
+    tcg_fpstatus = get_fpstatus_ptr(type == 3);
 
     tcg_shift = tcg_const_i32(64 - scale);
 
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
             tcg_int = tcg_extend;
         }
 
-        if (is_double) {
-            TCGv_i64 tcg_double = tcg_temp_new_i64();
+        switch (type) {
+        case 1: /* float64 */
+            tcg_double = tcg_temp_new_i64();
             if (is_signed) {
                 gen_helper_vfp_sqtod(tcg_double, tcg_int,
                                      tcg_shift, tcg_fpstatus);
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
             }
             write_fp_dreg(s, rd, tcg_double);
             tcg_temp_free_i64(tcg_double);
-        } else {
-            TCGv_i32 tcg_single = tcg_temp_new_i32();
+            break;
+
+        case 0: /* float32 */
+            tcg_single = tcg_temp_new_i32();
             if (is_signed) {
                 gen_helper_vfp_sqtos(tcg_single, tcg_int,
                                      tcg_shift, tcg_fpstatus);
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
             }
             write_fp_sreg(s, rd, tcg_single);
             tcg_temp_free_i32(tcg_single);
+            break;
+
+        case 3: /* float16 */
+            tcg_single = tcg_temp_new_i32();
+            if (is_signed) {
+                gen_helper_vfp_sqtoh(tcg_single, tcg_int,
+                                     tcg_shift, tcg_fpstatus);
+            } else {
+                gen_helper_vfp_uqtoh(tcg_single, tcg_int,
+                                     tcg_shift, tcg_fpstatus);
+            }
+            write_fp_sreg(s, rd, tcg_single);
+            tcg_temp_free_i32(tcg_single);
+            break;
+
+        default:
+            g_assert_not_reached();
         }
     } else {
         TCGv_i64 tcg_int = cpu_reg(s, rd);
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
 
         gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
 
-        if (is_double) {
-            TCGv_i64 tcg_double = read_fp_dreg(s, rn);
+        switch (type) {
+        case 1: /* float64 */
+            tcg_double = read_fp_dreg(s, rn);
             if (is_signed) {
                 if (!sf) {
                     gen_helper_vfp_tosld(tcg_int, tcg_double,
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                                          tcg_shift, tcg_fpstatus);
                 }
             }
+            if (!sf) {
+                tcg_gen_ext32u_i64(tcg_int, tcg_int);
+            }
             tcg_temp_free_i64(tcg_double);
-        } else {
-            TCGv_i32 tcg_single = read_fp_sreg(s, rn);
+            break;
+
+        case 0: /* float32 */
+            tcg_single = read_fp_sreg(s, rn);
             if (sf) {
                 if (is_signed) {
                     gen_helper_vfp_tosqs(tcg_int, tcg_single,
@@ -XXX,XX +XXX,XX @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                 tcg_temp_free_i32(tcg_dest);
             }
             tcg_temp_free_i32(tcg_single);
+            break;
+
+        case 3: /* float16 */
+            tcg_single = read_fp_sreg(s, rn);
+            if (sf) {
+                if (is_signed) {
+                    gen_helper_vfp_tosqh(tcg_int, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                } else {
+                    gen_helper_vfp_touqh(tcg_int, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                }
+            } else {
+                TCGv_i32 tcg_dest = tcg_temp_new_i32();
+                if (is_signed) {
+                    gen_helper_vfp_toslh(tcg_dest, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                } else {
+                    gen_helper_vfp_toulh(tcg_dest, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                }
+                tcg_gen_extu_i32_i64(tcg_int, tcg_dest);
+                tcg_temp_free_i32(tcg_dest);
+            }
+            tcg_temp_free_i32(tcg_single);
+            break;
+
+        default:
+            g_assert_not_reached();
         }
 
         gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
         tcg_temp_free_i32(tcg_rmode);
-
-        if (!sf) {
-            tcg_gen_ext32u_i64(tcg_int, tcg_int);
-        }
     }
 
     tcg_temp_free_ptr(tcg_fpstatus);
@@ -XXX,XX +XXX,XX @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
         /* actual FP conversions */
         bool itof = extract32(opcode, 1, 1);
 
-        if (type > 1 || (rmode != 0 && opcode > 1)) {
+        if (rmode != 0 && opcode > 1) {
+            unallocated_encoding(s);
+            return;
+        }
+        switch (type) {
+        case 0: /* float32 */
+        case 1: /* float64 */
+            break;
+        case 3: /* float16 */
+            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+                break;
+            }
+            /* fallthru */
+        default:
             unallocated_encoding(s);
             return;
         }
-- 
2.17.0

From: Richard Henderson <richard.henderson@linaro.org>

Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180512003217.9105-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_fp_fixed_conv(DisasContext *s, uint32_t insn)
     bool sf = extract32(insn, 31, 1);
     bool itof;
 
-    if (sbit || (type > 1)
-        || (!sf && scale < 32)) {
+    if (sbit || (!sf && scale < 32)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (type) {
+    case 0: /* float32 */
+    case 1: /* float64 */
+        break;
+    case 3: /* float16 */
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
-- 
2.17.0

From: Richard Henderson <richard.henderson@linaro.org>

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180512003217.9105-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 30 ++++++++++++++----------------
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static TCGv_i32 read_fp_sreg(DisasContext *s, int reg)
     return v;
 }
 
+static TCGv_i32 read_fp_hreg(DisasContext *s, int reg)
+{
+    TCGv_i32 v = tcg_temp_new_i32();
+
+    tcg_gen_ld16u_i32(v, cpu_env, fp_reg_offset(s, reg, MO_16));
+    return v;
+}
+
 /* Clear the bits above an N-bit vector, for N = (is_q ? 128 : 64).
  * If SVE is not enabled, then there are only 128 bits in the vector.
  */
@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
 static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
 {
     TCGv_ptr fpst = NULL;
-    TCGv_i32 tcg_op = tcg_temp_new_i32();
+    TCGv_i32 tcg_op = read_fp_hreg(s, rn);
     TCGv_i32 tcg_res = tcg_temp_new_i32();
 
-    read_vec_element_i32(s, tcg_op, rn, 0, MO_16);
-
     switch (opcode) {
     case 0x0: /* FMOV */
         tcg_gen_mov_i32(tcg_res, tcg_op);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn)
         tcg_temp_free_i64(tcg_op2);
         tcg_temp_free_i64(tcg_res);
     } else {
-        TCGv_i32 tcg_op1 = tcg_temp_new_i32();
-        TCGv_i32 tcg_op2 = tcg_temp_new_i32();
+        TCGv_i32 tcg_op1 = read_fp_hreg(s, rn);
+        TCGv_i32 tcg_op2 = read_fp_hreg(s, rm);
         TCGv_i64 tcg_res = tcg_temp_new_i64();
 
-        read_vec_element_i32(s, tcg_op1, rn, 0, MO_16);
-        read_vec_element_i32(s, tcg_op2, rm, 0, MO_16);
-
         gen_helper_neon_mull_s16(tcg_res, tcg_op1, tcg_op2);
         gen_helper_neon_addl_saturate_s32(tcg_res, cpu_env, tcg_res, tcg_res);
 
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
 
     fpst = get_fpstatus_ptr(true);
 
-    tcg_op1 = tcg_temp_new_i32();
-    tcg_op2 = tcg_temp_new_i32();
+    tcg_op1 = read_fp_hreg(s, rn);
+    tcg_op2 = read_fp_hreg(s, rm);
     tcg_res = tcg_temp_new_i32();
 
-    read_vec_element_i32(s, tcg_op1, rn, 0, MO_16);
-    read_vec_element_i32(s, tcg_op2, rm, 0, MO_16);
-
     switch (fpopcode) {
     case 0x03: /* FMULX */
         gen_helper_advsimd_mulxh(tcg_res, tcg_op1, tcg_op2, fpst);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
     }
 
     if (is_scalar) {
-        TCGv_i32 tcg_op = tcg_temp_new_i32();
+        TCGv_i32 tcg_op = read_fp_hreg(s, rn);
         TCGv_i32 tcg_res = tcg_temp_new_i32();
 
-        read_vec_element_i32(s, tcg_op, rn, 0, MO_16);
-
         switch (fpop) {
         case 0x1a: /* FCVTNS */
         case 0x1b: /* FCVTMS */
-- 
2.17.0

From: Richard Henderson <richard.henderson@linaro.org>

We missed all of the scalar fp16 binary operations.

Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180512003217.9105-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 65 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
     tcg_temp_free_i64(tcg_res);
 }
 
+/* Floating-point data-processing (2 source) - half precision */
+static void handle_fp_2src_half(DisasContext *s, int opcode,
+                                int rd, int rn, int rm)
+{
+    TCGv_i32 tcg_op1;
+    TCGv_i32 tcg_op2;
+    TCGv_i32 tcg_res;
+    TCGv_ptr fpst;
+
+    tcg_res = tcg_temp_new_i32();
+    fpst = get_fpstatus_ptr(true);
+    tcg_op1 = read_fp_hreg(s, rn);
+    tcg_op2 = read_fp_hreg(s, rm);
+
+    switch (opcode) {
+    case 0x0: /* FMUL */
+        gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x1: /* FDIV */
+        gen_helper_advsimd_divh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x2: /* FADD */
+        gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x3: /* FSUB */
+        gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x4: /* FMAX */
+        gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x5: /* FMIN */
+        gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x6: /* FMAXNM */
+        gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x7: /* FMINNM */
+        gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x8: /* FNMUL */
+        gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
+        tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    write_fp_sreg(s, rd, tcg_res);
+
+    tcg_temp_free_ptr(fpst);
+    tcg_temp_free_i32(tcg_op1);
+    tcg_temp_free_i32(tcg_op2);
+    tcg_temp_free_i32(tcg_res);
+}
+
 /* Floating point data-processing (2 source)
  *   31  30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
  * +---+---+---+-----------+------+---+------+--------+-----+------+------+
@@ -XXX,XX +XXX,XX @@ static void disas_fp_2src(DisasContext *s, uint32_t insn)
         }
         handle_fp_2src_double(s, opcode, rd, rn, rm);
         break;
+    case 3:
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            unallocated_encoding(s);
+            return;
+        }
+        if (!fp_access_check(s)) {
+            return;
+        }
+        handle_fp_2src_half(s, opcode, rd, rn, rm);
+        break;
     default:
         unallocated_encoding(s);
     }
-- 
2.17.0

From: Richard Henderson <richard.henderson@linaro.org>

We missed all of the scalar fp16 fma operations.

Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180512003217.9105-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 48 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_fp_3src_double(DisasContext *s, bool o0, bool o1,
     tcg_temp_free_i64(tcg_res);
 }
 
+/* Floating-point data-processing (3 source) - half precision */
+static void handle_fp_3src_half(DisasContext *s, bool o0, bool o1,
+                                int rd, int rn, int rm, int ra)
+{
+    TCGv_i32 tcg_op1, tcg_op2, tcg_op3;
+    TCGv_i32 tcg_res = tcg_temp_new_i32();
+    TCGv_ptr fpst = get_fpstatus_ptr(true);
+
+    tcg_op1 = read_fp_hreg(s, rn);
+    tcg_op2 = read_fp_hreg(s, rm);
+    tcg_op3 = read_fp_hreg(s, ra);
+
+    /* These are fused multiply-add, and must be done as one
+     * floating point operation with no rounding between the
+     * multiplication and addition steps.
+     * NB that doing the negations here as separate steps is
+     * correct : an input NaN should come out with its sign bit
+     * flipped if it is a negated-input.
+     */
+    if (o1 == true) {
+        tcg_gen_xori_i32(tcg_op3, tcg_op3, 0x8000);
+    }
+
+    if (o0 != o1) {
+        tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000);
+    }
+
+    gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst);
+
+    write_fp_sreg(s, rd, tcg_res);
+
+    tcg_temp_free_ptr(fpst);
+    tcg_temp_free_i32(tcg_op1);
+    tcg_temp_free_i32(tcg_op2);
+    tcg_temp_free_i32(tcg_op3);
+    tcg_temp_free_i32(tcg_res);
+}
+
 /* Floating point data-processing (3 source)
  *   31  30  29 28       24 23  22  21  20  16  15  14  10 9    5 4    0
  * +---+---+---+-----------+------+----+------+----+------+------+------+
@@ -XXX,XX +XXX,XX @@ static void disas_fp_3src(DisasContext *s, uint32_t insn)
         }
         handle_fp_3src_double(s, o0, o1, rd, rn, rm, ra);
         break;
+    case 3:
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            unallocated_encoding(s);
+            return;
+        }
+        if (!fp_access_check(s)) {
+            return;
+        }
+        handle_fp_3src_half(s, o0, o1, rd, rn, rm, ra);
+        break;
     default:
         unallocated_encoding(s);
     }
-- 
2.17.0

From: Alex Bennée <alex.bennee@linaro.org>

These where missed out from the rest of the half-precision work.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180512003217.9105-9-richard.henderson@linaro.org
[rth: Diagnose lack of FP16 before fp_access_check]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper-a64.h    |  2 +
 target/arm/helper-a64.c    | 10 +++++
 target/arm/translate-a64.c | 88 ++++++++++++++++++++++++++++++--------
 3 files changed, 83 insertions(+), 17 deletions(-)

diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -XXX,XX +XXX,XX @@
 DEF_HELPER_FLAGS_2(udiv64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(sdiv64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
 DEF_HELPER_FLAGS_1(rbit64, TCG_CALL_NO_RWG_SE, i64, i64)
+DEF_HELPER_3(vfp_cmph_a64, i64, f16, f16, ptr)
+DEF_HELPER_3(vfp_cmpeh_a64, i64, f16, f16, ptr)
 DEF_HELPER_3(vfp_cmps_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr)
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -XXX,XX +XXX,XX @@ static inline uint32_t float_rel_to_flags(int res)
     return flags;
 }
 
+uint64_t HELPER(vfp_cmph_a64)(float16 x, float16 y, void *fp_status)
+{
+    return float_rel_to_flags(float16_compare_quiet(x, y, fp_status));
+}
+
+uint64_t HELPER(vfp_cmpeh_a64)(float16 x, float16 y, void *fp_status)
+{
+    return float_rel_to_flags(float16_compare(x, y, fp_status));
+}
+
 uint64_t HELPER(vfp_cmps_a64)(float32 x, float32 y, void *fp_status)
 {
     return float_rel_to_flags(float32_compare_quiet(x, y, fp_status));
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_reg(DisasContext *s, uint32_t insn)
     }
 }
 
-static void handle_fp_compare(DisasContext *s, bool is_double,
+static void handle_fp_compare(DisasContext *s, int size,
                               unsigned int rn, unsigned int rm,
                               bool cmp_with_zero, bool signal_all_nans)
 {
     TCGv_i64 tcg_flags = tcg_temp_new_i64();
-    TCGv_ptr fpst = get_fpstatus_ptr(false);
+    TCGv_ptr fpst = get_fpstatus_ptr(size == MO_16);
 
-    if (is_double) {
+    if (size == MO_64) {
         TCGv_i64 tcg_vn, tcg_vm;
 
         tcg_vn = read_fp_dreg(s, rn);
@@ -XXX,XX +XXX,XX @@ static void handle_fp_compare(DisasContext *s, bool is_double,
         tcg_temp_free_i64(tcg_vn);
         tcg_temp_free_i64(tcg_vm);
     } else {
-        TCGv_i32 tcg_vn, tcg_vm;
+        TCGv_i32 tcg_vn = tcg_temp_new_i32();
+        TCGv_i32 tcg_vm = tcg_temp_new_i32();
 
-        tcg_vn = read_fp_sreg(s, rn);
+        read_vec_element_i32(s, tcg_vn, rn, 0, size);
         if (cmp_with_zero) {
-            tcg_vm = tcg_const_i32(0);
+            tcg_gen_movi_i32(tcg_vm, 0);
         } else {
-            tcg_vm = read_fp_sreg(s, rm);
+            read_vec_element_i32(s, tcg_vm, rm, 0, size);
         }
-        if (signal_all_nans) {
-            gen_helper_vfp_cmpes_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
-        } else {
-            gen_helper_vfp_cmps_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+
+        switch (size) {
+        case MO_32:
+            if (signal_all_nans) {
+                gen_helper_vfp_cmpes_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            } else {
+                gen_helper_vfp_cmps_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            }
+            break;
+        case MO_16:
+            if (signal_all_nans) {
+                gen_helper_vfp_cmpeh_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            } else {
+                gen_helper_vfp_cmph_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            }
+            break;
+        default:
+            g_assert_not_reached();
         }
+
         tcg_temp_free_i32(tcg_vn);
         tcg_temp_free_i32(tcg_vm);
     }
@@ -XXX,XX +XXX,XX @@ static void handle_fp_compare(DisasContext *s, bool is_double,
 static void disas_fp_compare(DisasContext *s, uint32_t insn)
 {
     unsigned int mos, type, rm, op, rn, opc, op2r;
+    int size;
 
     mos = extract32(insn, 29, 3);
-    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
+    type = extract32(insn, 22, 2);
     rm = extract32(insn, 16, 5);
     op = extract32(insn, 14, 2);
     rn = extract32(insn, 5, 5);
     opc = extract32(insn, 3, 2);
     op2r = extract32(insn, 0, 3);
 
-    if (mos || op || op2r || type > 1) {
+    if (mos || op || op2r) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (type) {
+    case 0:
+        size = MO_32;
+        break;
+    case 1:
+        size = MO_64;
+        break;
+    case 3:
+        size = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_compare(DisasContext *s, uint32_t insn)
         return;
     }
 
-    handle_fp_compare(s, type, rn, rm, opc & 1, opc & 2);
+    handle_fp_compare(s, size, rn, rm, opc & 1, opc & 2);
 }
 
 /* Floating point conditional compare
@@ -XXX,XX +XXX,XX @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
     unsigned int mos, type, rm, cond, rn, op, nzcv;
     TCGv_i64 tcg_flags;
     TCGLabel *label_continue = NULL;
+    int size;
 
     mos = extract32(insn, 29, 3);
-    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
+    type = extract32(insn, 22, 2);
     rm = extract32(insn, 16, 5);
     cond = extract32(insn, 12, 4);
     rn = extract32(insn, 5, 5);
     op = extract32(insn, 4, 1);
     nzcv = extract32(insn, 0, 4);
 
-    if (mos || type > 1) {
+    if (mos) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (type) {
+    case 0:
+        size = MO_32;
+        break;
+    case 1:
+        size = MO_64;
+        break;
+    case 3:
+        size = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
         gen_set_label(label_match);
     }
 
-    handle_fp_compare(s, type, rn, rm, false, op);
+    handle_fp_compare(s, size, rn, rm, false, op);
 
     if (cond < 0x0e) {
         gen_set_label(label_continue);
-- 
2.17.0

From: Alex Bennée <alex.bennee@linaro.org>

These were missed out from the rest of the half-precision work.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180512003217.9105-10-richard.henderson@linaro.org
[rth: Fix erroneous check vs type]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
     unsigned int mos, type, rm, cond, rn, rd;
     TCGv_i64 t_true, t_false, t_zero;
     DisasCompare64 c;
+    TCGMemOp sz;
 
     mos = extract32(insn, 29, 3);
-    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
+    type = extract32(insn, 22, 2);
     rm = extract32(insn, 16, 5);
     cond = extract32(insn, 12, 4);
     rn = extract32(insn, 5, 5);
     rd = extract32(insn, 0, 5);
 
-    if (mos || type > 1) {
+    if (mos) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (type) {
+    case 0:
+        sz = MO_32;
+        break;
+    case 1:
+        sz = MO_64;
+        break;
+    case 3:
+        sz = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
         return;
     }
 
-    /* Zero extend sreg inputs to 64 bits now.  */
+    /* Zero extend sreg & hreg inputs to 64 bits now.  */
     t_true = tcg_temp_new_i64();
     t_false = tcg_temp_new_i64();
-    read_vec_element(s, t_true, rn, 0, type ? MO_64 : MO_32);
-    read_vec_element(s, t_false, rm, 0, type ? MO_64 : MO_32);
+    read_vec_element(s, t_true, rn, 0, sz);
+    read_vec_element(s, t_false, rm, 0, sz);
 
     a64_test_cc(&c, cond);
     t_zero = tcg_const_i64(0);
@@ -XXX,XX +XXX,XX @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
     tcg_temp_free_i64(t_false);
     a64_free_cc(&c);
 
-    /* Note that sregs write back zeros to the high bits,
+    /* Note that sregs & hregs write back zeros to the high bits,
        and we've already done the zero-extension.  */
     write_fp_dreg(s, rd, t_true);
     tcg_temp_free_i64(t_true);
-- 
2.17.0

From: Alex Bennée <alex.bennee@linaro.org>

All the hard work is already done by vfp_expand_imm, we just need to
make sure we pick up the correct size.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180512003217.9105-11-richard.henderson@linaro.org
[rth: Merge unallocated_encoding check with TCGMemOp conversion.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
 {
     int rd = extract32(insn, 0, 5);
     int imm8 = extract32(insn, 13, 8);
-    int is_double = extract32(insn, 22, 2);
+    int type = extract32(insn, 22, 2);
     uint64_t imm;
     TCGv_i64 tcg_res;
+    TCGMemOp sz;
 
-    if (is_double > 1) {
+    switch (type) {
+    case 0:
+        sz = MO_32;
+        break;
+    case 1:
+        sz = MO_64;
+        break;
+    case 3:
+        sz = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -XXX,XX +XXX,XX @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
         return;
     }
 
-    imm = vfp_expand_imm(MO_32 + is_double, imm8);
+    imm = vfp_expand_imm(sz, imm8);
 
     tcg_res = tcg_const_i64(imm);
     write_fp_dreg(s, rd, tcg_res);
-- 
2.17.0

From: Alex Bennée <alex.bennee@linaro.org>

We are meant to explicitly pass fpst, not cpu_env.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180512003217.9105-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
         tcg_gen_xori_i32(tcg_res, tcg_op, 0x8000);
         break;
     case 0x3: /* FSQRT */
-        gen_helper_sqrt_f16(tcg_res, tcg_op, cpu_env);
+        fpst = get_fpstatus_ptr(true);
+        gen_helper_sqrt_f16(tcg_res, tcg_op, fpst);
         break;
     case 0x8: /* FRINTN */
     case 0x9: /* FRINTP */
-- 
2.17.0

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

Per the Physical Layer Simplified Spec. "4.3.10.4 Switch Function Status":

The block length is predefined to 512 bits

and "4.10.2 SD Status":

The SD Status contains status bits that are related to the SD Memory Card
  proprietary features and may be used for future application-specific usage.
  The size of the SD Status is one data block of 512 bit. The content of this
  register is transmitted to the Host over the DAT bus along with a 16-bit CRC.

Thus the 16-bit CRC goes at offset 64.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180509060104.4458-3-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/sd/sd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -XXX,XX +XXX,XX @@ static void sd_function_switch(SDState *sd, uint32_t arg)
         sd->data[14 + (i >> 1)] = new_func << ((i * 4) & 4);
     }
     memset(&sd->data[17], 0, 47);
-    stw_be_p(sd->data + 65, sd_crc16(sd->data, 64));
+    stw_be_p(sd->data + 64, sd_crc16(sd->data, 64));
 }
 
 static inline bool sd_wp_addr(SDState *sd, uint64_t addr)
-- 
2.17.0

Usually the logging of the CPU state produced by -d cpu is sufficient
to diagnose problems, but sometimes you want to see the state of
the floating point registers as well. We don't want to enable that
by default as it adds a lot of extra data to the log; instead,
allow it to be optionally enabled via -d fpu.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180510130024.31678-1-peter.maydell@linaro.org
---
 include/qemu/log.h   | 1 +
 accel/tcg/cpu-exec.c | 9 ++++++---
 util/log.c           | 2 ++
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/qemu/log.h b/include/qemu/log.h
index XXXXXXX..XXXXXXX 100644
--- a/include/qemu/log.h
+++ b/include/qemu/log.h
@@ -XXX,XX +XXX,XX @@ static inline bool qemu_log_separate(void)
 #define CPU_LOG_PAGE       (1 << 14)
 /* LOG_TRACE (1 << 15) is defined in log-for-trace.h */
 #define CPU_LOG_TB_OP_IND  (1 << 16)
+#define CPU_LOG_TB_FPU     (1 << 17)
 
 /* Lock output for a series of related logs.  Since this is not needed
  * for a single qemu_log / qemu_log_mask / qemu_log_mask_and_addr, we
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index XXXXXXX..XXXXXXX 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -XXX,XX +XXX,XX @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock *itb)
     if (qemu_loglevel_mask(CPU_LOG_TB_CPU)
         && qemu_log_in_addr_range(itb->pc)) {
         qemu_log_lock();
+        int flags = 0;
+        if (qemu_loglevel_mask(CPU_LOG_TB_FPU)) {
+            flags |= CPU_DUMP_FPU;
+        }
 #if defined(TARGET_I386)
-        log_cpu_state(cpu, CPU_DUMP_CCOP);
-#else
-        log_cpu_state(cpu, 0);
+        flags |= CPU_DUMP_CCOP;
 #endif
+        log_cpu_state(cpu, flags);
         qemu_log_unlock();
     }
 #endif /* DEBUG_DISAS */
diff --git a/util/log.c b/util/log.c
index XXXXXXX..XXXXXXX 100644
--- a/util/log.c
+++ b/util/log.c
@@ -XXX,XX +XXX,XX @@ const QEMULogItem qemu_log_items[] = {
       "show trace before each executed TB (lots of logs)" },
     { CPU_LOG_TB_CPU, "cpu",
       "show CPU registers before entering a TB (lots of logs)" },
+    { CPU_LOG_TB_FPU, "fpu",
+      "include FPU registers in the 'cpu' logging" },
     { CPU_LOG_MMU, "mmu",
       "log MMU-related activities" },
     { CPU_LOG_PCALL, "pcall",
-- 
2.17.0

Hi; this mostly contains the first slice of A64 decodetree
patches, plus some other minor pieces. It also has the
enablement of MTE for KVM guests.

thanks
-- PMM

The following changes since commit d27e7c359330ba7020bdbed7ed2316cb4cf6ffc1:

qapi/parser: Drop two bad type hints for now (2023-05-17 10:18:33 -0700)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20230518

for you to fetch changes up to 91608e2a44f36e79cb83f863b8a7bb57d2c98061:

docs: Convert u2f.txt to rST (2023-05-18 11:40:32 +0100)

----------------------------------------------------------------
target-arm queue:
 * Fix vd == vm overlap in sve_ldff1_z
 * Add support for MTE with KVM guests
 * Add RAZ/WI handling for DBGDTR[TX|RX]
 * Start of conversion of A64 decoder to decodetree
 * Saturate L2CTLR_EL1 core count field rather than overflowing
 * vexpress: Avoid trivial memory leak of 'flashalias'
 * sbsa-ref: switch default cpu core to Neoverse-N1
 * sbsa-ref: use Bochs graphics card instead of VGA
 * MAINTAINERS: Add Marcin Juszkiewicz to sbsa-ref reviewer list
 * docs: Convert u2f.txt to rST

----------------------------------------------------------------
Alex Bennée (1):
      target/arm: add RAZ/WI handling for DBGDTR[TX|RX]

Cornelia Huck (1):
      arm/kvm: add support for MTE

Marcin Juszkiewicz (3):
      sbsa-ref: switch default cpu core to Neoverse-N1
      Maintainers: add myself as reviewer for sbsa-ref
      sbsa-ref: use Bochs graphics card instead of VGA

Peter Maydell (14):
      target/arm: Create decodetree skeleton for A64
      target/arm: Pull calls to disas_sve() and disas_sme() out of legacy decoder
      target/arm: Convert Extract instructions to decodetree
      target/arm: Convert unconditional branch immediate to decodetree
      target/arm: Convert CBZ, CBNZ to decodetree
      target/arm: Convert TBZ, TBNZ to decodetree
      target/arm: Convert conditional branch insns to decodetree
      target/arm: Convert BR, BLR, RET to decodetree
      target/arm: Convert BRA[AB]Z, BLR[AB]Z, RETA[AB] to decodetree
      target/arm: Convert BRAA, BRAB, BLRAA, BLRAB to decodetree
      target/arm: Convert ERET, ERETAA, ERETAB to decodetree
      target/arm: Saturate L2CTLR_EL1 core count field rather than overflowing
      hw/arm/vexpress: Avoid trivial memory leak of 'flashalias'
      docs: Convert u2f.txt to rST

Richard Henderson (10):
      target/arm: Fix vd == vm overlap in sve_ldff1_z
      target/arm: Split out disas_a64_legacy
      target/arm: Convert PC-rel addressing to decodetree
      target/arm: Split gen_add_CC and gen_sub_CC
      target/arm: Convert Add/subtract (immediate) to decodetree
      target/arm: Convert Add/subtract (immediate with tags) to decodetree
      target/arm: Replace bitmask64 with MAKE_64BIT_MASK
      target/arm: Convert Logical (immediate) to decodetree
      target/arm: Convert Move wide (immediate) to decodetree
      target/arm: Convert Bitfield to decodetree

From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>

The world outside moves to newer and newer cpu cores. Let move SBSA
Reference Platform to something newer as well.

Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
Message-id: 20230506183417.1360427-1-marcin.juszkiewicz@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/sbsa-ref.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -XXX,XX +XXX,XX @@ static void sbsa_ref_class_init(ObjectClass *oc, void *data)
 
     mc->init = sbsa_ref_init;
     mc->desc = "QEMU 'SBSA Reference' ARM Virtual Machine";
-    mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a57");
+    mc->default_cpu_type = ARM_CPU_TYPE_NAME("neoverse-n1");
     mc->max_cpus = 512;
     mc->pci_allow_0_address = true;
     mc->minimum_page_bits = 12;
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

If vd == vm, copy vm to scratch, so that we can pre-zero
the output and still access the gather indicies.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1612
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230504104232.1877774-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve_helper.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -XXX,XX +XXX,XX @@ void sve_ldff1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
     intptr_t reg_off;
     SVEHostPage info;
     target_ulong addr, in_page;
+    ARMVectorReg scratch;
 
     /* Skip to the first true predicate.  */
     reg_off = find_next_active(vg, 0, reg_max, esz);
@@ -XXX,XX +XXX,XX @@ void sve_ldff1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
         return;
     }
 
+    /* Protect against overlap between vd and vm. */
+    if (unlikely(vd == vm)) {
+        vm = memcpy(&scratch, vm, reg_max);
+    }
+
     /*
      * Probe the first element, allowing faults.
      */
-- 
2.34.1

From: Cornelia Huck <cohuck@redhat.com>

Extend the 'mte' property for the virt machine to cover KVM as
well. For KVM, we don't allocate tag memory, but instead enable the
capability.

If MTE has been enabled, we need to disable migration, as we do not
yet have a way to migrate the tags as well. Therefore, MTE will stay
off with KVM unless requested explicitly.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230428095533.21747-2-cohuck@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h     |  4 +++
 target/arm/kvm_arm.h | 19 ++++++++++++
 hw/arm/virt.c        | 73 +++++++++++++++++++++++++-------------------
 target/arm/cpu.c     |  9 +++---
 target/arm/kvm.c     | 35 +++++++++++++++++++++
 target/arm/kvm64.c   |  5 +++
 6 files changed, 109 insertions(+), 36 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
      */
     uint32_t psci_conduit;
 
+    /* CPU has Memory Tag Extension */
+    bool has_mte;
+
     /* For v8M, initial value of the Secure VTOR */
     uint32_t init_svtor;
     /* For v8M, initial value of the Non-secure VTOR */
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
     bool prop_pauth;
     bool prop_pauth_impdef;
     bool prop_lpa2;
+    OnOffAuto prop_mte;
 
     /* DCZ blocksize, in log_2(words), ie low 4 bits of DCZID_EL0 */
     uint32_t dcz_blocksize;
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_pmu_supported(void);
  */
 bool kvm_arm_sve_supported(void);
 
+/**
+ * kvm_arm_mte_supported:
+ *
+ * Returns: true if KVM can enable MTE, and false otherwise.
+ */
+bool kvm_arm_mte_supported(void);
+
 /**
  * kvm_arm_get_max_vm_ipa_size:
  * @ms: Machine state handle
@@ -XXX,XX +XXX,XX @@ void kvm_arm_pvtime_init(CPUState *cs, uint64_t ipa);
 
 int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level);
 
+void kvm_arm_enable_mte(Object *cpuobj, Error **errp);
+
 #else
 
 /*
@@ -XXX,XX +XXX,XX @@ static inline bool kvm_arm_steal_time_supported(void)
     return false;
 }
 
+static inline bool kvm_arm_mte_supported(void)
+{
+    return false;
+}
+
 /*
  * These functions should never actually be called without KVM support.
  */
@@ -XXX,XX +XXX,XX @@ static inline uint32_t kvm_arm_sve_get_vls(CPUState *cs)
     g_assert_not_reached();
 }
 
+static inline void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
+{
+    g_assert_not_reached();
+}
+
 #endif
 
 static inline const char *gic_class_name(void)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -XXX,XX +XXX,XX @@ static void machvirt_init(MachineState *machine)
         exit(1);
     }
 
-    if (vms->mte && (kvm_enabled() || hvf_enabled())) {
+    if (vms->mte && hvf_enabled()) {
         error_report("mach-virt: %s does not support providing "
                      "MTE to the guest CPU",
                      current_accel_name());
@@ -XXX,XX +XXX,XX @@ static void machvirt_init(MachineState *machine)
         }
 
         if (vms->mte) {
-            /* Create the memory region only once, but link to all cpus. */
-            if (!tag_sysmem) {
-                /*
-                 * The property exists only if MemTag is supported.
-                 * If it is, we must allocate the ram to back that up.
-                 */
-                if (!object_property_find(cpuobj, "tag-memory")) {
-                    error_report("MTE requested, but not supported "
-                                 "by the guest CPU");
+            if (tcg_enabled()) {
+                /* Create the memory region only once, but link to all cpus. */
+                if (!tag_sysmem) {
+                    /*
+                     * The property exists only if MemTag is supported.
+                     * If it is, we must allocate the ram to back that up.
+                     */
+                    if (!object_property_find(cpuobj, "tag-memory")) {
+                        error_report("MTE requested, but not supported "
+                                     "by the guest CPU");
+                        exit(1);
+                    }
+
+                    tag_sysmem = g_new(MemoryRegion, 1);
+                    memory_region_init(tag_sysmem, OBJECT(machine),
+                                       "tag-memory", UINT64_MAX / 32);
+
+                    if (vms->secure) {
+                        secure_tag_sysmem = g_new(MemoryRegion, 1);
+                        memory_region_init(secure_tag_sysmem, OBJECT(machine),
+                                           "secure-tag-memory",
+                                           UINT64_MAX / 32);
+
+                        /* As with ram, secure-tag takes precedence over tag. */
+                        memory_region_add_subregion_overlap(secure_tag_sysmem,
+                                                            0, tag_sysmem, -1);
+                    }
+                }
+
+                object_property_set_link(cpuobj, "tag-memory",
+                                         OBJECT(tag_sysmem), &error_abort);
+                if (vms->secure) {
+                    object_property_set_link(cpuobj, "secure-tag-memory",
+                                             OBJECT(secure_tag_sysmem),
+                                             &error_abort);
+                }
+            } else if (kvm_enabled()) {
+                if (!kvm_arm_mte_supported()) {
+                    error_report("MTE requested, but not supported by KVM");
                     exit(1);
                 }
-
-                tag_sysmem = g_new(MemoryRegion, 1);
-                memory_region_init(tag_sysmem, OBJECT(machine),
-                                   "tag-memory", UINT64_MAX / 32);
-
-                if (vms->secure) {
-                    secure_tag_sysmem = g_new(MemoryRegion, 1);
-                    memory_region_init(secure_tag_sysmem, OBJECT(machine),
-                                       "secure-tag-memory", UINT64_MAX / 32);
-
-                    /* As with ram, secure-tag takes precedence over tag.  */
-                    memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
-                                                        tag_sysmem, -1);
-                }
-            }
-
-            object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem),
-                                     &error_abort);
-            if (vms->secure) {
-                object_property_set_link(cpuobj, "secure-tag-memory",
-                                         OBJECT(secure_tag_sysmem),
-                                         &error_abort);
+                kvm_arm_enable_mte(cpuobj, &error_abort);
             }
         }
 
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
                                      qdev_prop_allow_set_link_before_realize,
                                      OBJ_PROP_LINK_STRONG);
         }
+        cpu->has_mte = true;
     }
 #endif
 }
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         }
         if (cpu->tag_memory) {
             error_setg(errp,
-                       "Cannot enable %s when guest CPUs has MTE enabled",
+                       "Cannot enable %s when guest CPUs has tag memory enabled",
                        current_accel_name());
             return;
         }
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
     }
 
 #ifndef CONFIG_USER_ONLY
-    if (cpu->tag_memory == NULL && cpu_isar_feature(aa64_mte, cpu)) {
+    if (!cpu->has_mte && cpu_isar_feature(aa64_mte, cpu)) {
         /*
-         * Disable the MTE feature bits if we do not have tag-memory
-         * provided by the machine.
+         * Disable the MTE feature bits if we do not have the feature
+         * setup by the machine.
          */
         cpu->isar.id_aa64pfr1 =
             FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 0);
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/boards.h"
 #include "hw/irq.h"
 #include "qemu/log.h"
+#include "migration/blocker.h"
 
 const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
     KVM_CAP_LAST_INFO
@@ -XXX,XX +XXX,XX @@ bool kvm_arch_cpu_check_are_resettable(void)
 void kvm_arch_accel_class_init(ObjectClass *oc)
 {
 }
+
+void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
+{
+    static bool tried_to_enable;
+    static bool succeeded_to_enable;
+    Error *mte_migration_blocker = NULL;
+    int ret;
+
+    if (!tried_to_enable) {
+        /*
+         * MTE on KVM is enabled on a per-VM basis (and retrying doesn't make
+         * sense), and we only want a single migration blocker as well.
+         */
+        tried_to_enable = true;
+
+        ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_MTE, 0);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to enable KVM_CAP_ARM_MTE");
+            return;
+        }
+
+        /* TODO: add proper migration support with MTE enabled */
+        error_setg(&mte_migration_blocker,
+                   "Live migration disabled due to MTE enabled");
+        if (migrate_add_blocker(mte_migration_blocker, errp)) {
+            error_free(mte_migration_blocker);
+            return;
+        }
+        succeeded_to_enable = true;
+    }
+    if (succeeded_to_enable) {
+        object_property_set_bool(cpuobj, "has_mte", true, NULL);
+    }
+}
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ bool kvm_arm_steal_time_supported(void)
     return kvm_check_extension(kvm_state, KVM_CAP_STEAL_TIME);
 }
 
+bool kvm_arm_mte_supported(void)
+{
+    return kvm_check_extension(kvm_state, KVM_CAP_ARM_MTE);
+}
+
 QEMU_BUILD_BUG_ON(KVM_ARM64_SVE_VQ_MIN != 1);
 
 uint32_t kvm_arm_sve_get_vls(CPUState *cs)
-- 
2.34.1

From: Alex Bennée <alex.bennee@linaro.org>

The commit b3aa2f2128 (target/arm: provide stubs for more external
debug registers) was added to handle HyperV's unconditional usage of
Debug Communications Channel. It turns out that Linux will similarly
break if you enable CONFIG_HVC_DCC "ARM JTAG DCC console".

Extend the registers we RAZ/WI set to avoid this.

Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230516104420.407912-1-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/debug_helper.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/debug_helper.c
+++ b/target/arm/debug_helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
       .access = PL0_R, .accessfn = access_tdcc,
       .type = ARM_CP_CONST, .resetvalue = 0 },
     /*
-     * OSDTRRX_EL1/OSDTRTX_EL1 are used for save and restore of DBGDTRRX_EL0.
-     * It is a component of the Debug Communications Channel, which is not implemented.
+     * These registers belong to the Debug Communications Channel,
+     * which is not implemented. However we implement RAZ/WI behaviour
+     * with trapping to prevent spurious SIGILLs if the guest OS does
+     * access them as the support cannot be probed for.
      */
     { .name = "OSDTRRX_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14,
       .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 0, .opc2 = 2,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
       .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 2,
       .access = PL1_RW, .accessfn = access_tdcc,
       .type = ARM_CP_CONST, .resetvalue = 0 },
+    /* DBGDTRTX_EL0/DBGDTRRX_EL0 depend on direction */
+    { .name = "DBGDTR_EL0", .state = ARM_CP_STATE_BOTH, .cp = 14,
+      .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 5, .opc2 = 0,
+      .access = PL0_RW, .accessfn = access_tdcc,
+      .type = ARM_CP_CONST, .resetvalue = 0 },
     /*
      * OSECCR_EL1 provides a mechanism for an operating system
      * to access the contents of EDECCR. EDECCR is not implemented though,
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Split out all of the decode stuff from aarch64_tr_translate_insn.
Call it disas_a64_legacy to indicate it will be replaced.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230512144106.3608981-2-peter.maydell@linaro.org
[PMM: Rebased]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/translate-a64.c | 82 ++++++++++++++++++----------------
 1 file changed, 44 insertions(+), 38 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool btype_destination_ok(uint32_t insn, bool bt, int btype)
     return false;
 }
 
+/* C3.1 A64 instruction index by encoding */
+static void disas_a64_legacy(DisasContext *s, uint32_t insn)
+{
+    switch (extract32(insn, 25, 4)) {
+    case 0x0:
+        if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
+            unallocated_encoding(s);
+        }
+        break;
+    case 0x1: case 0x3: /* UNALLOCATED */
+        unallocated_encoding(s);
+        break;
+    case 0x2:
+        if (!disas_sve(s, insn)) {
+            unallocated_encoding(s);
+        }
+        break;
+    case 0x8: case 0x9: /* Data processing - immediate */
+        disas_data_proc_imm(s, insn);
+        break;
+    case 0xa: case 0xb: /* Branch, exception generation and system insns */
+        disas_b_exc_sys(s, insn);
+        break;
+    case 0x4:
+    case 0x6:
+    case 0xc:
+    case 0xe:      /* Loads and stores */
+        disas_ldst(s, insn);
+        break;
+    case 0x5:
+    case 0xd:      /* Data processing - register */
+        disas_data_proc_reg(s, insn);
+        break;
+    case 0x7:
+    case 0xf:      /* Data processing - SIMD and floating point */
+        disas_data_proc_simd_fp(s, insn);
+        break;
+    default:
+        assert(FALSE); /* all 15 cases should be handled above */
+        break;
+    }
+}
+
 static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
                                           CPUState *cpu)
 {
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
         disas_sme_fa64(s, insn);
     }
 
-    switch (extract32(insn, 25, 4)) {
-    case 0x0:
-        if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
-            unallocated_encoding(s);
-        }
-        break;
-    case 0x1: case 0x3: /* UNALLOCATED */
-        unallocated_encoding(s);
-        break;
-    case 0x2:
-        if (!disas_sve(s, insn)) {
-            unallocated_encoding(s);
-        }
-        break;
-    case 0x8: case 0x9: /* Data processing - immediate */
-        disas_data_proc_imm(s, insn);
-        break;
-    case 0xa: case 0xb: /* Branch, exception generation and system insns */
-        disas_b_exc_sys(s, insn);
-        break;
-    case 0x4:
-    case 0x6:
-    case 0xc:
-    case 0xe:      /* Loads and stores */
-        disas_ldst(s, insn);
-        break;
-    case 0x5:
-    case 0xd:      /* Data processing - register */
-        disas_data_proc_reg(s, insn);
-        break;
-    case 0x7:
-    case 0xf:      /* Data processing - SIMD and floating point */
-        disas_data_proc_simd_fp(s, insn);
-        break;
-    default:
-        assert(FALSE); /* all 15 cases should be handled above */
-        break;
-    }
+    disas_a64_legacy(s, insn);
 
     /*
      * After execution of most insns, btype is reset to 0.
-- 
2.34.1

The A64 translator uses a hand-written decoder for everything except
SVE or SME.  It's fairly well structured, but it's becoming obvious
that it's still more painful to add instructions to than the A32
translator, because putting a new instruction into the right place in
a hand-written decoder is much harder than adding new instruction
patterns to a decodetree file.

As the first step in conversion to decodetree, create the skeleton of
the decodetree decoder; where it does not handle instructions we will
fall back to the legacy decoder (which will be for everything at the
moment, since there are no patterns in a64.decode).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-3-peter.maydell@linaro.org
---
 target/arm/tcg/a64.decode      | 20 ++++++++++++++++++++
 target/arm/tcg/translate-a64.c | 18 +++++++++++-------
 target/arm/tcg/meson.build     |  1 +
 3 files changed, 32 insertions(+), 7 deletions(-)
 create mode 100644 target/arm/tcg/a64.decode

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@
+# AArch64 A64 allowed instruction decoding
+#
+#  Copyright (c) 2023 Linaro, Ltd
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
+
+#
+# This file is processed by scripts/decodetree.py
+#
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ enum a64_shift_type {
     A64_SHIFT_TYPE_ROR = 3
 };
 
+/*
+ * Include the generated decoders.
+ */
+
+#include "decode-sme-fa64.c.inc"
+#include "decode-a64.c.inc"
+
 /* Table based decoder typedefs - used when the relevant bits for decode
  * are too awkwardly scattered across the instruction (eg SIMD).
  */
@@ -XXX,XX +XXX,XX @@ static void disas_data_proc_simd_fp(DisasContext *s, uint32_t insn)
     }
 }
 
-/*
- * Include the generated SME FA64 decoder.
- */
-
-#include "decode-sme-fa64.c.inc"
-
 static bool trans_OK(DisasContext *s, arg_OK *a)
 {
     return true;
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
         disas_sme_fa64(s, insn);
     }
 
-    disas_a64_legacy(s, insn);
+
+    if (!disas_a64(s, insn)) {
+        disas_a64_legacy(s, insn);
+    }
 
     /*
      * After execution of most insns, btype is reset to 0.
diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/meson.build
+++ b/target/arm/tcg/meson.build
@@ -XXX,XX +XXX,XX @@ gen = [
   decodetree.process('a32-uncond.decode', extra_args: '--static-decode=disas_a32_uncond'),
   decodetree.process('t32.decode', extra_args: '--static-decode=disas_t32'),
   decodetree.process('t16.decode', extra_args: ['-w', '16', '--static-decode=disas_t16']),
+  decodetree.process('a64.decode', extra_args: ['--static-decode=disas_a64']),
 ]
 
 arm_ss.add(gen)
-- 
2.34.1

The SVE and SME decode is already done by decodetree.  Pull the calls
to these decoders out of the legacy decoder.  This doesn't change
behaviour because all the patterns in sve.decode and sme.decode
already require the bits that the legacy decoder is decoding to have
the correct values.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-4-peter.maydell@linaro.org
---
 target/arm/tcg/translate-a64.c | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool btype_destination_ok(uint32_t insn, bool bt, int btype)
 static void disas_a64_legacy(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 25, 4)) {
-    case 0x0:
-        if (!extract32(insn, 31, 1) || !disas_sme(s, insn)) {
-            unallocated_encoding(s);
-        }
-        break;
-    case 0x1: case 0x3: /* UNALLOCATED */
-        unallocated_encoding(s);
-        break;
-    case 0x2:
-        if (!disas_sve(s, insn)) {
-            unallocated_encoding(s);
-        }
-        break;
     case 0x8: case 0x9: /* Data processing - immediate */
         disas_data_proc_imm(s, insn);
         break;
@@ -XXX,XX +XXX,XX @@ static void disas_a64_legacy(DisasContext *s, uint32_t insn)
         disas_data_proc_simd_fp(s, insn);
         break;
     default:
-        assert(FALSE); /* all 15 cases should be handled above */
+        unallocated_encoding(s);
         break;
     }
 }
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
         disas_sme_fa64(s, insn);
     }
 
-
-    if (!disas_a64(s, insn)) {
+    if (!disas_a64(s, insn) &&
+        !disas_sme(s, insn) &&
+        !disas_sve(s, insn)) {
         disas_a64_legacy(s, insn);
     }
 
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Convert the ADR and ADRP instructions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230512144106.3608981-5-peter.maydell@linaro.org
[PMM: Rebased]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/a64.decode      | 13 ++++++++++++
 target/arm/tcg/translate-a64.c | 38 +++++++++++++---------------------
 2 files changed, 27 insertions(+), 24 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@
 #
 # This file is processed by scripts/decodetree.py
 #
+
+&ri              rd imm
+
+
+### Data Processing - Immediate
+
+# PC-rel addressing
+
+%imm_pcrel      5:s19 29:2
+@pcrel          . .. ..... ................... rd:5     &ri imm=%imm_pcrel
+
+ADR             0 .. 10000 ................... .....    @pcrel
+ADRP            1 .. 10000 ................... .....    @pcrel
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_ldst(DisasContext *s, uint32_t insn)
     }
 }
 
-/* PC-rel. addressing
- *   31  30   29 28       24 23                5 4    0
- * +----+-------+-----------+-------------------+------+
- * | op | immlo | 1 0 0 0 0 |       immhi       |  Rd  |
- * +----+-------+-----------+-------------------+------+
+/*
+ * PC-rel. addressing
  */
-static void disas_pc_rel_adr(DisasContext *s, uint32_t insn)
+
+static bool trans_ADR(DisasContext *s, arg_ri *a)
 {
-    unsigned int page, rd;
-    int64_t offset;
+    gen_pc_plus_diff(s, cpu_reg(s, a->rd), a->imm);
+    return true;
+}
 
-    page = extract32(insn, 31, 1);
-    /* SignExtend(immhi:immlo) -> offset */
-    offset = sextract64(insn, 5, 19);
-    offset = offset << 2 | extract32(insn, 29, 2);
-    rd = extract32(insn, 0, 5);
+static bool trans_ADRP(DisasContext *s, arg_ri *a)
+{
+    int64_t offset = (int64_t)a->imm << 12;
 
-    if (page) {
-        /* ADRP (page based) */
-        offset <<= 12;
-        /* The page offset is ok for CF_PCREL. */
-        offset -= s->pc_curr & 0xfff;
-    }
-
-    gen_pc_plus_diff(s, cpu_reg(s, rd), offset);
+    /* The page offset is ok for CF_PCREL. */
+    offset -= s->pc_curr & 0xfff;
+    gen_pc_plus_diff(s, cpu_reg(s, a->rd), offset);
+    return true;
 }
 
 /*
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
 static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 23, 6)) {
-    case 0x20: case 0x21: /* PC-rel. addressing */
-        disas_pc_rel_adr(s, insn);
-        break;
     case 0x22: /* Add/subtract (immediate) */
         disas_add_sub_imm(s, insn);
         break;
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Split out specific 32-bit and 64-bit functions.
These carry the same signature as tcg_gen_add_i64,
and so will be easier to pass as callbacks.

Retain gen_add_CC and gen_sub_CC during conversion.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230512144106.3608981-6-peter.maydell@linaro.org
[PMM: rebased]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/translate-a64.c | 149 +++++++++++++++++++--------------
 1 file changed, 84 insertions(+), 65 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_logic_CC(int sf, TCGv_i64 result)
 }
 
 /* dest = T0 + T1; compute C, N, V and Z flags */
+static void gen_add64_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
+{
+    TCGv_i64 result, flag, tmp;
+    result = tcg_temp_new_i64();
+    flag = tcg_temp_new_i64();
+    tmp = tcg_temp_new_i64();
+
+    tcg_gen_movi_i64(tmp, 0);
+    tcg_gen_add2_i64(result, flag, t0, tmp, t1, tmp);
+
+    tcg_gen_extrl_i64_i32(cpu_CF, flag);
+
+    gen_set_NZ64(result);
+
+    tcg_gen_xor_i64(flag, result, t0);
+    tcg_gen_xor_i64(tmp, t0, t1);
+    tcg_gen_andc_i64(flag, flag, tmp);
+    tcg_gen_extrh_i64_i32(cpu_VF, flag);
+
+    tcg_gen_mov_i64(dest, result);
+}
+
+static void gen_add32_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
+{
+    TCGv_i32 t0_32 = tcg_temp_new_i32();
+    TCGv_i32 t1_32 = tcg_temp_new_i32();
+    TCGv_i32 tmp = tcg_temp_new_i32();
+
+    tcg_gen_movi_i32(tmp, 0);
+    tcg_gen_extrl_i64_i32(t0_32, t0);
+    tcg_gen_extrl_i64_i32(t1_32, t1);
+    tcg_gen_add2_i32(cpu_NF, cpu_CF, t0_32, tmp, t1_32, tmp);
+    tcg_gen_mov_i32(cpu_ZF, cpu_NF);
+    tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
+    tcg_gen_xor_i32(tmp, t0_32, t1_32);
+    tcg_gen_andc_i32(cpu_VF, cpu_VF, tmp);
+    tcg_gen_extu_i32_i64(dest, cpu_NF);
+}
+
 static void gen_add_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
 {
     if (sf) {
-        TCGv_i64 result, flag, tmp;
-        result = tcg_temp_new_i64();
-        flag = tcg_temp_new_i64();
-        tmp = tcg_temp_new_i64();
-
-        tcg_gen_movi_i64(tmp, 0);
-        tcg_gen_add2_i64(result, flag, t0, tmp, t1, tmp);
-
-        tcg_gen_extrl_i64_i32(cpu_CF, flag);
-
-        gen_set_NZ64(result);
-
-        tcg_gen_xor_i64(flag, result, t0);
-        tcg_gen_xor_i64(tmp, t0, t1);
-        tcg_gen_andc_i64(flag, flag, tmp);
-        tcg_gen_extrh_i64_i32(cpu_VF, flag);
-
-        tcg_gen_mov_i64(dest, result);
+        gen_add64_CC(dest, t0, t1);
     } else {
-        /* 32 bit arithmetic */
-        TCGv_i32 t0_32 = tcg_temp_new_i32();
-        TCGv_i32 t1_32 = tcg_temp_new_i32();
-        TCGv_i32 tmp = tcg_temp_new_i32();
-
-        tcg_gen_movi_i32(tmp, 0);
-        tcg_gen_extrl_i64_i32(t0_32, t0);
-        tcg_gen_extrl_i64_i32(t1_32, t1);
-        tcg_gen_add2_i32(cpu_NF, cpu_CF, t0_32, tmp, t1_32, tmp);
-        tcg_gen_mov_i32(cpu_ZF, cpu_NF);
-        tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
-        tcg_gen_xor_i32(tmp, t0_32, t1_32);
-        tcg_gen_andc_i32(cpu_VF, cpu_VF, tmp);
-        tcg_gen_extu_i32_i64(dest, cpu_NF);
+        gen_add32_CC(dest, t0, t1);
     }
 }
 
 /* dest = T0 - T1; compute C, N, V and Z flags */
+static void gen_sub64_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
+{
+    /* 64 bit arithmetic */
+    TCGv_i64 result, flag, tmp;
+
+    result = tcg_temp_new_i64();
+    flag = tcg_temp_new_i64();
+    tcg_gen_sub_i64(result, t0, t1);
+
+    gen_set_NZ64(result);
+
+    tcg_gen_setcond_i64(TCG_COND_GEU, flag, t0, t1);
+    tcg_gen_extrl_i64_i32(cpu_CF, flag);
+
+    tcg_gen_xor_i64(flag, result, t0);
+    tmp = tcg_temp_new_i64();
+    tcg_gen_xor_i64(tmp, t0, t1);
+    tcg_gen_and_i64(flag, flag, tmp);
+    tcg_gen_extrh_i64_i32(cpu_VF, flag);
+    tcg_gen_mov_i64(dest, result);
+}
+
+static void gen_sub32_CC(TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
+{
+    /* 32 bit arithmetic */
+    TCGv_i32 t0_32 = tcg_temp_new_i32();
+    TCGv_i32 t1_32 = tcg_temp_new_i32();
+    TCGv_i32 tmp;
+
+    tcg_gen_extrl_i64_i32(t0_32, t0);
+    tcg_gen_extrl_i64_i32(t1_32, t1);
+    tcg_gen_sub_i32(cpu_NF, t0_32, t1_32);
+    tcg_gen_mov_i32(cpu_ZF, cpu_NF);
+    tcg_gen_setcond_i32(TCG_COND_GEU, cpu_CF, t0_32, t1_32);
+    tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
+    tmp = tcg_temp_new_i32();
+    tcg_gen_xor_i32(tmp, t0_32, t1_32);
+    tcg_gen_and_i32(cpu_VF, cpu_VF, tmp);
+    tcg_gen_extu_i32_i64(dest, cpu_NF);
+}
+
 static void gen_sub_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
 {
     if (sf) {
-        /* 64 bit arithmetic */
-        TCGv_i64 result, flag, tmp;
-
-        result = tcg_temp_new_i64();
-        flag = tcg_temp_new_i64();
-        tcg_gen_sub_i64(result, t0, t1);
-
-        gen_set_NZ64(result);
-
-        tcg_gen_setcond_i64(TCG_COND_GEU, flag, t0, t1);
-        tcg_gen_extrl_i64_i32(cpu_CF, flag);
-
-        tcg_gen_xor_i64(flag, result, t0);
-        tmp = tcg_temp_new_i64();
-        tcg_gen_xor_i64(tmp, t0, t1);
-        tcg_gen_and_i64(flag, flag, tmp);
-        tcg_gen_extrh_i64_i32(cpu_VF, flag);
-        tcg_gen_mov_i64(dest, result);
+        gen_sub64_CC(dest, t0, t1);
     } else {
-        /* 32 bit arithmetic */
-        TCGv_i32 t0_32 = tcg_temp_new_i32();
-        TCGv_i32 t1_32 = tcg_temp_new_i32();
-        TCGv_i32 tmp;
-
-        tcg_gen_extrl_i64_i32(t0_32, t0);
-        tcg_gen_extrl_i64_i32(t1_32, t1);
-        tcg_gen_sub_i32(cpu_NF, t0_32, t1_32);
-        tcg_gen_mov_i32(cpu_ZF, cpu_NF);
-        tcg_gen_setcond_i32(TCG_COND_GEU, cpu_CF, t0_32, t1_32);
-        tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
-        tmp = tcg_temp_new_i32();
-        tcg_gen_xor_i32(tmp, t0_32, t1_32);
-        tcg_gen_and_i32(cpu_VF, cpu_VF, tmp);
-        tcg_gen_extu_i32_i64(dest, cpu_NF);
+        gen_sub32_CC(dest, t0, t1);
     }
 }
 
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Convert the ADD and SUB (immediate) instructions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230512144106.3608981-7-peter.maydell@linaro.org
[PMM: Rebased; adjusted to use translate.h's TRANS macro]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/translate.h     |  5 +++
 target/arm/tcg/a64.decode      | 17 ++++++++
 target/arm/tcg/translate-a64.c | 73 ++++++++++------------------------
 3 files changed, 42 insertions(+), 53 deletions(-)

diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate.h
+++ b/target/arm/tcg/translate.h
@@ -XXX,XX +XXX,XX @@ static inline int rsub_8(DisasContext *s, int x)
     return 8 - x;
 }
 
+static inline int shl_12(DisasContext *s, int x)
+{
+    return x << 12;
+}
+
 static inline int neon_3same_fp_size(DisasContext *s, int x)
 {
     /* Convert 0==fp32, 1==fp16 into a MO_* value */
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@
 #
 
 &ri              rd imm
+&rri_sf          rd rn imm sf
 
 
 ### Data Processing - Immediate
@@ -XXX,XX +XXX,XX @@
 
 ADR             0 .. 10000 ................... .....    @pcrel
 ADRP            1 .. 10000 ................... .....    @pcrel
+
+# Add/subtract (immediate)
+
+%imm12_sh12     10:12 !function=shl_12
+@addsub_imm     sf:1 .. ...... . imm:12 rn:5 rd:5
+@addsub_imm12   sf:1 .. ...... . ............ rn:5 rd:5 imm=%imm12_sh12
+
+ADD_i           . 00 100010 0 ............ ..... .....  @addsub_imm
+ADD_i           . 00 100010 1 ............ ..... .....  @addsub_imm12
+ADDS_i          . 01 100010 0 ............ ..... .....  @addsub_imm
+ADDS_i          . 01 100010 1 ............ ..... .....  @addsub_imm12
+
+SUB_i           . 10 100010 0 ............ ..... .....  @addsub_imm
+SUB_i           . 10 100010 1 ............ ..... .....  @addsub_imm12
+SUBS_i          . 11 100010 0 ............ ..... .....  @addsub_imm
+SUBS_i          . 11 100010 1 ............ ..... .....  @addsub_imm12
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_ldst(DisasContext *s, uint32_t insn)
     }
 }
 
+typedef void ArithTwoOp(TCGv_i64, TCGv_i64, TCGv_i64);
+
+static bool gen_rri(DisasContext *s, arg_rri_sf *a,
+                    bool rd_sp, bool rn_sp, ArithTwoOp *fn)
+{
+    TCGv_i64 tcg_rn = rn_sp ? cpu_reg_sp(s, a->rn) : cpu_reg(s, a->rn);
+    TCGv_i64 tcg_rd = rd_sp ? cpu_reg_sp(s, a->rd) : cpu_reg(s, a->rd);
+    TCGv_i64 tcg_imm = tcg_constant_i64(a->imm);
+
+    fn(tcg_rd, tcg_rn, tcg_imm);
+    if (!a->sf) {
+        tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
+    }
+    return true;
+}
+
 /*
  * PC-rel. addressing
  */
@@ -XXX,XX +XXX,XX @@ static bool trans_ADRP(DisasContext *s, arg_ri *a)
 
 /*
  * Add/subtract (immediate)
- *
- *  31 30 29 28         23 22 21         10 9   5 4   0
- * +--+--+--+-------------+--+-------------+-----+-----+
- * |sf|op| S| 1 0 0 0 1 0 |sh|    imm12    |  Rn | Rd  |
- * +--+--+--+-------------+--+-------------+-----+-----+
- *
- *    sf: 0 -> 32bit, 1 -> 64bit
- *    op: 0 -> add  , 1 -> sub
- *     S: 1 -> set flags
- *    sh: 1 -> LSL imm by 12
  */
-static void disas_add_sub_imm(DisasContext *s, uint32_t insn)
-{
-    int rd = extract32(insn, 0, 5);
-    int rn = extract32(insn, 5, 5);
-    uint64_t imm = extract32(insn, 10, 12);
-    bool shift = extract32(insn, 22, 1);
-    bool setflags = extract32(insn, 29, 1);
-    bool sub_op = extract32(insn, 30, 1);
-    bool is_64bit = extract32(insn, 31, 1);
-
-    TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
-    TCGv_i64 tcg_rd = setflags ? cpu_reg(s, rd) : cpu_reg_sp(s, rd);
-    TCGv_i64 tcg_result;
-
-    if (shift) {
-        imm <<= 12;
-    }
-
-    tcg_result = tcg_temp_new_i64();
-    if (!setflags) {
-        if (sub_op) {
-            tcg_gen_subi_i64(tcg_result, tcg_rn, imm);
-        } else {
-            tcg_gen_addi_i64(tcg_result, tcg_rn, imm);
-        }
-    } else {
-        TCGv_i64 tcg_imm = tcg_constant_i64(imm);
-        if (sub_op) {
-            gen_sub_CC(is_64bit, tcg_result, tcg_rn, tcg_imm);
-        } else {
-            gen_add_CC(is_64bit, tcg_result, tcg_rn, tcg_imm);
-        }
-    }
-
-    if (is_64bit) {
-        tcg_gen_mov_i64(tcg_rd, tcg_result);
-    } else {
-        tcg_gen_ext32u_i64(tcg_rd, tcg_result);
-    }
-}
+TRANS(ADD_i, gen_rri, a, 1, 1, tcg_gen_add_i64)
+TRANS(SUB_i, gen_rri, a, 1, 1, tcg_gen_sub_i64)
+TRANS(ADDS_i, gen_rri, a, 0, 1, a->sf ? gen_add64_CC : gen_add32_CC)
+TRANS(SUBS_i, gen_rri, a, 0, 1, a->sf ? gen_sub64_CC : gen_sub32_CC)
 
 /*
  * Add/subtract (immediate, with tags)
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
 static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 23, 6)) {
-    case 0x22: /* Add/subtract (immediate) */
-        disas_add_sub_imm(s, insn);
-        break;
     case 0x23: /* Add/subtract (immediate, with tags) */
         disas_add_sub_imm_with_tags(s, insn);
         break;
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Convert the ADDG and SUBG (immediate) instructions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230512144106.3608981-8-peter.maydell@linaro.org
[PMM: Rebased; use TRANS_FEAT()]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/a64.decode      |  8 +++++++
 target/arm/tcg/translate-a64.c | 38 ++++++++++------------------------
 2 files changed, 19 insertions(+), 27 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Use the bitops.h macro rather than rolling our own here.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230512144106.3608981-9-peter.maydell@linaro.org
---
 target/arm/tcg/translate-a64.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static uint64_t bitfield_replicate(uint64_t mask, unsigned int e)
     return mask;
 }
 
-/* Return a value with the bottom len bits set (where 0 < len <= 64) */
-static inline uint64_t bitmask64(unsigned int length)
-{
-    assert(length > 0 && length <= 64);
-    return ~0ULL >> (64 - length);
-}
-
 /* Simplified variant of pseudocode DecodeBitMasks() for the case where we
  * only require the wmask. Returns false if the imms/immr/immn are a reserved
  * value (ie should cause a guest UNDEF exception), and true if they are
@@ -XXX,XX +XXX,XX @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
     /* Create the value of one element: s+1 set bits rotated
      * by r within the element (which is e bits wide)...
      */
-    mask = bitmask64(s + 1);
+    mask = MAKE_64BIT_MASK(0, s + 1);
     if (r) {
         mask = (mask >> r) | (mask << (e - r));
-        mask &= bitmask64(e);
+        mask &= MAKE_64BIT_MASK(0, e);
     }
     /* ...then replicate the element over the whole 64 bit value */
     mask = bitfield_replicate(mask, e);
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Convert the ADD, ORR, EOR, ANDS (immediate) instructions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230512144106.3608981-10-peter.maydell@linaro.org
[PMM: rebased]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/a64.decode      | 15 ++++++
 target/arm/tcg/translate-a64.c | 94 +++++++++++-----------------------
 2 files changed, 44 insertions(+), 65 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@ SUBS_i          . 11 100010 1 ............ ..... .....  @addsub_imm12
 
 ADDG_i          1 00 100011 0 ...... 00 .... ..... ..... @addsub_imm_tag
 SUBG_i          1 10 100011 0 ...... 00 .... ..... ..... @addsub_imm_tag
+
+# Logical (immediate)
+
+&rri_log        rd rn sf dbm
+@logic_imm_64   1 .. ...... dbm:13 rn:5 rd:5            &rri_log sf=1
+@logic_imm_32   0 .. ...... 0 dbm:12 rn:5 rd:5          &rri_log sf=0
+
+AND_i           . 00 100100 . ...... ...... ..... ..... @logic_imm_64
+AND_i           . 00 100100 . ...... ...... ..... ..... @logic_imm_32
+ORR_i           . 01 100100 . ...... ...... ..... ..... @logic_imm_64
+ORR_i           . 01 100100 . ...... ...... ..... ..... @logic_imm_32
+EOR_i           . 10 100100 . ...... ...... ..... ..... @logic_imm_64
+EOR_i           . 10 100100 . ...... ...... ..... ..... @logic_imm_32
+ANDS_i          . 11 100100 . ...... ...... ..... ..... @logic_imm_64
+ANDS_i          . 11 100100 . ...... ...... ..... ..... @logic_imm_32
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static uint64_t bitfield_replicate(uint64_t mask, unsigned int e)
     return mask;
 }
 
-/* Simplified variant of pseudocode DecodeBitMasks() for the case where we
+/*
+ * Logical (immediate)
+ */
+
+/*
+ * Simplified variant of pseudocode DecodeBitMasks() for the case where we
  * only require the wmask. Returns false if the imms/immr/immn are a reserved
  * value (ie should cause a guest UNDEF exception), and true if they are
  * valid, in which case the decoded bit pattern is written to result.
@@ -XXX,XX +XXX,XX @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
     return true;
 }
 
-/* Logical (immediate)
- *   31  30 29 28         23 22  21  16 15  10 9    5 4    0
- * +----+-----+-------------+---+------+------+------+------+
- * | sf | opc | 1 0 0 1 0 0 | N | immr | imms |  Rn  |  Rd  |
- * +----+-----+-------------+---+------+------+------+------+
- */
-static void disas_logic_imm(DisasContext *s, uint32_t insn)
+static bool gen_rri_log(DisasContext *s, arg_rri_log *a, bool set_cc,
+                        void (*fn)(TCGv_i64, TCGv_i64, int64_t))
 {
-    unsigned int sf, opc, is_n, immr, imms, rn, rd;
     TCGv_i64 tcg_rd, tcg_rn;
-    uint64_t wmask;
-    bool is_and = false;
+    uint64_t imm;
 
-    sf = extract32(insn, 31, 1);
-    opc = extract32(insn, 29, 2);
-    is_n = extract32(insn, 22, 1);
-    immr = extract32(insn, 16, 6);
-    imms = extract32(insn, 10, 6);
-    rn = extract32(insn, 5, 5);
-    rd = extract32(insn, 0, 5);
-
-    if (!sf && is_n) {
-        unallocated_encoding(s);
-        return;
+    /* Some immediate field values are reserved. */
+    if (!logic_imm_decode_wmask(&imm, extract32(a->dbm, 12, 1),
+                                extract32(a->dbm, 0, 6),
+                                extract32(a->dbm, 6, 6))) {
+        return false;
+    }
+    if (!a->sf) {
+        imm &= 0xffffffffull;
     }
 
-    if (opc == 0x3) { /* ANDS */
-        tcg_rd = cpu_reg(s, rd);
-    } else {
-        tcg_rd = cpu_reg_sp(s, rd);
-    }
-    tcg_rn = cpu_reg(s, rn);
+    tcg_rd = set_cc ? cpu_reg(s, a->rd) : cpu_reg_sp(s, a->rd);
+    tcg_rn = cpu_reg(s, a->rn);
 
-    if (!logic_imm_decode_wmask(&wmask, is_n, imms, immr)) {
-        /* some immediate field values are reserved */
-        unallocated_encoding(s);
-        return;
+    fn(tcg_rd, tcg_rn, imm);
+    if (set_cc) {
+        gen_logic_CC(a->sf, tcg_rd);
     }
-
-    if (!sf) {
-        wmask &= 0xffffffff;
-    }
-
-    switch (opc) {
-    case 0x3: /* ANDS */
-    case 0x0: /* AND */
-        tcg_gen_andi_i64(tcg_rd, tcg_rn, wmask);
-        is_and = true;
-        break;
-    case 0x1: /* ORR */
-        tcg_gen_ori_i64(tcg_rd, tcg_rn, wmask);
-        break;
-    case 0x2: /* EOR */
-        tcg_gen_xori_i64(tcg_rd, tcg_rn, wmask);
-        break;
-    default:
-        assert(FALSE); /* must handle all above */
-        break;
-    }
-
-    if (!sf && !is_and) {
-        /* zero extend final result; we know we can skip this for AND
-         * since the immediate had the high 32 bits clear.
-         */
+    if (!a->sf) {
         tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
     }
-
-    if (opc == 3) { /* ANDS */
-        gen_logic_CC(sf, tcg_rd);
-    }
+    return true;
 }
 
+TRANS(AND_i, gen_rri_log, a, false, tcg_gen_andi_i64)
+TRANS(ORR_i, gen_rri_log, a, false, tcg_gen_ori_i64)
+TRANS(EOR_i, gen_rri_log, a, false, tcg_gen_xori_i64)
+TRANS(ANDS_i, gen_rri_log, a, true, tcg_gen_andi_i64)
+
 /*
  * Move wide (immediate)
  *
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
 static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 23, 6)) {
-    case 0x24: /* Logical (immediate) */
-        disas_logic_imm(s, insn);
-        break;
     case 0x25: /* Move wide (immediate) */
         disas_movw_imm(s, insn);
         break;
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Convert the MON, MOVZ, MOVK instructions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230512144106.3608981-11-peter.maydell@linaro.org
[PMM: Rebased]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/a64.decode      | 13 ++++++
 target/arm/tcg/translate-a64.c | 73 ++++++++++++++--------------------
 2 files changed, 42 insertions(+), 44 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Convert the BFM, SBFM, UBFM instructions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230512144106.3608981-12-peter.maydell@linaro.org
[PMM: Rebased]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/a64.decode      |  13 +++
 target/arm/tcg/translate-a64.c | 144 ++++++++++++++++++---------------
 2 files changed, 94 insertions(+), 63 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@ MOVZ            . 10 100101 .. ................ .....   @movw_64
 MOVZ            . 10 100101 .. ................ .....   @movw_32
 MOVK            . 11 100101 .. ................ .....   @movw_64
 MOVK            . 11 100101 .. ................ .....   @movw_32
+
+# Bitfield
+
+&bitfield       rd rn sf immr imms
+@bitfield_64    1 .. ...... 1 immr:6 imms:6 rn:5 rd:5      &bitfield sf=1
+@bitfield_32    0 .. ...... 0 0 immr:5 0 imms:5 rn:5 rd:5  &bitfield sf=0
+
+SBFM            . 00 100110 . ...... ...... ..... ..... @bitfield_64
+SBFM            . 00 100110 . ...... ...... ..... ..... @bitfield_32
+BFM             . 01 100110 . ...... ...... ..... ..... @bitfield_64
+BFM             . 01 100110 . ...... ...... ..... ..... @bitfield_32
+UBFM            . 10 100110 . ...... ...... ..... ..... @bitfield_64
+UBFM            . 10 100110 . ...... ...... ..... ..... @bitfield_32
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool trans_MOVK(DisasContext *s, arg_movw *a)
     return true;
 }
 
-/* Bitfield
- *   31  30 29 28         23 22  21  16 15  10 9    5 4    0
- * +----+-----+-------------+---+------+------+------+------+
- * | sf | opc | 1 0 0 1 1 0 | N | immr | imms |  Rn  |  Rd  |
- * +----+-----+-------------+---+------+------+------+------+
+/*
+ * Bitfield
  */
-static void disas_bitfield(DisasContext *s, uint32_t insn)
+
+static bool trans_SBFM(DisasContext *s, arg_SBFM *a)
 {
-    unsigned int sf, n, opc, ri, si, rn, rd, bitsize, pos, len;
-    TCGv_i64 tcg_rd, tcg_tmp;
+    TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
+    TCGv_i64 tcg_tmp = read_cpu_reg(s, a->rn, 1);
+    unsigned int bitsize = a->sf ? 64 : 32;
+    unsigned int ri = a->immr;
+    unsigned int si = a->imms;
+    unsigned int pos, len;
 
-    sf = extract32(insn, 31, 1);
-    opc = extract32(insn, 29, 2);
-    n = extract32(insn, 22, 1);
-    ri = extract32(insn, 16, 6);
-    si = extract32(insn, 10, 6);
-    rn = extract32(insn, 5, 5);
-    rd = extract32(insn, 0, 5);
-    bitsize = sf ? 64 : 32;
-
-    if (sf != n || ri >= bitsize || si >= bitsize || opc > 2) {
-        unallocated_encoding(s);
-        return;
-    }
-
-    tcg_rd = cpu_reg(s, rd);
-
-    /* Suppress the zero-extend for !sf.  Since RI and SI are constrained
-       to be smaller than bitsize, we'll never reference data outside the
-       low 32-bits anyway.  */
-    tcg_tmp = read_cpu_reg(s, rn, 1);
-
-    /* Recognize simple(r) extractions.  */
     if (si >= ri) {
         /* Wd<s-r:0> = Wn<s:r> */
         len = (si - ri) + 1;
-        if (opc == 0) { /* SBFM: ASR, SBFX, SXTB, SXTH, SXTW */
-            tcg_gen_sextract_i64(tcg_rd, tcg_tmp, ri, len);
-            goto done;
-        } else if (opc == 2) { /* UBFM: UBFX, LSR, UXTB, UXTH */
-            tcg_gen_extract_i64(tcg_rd, tcg_tmp, ri, len);
-            return;
+        tcg_gen_sextract_i64(tcg_rd, tcg_tmp, ri, len);
+        if (!a->sf) {
+            tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
         }
-        /* opc == 1, BFXIL fall through to deposit */
+    } else {
+        /* Wd<32+s-r,32-r> = Wn<s:0> */
+        len = si + 1;
+        pos = (bitsize - ri) & (bitsize - 1);
+
+        if (len < ri) {
+            /*
+             * Sign extend the destination field from len to fill the
+             * balance of the word.  Let the deposit below insert all
+             * of those sign bits.
+             */
+            tcg_gen_sextract_i64(tcg_tmp, tcg_tmp, 0, len);
+            len = ri;
+        }
+
+        /*
+         * We start with zero, and we haven't modified any bits outside
+         * bitsize, therefore no final zero-extension is unneeded for !sf.
+         */
+        tcg_gen_deposit_z_i64(tcg_rd, tcg_tmp, pos, len);
+    }
+    return true;
+}
+
+static bool trans_UBFM(DisasContext *s, arg_UBFM *a)
+{
+    TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
+    TCGv_i64 tcg_tmp = read_cpu_reg(s, a->rn, 1);
+    unsigned int bitsize = a->sf ? 64 : 32;
+    unsigned int ri = a->immr;
+    unsigned int si = a->imms;
+    unsigned int pos, len;
+
+    tcg_rd = cpu_reg(s, a->rd);
+    tcg_tmp = read_cpu_reg(s, a->rn, 1);
+
+    if (si >= ri) {
+        /* Wd<s-r:0> = Wn<s:r> */
+        len = (si - ri) + 1;
+        tcg_gen_extract_i64(tcg_rd, tcg_tmp, ri, len);
+    } else {
+        /* Wd<32+s-r,32-r> = Wn<s:0> */
+        len = si + 1;
+        pos = (bitsize - ri) & (bitsize - 1);
+        tcg_gen_deposit_z_i64(tcg_rd, tcg_tmp, pos, len);
+    }
+    return true;
+}
+
+static bool trans_BFM(DisasContext *s, arg_BFM *a)
+{
+    TCGv_i64 tcg_rd = cpu_reg(s, a->rd);
+    TCGv_i64 tcg_tmp = read_cpu_reg(s, a->rn, 1);
+    unsigned int bitsize = a->sf ? 64 : 32;
+    unsigned int ri = a->immr;
+    unsigned int si = a->imms;
+    unsigned int pos, len;
+
+    tcg_rd = cpu_reg(s, a->rd);
+    tcg_tmp = read_cpu_reg(s, a->rn, 1);
+
+    if (si >= ri) {
+        /* Wd<s-r:0> = Wn<s:r> */
         tcg_gen_shri_i64(tcg_tmp, tcg_tmp, ri);
+        len = (si - ri) + 1;
         pos = 0;
     } else {
-        /* Handle the ri > si case with a deposit
-         * Wd<32+s-r,32-r> = Wn<s:0>
-         */
+        /* Wd<32+s-r,32-r> = Wn<s:0> */
         len = si + 1;
         pos = (bitsize - ri) & (bitsize - 1);
     }
 
-    if (opc == 0 && len < ri) {
-        /* SBFM: sign extend the destination field from len to fill
-           the balance of the word.  Let the deposit below insert all
-           of those sign bits.  */
-        tcg_gen_sextract_i64(tcg_tmp, tcg_tmp, 0, len);
-        len = ri;
-    }
-
-    if (opc == 1) { /* BFM, BFXIL */
-        tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, pos, len);
-    } else {
-        /* SBFM or UBFM: We start with zero, and we haven't modified
-           any bits outside bitsize, therefore the zero-extension
-           below is unneeded.  */
-        tcg_gen_deposit_z_i64(tcg_rd, tcg_tmp, pos, len);
-        return;
-    }
-
- done:
-    if (!sf) { /* zero extend final result */
+    tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, pos, len);
+    if (!a->sf) {
         tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
     }
+    return true;
 }
 
 /* Extract
@@ -XXX,XX +XXX,XX @@ static void disas_extract(DisasContext *s, uint32_t insn)
 static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 23, 6)) {
-    case 0x26: /* Bitfield */
-        disas_bitfield(s, insn);
-        break;
     case 0x27: /* Extract */
         disas_extract(s, insn);
         break;
-- 
2.34.1

Convert the EXTR instruction to decodetree (this is the
only one in the 'Extract" class). This is the last of
the dp-immediate insns in the legacy decoder, so we
can now remove disas_data_proc_imm().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-13-peter.maydell@linaro.org
---
 target/arm/tcg/a64.decode      |  7 +++
 target/arm/tcg/translate-a64.c | 94 +++++++++++-----------------------
 2 files changed, 36 insertions(+), 65 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@ BFM             . 01 100110 . ...... ...... ..... ..... @bitfield_64
 BFM             . 01 100110 . ...... ...... ..... ..... @bitfield_32
 UBFM            . 10 100110 . ...... ...... ..... ..... @bitfield_64
 UBFM            . 10 100110 . ...... ...... ..... ..... @bitfield_32
+
+# Extract
+
+&extract        rd rn rm imm sf
+
+EXTR            1 00 100111 1 0 rm:5 imm:6 rn:5 rd:5     &extract sf=1
+EXTR            0 00 100111 0 0 rm:5 0 imm:5 rn:5 rd:5   &extract sf=0
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool trans_BFM(DisasContext *s, arg_BFM *a)
     return true;
 }
 
-/* Extract
- *   31  30  29 28         23 22   21  20  16 15    10 9    5 4    0
- * +----+------+-------------+---+----+------+--------+------+------+
- * | sf | op21 | 1 0 0 1 1 1 | N | o0 |  Rm  |  imms  |  Rn  |  Rd  |
- * +----+------+-------------+---+----+------+--------+------+------+
- */
-static void disas_extract(DisasContext *s, uint32_t insn)
+static bool trans_EXTR(DisasContext *s, arg_extract *a)
 {
-    unsigned int sf, n, rm, imm, rn, rd, bitsize, op21, op0;
+    TCGv_i64 tcg_rd, tcg_rm, tcg_rn;
 
-    sf = extract32(insn, 31, 1);
-    n = extract32(insn, 22, 1);
-    rm = extract32(insn, 16, 5);
-    imm = extract32(insn, 10, 6);
-    rn = extract32(insn, 5, 5);
-    rd = extract32(insn, 0, 5);
-    op21 = extract32(insn, 29, 2);
-    op0 = extract32(insn, 21, 1);
-    bitsize = sf ? 64 : 32;
+    tcg_rd = cpu_reg(s, a->rd);
 
-    if (sf != n || op21 || op0 || imm >= bitsize) {
-        unallocated_encoding(s);
-    } else {
-        TCGv_i64 tcg_rd, tcg_rm, tcg_rn;
-
-        tcg_rd = cpu_reg(s, rd);
-
-        if (unlikely(imm == 0)) {
-            /* tcg shl_i32/shl_i64 is undefined for 32/64 bit shifts,
-             * so an extract from bit 0 is a special case.
-             */
-            if (sf) {
-                tcg_gen_mov_i64(tcg_rd, cpu_reg(s, rm));
-            } else {
-                tcg_gen_ext32u_i64(tcg_rd, cpu_reg(s, rm));
-            }
+    if (unlikely(a->imm == 0)) {
+        /*
+         * tcg shl_i32/shl_i64 is undefined for 32/64 bit shifts,
+         * so an extract from bit 0 is a special case.
+         */
+        if (a->sf) {
+            tcg_gen_mov_i64(tcg_rd, cpu_reg(s, a->rm));
         } else {
-            tcg_rm = cpu_reg(s, rm);
-            tcg_rn = cpu_reg(s, rn);
+            tcg_gen_ext32u_i64(tcg_rd, cpu_reg(s, a->rm));
+        }
+    } else {
+        tcg_rm = cpu_reg(s, a->rm);
+        tcg_rn = cpu_reg(s, a->rn);
 
-            if (sf) {
-                /* Specialization to ROR happens in EXTRACT2.  */
-                tcg_gen_extract2_i64(tcg_rd, tcg_rm, tcg_rn, imm);
+        if (a->sf) {
+            /* Specialization to ROR happens in EXTRACT2.  */
+            tcg_gen_extract2_i64(tcg_rd, tcg_rm, tcg_rn, a->imm);
+        } else {
+            TCGv_i32 t0 = tcg_temp_new_i32();
+
+            tcg_gen_extrl_i64_i32(t0, tcg_rm);
+            if (a->rm == a->rn) {
+                tcg_gen_rotri_i32(t0, t0, a->imm);
             } else {
-                TCGv_i32 t0 = tcg_temp_new_i32();
-
-                tcg_gen_extrl_i64_i32(t0, tcg_rm);
-                if (rm == rn) {
-                    tcg_gen_rotri_i32(t0, t0, imm);
-                } else {
-                    TCGv_i32 t1 = tcg_temp_new_i32();
-                    tcg_gen_extrl_i64_i32(t1, tcg_rn);
-                    tcg_gen_extract2_i32(t0, t0, t1, imm);
-                }
-                tcg_gen_extu_i32_i64(tcg_rd, t0);
+                TCGv_i32 t1 = tcg_temp_new_i32();
+                tcg_gen_extrl_i64_i32(t1, tcg_rn);
+                tcg_gen_extract2_i32(t0, t0, t1, a->imm);
             }
+            tcg_gen_extu_i32_i64(tcg_rd, t0);
         }
     }
-}
-
-/* Data processing - immediate */
-static void disas_data_proc_imm(DisasContext *s, uint32_t insn)
-{
-    switch (extract32(insn, 23, 6)) {
-    case 0x27: /* Extract */
-        disas_extract(s, insn);
-        break;
-    default:
-        unallocated_encoding(s);
-        break;
-    }
+    return true;
 }
 
 /* Shift a TCGv src by TCGv shift_amount, put result in dst.
@@ -XXX,XX +XXX,XX @@ static bool btype_destination_ok(uint32_t insn, bool bt, int btype)
 static void disas_a64_legacy(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 25, 4)) {
-    case 0x8: case 0x9: /* Data processing - immediate */
-        disas_data_proc_imm(s, insn);
-        break;
     case 0xa: case 0xb: /* Branch, exception generation and system insns */
         disas_b_exc_sys(s, insn);
         break;
-- 
2.34.1

Convert the unconditional branch immediate insns B and BL to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-14-peter.maydell@linaro.org
---
 target/arm/tcg/a64.decode      |  9 +++++++++
 target/arm/tcg/translate-a64.c | 31 +++++++++++--------------------
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@
 
 &ri              rd imm
 &rri_sf          rd rn imm sf
+&i               imm
 
 
 ### Data Processing - Immediate
@@ -XXX,XX +XXX,XX @@ UBFM            . 10 100110 . ...... ...... ..... ..... @bitfield_32
 
 EXTR            1 00 100111 1 0 rm:5 imm:6 rn:5 rd:5     &extract sf=1
 EXTR            0 00 100111 0 0 rm:5 0 imm:5 rn:5 rd:5   &extract sf=0
+
+# Branches
+
+%imm26   0:s26 !function=times_4
+@branch         . ..... .......................... &i imm=%imm26
+
+B               0 00101 .......................... @branch
+BL              1 00101 .......................... @branch
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static inline AArch64DecodeFn *lookup_disas_fn(const AArch64DecodeTable *table,
  * match up with those in the manual.
  */
 
-/* Unconditional branch (immediate)
- *   31  30       26 25                                  0
- * +----+-----------+-------------------------------------+
- * | op | 0 0 1 0 1 |                 imm26               |
- * +----+-----------+-------------------------------------+
- */
-static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
+static bool trans_B(DisasContext *s, arg_i *a)
 {
-    int64_t diff = sextract32(insn, 0, 26) * 4;
-
-    if (insn & (1U << 31)) {
-        /* BL Branch with link */
-        gen_pc_plus_diff(s, cpu_reg(s, 30), curr_insn_len(s));
-    }
-
-    /* B Branch / BL Branch with link */
     reset_btype(s);
-    gen_goto_tb(s, 0, diff);
+    gen_goto_tb(s, 0, a->imm);
+    return true;
+}
+
+static bool trans_BL(DisasContext *s, arg_i *a)
+{
+    gen_pc_plus_diff(s, cpu_reg(s, 30), curr_insn_len(s));
+    reset_btype(s);
+    gen_goto_tb(s, 0, a->imm);
+    return true;
 }
 
 /* Compare and branch (immediate)
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
 static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 25, 7)) {
-    case 0x0a: case 0x0b:
-    case 0x4a: case 0x4b: /* Unconditional branch (immediate) */
-        disas_uncond_b_imm(s, insn);
-        break;
     case 0x1a: case 0x5a: /* Compare & branch (immediate) */
         disas_comp_b_imm(s, insn);
         break;
-- 
2.34.1

Convert the compare-and-branch-immediate insns CBZ and CBNZ
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-15-peter.maydell@linaro.org
---
 target/arm/tcg/a64.decode      |  5 +++++
 target/arm/tcg/translate-a64.c | 26 ++++++--------------------
 2 files changed, 11 insertions(+), 20 deletions(-)

Convert the test-and-branch-immediate insns TBZ and TBNZ
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-16-peter.maydell@linaro.org
---
 target/arm/tcg/a64.decode      |  6 ++++++
 target/arm/tcg/translate-a64.c | 25 +++++--------------------
 2 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@ BL              1 00101 .......................... @branch
 &cbz     rt imm sf nz
 
 CBZ             sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
+
+%imm14     5:s14 !function=times_4
+%imm31_19  31:1 19:5
+&tbz       rt imm nz bitpos
+
+TBZ             . 011011 nz:1 ..... .............. rt:5 &tbz  imm=%imm14 bitpos=%imm31_19
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool trans_CBZ(DisasContext *s, arg_cbz *a)
     return true;
 }
 
-/* Test and branch (immediate)
- *   31  30         25  24  23   19 18          5 4    0
- * +----+-------------+----+-------+-------------+------+
- * | b5 | 0 1 1 0 1 1 | op |  b40  |    imm14    |  Rt  |
- * +----+-------------+----+-------+-------------+------+
- */
-static void disas_test_b_imm(DisasContext *s, uint32_t insn)
+static bool trans_TBZ(DisasContext *s, arg_tbz *a)
 {
-    unsigned int bit_pos, op, rt;
-    int64_t diff;
     DisasLabel match;
     TCGv_i64 tcg_cmp;
 
-    bit_pos = (extract32(insn, 31, 1) << 5) | extract32(insn, 19, 5);
-    op = extract32(insn, 24, 1); /* 0: TBZ; 1: TBNZ */
-    diff = sextract32(insn, 5, 14) * 4;
-    rt = extract32(insn, 0, 5);
-
     tcg_cmp = tcg_temp_new_i64();
-    tcg_gen_andi_i64(tcg_cmp, cpu_reg(s, rt), (1ULL << bit_pos));
+    tcg_gen_andi_i64(tcg_cmp, cpu_reg(s, a->rt), 1ULL << a->bitpos);
 
     reset_btype(s);
 
     match = gen_disas_label(s);
-    tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
+    tcg_gen_brcondi_i64(a->nz ? TCG_COND_NE : TCG_COND_EQ,
                         tcg_cmp, 0, match.label);
     gen_goto_tb(s, 0, 4);
     set_disas_label(s, match);
-    gen_goto_tb(s, 1, diff);
+    gen_goto_tb(s, 1, a->imm);
+    return true;
 }
 
 /* Conditional branch (immediate)
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
 static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 25, 7)) {
-    case 0x1b: case 0x5b: /* Test & branch (immediate) */
-        disas_test_b_imm(s, insn);
-        break;
     case 0x2a: /* Conditional branch (immediate) */
         disas_cond_b_imm(s, insn);
         break;
-- 
2.34.1

Convert the immediate conditional branch insn B.cond to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-17-peter.maydell@linaro.org
---
 target/arm/tcg/a64.decode      |  2 ++
 target/arm/tcg/translate-a64.c | 30 ++++++------------------------
 2 files changed, 8 insertions(+), 24 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@ CBZ             sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
 &tbz       rt imm nz bitpos
 
 TBZ             . 011011 nz:1 ..... .............. rt:5 &tbz  imm=%imm14 bitpos=%imm31_19
+
+B_cond          0101010 0 ................... 0 cond:4 imm=%imm19
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool trans_TBZ(DisasContext *s, arg_tbz *a)
     return true;
 }
 
-/* Conditional branch (immediate)
- *  31           25  24  23                  5   4  3    0
- * +---------------+----+---------------------+----+------+
- * | 0 1 0 1 0 1 0 | o1 |         imm19       | o0 | cond |
- * +---------------+----+---------------------+----+------+
- */
-static void disas_cond_b_imm(DisasContext *s, uint32_t insn)
+static bool trans_B_cond(DisasContext *s, arg_B_cond *a)
 {
-    unsigned int cond;
-    int64_t diff;
-
-    if ((insn & (1 << 4)) || (insn & (1 << 24))) {
-        unallocated_encoding(s);
-        return;
-    }
-    diff = sextract32(insn, 5, 19) * 4;
-    cond = extract32(insn, 0, 4);
-
     reset_btype(s);
-    if (cond < 0x0e) {
+    if (a->cond < 0x0e) {
         /* genuinely conditional branches */
         DisasLabel match = gen_disas_label(s);
-        arm_gen_test_cc(cond, match.label);
+        arm_gen_test_cc(a->cond, match.label);
         gen_goto_tb(s, 0, 4);
         set_disas_label(s, match);
-        gen_goto_tb(s, 1, diff);
+        gen_goto_tb(s, 1, a->imm);
     } else {
         /* 0xe and 0xf are both "always" conditions */
-        gen_goto_tb(s, 0, diff);
+        gen_goto_tb(s, 0, a->imm);
     }
+    return true;
 }
 
 /* HINT instruction group, including various allocated HINTs */
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
 static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 25, 7)) {
-    case 0x2a: /* Conditional branch (immediate) */
-        disas_cond_b_imm(s, insn);
-        break;
     case 0x6a: /* Exception generation / System */
         if (insn & (1 << 24)) {
             if (extract32(insn, 22, 2) == 0) {
-- 
2.34.1

Convert the simple (non-pointer-auth) BR, BLR and RET insns
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-18-peter.maydell@linaro.org
---
 target/arm/tcg/a64.decode      |  5 ++++
 target/arm/tcg/translate-a64.c | 55 ++++++++++++++++++++++++++++++----
 2 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@
 # This file is processed by scripts/decodetree.py
 #
 
+&r               rn
 &ri              rd imm
 &rri_sf          rd rn imm sf
 &i               imm
@@ -XXX,XX +XXX,XX @@ CBZ             sf:1 011010 nz:1 ................... rt:5 &cbz imm=%imm19
 TBZ             . 011011 nz:1 ..... .............. rt:5 &tbz  imm=%imm14 bitpos=%imm31_19
 
 B_cond          0101010 0 ................... 0 cond:4 imm=%imm19
+
+BR              1101011 0000 11111 000000 rn:5 00000 &r
+BLR             1101011 0001 11111 000000 rn:5 00000 &r
+RET             1101011 0010 11111 000000 rn:5 00000 &r
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool trans_B_cond(DisasContext *s, arg_B_cond *a)
     return true;
 }
 
+static void set_btype_for_br(DisasContext *s, int rn)
+{
+    if (dc_isar_feature(aa64_bti, s)) {
+        /* BR to {x16,x17} or !guard -> 1, else 3.  */
+        set_btype(s, rn == 16 || rn == 17 || !s->guarded_page ? 1 : 3);
+    }
+}
+
+static void set_btype_for_blr(DisasContext *s)
+{
+    if (dc_isar_feature(aa64_bti, s)) {
+        /* BLR sets BTYPE to 2, regardless of source guarded page.  */
+        set_btype(s, 2);
+    }
+}
+
+static bool trans_BR(DisasContext *s, arg_r *a)
+{
+    gen_a64_set_pc(s, cpu_reg(s, a->rn));
+    set_btype_for_br(s, a->rn);
+    s->base.is_jmp = DISAS_JUMP;
+    return true;
+}
+
+static bool trans_BLR(DisasContext *s, arg_r *a)
+{
+    TCGv_i64 dst = cpu_reg(s, a->rn);
+    TCGv_i64 lr = cpu_reg(s, 30);
+    if (dst == lr) {
+        TCGv_i64 tmp = tcg_temp_new_i64();
+        tcg_gen_mov_i64(tmp, dst);
+        dst = tmp;
+    }
+    gen_pc_plus_diff(s, lr, curr_insn_len(s));
+    gen_a64_set_pc(s, dst);
+    set_btype_for_blr(s);
+    s->base.is_jmp = DISAS_JUMP;
+    return true;
+}
+
+static bool trans_RET(DisasContext *s, arg_r *a)
+{
+    gen_a64_set_pc(s, cpu_reg(s, a->rn));
+    s->base.is_jmp = DISAS_JUMP;
+    return true;
+}
+
 /* HINT instruction group, including various allocated HINTs */
 static void handle_hint(DisasContext *s, uint32_t insn,
                         unsigned int op1, unsigned int op2, unsigned int crm)
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
         btype_mod = opc;
         switch (op3) {
         case 0:
-            /* BR, BLR, RET */
-            if (op4 != 0) {
-                goto do_unallocated;
-            }
-            dst = cpu_reg(s, rn);
-            break;
+            /* BR, BLR, RET : handled in decodetree */
+            goto do_unallocated;
 
         case 2:
         case 3:
-- 
2.34.1

Convert the single-register pointer-authentication variants of BR,
BLR, RET to decodetree. (BRAA/BLRAA are in a different branch of
the legacy decoder and will be dealt with in the next commit.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-19-peter.maydell@linaro.org
---
 target/arm/tcg/a64.decode      |   7 ++
 target/arm/tcg/translate-a64.c | 132 +++++++++++++++++++--------------
 2 files changed, 84 insertions(+), 55 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@ B_cond          0101010 0 ................... 0 cond:4 imm=%imm19
 BR              1101011 0000 11111 000000 rn:5 00000 &r
 BLR             1101011 0001 11111 000000 rn:5 00000 &r
 RET             1101011 0010 11111 000000 rn:5 00000 &r
+
+&braz       rn m
+BRAZ            1101011 0000 11111 00001 m:1 rn:5 11111 &braz   # BRAAZ, BRABZ
+BLRAZ           1101011 0001 11111 00001 m:1 rn:5 11111 &braz   # BLRAAZ, BLRABZ
+
+&reta       m
+RETA            1101011 0010 11111 00001 m:1 11111 11111 &reta  # RETAA, RETAB
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool trans_RET(DisasContext *s, arg_r *a)
     return true;
 }
 
+static TCGv_i64 auth_branch_target(DisasContext *s, TCGv_i64 dst,
+                                   TCGv_i64 modifier, bool use_key_a)
+{
+    TCGv_i64 truedst;
+    /*
+     * Return the branch target for a BRAA/RETA/etc, which is either
+     * just the destination dst, or that value with the pauth check
+     * done and the code removed from the high bits.
+     */
+    if (!s->pauth_active) {
+        return dst;
+    }
+
+    truedst = tcg_temp_new_i64();
+    if (use_key_a) {
+        gen_helper_autia(truedst, cpu_env, dst, modifier);
+    } else {
+        gen_helper_autib(truedst, cpu_env, dst, modifier);
+    }
+    return truedst;
+}
+
+static bool trans_BRAZ(DisasContext *s, arg_braz *a)
+{
+    TCGv_i64 dst;
+
+    if (!dc_isar_feature(aa64_pauth, s)) {
+        return false;
+    }
+
+    dst = auth_branch_target(s, cpu_reg(s, a->rn), tcg_constant_i64(0), !a->m);
+    gen_a64_set_pc(s, dst);
+    set_btype_for_br(s, a->rn);
+    s->base.is_jmp = DISAS_JUMP;
+    return true;
+}
+
+static bool trans_BLRAZ(DisasContext *s, arg_braz *a)
+{
+    TCGv_i64 dst, lr;
+
+    if (!dc_isar_feature(aa64_pauth, s)) {
+        return false;
+    }
+
+    dst = auth_branch_target(s, cpu_reg(s, a->rn), tcg_constant_i64(0), !a->m);
+    lr = cpu_reg(s, 30);
+    if (dst == lr) {
+        TCGv_i64 tmp = tcg_temp_new_i64();
+        tcg_gen_mov_i64(tmp, dst);
+        dst = tmp;
+    }
+    gen_pc_plus_diff(s, lr, curr_insn_len(s));
+    gen_a64_set_pc(s, dst);
+    set_btype_for_blr(s);
+    s->base.is_jmp = DISAS_JUMP;
+    return true;
+}
+
+static bool trans_RETA(DisasContext *s, arg_reta *a)
+{
+    TCGv_i64 dst;
+
+    dst = auth_branch_target(s, cpu_reg(s, 30), cpu_X[31], !a->m);
+    gen_a64_set_pc(s, dst);
+    s->base.is_jmp = DISAS_JUMP;
+    return true;
+}
+
 /* HINT instruction group, including various allocated HINTs */
 static void handle_hint(DisasContext *s, uint32_t insn,
                         unsigned int op1, unsigned int op2, unsigned int crm)
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
     }
 
     switch (opc) {
-    case 0: /* BR */
-    case 1: /* BLR */
-    case 2: /* RET */
-        btype_mod = opc;
-        switch (op3) {
-        case 0:
-            /* BR, BLR, RET : handled in decodetree */
-            goto do_unallocated;
-
-        case 2:
-        case 3:
-            if (!dc_isar_feature(aa64_pauth, s)) {
-                goto do_unallocated;
-            }
-            if (opc == 2) {
-                /* RETAA, RETAB */
-                if (rn != 0x1f || op4 != 0x1f) {
-                    goto do_unallocated;
-                }
-                rn = 30;
-                modifier = cpu_X[31];
-            } else {
-                /* BRAAZ, BRABZ, BLRAAZ, BLRABZ */
-                if (op4 != 0x1f) {
-                    goto do_unallocated;
-                }
-                modifier = tcg_constant_i64(0);
-            }
-            if (s->pauth_active) {
-                dst = tcg_temp_new_i64();
-                if (op3 == 2) {
-                    gen_helper_autia(dst, cpu_env, cpu_reg(s, rn), modifier);
-                } else {
-                    gen_helper_autib(dst, cpu_env, cpu_reg(s, rn), modifier);
-                }
-            } else {
-                dst = cpu_reg(s, rn);
-            }
-            break;
-
-        default:
-            goto do_unallocated;
-        }
-        /* BLR also needs to load return address */
-        if (opc == 1) {
-            TCGv_i64 lr = cpu_reg(s, 30);
-            if (dst == lr) {
-                TCGv_i64 tmp = tcg_temp_new_i64();
-                tcg_gen_mov_i64(tmp, dst);
-                dst = tmp;
-            }
-            gen_pc_plus_diff(s, lr, curr_insn_len(s));
-        }
-        gen_a64_set_pc(s, dst);
-        break;
+    case 0:
+    case 1:
+    case 2:
+        /*
+         * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ:
+         * handled in decodetree
+         */
+        goto do_unallocated;
 
     case 8: /* BRAA */
     case 9: /* BLRAA */
-- 
2.34.1

Convert the last four BR-with-pointer-auth insns to decodetree.
The remaining cases in the outer switch in disas_uncond_b_reg()
all return early rather than leaving the case statement, so we
can delete the now-unused code at the end of that function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-20-peter.maydell@linaro.org
---
 target/arm/tcg/a64.decode      |  4 ++
 target/arm/tcg/translate-a64.c | 97 ++++++++++++++--------------------
 2 files changed, 43 insertions(+), 58 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@ BLRAZ           1101011 0001 11111 00001 m:1 rn:5 11111 &braz   # BLRAAZ, BLRABZ
 
 &reta       m
 RETA            1101011 0010 11111 00001 m:1 11111 11111 &reta  # RETAA, RETAB
+
+&bra        rn rm m
+BRA             1101011 1000 11111 00001 m:1 rn:5 rm:5 &bra # BRAA, BRAB
+BLRA            1101011 1001 11111 00001 m:1 rn:5 rm:5 &bra # BLRAA, BLRAB
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool trans_RETA(DisasContext *s, arg_reta *a)
     return true;
 }
 
+static bool trans_BRA(DisasContext *s, arg_bra *a)
+{
+    TCGv_i64 dst;
+
+    if (!dc_isar_feature(aa64_pauth, s)) {
+        return false;
+    }
+    dst = auth_branch_target(s, cpu_reg(s,a->rn), cpu_reg_sp(s, a->rm), !a->m);
+    gen_a64_set_pc(s, dst);
+    set_btype_for_br(s, a->rn);
+    s->base.is_jmp = DISAS_JUMP;
+    return true;
+}
+
+static bool trans_BLRA(DisasContext *s, arg_bra *a)
+{
+    TCGv_i64 dst, lr;
+
+    if (!dc_isar_feature(aa64_pauth, s)) {
+        return false;
+    }
+    dst = auth_branch_target(s, cpu_reg(s, a->rn), cpu_reg_sp(s, a->rm), !a->m);
+    lr = cpu_reg(s, 30);
+    if (dst == lr) {
+        TCGv_i64 tmp = tcg_temp_new_i64();
+        tcg_gen_mov_i64(tmp, dst);
+        dst = tmp;
+    }
+    gen_pc_plus_diff(s, lr, curr_insn_len(s));
+    gen_a64_set_pc(s, dst);
+    set_btype_for_blr(s);
+    s->base.is_jmp = DISAS_JUMP;
+    return true;
+}
+
 /* HINT instruction group, including various allocated HINTs */
 static void handle_hint(DisasContext *s, uint32_t insn,
                         unsigned int op1, unsigned int op2, unsigned int crm)
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
 static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
 {
     unsigned int opc, op2, op3, rn, op4;
-    unsigned btype_mod = 2;   /* 0: BR, 1: BLR, 2: other */
     TCGv_i64 dst;
     TCGv_i64 modifier;
 
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
     case 0:
     case 1:
     case 2:
+    case 8:
+    case 9:
         /*
-         * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ:
-         * handled in decodetree
+         * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ,
+         * BRAA, BLRAA: handled in decodetree
          */
         goto do_unallocated;
 
-    case 8: /* BRAA */
-    case 9: /* BLRAA */
-        if (!dc_isar_feature(aa64_pauth, s)) {
-            goto do_unallocated;
-        }
-        if ((op3 & ~1) != 2) {
-            goto do_unallocated;
-        }
-        btype_mod = opc & 1;
-        if (s->pauth_active) {
-            dst = tcg_temp_new_i64();
-            modifier = cpu_reg_sp(s, op4);
-            if (op3 == 2) {
-                gen_helper_autia(dst, cpu_env, cpu_reg(s, rn), modifier);
-            } else {
-                gen_helper_autib(dst, cpu_env, cpu_reg(s, rn), modifier);
-            }
-        } else {
-            dst = cpu_reg(s, rn);
-        }
-        /* BLRAA also needs to load return address */
-        if (opc == 9) {
-            TCGv_i64 lr = cpu_reg(s, 30);
-            if (dst == lr) {
-                TCGv_i64 tmp = tcg_temp_new_i64();
-                tcg_gen_mov_i64(tmp, dst);
-                dst = tmp;
-            }
-            gen_pc_plus_diff(s, lr, curr_insn_len(s));
-        }
-        gen_a64_set_pc(s, dst);
-        break;
-
     case 4: /* ERET */
         if (s->current_el == 0) {
             goto do_unallocated;
@@ -XXX,XX +XXX,XX @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
         unallocated_encoding(s);
         return;
     }
-
-    switch (btype_mod) {
-    case 0: /* BR */
-        if (dc_isar_feature(aa64_bti, s)) {
-            /* BR to {x16,x17} or !guard -> 1, else 3.  */
-            set_btype(s, rn == 16 || rn == 17 || !s->guarded_page ? 1 : 3);
-        }
-        break;
-
-    case 1: /* BLR */
-        if (dc_isar_feature(aa64_bti, s)) {
-            /* BLR sets BTYPE to 2, regardless of source guarded page.  */
-            set_btype(s, 2);
-        }
-        break;
-
-    default: /* RET or none of the above.  */
-        /* BTYPE will be set to 0 by normal end-of-insn processing.  */
-        break;
-    }
-
-    s->base.is_jmp = DISAS_JUMP;
 }
 
 /* Branches, exception generating and system instructions */
-- 
2.34.1

Convert the exception-return insns ERET, ERETA and ERETB to
decodetree. These were the last insns left in the legacy
decoder function disas_uncond_reg_b(), which allows us to
remove it.

The old decoder explicitly decoded the DRPS instruction,
only in order to call unallocated_encoding() on it, exactly
as would have happened if it hadn't decoded it. This is
because this insn always UNDEFs unless the CPU is in
halting-debug state, which we don't emulate. So we list
the pattern in a comment in a64.decode, but don't actively
decode it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-21-peter.maydell@linaro.org
---
 target/arm/tcg/a64.decode      |   8 ++
 target/arm/tcg/translate-a64.c | 163 +++++++++++----------------------
 2 files changed, 63 insertions(+), 108 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -XXX,XX +XXX,XX @@ RETA            1101011 0010 11111 00001 m:1 11111 11111 &reta  # RETAA, RETAB
 &bra        rn rm m
 BRA             1101011 1000 11111 00001 m:1 rn:5 rm:5 &bra # BRAA, BRAB
 BLRA            1101011 1001 11111 00001 m:1 rn:5 rm:5 &bra # BLRAA, BLRAB
+
+ERET            1101011 0100 11111 000000 11111 00000
+ERETA           1101011 0100 11111 00001 m:1 11111 11111 &reta  # ERETAA, ERETAB
+
+# We don't need to decode DRPS because it always UNDEFs except when
+# the processor is in halting debug state (which we don't implement).
+# The pattern is listed here as documentation.
+# DRPS            1101011 0101 11111 000000 11111 00000
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static bool trans_BLRA(DisasContext *s, arg_bra *a)
     return true;
 }
 
+static bool trans_ERET(DisasContext *s, arg_ERET *a)
+{
+    TCGv_i64 dst;
+
+    if (s->current_el == 0) {
+        return false;
+    }
+    if (s->fgt_eret) {
+        gen_exception_insn_el(s, 0, EXCP_UDEF, 0, 2);
+        return true;
+    }
+    dst = tcg_temp_new_i64();
+    tcg_gen_ld_i64(dst, cpu_env,
+                   offsetof(CPUARMState, elr_el[s->current_el]));
+
+    if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+        gen_io_start();
+    }
+
+    gen_helper_exception_return(cpu_env, dst);
+    /* Must exit loop to check un-masked IRQs */
+    s->base.is_jmp = DISAS_EXIT;
+    return true;
+}
+
+static bool trans_ERETA(DisasContext *s, arg_reta *a)
+{
+    TCGv_i64 dst;
+
+    if (!dc_isar_feature(aa64_pauth, s)) {
+        return false;
+    }
+    if (s->current_el == 0) {
+        return false;
+    }
+    /* The FGT trap takes precedence over an auth trap. */
+    if (s->fgt_eret) {
+        gen_exception_insn_el(s, 0, EXCP_UDEF, a->m ? 3 : 2, 2);
+        return true;
+    }
+    dst = tcg_temp_new_i64();
+    tcg_gen_ld_i64(dst, cpu_env,
+                   offsetof(CPUARMState, elr_el[s->current_el]));
+
+    dst = auth_branch_target(s, dst, cpu_X[31], !a->m);
+    if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+        gen_io_start();
+    }
+
+    gen_helper_exception_return(cpu_env, dst);
+    /* Must exit loop to check un-masked IRQs */
+    s->base.is_jmp = DISAS_EXIT;
+    return true;
+}
+
 /* HINT instruction group, including various allocated HINTs */
 static void handle_hint(DisasContext *s, uint32_t insn,
                         unsigned int op1, unsigned int op2, unsigned int crm)
@@ -XXX,XX +XXX,XX @@ static void disas_exc(DisasContext *s, uint32_t insn)
     }
 }
 
-/* Unconditional branch (register)
- *  31           25 24   21 20   16 15   10 9    5 4     0
- * +---------------+-------+-------+-------+------+-------+
- * | 1 1 0 1 0 1 1 |  opc  |  op2  |  op3  |  Rn  |  op4  |
- * +---------------+-------+-------+-------+------+-------+
- */
-static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
-{
-    unsigned int opc, op2, op3, rn, op4;
-    TCGv_i64 dst;
-    TCGv_i64 modifier;
-
-    opc = extract32(insn, 21, 4);
-    op2 = extract32(insn, 16, 5);
-    op3 = extract32(insn, 10, 6);
-    rn = extract32(insn, 5, 5);
-    op4 = extract32(insn, 0, 5);
-
-    if (op2 != 0x1f) {
-        goto do_unallocated;
-    }
-
-    switch (opc) {
-    case 0:
-    case 1:
-    case 2:
-    case 8:
-    case 9:
-        /*
-         * BR, BLR, RET, RETAA, RETAB, BRAAZ, BRABZ, BLRAAZ, BLRABZ,
-         * BRAA, BLRAA: handled in decodetree
-         */
-        goto do_unallocated;
-
-    case 4: /* ERET */
-        if (s->current_el == 0) {
-            goto do_unallocated;
-        }
-        switch (op3) {
-        case 0: /* ERET */
-            if (op4 != 0) {
-                goto do_unallocated;
-            }
-            if (s->fgt_eret) {
-                gen_exception_insn_el(s, 0, EXCP_UDEF, syn_erettrap(op3), 2);
-                return;
-            }
-            dst = tcg_temp_new_i64();
-            tcg_gen_ld_i64(dst, cpu_env,
-                           offsetof(CPUARMState, elr_el[s->current_el]));
-            break;
-
-        case 2: /* ERETAA */
-        case 3: /* ERETAB */
-            if (!dc_isar_feature(aa64_pauth, s)) {
-                goto do_unallocated;
-            }
-            if (rn != 0x1f || op4 != 0x1f) {
-                goto do_unallocated;
-            }
-            /* The FGT trap takes precedence over an auth trap. */
-            if (s->fgt_eret) {
-                gen_exception_insn_el(s, 0, EXCP_UDEF, syn_erettrap(op3), 2);
-                return;
-            }
-            dst = tcg_temp_new_i64();
-            tcg_gen_ld_i64(dst, cpu_env,
-                           offsetof(CPUARMState, elr_el[s->current_el]));
-            if (s->pauth_active) {
-                modifier = cpu_X[31];
-                if (op3 == 2) {
-                    gen_helper_autia(dst, cpu_env, dst, modifier);
-                } else {
-                    gen_helper_autib(dst, cpu_env, dst, modifier);
-                }
-            }
-            break;
-
-        default:
-            goto do_unallocated;
-        }
-        if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-            gen_io_start();
-        }
-
-        gen_helper_exception_return(cpu_env, dst);
-        /* Must exit loop to check un-masked IRQs */
-        s->base.is_jmp = DISAS_EXIT;
-        return;
-
-    case 5: /* DRPS */
-        if (op3 != 0 || op4 != 0 || rn != 0x1f) {
-            goto do_unallocated;
-        } else {
-            unallocated_encoding(s);
-        }
-        return;
-
-    default:
-    do_unallocated:
-        unallocated_encoding(s);
-        return;
-    }
-}
-
 /* Branches, exception generating and system instructions */
 static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
 {
@@ -XXX,XX +XXX,XX @@ static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
             disas_exc(s, insn);
         }
         break;
-    case 0x6b: /* Unconditional branch (register) */
-        disas_uncond_b_reg(s, insn);
-        break;
     default:
         unallocated_encoding(s);
         break;
-- 
2.34.1

The IMPDEF sysreg L2CTLR_EL1 found on the Cortex-A35, A53, A57, A72
and which we (arguably dubiously) also provide in '-cpu max' has a
2 bit field for the number of processors in the cluster. On real
hardware this must be sufficient because it can only be configured
with up to 4 CPUs in the cluster. However on QEMU if the board code
does not explicitly configure the code into clusters with the right
CPU count we default to "give the value assuming that all CPUs in
the system are in a single cluster", which might be too big to fit
in the field.

Instead of just overflowing this 2-bit field, saturate to 3 (meaning
"4 CPUs", so at least we don't overwrite other fields in the register.
It's unlikely that any guest code really cares about the value in
this field; at least, if it does it probably also wants the system
to be more closely matching real hardware, i.e. not to have more
than 4 CPUs.

This issue has been present since the L2CTLR was first added in
commit 377a44ec8f2fac5b back in 2014. It was only noticed because
Coverity complains (CID 1509227) that the shift might overflow 32 bits
and inadvertently sign extend into the top half of the 64 bit value.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512170223.3801643-2-peter.maydell@linaro.org
---
 target/arm/cortex-regs.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/target/arm/cortex-regs.c b/target/arm/cortex-regs.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cortex-regs.c
+++ b/target/arm/cortex-regs.c
@@ -XXX,XX +XXX,XX @@ static uint64_t l2ctlr_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
     ARMCPU *cpu = env_archcpu(env);
 
-    /* Number of cores is in [25:24]; otherwise we RAZ */
-    return (cpu->core_count - 1) << 24;
+    /*
+     * Number of cores is in [25:24]; otherwise we RAZ.
+     * If the board didn't configure the CPUs into clusters,
+     * we default to "all CPUs in one cluster", which might be
+     * more than the 4 that the hardware permits and which is
+     * all you can report in this two-bit field. Saturate to
+     * 0b11 (== 4 CPUs) rather than overflowing the field.
+     */
+    return MIN(cpu->core_count - 1, 3) << 24;
 }
 
 static const ARMCPRegInfo cortex_a72_a57_a53_cp_reginfo[] = {
-- 
2.34.1

In the vexpress board code, we allocate a new MemoryRegion at the top
of vexpress_common_init() but only set it up and use it inside the
"if (map[VE_NORFLASHALIAS] != -1)" conditional, so we leak it if not.
This isn't a very interesting leak as it's a tiny amount of memory
once at startup, but it's easy to fix.

We could silence Coverity simply by moving the g_new() into the
if() block, but this use of g_new(MemoryRegion, 1) is a legacy from
when this board model was originally written; we wouldn't do that
if we wrote it today. The MemoryRegions are conceptually a part of
the board and must not go away until the whole board is done with
(at the end of the simulation), so they belong in its state struct.

This machine already has a VexpressMachineState struct that extends
MachineState, so statically put the MemoryRegions in there instead of
dynamically allocating them separately at runtime.

Spotted by Coverity (CID 1509083).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230512170223.3801643-3-peter.maydell@linaro.org
---
 hw/arm/vexpress.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/vexpress.c
+++ b/hw/arm/vexpress.c
@@ -XXX,XX +XXX,XX @@ struct VexpressMachineClass {
 
 struct VexpressMachineState {
     MachineState parent;
+    MemoryRegion vram;
+    MemoryRegion sram;
+    MemoryRegion flashalias;
+    MemoryRegion lowram;
+    MemoryRegion a15sram;
     bool secure;
     bool virt;
 };
@@ -XXX,XX +XXX,XX @@ struct VexpressMachineState {
 #define TYPE_VEXPRESS_A15_MACHINE   MACHINE_TYPE_NAME("vexpress-a15")
 OBJECT_DECLARE_TYPE(VexpressMachineState, VexpressMachineClass, VEXPRESS_MACHINE)
 
-typedef void DBoardInitFn(const VexpressMachineState *machine,
+typedef void DBoardInitFn(VexpressMachineState *machine,
                           ram_addr_t ram_size,
                           const char *cpu_type,
                           qemu_irq *pic);
@@ -XXX,XX +XXX,XX @@ static void init_cpus(MachineState *ms, const char *cpu_type,
     }
 }
 
-static void a9_daughterboard_init(const VexpressMachineState *vms,
+static void a9_daughterboard_init(VexpressMachineState *vms,
                                   ram_addr_t ram_size,
                                   const char *cpu_type,
                                   qemu_irq *pic)
 {
     MachineState *machine = MACHINE(vms);
     MemoryRegion *sysmem = get_system_memory();
-    MemoryRegion *lowram = g_new(MemoryRegion, 1);
     ram_addr_t low_ram_size;
 
     if (ram_size > 0x40000000) {
@@ -XXX,XX +XXX,XX @@ static void a9_daughterboard_init(const VexpressMachineState *vms,
      * address space should in theory be remappable to various
      * things including ROM or RAM; we always map the RAM there.
      */
-    memory_region_init_alias(lowram, NULL, "vexpress.lowmem", machine->ram,
-                             0, low_ram_size);
-    memory_region_add_subregion(sysmem, 0x0, lowram);
+    memory_region_init_alias(&vms->lowram, NULL, "vexpress.lowmem",
+                             machine->ram, 0, low_ram_size);
+    memory_region_add_subregion(sysmem, 0x0, &vms->lowram);
     memory_region_add_subregion(sysmem, 0x60000000, machine->ram);
 
     /* 0x1e000000 A9MPCore (SCU) private memory region */
@@ -XXX,XX +XXX,XX @@ static VEDBoardInfo a9_daughterboard = {
     .init = a9_daughterboard_init,
 };
 
-static void a15_daughterboard_init(const VexpressMachineState *vms,
+static void a15_daughterboard_init(VexpressMachineState *vms,
                                    ram_addr_t ram_size,
                                    const char *cpu_type,
                                    qemu_irq *pic)
 {
     MachineState *machine = MACHINE(vms);
     MemoryRegion *sysmem = get_system_memory();
-    MemoryRegion *sram = g_new(MemoryRegion, 1);
 
     {
         /* We have to use a separate 64 bit variable here to avoid the gcc
@@ -XXX,XX +XXX,XX @@ static void a15_daughterboard_init(const VexpressMachineState *vms,
     /* 0x2b060000: SP805 watchdog: not modelled */
     /* 0x2b0a0000: PL341 dynamic memory controller: not modelled */
     /* 0x2e000000: system SRAM */
-    memory_region_init_ram(sram, NULL, "vexpress.a15sram", 0x10000,
+    memory_region_init_ram(&vms->a15sram, NULL, "vexpress.a15sram", 0x10000,
                            &error_fatal);
-    memory_region_add_subregion(sysmem, 0x2e000000, sram);
+    memory_region_add_subregion(sysmem, 0x2e000000, &vms->a15sram);
 
     /* 0x7ffb0000: DMA330 DMA controller: not modelled */
     /* 0x7ffd0000: PL354 static memory controller: not modelled */
@@ -XXX,XX +XXX,XX @@ static void vexpress_common_init(MachineState *machine)
     I2CBus *i2c;
     ram_addr_t vram_size, sram_size;
     MemoryRegion *sysmem = get_system_memory();
-    MemoryRegion *vram = g_new(MemoryRegion, 1);
-    MemoryRegion *sram = g_new(MemoryRegion, 1);
-    MemoryRegion *flashalias = g_new(MemoryRegion, 1);
-    MemoryRegion *flash0mem;
     const hwaddr *map = daughterboard->motherboard_map;
     int i;
 
@@ -XXX,XX +XXX,XX @@ static void vexpress_common_init(MachineState *machine)
 
     if (map[VE_NORFLASHALIAS] != -1) {
         /* Map flash 0 as an alias into low memory */
+        MemoryRegion *flash0mem;
         flash0mem = sysbus_mmio_get_region(SYS_BUS_DEVICE(pflash0), 0);
-        memory_region_init_alias(flashalias, NULL, "vexpress.flashalias",
+        memory_region_init_alias(&vms->flashalias, NULL, "vexpress.flashalias",
                                  flash0mem, 0, VEXPRESS_FLASH_SIZE);
-        memory_region_add_subregion(sysmem, map[VE_NORFLASHALIAS], flashalias);
+        memory_region_add_subregion(sysmem, map[VE_NORFLASHALIAS], &vms->flashalias);
     }
 
     dinfo = drive_get(IF_PFLASH, 0, 1);
     ve_pflash_cfi01_register(map[VE_NORFLASH1], "vexpress.flash1", dinfo);
 
     sram_size = 0x2000000;
-    memory_region_init_ram(sram, NULL, "vexpress.sram", sram_size,
+    memory_region_init_ram(&vms->sram, NULL, "vexpress.sram", sram_size,
                            &error_fatal);
-    memory_region_add_subregion(sysmem, map[VE_SRAM], sram);
+    memory_region_add_subregion(sysmem, map[VE_SRAM], &vms->sram);
 
     vram_size = 0x800000;
-    memory_region_init_ram(vram, NULL, "vexpress.vram", vram_size,
+    memory_region_init_ram(&vms->vram, NULL, "vexpress.vram", vram_size,
                            &error_fatal);
-    memory_region_add_subregion(sysmem, map[VE_VIDEORAM], vram);
+    memory_region_add_subregion(sysmem, map[VE_VIDEORAM], &vms->vram);
 
     /* 0x4e000000 LAN9118 Ethernet */
     if (nd_table[0].used) {
-- 
2.34.1

Convert the u2f.txt file to rST, and place it in the right place
in our manual layout. The old text didn't fit very well into our
manual style, so the new version ends up looking like a rewrite,
although some of the original text is preserved:

* the 'building' section of the old file is removed, since we
   generally assume that users have already built QEMU
 * some rather verbose text has been cut back
 * document the passthrough device first, on the assumption
   that's most likely to be of interest to users
 * cut back on the duplication of text between sections
 * format example command lines etc with rST

As it's a short document it seemed simplest to do this all
in one go rather than try to do a minimal syntactic conversion
and then clean up the wording and layout.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20230421163734.1152076-1-peter.maydell@linaro.org
---
 docs/system/device-emulation.rst |   1 +
 docs/system/devices/usb-u2f.rst  |  93 ++++++++++++++++++++++++++
 docs/system/devices/usb.rst      |   2 +-
 docs/u2f.txt                     | 110 -------------------------------
 4 files changed, 95 insertions(+), 111 deletions(-)
 create mode 100644 docs/system/devices/usb-u2f.rst
 delete mode 100644 docs/u2f.txt

diff --git a/docs/system/device-emulation.rst b/docs/system/device-emulation.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/device-emulation.rst
+++ b/docs/system/device-emulation.rst
@@ -XXX,XX +XXX,XX @@ Emulated Devices
    devices/virtio-pmem.rst
    devices/vhost-user-rng.rst
    devices/canokey.rst
+   devices/usb-u2f.rst
    devices/igb.rst
diff --git a/docs/system/devices/usb-u2f.rst b/docs/system/devices/usb-u2f.rst
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/docs/system/devices/usb-u2f.rst
@@ -XXX,XX +XXX,XX @@
+Universal Second Factor (U2F) USB Key Device
+============================================
+
+U2F is an open authentication standard that enables relying parties
+exposed to the internet to offer a strong second factor option for end
+user authentication.
+
+The second factor is provided by a device implementing the U2F
+protocol. In case of a USB U2F security key, it is a USB HID device
+that implements the U2F protocol.
+
+QEMU supports both pass-through of a host U2F key device to a VM,
+and software emulation of a U2F key.
+
+``u2f-passthru``
+----------------
+
+The ``u2f-passthru`` device allows you to connect a real hardware
+U2F key on your host to a guest VM. All requests made from the guest
+are passed through to the physical security key connected to the
+host machine and vice versa.
+
+In addition, the dedicated pass-through allows you to share a single
+U2F security key with several guest VMs, which is not possible with a
+simple host device assignment pass-through.
+
+You can specify the host U2F key to use with the ``hidraw``
+option, which takes the host path to a Linux ``/dev/hidrawN`` device:
+
+.. parsed-literal::
+   |qemu_system| -usb -device u2f-passthru,hidraw=/dev/hidraw0
+
+If you don't specify the device, the ``u2f-passthru`` device will
+autoscan to take the first U2F device it finds on the host (this
+requires a working libudev):
+
+.. parsed-literal::
+   |qemu_system| -usb -device u2f-passthru
+
+``u2f-emulated``
+----------------
+
+``u2f-emulated`` is a completely software emulated U2F device.
+It uses `libu2f-emu <https://github.com/MattGorko/libu2f-emu>`__
+for the U2F key emulation. libu2f-emu
+provides a complete implementation of the U2F protocol device part for
+all specified transports given by the FIDO Alliance.
+
+To work, an emulated U2F device must have four elements:
+
+ * ec x509 certificate
+ * ec private key
+ * counter (four bytes value)
+ * 48 bytes of entropy (random bits)
+
+To use this type of device, these have to be configured, and these
+four elements must be passed one way or another.
+
+Assuming that you have a working libu2f-emu installed on the host,
+there are three possible ways to configure the ``u2f-emulated`` device:
+
+ * ephemeral
+ * setup directory
+ * manual
+
+Ephemeral is the simplest way to configure; it lets the device generate
+all the elements it needs for a single use of the lifetime of the device.
+It is the default if you do not pass any other options to the device.
+
+.. parsed-literal::
+   |qemu_system| -usb -device u2f-emulated
+
+You can pass the device the path of a setup directory on the host
+using the ``dir`` option; the directory must contain these four files:
+
+ * ``certificate.pem``: ec x509 certificate
+ * ``private-key.pem``: ec private key
+ * ``counter``: counter value
+ * ``entropy``: 48 bytes of entropy
+
+.. parsed-literal::
+   |qemu_system| -usb -device u2f-emulated,dir=$dir
+
+You can also manually pass the device the paths to each of these files,
+if you don't want them all to be in the same directory, using the options
+
+ * ``cert``
+ * ``priv``
+ * ``counter``
+ * ``entropy``
+
+.. parsed-literal::
+   |qemu_system| -usb -device u2f-emulated,cert=$DIR1/$FILE1,priv=$DIR2/$FILE2,counter=$DIR3/$FILE3,entropy=$DIR4/$FILE4
diff --git a/docs/system/devices/usb.rst b/docs/system/devices/usb.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/devices/usb.rst
+++ b/docs/system/devices/usb.rst
@@ -XXX,XX +XXX,XX @@ option or the ``device_add`` monitor command. Available devices are:
    USB audio device
 
 ``u2f-{emulated,passthru}``
-   Universal Second Factor device
+   :doc:`usb-u2f`
 
 ``canokey``
    An Open-source Secure Key implementing FIDO2, OpenPGP, PIV and more.
diff --git a/docs/u2f.txt b/docs/u2f.txt
deleted file mode 100644
index XXXXXXX..XXXXXXX
--- a/docs/u2f.txt
+++ /dev/null
@@ -XXX,XX +XXX,XX @@
-QEMU U2F Key Device Documentation.
-
-Contents
-1. USB U2F key device
-2. Building
-3. Using u2f-emulated
-4. Using u2f-passthru
-5. Libu2f-emu
-
-1. USB U2F key device
-
-U2F is an open authentication standard that enables relying parties
-exposed to the internet to offer a strong second factor option for end
-user authentication.
-
-The standard brings many advantages to both parties, client and server,
-allowing to reduce over-reliance on passwords, it increases authentication
-security and simplifies passwords.
-
-The second factor is materialized by a device implementing the U2F
-protocol. In case of a USB U2F security key, it is a USB HID device
-that implements the U2F protocol.
-
-In QEMU, the USB U2F key device offers a dedicated support of U2F, allowing
-guest USB FIDO/U2F security keys operating in two possible modes:
-pass-through and emulated.
-
-The pass-through mode consists of passing all requests made from the guest
-to the physical security key connected to the host machine and vice versa.
-In addition, the dedicated pass-through allows to have a U2F security key
-shared on several guests which is not possible with a simple host device
-assignment pass-through.
-
-The emulated mode consists of completely emulating the behavior of an
-U2F device through software part. Libu2f-emu is used for that.
-
-
-2. Building
-
-To ensure the build of the u2f-emulated device variant which depends
-on libu2f-emu: configuring and building:
-
-    ./configure --enable-u2f && make
-
-The pass-through mode is built by default on Linux. To take advantage
-of the autoscan option it provides, make sure you have a working libudev
-installed on the host.
-
-
-3. Using u2f-emulated
-
-To work, an emulated U2F device must have four elements:
- * ec x509 certificate
- * ec private key
- * counter (four bytes value)
- * 48 bytes of entropy (random bits)
-
-To use this type of device, this one has to be configured, and these
-four elements must be passed one way or another.
-
-Assuming that you have a working libu2f-emu installed on the host.
-There are three possible ways of configurations:
- * ephemeral
- * setup directory
- * manual
-
-Ephemeral is the simplest way to configure, it lets the device generate
-all the elements it needs for a single use of the lifetime of the device.
-
-    qemu -usb -device u2f-emulated
-
-Setup directory allows to configure the device from a directory containing
-four files:
- * certificate.pem: ec x509 certificate
- * private-key.pem: ec private key
- * counter: counter value
- * entropy: 48 bytes of entropy
-
-    qemu -usb -device u2f-emulated,dir=$dir
-
-Manual allows to configure the device more finely by specifying each
-of the elements necessary for the device:
- * cert
- * priv
- * counter
- * entropy
-
-    qemu -usb -device u2f-emulated,cert=$DIR1/$FILE1,priv=$DIR2/$FILE2,counter=$DIR3/$FILE3,entropy=$DIR4/$FILE4
-
-
-4. Using u2f-passthru
-
-On the host specify the u2f-passthru device with a suitable hidraw:
-
-    qemu -usb -device u2f-passthru,hidraw=/dev/hidraw0
-
-Alternately, the u2f-passthru device can autoscan to take the first
-U2F device it finds on the host (this requires a working libudev):
-
-    qemu -usb -device u2f-passthru
-
-
-5. Libu2f-emu
-
-The u2f-emulated device uses libu2f-emu for the U2F key emulation. Libu2f-emu
-implements completely the U2F protocol device part for all specified
-transport given by the FIDO Alliance.
-
-For more information about libu2f-emu see this page:
-https://github.com/MattGorko/libu2f-emu.
-- 
2.34.1