Series comparison

-[Qemu-devel] [PULL 00/39] target-arm queue
+[PULL 00/45] target-arm queue
-Second pull request of the week; mostly RTH's support for some
+Mostly this is patches from me and RTH cleaning up and doing
-new-in-v8.1/v8.3 instructions, and my v8M board model.
+more decodetree conversion for AArch32 Neon. The major new feature
 is Dongjiu Geng's patchset to report host memory errors to KVM guests;
 also a new aspeed board from Patrick Williams.
 thanks
 -- PMM
-The following changes since commit 427cbc7e4136a061628cb4315cc8182ea36d772f:
+The following changes since commit 035b448b84f3557206abc44d786c5d3db2638f7d:
-  Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging (2018-03-01 18:46:41 +0000)
+  Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-05-14' into staging (2020-05-14 10:58:30 +0100)
 are available in the Git repository at:
-  git://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20180302
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200514
-for you to fetch changes up to e66a67bf28e1b4fce2e3d72a2610dbd48d9d3078:
+for you to fetch changes up to e95485f85657be21135c17a9226e297c21e73360:
-  target/arm: Enable ARM_FEATURE_V8_FCMA (2018-03-02 11:03:45 +0000)
+  target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree (2020-05-14 15:03:09 +0100)
 ----------------------------------------------------------------
 target-arm queue:
- * implement FCMA and RDM v8.1 and v8.3 instructions
+ * target/arm: Use correct GDB XML for M-profile cores
- * enable Cortex-M33 v8M core, and provide new mps2-an505 board model
+ * target/arm: Code cleanup to use gvec APIs better
-   that uses it
+ * aspeed: Add support for the sonorapass-bmc board
- * decodetree: Propagate return value from translate subroutines
+ * target/arm: Support reporting KVM host memory errors
- * xlnx-zynqmp: Implement the RTC device
+   to the guest via ACPI notifications
  * target/arm: Finish conversion of Neon 3-reg-same insns to decodetree
 ----------------------------------------------------------------
-Alistair Francis (3):
+Dongjiu Geng (10):
-      xlnx-zynqmp-rtc: Initial commit
+      acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
-      xlnx-zynqmp-rtc: Add basic time support
+      hw/arm/virt: Introduce a RAS machine option
-      xlnx-zynqmp: Connect the RTC device
+      docs: APEI GHES generation and CPER record description
       ACPI: Build related register address fields via hardware error fw_cfg blob
       ACPI: Build Hardware Error Source Table
       ACPI: Record the Generic Error Status Block address
       KVM: Move hwpoison page related functions into kvm-all.c
       ACPI: Record Generic Error Status Block(GESB) table
       target-arm: kvm64: handle SIGBUS signal from kernel or KVM
       MAINTAINERS: Add ACPI/HEST/GHES entries
-Peter Maydell (19):
+Patrick Williams (1):
-      loader: Add new load_ramdisk_as()
+      aspeed: Add support for the sonorapass-bmc board
       hw/arm/boot: Honour CPU's address space for image loads
       hw/arm/armv7m: Honour CPU's address space for image loads
       target/arm: Define an IDAU interface
       armv7m: Forward idau property to CPU object
       target/arm: Define init-svtor property for the reset secure VTOR value
       armv7m: Forward init-svtor property to CPU object
       target/arm: Add Cortex-M33
       hw/misc/unimp: Move struct to header file
       include/hw/or-irq.h: Add missing include guard
       qdev: Add new qdev_init_gpio_in_named_with_opaque()
       hw/core/split-irq: Device that splits IRQ lines
       hw/misc/mps2-fpgaio: FPGA control block for MPS2 AN505
       hw/misc/tz-ppc: Model TrustZone peripheral protection controller
       hw/misc/iotkit-secctl: Arm IoT Kit security controller initial skeleton
       hw/misc/iotkit-secctl: Add handling for PPCs
       hw/misc/iotkit-secctl: Add remaining simple registers
       hw/arm/iotkit: Model Arm IOT Kit
       mps2-an505: New board model: MPS2 with AN505 Cortex-M33 FPGA image
-Richard Henderson (17):
+Peter Maydell (18):
-      decodetree: Propagate return value from translate subroutines
+      target/arm: Use correct GDB XML for M-profile cores
-      target/arm: Add ARM_FEATURE_V8_RDM
+      target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
-      target/arm: Refactor disas_simd_indexed decode
+      target/arm: Convert Neon 3-reg-same SHA to decodetree
-      target/arm: Refactor disas_simd_indexed size checks
+      target/arm: Convert Neon 64-bit element 3-reg-same insns
-      target/arm: Decode aa64 armv8.1 scalar three same extra
+      target/arm: Convert Neon VHADD 3-reg-same insns
-      target/arm: Decode aa64 armv8.1 three same extra
+      target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
-      target/arm: Decode aa64 armv8.1 scalar/vector x indexed element
+      target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
-      target/arm: Decode aa32 armv8.1 three same
+      target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
-      target/arm: Decode aa32 armv8.1 two reg and a scalar
+      target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
-      target/arm: Enable ARM_FEATURE_V8_RDM
+      target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
-      target/arm: Add ARM_FEATURE_V8_FCMA
+      target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
-      target/arm: Decode aa64 armv8.3 fcadd
+      target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
-      target/arm: Decode aa64 armv8.3 fcmla
+      target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
-      target/arm: Decode aa32 armv8.3 3-same
+      target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
-      target/arm: Decode aa32 armv8.3 2-reg-index
+      target/arm: Convert Neon 3-reg-same compare insns to decodetree
-      target/arm: Decode t32 simd 3reg and 2reg_scalar extension
+      target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
-      target/arm: Enable ARM_FEATURE_V8_FCMA
+      target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
       target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree
- hw/arm/Makefile.objs               |   2 +
+Richard Henderson (16):
- hw/core/Makefile.objs              |   1 +
+      target/arm: Create gen_gvec_[us]sra
- hw/misc/Makefile.objs              |   4 +
+      target/arm: Create gen_gvec_{u,s}{rshr,rsra}
- hw/timer/Makefile.objs             |   1 +
+      target/arm: Create gen_gvec_{sri,sli}
- target/arm/Makefile.objs           |   2 +-
+      target/arm: Remove unnecessary range check for VSHL
- include/hw/arm/armv7m.h            |   5 +
+      target/arm: Tidy handle_vec_simd_shri
- include/hw/arm/iotkit.h            | 109 ++++++
+      target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
- include/hw/arm/xlnx-zynqmp.h       |   2 +
+      target/arm: Create gen_gvec_{mla,mls}
- include/hw/core/split-irq.h        |  57 +++
+      target/arm: Swap argument order for VSHL during decode
- include/hw/irq.h                   |   4 +-
+      target/arm: Create gen_gvec_{cmtst,ushl,sshl}
- include/hw/loader.h                |  12 +-
+      target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
- include/hw/misc/iotkit-secctl.h    | 103 ++++++
+      target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
- include/hw/misc/mps2-fpgaio.h      |  43 +++
+      target/arm: Create gen_gvec_{qrdmla,qrdmls}
- include/hw/misc/tz-ppc.h           | 101 ++++++
+      target/arm: Pass pointer to qc to qrdmla/qrdmls
- include/hw/misc/unimp.h            |  10 +
+      target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
- include/hw/or-irq.h                |   5 +
+      target/arm: Vectorize SABD/UABD
- include/hw/qdev-core.h             |  30 +-
+      target/arm: Vectorize SABA/UABA
  include/hw/timer/xlnx-zynqmp-rtc.h |  86 +++++
  target/arm/cpu.h                   |   8 +
  target/arm/helper.h                |  31 ++
  target/arm/idau.h                  |  61 ++++
  hw/arm/armv7m.c                    |  35 +-
  hw/arm/boot.c                      | 119 ++++---
  hw/arm/iotkit.c                    | 598 +++++++++++++++++++++++++++++++
  hw/arm/mps2-tz.c                   | 503 ++++++++++++++++++++++++++
  hw/arm/xlnx-zynqmp.c               |  14 +
  hw/core/loader.c                   |   8 +-
  hw/core/qdev.c                     |   8 +-
  hw/core/split-irq.c                |  89 +++++
  hw/misc/iotkit-secctl.c            | 704 +++++++++++++++++++++++++++++++++++++
  hw/misc/mps2-fpgaio.c              | 176 ++++++++++
  hw/misc/tz-ppc.c                   | 302 ++++++++++++++++
  hw/misc/unimp.c                    |  10 -
  hw/timer/xlnx-zynqmp-rtc.c         | 272 ++++++++++++++
  linux-user/elfload.c               |   2 +
  target/arm/cpu.c                   |  66 +++-
  target/arm/cpu64.c                 |   2 +
  target/arm/helper.c                |  28 +-
  target/arm/translate-a64.c         | 514 +++++++++++++++++++++------
  target/arm/translate.c             | 275 +++++++++++++--
  target/arm/vec_helper.c            | 429 ++++++++++++++++++++++
  default-configs/arm-softmmu.mak    |   5 +
  hw/misc/trace-events               |  24 ++
  hw/timer/trace-events              |   3 +
  scripts/decodetree.py              |   5 +-
 files changed, 4668 insertions(+), 200 deletions(-)
  create mode 100644 include/hw/arm/iotkit.h
  create mode 100644 include/hw/core/split-irq.h
  create mode 100644 include/hw/misc/iotkit-secctl.h
  create mode 100644 include/hw/misc/mps2-fpgaio.h
  create mode 100644 include/hw/misc/tz-ppc.h
  create mode 100644 include/hw/timer/xlnx-zynqmp-rtc.h
  create mode 100644 target/arm/idau.h
  create mode 100644 hw/arm/iotkit.c
  create mode 100644 hw/arm/mps2-tz.c
  create mode 100644 hw/core/split-irq.c
  create mode 100644 hw/misc/iotkit-secctl.c
  create mode 100644 hw/misc/mps2-fpgaio.c
  create mode 100644 hw/misc/tz-ppc.c
  create mode 100644 hw/timer/xlnx-zynqmp-rtc.c
  create mode 100644 target/arm/vec_helper.c
+ docs/specs/acpi_hest_ghes.rst          |  110 ++
+ docs/specs/index.rst                   |    1 +
+ configure                              |    4 +-
+ default-configs/arm-softmmu.mak        |    1 +
+ include/hw/acpi/aml-build.h            |    1 +
+ include/hw/acpi/generic_event_device.h |    2 +
+ include/hw/acpi/ghes.h                 |   74 +
+ include/hw/arm/virt.h                  |    1 +
+ include/qemu/uuid.h                    |   27 +
+ include/sysemu/kvm.h                   |    3 +-
+ include/sysemu/kvm_int.h               |   12 +
+ target/arm/cpu.h                       |    4 +
+ target/arm/helper.h                    |   78 +-
+ target/arm/internals.h                 |    5 +-
+ target/arm/translate.h                 |   84 +-
+ target/i386/cpu.h                      |    2 +
+ target/arm/neon-dp.decode              |  119 +-
+ accel/kvm/kvm-all.c                    |   36 +
+ hw/acpi/aml-build.c                    |    2 +
+ hw/acpi/generic_event_device.c         |   19 +
+ hw/acpi/ghes.c                         |  448 ++++++
+ hw/acpi/nvdimm.c                       |   10 +-
+ hw/arm/aspeed.c                        |   78 ++
+ hw/arm/virt-acpi-build.c               |   15 +
+ hw/arm/virt.c                          |   23 +
+ target/arm/cpu_tcg.c                   |    1 +
+ target/arm/gdbstub.c                   |   22 +-
+ target/arm/helper.c                    |    2 +-
+ target/arm/kvm64.c                     |   77 ++
+ target/arm/neon_helper.c               |   17 -
+ target/arm/tlb_helper.c                |    2 +-
+ target/arm/translate-a64.c             |  210 +--
+ target/arm/translate-neon.inc.c        |  682 +++++++++-
+ target/arm/translate.c                 | 2349 +++++++++++++++++---------------
+ target/arm/vec_helper.c                |  240 +++-
+ target/arm/vfp_helper.c                |    9 +-
+ target/i386/kvm.c                      |   36 -
+ MAINTAINERS                            |    9 +
+ gdb-xml/arm-m-profile.xml              |   27 +
+ hw/acpi/Kconfig                        |    4 +
+ hw/acpi/Makefile.objs                  |    1 +
+files changed, 3402 insertions(+), 1445 deletions(-)
+ create mode 100644 docs/specs/acpi_hest_ghes.rst
+ create mode 100644 include/hw/acpi/ghes.h
+ create mode 100644 hw/acpi/ghes.c
+ create mode 100644 gdb-xml/arm-m-profile.xml

-[Qemu-devel] [PULL 14/39] include/hw/or-irq.h: Add missing include guard
+[PULL 01/45] target/arm: Use correct GDB XML for M-profile cores
-The or-irq.h header file is missing the customary guard against
+GDB's remote protocol requires M-profile cores to use the feature
-multiple inclusion, which means compilation fails if it gets
+name 'org.gnu.gdb.arm.m-profile' instead of the 'org.gnu.gdb.arm.core'
-included twice. Fix the omission.
+feature used for A- and R-profile cores. We weren't doing this, which
 meant GDB treated our M-profile cores like A-profile ones. This mostly
 doesn't matter, but for instance means that it doesn't correctly
 handle backtraces where an M-profile exception frame is involved.
+Ship a copy of GDB's arm-m-profile.xml and use it on the M-profile
+cores.  The integer registers have the same offsets as the
+arm-core.xml, but register 25 is the M-profile XPSR rather than the
+A-profile CPSR, so we need to update arm_cpu_gdb_read_register() and
+arm_cpu_gdb_write_register() to handle XSPR reads and writes.
+Fixes: https://bugs.launchpad.net/qemu/+bug/1877136
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200507134755.13997-1-peter.maydell@linaro.org
 Message-id: 20180220180325.29818-11-peter.maydell@linaro.org
 ---
- include/hw/or-irq.h | 5 +++++
+ configure                 |  4 ++--
-file changed, 5 insertions(+)
+ target/arm/cpu_tcg.c      |  1 +
  target/arm/gdbstub.c      | 22 ++++++++++++++++++----
  gdb-xml/arm-m-profile.xml | 27 +++++++++++++++++++++++++++
 files changed, 48 insertions(+), 6 deletions(-)
  create mode 100644 gdb-xml/arm-m-profile.xml
-diff --git a/include/hw/or-irq.h b/include/hw/or-irq.h
+diff --git a/configure b/configure
 index XXXXXXX..XXXXXXX 100755
 --- a/configure
 +++ b/configure
@@ -XXX,XX +XXX,XX @@ case "$target_name" in
      TARGET_SYSTBL_ABI=common,oabi
      bflt="yes"
      mttcg="yes"
 -    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
 +    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
    ;;
    aarch64|aarch64_be)
      TARGET_ARCH=aarch64
      TARGET_BASE_ARCH=arm
      bflt="yes"
      mttcg="yes"
 -    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
 +    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
    ;;
    cris)
    ;;
 diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/or-irq.h
+--- a/target/arm/cpu_tcg.c
-+++ b/include/hw/or-irq.h
++++ b/target/arm/cpu_tcg.c
@@ -XXX,XX +XXX,XX @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
  #endif
      cc->cpu_exec_interrupt = arm_v7m_cpu_exec_interrupt;
 +    cc->gdb_core_xml_file = "arm-m-profile.xml";
  }
  static const ARMCPUInfo arm_tcg_cpus[] = {
 diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/gdbstub.c
 +++ b/target/arm/gdbstub.c
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
          }
          return gdb_get_reg32(mem_buf, 0);
      case 25:
 -        /* CPSR */
 -        return gdb_get_reg32(mem_buf, cpsr_read(env));
 +        /* CPSR, or XPSR for M-profile */
 +        if (arm_feature(env, ARM_FEATURE_M)) {
 +            return gdb_get_reg32(mem_buf, xpsr_read(env));
 +        } else {
 +            return gdb_get_reg32(mem_buf, cpsr_read(env));
 +        }
      }
      /* Unknown register.  */
      return 0;
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
          }
          return 4;
      case 25:
 -        /* CPSR */
 -        cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
 +        /* CPSR, or XPSR for M-profile */
 +        if (arm_feature(env, ARM_FEATURE_M)) {
 +            /*
 +             * Don't allow writing to XPSR.Exception as it can cause
 +             * a transition into or out of handler mode (it's not
 +             * writeable via the MSR insn so this is a reasonable
 +             * restriction). Other fields are safe to update.
 +             */
 +            xpsr_write(env, tmp, ~XPSR_EXCP);
 +        } else {
 +            cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
 +        }
          return 4;
      }
      /* Unknown register.  */
 diff --git a/gdb-xml/arm-m-profile.xml b/gdb-xml/arm-m-profile.xml
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/gdb-xml/arm-m-profile.xml
 @@ -XXX,XX +XXX,XX @@
-  * THE SOFTWARE.
++<?xml version="1.0"?>
-  */
++<!-- Copyright (C) 2010-2020 Free Software Foundation, Inc.
 +#ifndef HW_OR_IRQ_H
 +#define HW_OR_IRQ_H
 +
- #include "hw/irq.h"
++     Copying and distribution of this file, with or without modification,
- #include "hw/sysbus.h"
++     are permitted in any medium without royalty provided the copyright
- #include "qom/object.h"
++     notice and this notice are preserved.  -->
@@ -XXX,XX +XXX,XX @@ struct OrIRQState {
      bool levels[MAX_OR_LINES];
      uint16_t num_lines;
  };
 +
-+#endif
++<!DOCTYPE feature SYSTEM "gdb-target.dtd">
 +<feature name="org.gnu.gdb.arm.m-profile">
 +  <reg name="r0" bitsize="32"/>
 +  <reg name="r1" bitsize="32"/>
 +  <reg name="r2" bitsize="32"/>
 +  <reg name="r3" bitsize="32"/>
 +  <reg name="r4" bitsize="32"/>
 +  <reg name="r5" bitsize="32"/>
 +  <reg name="r6" bitsize="32"/>
 +  <reg name="r7" bitsize="32"/>
 +  <reg name="r8" bitsize="32"/>
 +  <reg name="r9" bitsize="32"/>
 +  <reg name="r10" bitsize="32"/>
 +  <reg name="r11" bitsize="32"/>
 +  <reg name="r12" bitsize="32"/>
 +  <reg name="sp" bitsize="32" type="data_ptr"/>
 +  <reg name="lr" bitsize="32"/>
 +  <reg name="pc" bitsize="32" type="code_ptr"/>
 +  <reg name="xpsr" bitsize="32" regnum="25"/>
 +</feature>
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 27/39] target/arm: Decode aa64 armv8.1 scalar three same extra
+[PULL 02/45] target/arm: Create gen_gvec_[us]sra
 From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+The functions eliminate duplication of the special cases for
 this operation.  They match up with the GVecGen2iFn typedef.
 Add out-of-line helpers.  We got away with only having inline
 expanders because the neon vector size is only 16 bytes, and
 we know that the inline expansion will always succeed.
 When we reuse this for SVE, tcg-gvec-op may decide to use an
 out-of-line helper due to longer vector lengths.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-5-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-2-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/Makefile.objs   |   2 +-
+ target/arm/helper.h        |  10 +++
- target/arm/helper.h        |   4 ++
+ target/arm/translate.h     |   7 +-
- target/arm/translate-a64.c |  84 ++++++++++++++++++++++++++++++++++
+ target/arm/translate-a64.c |  15 +---
- target/arm/vec_helper.c    | 109 +++++++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate.c     | 161 ++++++++++++++++++++++---------------
-files changed, 198 insertions(+), 1 deletion(-)
+ target/arm/vec_helper.c    |  25 ++++++
- create mode 100644 target/arm/vec_helper.c
+files changed, 139 insertions(+), 79 deletions(-)
 diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/Makefile.objs
 +++ b/target/arm/Makefile.objs
@@ -XXX,XX +XXX,XX @@ obj-$(call land,$(CONFIG_KVM),$(call lnot,$(TARGET_AARCH64))) += kvm32.o
  obj-$(call land,$(CONFIG_KVM),$(TARGET_AARCH64)) += kvm64.o
  obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
  obj-y += translate.o op_helper.o helper.o cpu.o
 -obj-y += neon_helper.o iwmmxt_helper.o
 +obj-y += neon_helper.o iwmmxt_helper.o vec_helper.o
  obj-y += gdbstub.o
  obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o helper-a64.o gdbstub64.o
  obj-y += crypto_helper.o
 diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.h
 +++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_1(neon_rbit_u8, TCG_CALL_NO_RWG_SE, i32, i32)
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
- DEF_HELPER_3(neon_qdmulh_s16, i32, env, i32, i32)
+ DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
- DEF_HELPER_3(neon_qrdmulh_s16, i32, env, i32, i32)
-+DEF_HELPER_4(neon_qrdmlah_s16, i32, env, i32, i32, i32)
++DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+DEF_HELPER_4(neon_qrdmlsh_s16, i32, env, i32, i32, i32)
++DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- DEF_HELPER_3(neon_qdmulh_s32, i32, env, i32, i32)
++DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- DEF_HELPER_3(neon_qrdmulh_s32, i32, env, i32, i32)
++DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+DEF_HELPER_4(neon_qrdmlah_s32, i32, env, s32, s32, s32)
++
-+DEF_HELPER_4(neon_qrdmlsh_s32, i32, env, s32, s32, s32)
++DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- DEF_HELPER_1(neon_narrow_u8, i32, i64)
++DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- DEF_HELPER_1(neon_narrow_u16, i32, i64)
++DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
  #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 mls_op[4];
  extern const GVecGen3 cmtst_op[4];
  extern const GVecGen3 sshl_op[4];
  extern const GVecGen3 ushl_op[4];
 -extern const GVecGen2i ssra_op[4];
 -extern const GVecGen2i usra_op[4];
  extern const GVecGen2i sri_op[4];
  extern const GVecGen2i sli_op[4];
  extern const GVecGen4 uqadd_op[4];
@@ -XXX,XX +XXX,XX @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
  void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
  void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
+@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
-     tcg_temp_free_ptr(fpst);
      switch (opcode) {
      case 0x02: /* SSRA / USRA (accumulate) */
 -        if (is_u) {
 -            /* Shift count same as element size produces zero to add.  */
 -            if (shift == 8 << size) {
 -                goto done;
 -            }
 -            gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]);
 -        } else {
 -            /* Shift count same as element size produces all sign to add.  */
 -            if (shift == 8 << size) {
 -                shift -= 1;
 -            }
 -            gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]);
 -        }
 +        gen_gvec_fn2i(s, is_q, rd, rn, shift,
 +                      is_u ? gen_gvec_usra : gen_gvec_ssra, size);
          return;
      case 0x08: /* SRI */
          /* Shift count same as element size is valid but does nothing.  */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
      tcg_gen_add_vec(vece, d, d, a);
  }
-+/* AdvSIMD scalar three same extra
+-static const TCGOpcode vecop_list_ssra[] = {
-+ *  31 30  29 28       24 23  22  21 20  16  15 14    11  10 9  5 4  0
+-    INDEX_op_sari_vec, INDEX_op_add_vec, 0
-+ * +-----+---+-----------+------+---+------+---+--------+---+----+----+
+-};
-+ * | 0 1 | U | 1 1 1 1 0 | size | 0 |  Rm  | 1 | opcode | 1 | Rn | Rd |
++void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-+ * +-----+---+-----------+------+---+------+---+--------+---+----+----+
++                   int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 + */
 +static void disas_simd_scalar_three_reg_same_extra(DisasContext *s,
 +                                                   uint32_t insn)
 +{
-+    int rd = extract32(insn, 0, 5);
++    static const TCGOpcode vecop_list[] = {
-+    int rn = extract32(insn, 5, 5);
++        INDEX_op_sari_vec, INDEX_op_add_vec, 0
-+    int opcode = extract32(insn, 11, 4);
++    };
-+    int rm = extract32(insn, 16, 5);
++    static const GVecGen2i ops[4] = {
-+    int size = extract32(insn, 22, 2);
++        { .fni8 = gen_ssra8_i64,
-+    bool u = extract32(insn, 29, 1);
++          .fniv = gen_ssra_vec,
-+    TCGv_i32 ele1, ele2, ele3;
++          .fno = gen_helper_gvec_ssra_b,
-+    TCGv_i64 res;
++          .load_dest = true,
-+    int feature;
++          .opt_opc = vecop_list,
-+
++          .vece = MO_8 },
-+    switch (u * 16 + opcode) {
++        { .fni8 = gen_ssra16_i64,
-+    case 0x10: /* SQRDMLAH (vector) */
++          .fniv = gen_ssra_vec,
-+    case 0x11: /* SQRDMLSH (vector) */
++          .fno = gen_helper_gvec_ssra_h,
-+        if (size != 1 && size != 2) {
++          .load_dest = true,
-+            unallocated_encoding(s);
++          .opt_opc = vecop_list,
-+            return;
++          .vece = MO_16 },
-+        }
++        { .fni4 = gen_ssra32_i32,
-+        feature = ARM_FEATURE_V8_RDM;
++          .fniv = gen_ssra_vec,
-+        break;
++          .fno = gen_helper_gvec_ssra_s,
-+    default:
++          .load_dest = true,
-+        unallocated_encoding(s);
++          .opt_opc = vecop_list,
-+        return;
++          .vece = MO_32 },
 +        { .fni8 = gen_ssra64_i64,
 +          .fniv = gen_ssra_vec,
 +          .fno = gen_helper_gvec_ssra_b,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 -const GVecGen2i ssra_op[4] = {
 -    { .fni8 = gen_ssra8_i64,
 -      .fniv = gen_ssra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_ssra,
 -      .vece = MO_8 },
 -    { .fni8 = gen_ssra16_i64,
 -      .fniv = gen_ssra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_ssra,
 -      .vece = MO_16 },
 -    { .fni4 = gen_ssra32_i32,
 -      .fniv = gen_ssra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_ssra,
 -      .vece = MO_32 },
 -    { .fni8 = gen_ssra64_i64,
 -      .fniv = gen_ssra_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .opt_opc = vecop_list_ssra,
 -      .load_dest = true,
 -      .vece = MO_64 },
 -};
 +    /* tszimm encoding produces immediates in the range [1..esize]. */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    /*
 +     * Shifts larger than the element size are architecturally valid.
 +     * Signed results in all sign bits.
 +     */
 +    shift = MIN(shift, (8 << vece) - 1);
 +    tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +}
  static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
@@ -XXX,XX +XXX,XX @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
      tcg_gen_add_vec(vece, d, d, a);
  }
 -static const TCGOpcode vecop_list_usra[] = {
 -    INDEX_op_shri_vec, INDEX_op_add_vec, 0
 -};
 +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_usra8_i64,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_b,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8, },
 +        { .fni8 = gen_usra16_i64,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_h,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16, },
 +        { .fni4 = gen_usra32_i32,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_s,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32, },
 +        { .fni8 = gen_usra64_i64,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64, },
 +    };
 -const GVecGen2i usra_op[4] = {
 -    { .fni8 = gen_usra8_i64,
 -      .fniv = gen_usra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_8, },
 -    { .fni8 = gen_usra16_i64,
 -      .fniv = gen_usra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_16, },
 -    { .fni4 = gen_usra32_i32,
 -      .fniv = gen_usra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_32, },
 -    { .fni8 = gen_usra64_i64,
 -      .fniv = gen_usra_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_64, },
 -};
 +    /* tszimm encoding produces immediates in the range [1..esize]. */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    /*
 +     * Shifts larger than the element size are architecturally valid.
 +     * Unsigned results in all zeros as input to accumulate: nop.
 +     */
 +    if (shift < (8 << vece)) {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    } else {
 +        /* Nop, but we do need to clear the tail. */
 +        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
 +    }
-+    if (!arm_dc_feature(s, feature)) {
-+        unallocated_encoding(s);
-+        return;
-+    }
-+    if (!fp_access_check(s)) {
-+        return;
-+    }
-+
-+    /* Do a single operation on the lowest element in the vector.
-+     * We use the standard Neon helpers and rely on 0 OP 0 == 0
-+     * with no side effects for all these operations.
-+     * OPTME: special-purpose helpers would avoid doing some
-+     * unnecessary work in the helper for the 16 bit cases.
-+     */
-+    ele1 = tcg_temp_new_i32();
-+    ele2 = tcg_temp_new_i32();
-+    ele3 = tcg_temp_new_i32();
-+
-+    read_vec_element_i32(s, ele1, rn, 0, size);
-+    read_vec_element_i32(s, ele2, rm, 0, size);
-+    read_vec_element_i32(s, ele3, rd, 0, size);
-+
-+    switch (opcode) {
-+    case 0x0: /* SQRDMLAH */
-+        if (size == 1) {
-+            gen_helper_neon_qrdmlah_s16(ele3, cpu_env, ele1, ele2, ele3);
-+        } else {
-+            gen_helper_neon_qrdmlah_s32(ele3, cpu_env, ele1, ele2, ele3);
-+        }
-+        break;
-+    case 0x1: /* SQRDMLSH */
-+        if (size == 1) {
-+            gen_helper_neon_qrdmlsh_s16(ele3, cpu_env, ele1, ele2, ele3);
-+        } else {
-+            gen_helper_neon_qrdmlsh_s32(ele3, cpu_env, ele1, ele2, ele3);
-+        }
-+        break;
-+    default:
-+        g_assert_not_reached();
-+    }
-+    tcg_temp_free_i32(ele1);
-+    tcg_temp_free_i32(ele2);
-+
-+    res = tcg_temp_new_i64();
-+    tcg_gen_extu_i32_i64(res, ele3);
-+    tcg_temp_free_i32(ele3);
-+
-+    write_fp_dreg(s, rd, res);
-+    tcg_temp_free_i64(res);
 +}
-+
- static void handle_2misc_64(DisasContext *s, int opcode, bool u,
+ static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
-                             TCGv_i64 tcg_rd, TCGv_i64 tcg_rn,
+ {
-                             TCGv_i32 tcg_rmode, TCGv_ptr tcg_fpstatus)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
+                 case 1:  /* VSRA */
-     { 0x0e000800, 0xbf208c00, disas_simd_zip_trn },
+                     /* Right shift comes here negative.  */
-     { 0x2e000000, 0xbf208400, disas_simd_ext },
+                     shift = -shift;
-     { 0x5e200400, 0xdf200400, disas_simd_scalar_three_reg_same },
+-                    /* Shifts larger than the element size are architecturally
-+    { 0x5e008400, 0xdf208400, disas_simd_scalar_three_reg_same_extra },
+-                     * valid.  Unsigned results in all zeros; signed results
-     { 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff },
+-                     * in all sign bits.
-     { 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
+-                     */
-     { 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },
+-                    if (!u) {
 -                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 -                                        MIN(shift, (8 << size) - 1),
 -                                        &ssra_op[size]);
 -                    } else if (shift >= 8 << size) {
 -                        /* rd += 0 */
 +                    if (u) {
 +                        gen_gvec_usra(size, rd_ofs, rm_ofs, shift,
 +                                      vec_size, vec_size);
                      } else {
 -                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 -                                        shift, &usra_op[size]);
 +                        gen_gvec_ssra(size, rd_ofs, rm_ofs, shift,
 +                                      vec_size, vec_size);
                      }
                      return 0;
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
-new file mode 100644
+index XXXXXXX..XXXXXXX 100644
-index XXXXXXX..XXXXXXX
+--- a/target/arm/vec_helper.c
 --- /dev/null
 +++ b/target/arm/vec_helper.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn,
-+/*
+     clear_tail(d, oprsz, simd_maxsz(desc));
-+ * ARM AdvSIMD / SVE Vector Operations
+ }
-+ *
-+ * Copyright (c) 2018 Linaro
++
-+ *
++#define DO_SRA(NAME, TYPE)                              \
-+ * This library is free software; you can redistribute it and/or
++void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
-+ * modify it under the terms of the GNU Lesser General Public
++{                                                       \
-+ * License as published by the Free Software Foundation; either
++    intptr_t i, oprsz = simd_oprsz(desc);               \
-+ * version 2 of the License, or (at your option) any later version.
++    int shift = simd_data(desc);                        \
-+ *
++    TYPE *d = vd, *n = vn;                              \
-+ * This library is distributed in the hope that it will be useful,
++    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
-+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
++        d[i] += n[i] >> shift;                          \
-+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
++    }                                                   \
-+ * Lesser General Public License for more details.
++    clear_tail(d, oprsz, simd_maxsz(desc));             \
 + *
 + * You should have received a copy of the GNU Lesser General Public
 + * License along with this library; if not, see <http://www.gnu.org/licenses/>.
 + */
 +
 +#include "qemu/osdep.h"
 +#include "cpu.h"
 +#include "exec/exec-all.h"
 +#include "exec/helper-proto.h"
 +#include "tcg/tcg-gvec-desc.h"
 +
 +
 +#define SET_QC() env->vfp.xregs[ARM_VFP_FPSCR] |= CPSR_Q
 +
 +/* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
 +static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
 +                                int16_t src2, int16_t src3)
 +{
 +    /* Simplify:
 +     * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16
 +     * = ((a3 << 15) + (e1 * e2) + (1 << 14)) >> 15
 +     */
 +    int32_t ret = (int32_t)src1 * src2;
 +    ret = ((int32_t)src3 << 15) + ret + (1 << 14);
 +    ret >>= 15;
 +    if (ret != (int16_t)ret) {
 +        SET_QC();
 +        ret = (ret < 0 ? -0x8000 : 0x7fff);
 +    }
 +    return ret;
 +}
 +
-+uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
++DO_SRA(gvec_ssra_b, int8_t)
-+                                  uint32_t src2, uint32_t src3)
++DO_SRA(gvec_ssra_h, int16_t)
-+{
++DO_SRA(gvec_ssra_s, int32_t)
-+    uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3);
++DO_SRA(gvec_ssra_d, int64_t)
-+    uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
++
-+    return deposit32(e1, 16, 16, e2);
++DO_SRA(gvec_usra_b, uint8_t)
-+}
++DO_SRA(gvec_usra_h, uint16_t)
-+
++DO_SRA(gvec_usra_s, uint32_t)
-+/* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
++DO_SRA(gvec_usra_d, uint64_t)
-+static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
++
-+                                int16_t src2, int16_t src3)
++#undef DO_SRA
-+{
++
-+    /* Similarly, using subtraction:
+ /*
-+     * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16
+  * Convert float16 to float32, raising no exceptions and
-+     * = ((a3 << 15) - (e1 * e2) + (1 << 14)) >> 15
+  * preserving exceptional values, including SNaN.
 +     */
 +    int32_t ret = (int32_t)src1 * src2;
 +    ret = ((int32_t)src3 << 15) - ret + (1 << 14);
 +    ret >>= 15;
 +    if (ret != (int16_t)ret) {
 +        SET_QC();
 +        ret = (ret < 0 ? -0x8000 : 0x7fff);
 +    }
 +    return ret;
 +}
 +
 +uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
 +                                  uint32_t src2, uint32_t src3)
 +{
 +    uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3);
 +    uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
 +    return deposit32(e1, 16, 16, e2);
 +}
 +
 +/* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
 +uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
 +                                  int32_t src2, int32_t src3)
 +{
 +    /* Simplify similarly to int_qrdmlah_s16 above.  */
 +    int64_t ret = (int64_t)src1 * src2;
 +    ret = ((int64_t)src3 << 31) + ret + (1 << 30);
 +    ret >>= 31;
 +    if (ret != (int32_t)ret) {
 +        SET_QC();
 +        ret = (ret < 0 ? INT32_MIN : INT32_MAX);
 +    }
 +    return ret;
 +}
 +
 +/* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
 +uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
 +                                  int32_t src2, int32_t src3)
 +{
 +    /* Simplify similarly to int_qrdmlsh_s16 above.  */
 +    int64_t ret = (int64_t)src1 * src2;
 +    ret = ((int64_t)src3 << 31) - ret + (1 << 30);
 +    ret >>= 31;
 +    if (ret != (int32_t)ret) {
 +        SET_QC();
 +        ret = (ret < 0 ? INT32_MIN : INT32_MAX);
 +    }
 +    return ret;
 +}
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 34/39] target/arm: Decode aa64 armv8.3 fcadd
+[PULL 03/45] target/arm: Create gen_gvec_{u,s}{rshr,rsra}
 From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+Create vectorized versions of handle_shri_with_rndacc
 for shift+round and shift+round+accumulate.  Add out-of-line
 helpers in preparation for longer vector lengths from SVE.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-12-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-3-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h        |  7 ++++
+ target/arm/helper.h        |  20 ++
- target/arm/translate-a64.c | 48 ++++++++++++++++++++++-
+ target/arm/translate.h     |   9 +
- target/arm/vec_helper.c    | 97 ++++++++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate-a64.c |  11 +-
-files changed, 151 insertions(+), 1 deletion(-)
+ target/arm/translate.c     | 463 +++++++++++++++++++++++++++++++++++--
  target/arm/vec_helper.c    |  50 ++++
 files changed, 527 insertions(+), 26 deletions(-)
 diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.h
 +++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG,
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG,
+ DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-                    void, ptr, ptr, ptr, ptr, i32)
+ DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG,
++DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+                   void, ptr, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG,
++DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+                   void, ptr, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+DEF_HELPER_FLAGS_5(gvec_fcaddd, TCG_CALL_NO_RWG,
++
-+                   void, ptr, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
 +DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
 +DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
- #endif
+ #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
  void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                     int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd,
+@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
                         is_q ? 16 : 8, vec_full_reg_size(s), 0, fn);
  }
 +/* Expand a 3-operand + fpstatus pointer + simd data value operation using
 + * an out-of-line helper.
 + */
 +static void gen_gvec_op3_fpst(DisasContext *s, bool is_q, int rd, int rn,
 +                              int rm, bool is_fp16, int data,
 +                              gen_helper_gvec_3_ptr *fn)
 +{
 +    TCGv_ptr fpst = get_fpstatus_ptr(is_fp16);
 +    tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
 +                       vec_full_reg_offset(s, rn),
 +                       vec_full_reg_offset(s, rm), fpst,
 +                       is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
 +    tcg_temp_free_ptr(fpst);
 +}
 +
  /* Set ZF and NF based on a 64 bit result. This is alas fiddlier
   * than the 32 bit equivalent.
   */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
      int size = extract32(insn, 22, 2);
      bool u = extract32(insn, 29, 1);
      bool is_q = extract32(insn, 30, 1);
 -    int feature;
 +    int feature, rot;
      switch (u * 16 + opcode) {
      case 0x10: /* SQRDMLAH (vector) */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
          }
          feature = ARM_FEATURE_V8_RDM;
          break;
 +    case 0xc: /* FCADD, #90 */
 +    case 0xe: /* FCADD, #270 */
 +        if (size == 0
 +            || (size == 1 && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))
 +            || (size == 3 && !is_q)) {
 +            unallocated_encoding(s);
 +            return;
 +        }
 +        feature = ARM_FEATURE_V8_FCMA;
 +        break;
      default:
          unallocated_encoding(s);
          return;
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
-         }
+     case 0x04: /* SRSHR / URSHR (rounding) */
-         return;
+-        break;
++        gen_gvec_fn2i(s, is_q, rd, rn, shift,
-+    case 0xc: /* FCADD, #90 */
++                      is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
-+    case 0xe: /* FCADD, #270 */
++        return;
-+        rot = extract32(opcode, 1, 1);
++
-+        switch (size) {
+     case 0x06: /* SRSRA / URSRA (accum + rounding) */
-+        case 1:
+-        accumulate = true;
-+            gen_gvec_op3_fpst(s, is_q, rd, rn, rm, size == 1, rot,
+-        break;
-+                              gen_helper_gvec_fcaddh);
++        gen_gvec_fn2i(s, is_q, rd, rn, shift,
-+            break;
++                      is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
 +        case 2:
 +            gen_gvec_op3_fpst(s, is_q, rd, rn, rm, size == 1, rot,
 +                              gen_helper_gvec_fcadds);
 +            break;
 +        case 3:
 +            gen_gvec_op3_fpst(s, is_q, rd, rn, rm, size == 1, rot,
 +                              gen_helper_gvec_fcaddd);
 +            break;
 +        default:
 +            g_assert_not_reached();
 +        }
 +        return;
 +
      default:
          g_assert_not_reached();
      }
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+     }
+ }
++/*
++ * Shift one less than the requested amount, and the low bit is
++ * the rounding bit.  For the 8 and 16-bit operations, because we
++ * mask the low bit, we can perform a normal integer shift instead
++ * of a vector shift.
++ */
++static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    tcg_gen_shri_i64(t, a, sh - 1);
++    tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
++    tcg_gen_vec_sar8i_i64(d, a, sh);
++    tcg_gen_vec_add8_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    tcg_gen_shri_i64(t, a, sh - 1);
++    tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
++    tcg_gen_vec_sar16i_i64(d, a, sh);
++    tcg_gen_vec_add16_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
++{
++    TCGv_i32 t = tcg_temp_new_i32();
++
++    tcg_gen_extract_i32(t, a, sh - 1, 1);
++    tcg_gen_sari_i32(d, a, sh);
++    tcg_gen_add_i32(d, d, t);
++    tcg_temp_free_i32(t);
++}
++
++static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    tcg_gen_extract_i64(t, a, sh - 1, 1);
++    tcg_gen_sari_i64(d, a, sh);
++    tcg_gen_add_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
++{
++    TCGv_vec t = tcg_temp_new_vec_matching(d);
++    TCGv_vec ones = tcg_temp_new_vec_matching(d);
++
++    tcg_gen_shri_vec(vece, t, a, sh - 1);
++    tcg_gen_dupi_vec(vece, ones, 1);
++    tcg_gen_and_vec(vece, t, t, ones);
++    tcg_gen_sari_vec(vece, d, a, sh);
++    tcg_gen_add_vec(vece, d, d, t);
++
++    tcg_temp_free_vec(t);
++    tcg_temp_free_vec(ones);
++}
++
++void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
++                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
++{
++    static const TCGOpcode vecop_list[] = {
++        INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
++    };
++    static const GVecGen2i ops[4] = {
++        { .fni8 = gen_srshr8_i64,
++          .fniv = gen_srshr_vec,
++          .fno = gen_helper_gvec_srshr_b,
++          .opt_opc = vecop_list,
++          .vece = MO_8 },
++        { .fni8 = gen_srshr16_i64,
++          .fniv = gen_srshr_vec,
++          .fno = gen_helper_gvec_srshr_h,
++          .opt_opc = vecop_list,
++          .vece = MO_16 },
++        { .fni4 = gen_srshr32_i32,
++          .fniv = gen_srshr_vec,
++          .fno = gen_helper_gvec_srshr_s,
++          .opt_opc = vecop_list,
++          .vece = MO_32 },
++        { .fni8 = gen_srshr64_i64,
++          .fniv = gen_srshr_vec,
++          .fno = gen_helper_gvec_srshr_d,
++          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
++          .opt_opc = vecop_list,
++          .vece = MO_64 },
++    };
++
++    /* tszimm encoding produces immediates in the range [1..esize] */
++    tcg_debug_assert(shift > 0);
++    tcg_debug_assert(shift <= (8 << vece));
++
++    if (shift == (8 << vece)) {
++        /*
++         * Shifts larger than the element size are architecturally valid.
++         * Signed results in all sign bits.  With rounding, this produces
++         *   (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
++         * I.e. always zero.
++         */
++        tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
++    } else {
++        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
++    }
++}
++
++static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    gen_srshr8_i64(t, a, sh);
++    tcg_gen_vec_add8_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    gen_srshr16_i64(t, a, sh);
++    tcg_gen_vec_add16_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
++{
++    TCGv_i32 t = tcg_temp_new_i32();
++
++    gen_srshr32_i32(t, a, sh);
++    tcg_gen_add_i32(d, d, t);
++    tcg_temp_free_i32(t);
++}
++
++static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    gen_srshr64_i64(t, a, sh);
++    tcg_gen_add_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
++{
++    TCGv_vec t = tcg_temp_new_vec_matching(d);
++
++    gen_srshr_vec(vece, t, a, sh);
++    tcg_gen_add_vec(vece, d, d, t);
++    tcg_temp_free_vec(t);
++}
++
++void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
++                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
++{
++    static const TCGOpcode vecop_list[] = {
++        INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
++    };
++    static const GVecGen2i ops[4] = {
++        { .fni8 = gen_srsra8_i64,
++          .fniv = gen_srsra_vec,
++          .fno = gen_helper_gvec_srsra_b,
++          .opt_opc = vecop_list,
++          .load_dest = true,
++          .vece = MO_8 },
++        { .fni8 = gen_srsra16_i64,
++          .fniv = gen_srsra_vec,
++          .fno = gen_helper_gvec_srsra_h,
++          .opt_opc = vecop_list,
++          .load_dest = true,
++          .vece = MO_16 },
++        { .fni4 = gen_srsra32_i32,
++          .fniv = gen_srsra_vec,
++          .fno = gen_helper_gvec_srsra_s,
++          .opt_opc = vecop_list,
++          .load_dest = true,
++          .vece = MO_32 },
++        { .fni8 = gen_srsra64_i64,
++          .fniv = gen_srsra_vec,
++          .fno = gen_helper_gvec_srsra_d,
++          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
++          .opt_opc = vecop_list,
++          .load_dest = true,
++          .vece = MO_64 },
++    };
++
++    /* tszimm encoding produces immediates in the range [1..esize] */
++    tcg_debug_assert(shift > 0);
++    tcg_debug_assert(shift <= (8 << vece));
++
++    /*
++     * Shifts larger than the element size are architecturally valid.
++     * Signed results in all sign bits.  With rounding, this produces
++     *   (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
++     * I.e. always zero.  With accumulation, this leaves D unchanged.
++     */
++    if (shift == (8 << vece)) {
++        /* Nop, but we do need to clear the tail. */
++        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
++    } else {
++        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
++    }
++}
++
++static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    tcg_gen_shri_i64(t, a, sh - 1);
++    tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
++    tcg_gen_vec_shr8i_i64(d, a, sh);
++    tcg_gen_vec_add8_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    tcg_gen_shri_i64(t, a, sh - 1);
++    tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
++    tcg_gen_vec_shr16i_i64(d, a, sh);
++    tcg_gen_vec_add16_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
++{
++    TCGv_i32 t = tcg_temp_new_i32();
++
++    tcg_gen_extract_i32(t, a, sh - 1, 1);
++    tcg_gen_shri_i32(d, a, sh);
++    tcg_gen_add_i32(d, d, t);
++    tcg_temp_free_i32(t);
++}
++
++static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    tcg_gen_extract_i64(t, a, sh - 1, 1);
++    tcg_gen_shri_i64(d, a, sh);
++    tcg_gen_add_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
++{
++    TCGv_vec t = tcg_temp_new_vec_matching(d);
++    TCGv_vec ones = tcg_temp_new_vec_matching(d);
++
++    tcg_gen_shri_vec(vece, t, a, shift - 1);
++    tcg_gen_dupi_vec(vece, ones, 1);
++    tcg_gen_and_vec(vece, t, t, ones);
++    tcg_gen_shri_vec(vece, d, a, shift);
++    tcg_gen_add_vec(vece, d, d, t);
++
++    tcg_temp_free_vec(t);
++    tcg_temp_free_vec(ones);
++}
++
++void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
++                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
++{
++    static const TCGOpcode vecop_list[] = {
++        INDEX_op_shri_vec, INDEX_op_add_vec, 0
++    };
++    static const GVecGen2i ops[4] = {
++        { .fni8 = gen_urshr8_i64,
++          .fniv = gen_urshr_vec,
++          .fno = gen_helper_gvec_urshr_b,
++          .opt_opc = vecop_list,
++          .vece = MO_8 },
++        { .fni8 = gen_urshr16_i64,
++          .fniv = gen_urshr_vec,
++          .fno = gen_helper_gvec_urshr_h,
++          .opt_opc = vecop_list,
++          .vece = MO_16 },
++        { .fni4 = gen_urshr32_i32,
++          .fniv = gen_urshr_vec,
++          .fno = gen_helper_gvec_urshr_s,
++          .opt_opc = vecop_list,
++          .vece = MO_32 },
++        { .fni8 = gen_urshr64_i64,
++          .fniv = gen_urshr_vec,
++          .fno = gen_helper_gvec_urshr_d,
++          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
++          .opt_opc = vecop_list,
++          .vece = MO_64 },
++    };
++
++    /* tszimm encoding produces immediates in the range [1..esize] */
++    tcg_debug_assert(shift > 0);
++    tcg_debug_assert(shift <= (8 << vece));
++
++    if (shift == (8 << vece)) {
++        /*
++         * Shifts larger than the element size are architecturally valid.
++         * Unsigned results in zero.  With rounding, this produces a
++         * copy of the most significant bit.
++         */
++        tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
++    } else {
++        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
++    }
++}
++
++static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    if (sh == 8) {
++        tcg_gen_vec_shr8i_i64(t, a, 7);
++    } else {
++        gen_urshr8_i64(t, a, sh);
++    }
++    tcg_gen_vec_add8_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    if (sh == 16) {
++        tcg_gen_vec_shr16i_i64(t, a, 15);
++    } else {
++        gen_urshr16_i64(t, a, sh);
++    }
++    tcg_gen_vec_add16_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
++{
++    TCGv_i32 t = tcg_temp_new_i32();
++
++    if (sh == 32) {
++        tcg_gen_shri_i32(t, a, 31);
++    } else {
++        gen_urshr32_i32(t, a, sh);
++    }
++    tcg_gen_add_i32(d, d, t);
++    tcg_temp_free_i32(t);
++}
++
++static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
++{
++    TCGv_i64 t = tcg_temp_new_i64();
++
++    if (sh == 64) {
++        tcg_gen_shri_i64(t, a, 63);
++    } else {
++        gen_urshr64_i64(t, a, sh);
++    }
++    tcg_gen_add_i64(d, d, t);
++    tcg_temp_free_i64(t);
++}
++
++static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
++{
++    TCGv_vec t = tcg_temp_new_vec_matching(d);
++
++    if (sh == (8 << vece)) {
++        tcg_gen_shri_vec(vece, t, a, sh - 1);
++    } else {
++        gen_urshr_vec(vece, t, a, sh);
++    }
++    tcg_gen_add_vec(vece, d, d, t);
++    tcg_temp_free_vec(t);
++}
++
++void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
++                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
++{
++    static const TCGOpcode vecop_list[] = {
++        INDEX_op_shri_vec, INDEX_op_add_vec, 0
++    };
++    static const GVecGen2i ops[4] = {
++        { .fni8 = gen_ursra8_i64,
++          .fniv = gen_ursra_vec,
++          .fno = gen_helper_gvec_ursra_b,
++          .opt_opc = vecop_list,
++          .load_dest = true,
++          .vece = MO_8 },
++        { .fni8 = gen_ursra16_i64,
++          .fniv = gen_ursra_vec,
++          .fno = gen_helper_gvec_ursra_h,
++          .opt_opc = vecop_list,
++          .load_dest = true,
++          .vece = MO_16 },
++        { .fni4 = gen_ursra32_i32,
++          .fniv = gen_ursra_vec,
++          .fno = gen_helper_gvec_ursra_s,
++          .opt_opc = vecop_list,
++          .load_dest = true,
++          .vece = MO_32 },
++        { .fni8 = gen_ursra64_i64,
++          .fniv = gen_ursra_vec,
++          .fno = gen_helper_gvec_ursra_d,
++          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
++          .opt_opc = vecop_list,
++          .load_dest = true,
++          .vece = MO_64 },
++    };
++
++    /* tszimm encoding produces immediates in the range [1..esize] */
++    tcg_debug_assert(shift > 0);
++    tcg_debug_assert(shift <= (8 << vece));
++
++    tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
++}
++
+ static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
+ {
+     uint64_t mask = dup_const(MO_8, 0xff >> shift);
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+                     }
+                     return 0;
++                case 2: /* VRSHR */
++                    /* Right shift comes here negative.  */
++                    shift = -shift;
++                    if (u) {
++                        gen_gvec_urshr(size, rd_ofs, rm_ofs, shift,
++                                       vec_size, vec_size);
++                    } else {
++                        gen_gvec_srshr(size, rd_ofs, rm_ofs, shift,
++                                       vec_size, vec_size);
++                    }
++                    return 0;
++
++                case 3: /* VRSRA */
++                    /* Right shift comes here negative.  */
++                    shift = -shift;
++                    if (u) {
++                        gen_gvec_ursra(size, rd_ofs, rm_ofs, shift,
++                                       vec_size, vec_size);
++                    } else {
++                        gen_gvec_srsra(size, rd_ofs, rm_ofs, shift,
++                                       vec_size, vec_size);
++                    }
++                    return 0;
++
+                 case 4: /* VSRI */
+                     if (!u) {
+                         return 1;
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+                         neon_load_reg64(cpu_V0, rm + pass);
+                         tcg_gen_movi_i64(cpu_V1, imm);
+                         switch (op) {
+-                        case 2: /* VRSHR */
+-                        case 3: /* VRSRA */
+-                            if (u)
+-                                gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1);
+-                            else
+-                                gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1);
+-                            break;
+                         case 6: /* VQSHLU */
+                             gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
+                                                       cpu_V0, cpu_V1);
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+                         default:
+                             g_assert_not_reached();
+                         }
+-                        if (op == 3) {
+-                            /* Accumulate.  */
+-                            neon_load_reg64(cpu_V1, rd + pass);
+-                            tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
+-                        }
+                         neon_store_reg64(cpu_V0, rd + pass);
+                     } else { /* size < 3 */
+                         /* Operands in T0 and T1.  */
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+                         tmp2 = tcg_temp_new_i32();
+                         tcg_gen_movi_i32(tmp2, imm);
+                         switch (op) {
+-                        case 2: /* VRSHR */
+-                        case 3: /* VRSRA */
+-                            GEN_NEON_INTEGER_OP(rshl);
+-                            break;
+                         case 6: /* VQSHLU */
+                             switch (size) {
+                             case 0:
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+                             g_assert_not_reached();
+                         }
+                         tcg_temp_free_i32(tmp2);
+-
+-                        if (op == 3) {
+-                            /* Accumulate.  */
+-                            tmp2 = neon_load_reg(rd, pass);
+-                            gen_neon_add(size, tmp, tmp2);
+-                            tcg_temp_free_i32(tmp2);
+-                        }
+                         neon_store_reg(rd, pass, tmp);
+                     }
+                 } /* for pass */
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ DO_SRA(gvec_usra_d, uint64_t)
- #include "exec/exec-all.h"
- #include "exec/helper-proto.h"
+ #undef DO_SRA
- #include "tcg/tcg-gvec-desc.h"
-+#include "fpu/softfloat.h"
++#define DO_RSHR(NAME, TYPE)                             \
++void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
++{                                                       \
-+/* Note that vector data is stored in host-endian 64-bit chunks,
++    intptr_t i, oprsz = simd_oprsz(desc);               \
-+   so addressing units smaller than that needs a host-endian fixup.  */
++    int shift = simd_data(desc);                        \
-+#ifdef HOST_WORDS_BIGENDIAN
++    TYPE *d = vd, *n = vn;                              \
-+#define H1(x)  ((x) ^ 7)
++    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
-+#define H2(x)  ((x) ^ 3)
++        TYPE tmp = n[i] >> (shift - 1);                 \
-+#define H4(x)  ((x) ^ 1)
++        d[i] = (tmp >> 1) + (tmp & 1);                  \
-+#else
++    }                                                   \
-+#define H1(x)  (x)
++    clear_tail(d, oprsz, simd_maxsz(desc));             \
-+#define H2(x)  (x)
++}
-+#define H4(x)  (x)
++
-+#endif
++DO_RSHR(gvec_srshr_b, int8_t)
-+
++DO_RSHR(gvec_srshr_h, int16_t)
- #define SET_QC() env->vfp.xregs[ARM_VFP_FPSCR] |= CPSR_Q
++DO_RSHR(gvec_srshr_s, int32_t)
++DO_RSHR(gvec_srshr_d, int64_t)
- static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
++
-@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
++DO_RSHR(gvec_urshr_b, uint8_t)
-     }
++DO_RSHR(gvec_urshr_h, uint16_t)
-     clear_tail(d, opr_sz, simd_maxsz(desc));
++DO_RSHR(gvec_urshr_s, uint32_t)
- }
++DO_RSHR(gvec_urshr_d, uint64_t)
 +
-+void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm,
++#undef DO_RSHR
-+                         void *vfpst, uint32_t desc)
++
-+{
++#define DO_RSRA(NAME, TYPE)                             \
-+    uintptr_t opr_sz = simd_oprsz(desc);
++void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
-+    float16 *d = vd;
++{                                                       \
-+    float16 *n = vn;
++    intptr_t i, oprsz = simd_oprsz(desc);               \
-+    float16 *m = vm;
++    int shift = simd_data(desc);                        \
-+    float_status *fpst = vfpst;
++    TYPE *d = vd, *n = vn;                              \
-+    uint32_t neg_real = extract32(desc, SIMD_DATA_SHIFT, 1);
++    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
-+    uint32_t neg_imag = neg_real ^ 1;
++        TYPE tmp = n[i] >> (shift - 1);                 \
-+    uintptr_t i;
++        d[i] += (tmp >> 1) + (tmp & 1);                 \
-+
++    }                                                   \
-+    /* Shift boolean to the sign bit so we can xor to negate.  */
++    clear_tail(d, oprsz, simd_maxsz(desc));             \
-+    neg_real <<= 15;
++}
-+    neg_imag <<= 15;
++
-+
++DO_RSRA(gvec_srsra_b, int8_t)
-+    for (i = 0; i < opr_sz / 2; i += 2) {
++DO_RSRA(gvec_srsra_h, int16_t)
-+        float16 e0 = n[H2(i)];
++DO_RSRA(gvec_srsra_s, int32_t)
-+        float16 e1 = m[H2(i + 1)] ^ neg_imag;
++DO_RSRA(gvec_srsra_d, int64_t)
-+        float16 e2 = n[H2(i + 1)];
++
-+        float16 e3 = m[H2(i)] ^ neg_real;
++DO_RSRA(gvec_ursra_b, uint8_t)
-+
++DO_RSRA(gvec_ursra_h, uint16_t)
-+        d[H2(i)] = float16_add(e0, e1, fpst);
++DO_RSRA(gvec_ursra_s, uint32_t)
-+        d[H2(i + 1)] = float16_add(e2, e3, fpst);
++DO_RSRA(gvec_ursra_d, uint64_t)
-+    }
++
-+    clear_tail(d, opr_sz, simd_maxsz(desc));
++#undef DO_RSRA
-+}
++
-+
+ /*
-+void HELPER(gvec_fcadds)(void *vd, void *vn, void *vm,
+  * Convert float16 to float32, raising no exceptions and
-+                         void *vfpst, uint32_t desc)
+  * preserving exceptional values, including SNaN.
 +{
 +    uintptr_t opr_sz = simd_oprsz(desc);
 +    float32 *d = vd;
 +    float32 *n = vn;
 +    float32 *m = vm;
 +    float_status *fpst = vfpst;
 +    uint32_t neg_real = extract32(desc, SIMD_DATA_SHIFT, 1);
 +    uint32_t neg_imag = neg_real ^ 1;
 +    uintptr_t i;
 +
 +    /* Shift boolean to the sign bit so we can xor to negate.  */
 +    neg_real <<= 31;
 +    neg_imag <<= 31;
 +
 +    for (i = 0; i < opr_sz / 4; i += 2) {
 +        float32 e0 = n[H4(i)];
 +        float32 e1 = m[H4(i + 1)] ^ neg_imag;
 +        float32 e2 = n[H4(i + 1)];
 +        float32 e3 = m[H4(i)] ^ neg_real;
 +
 +        d[H4(i)] = float32_add(e0, e1, fpst);
 +        d[H4(i + 1)] = float32_add(e2, e3, fpst);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 +
 +void HELPER(gvec_fcaddd)(void *vd, void *vn, void *vm,
 +                         void *vfpst, uint32_t desc)
 +{
 +    uintptr_t opr_sz = simd_oprsz(desc);
 +    float64 *d = vd;
 +    float64 *n = vn;
 +    float64 *m = vm;
 +    float_status *fpst = vfpst;
 +    uint64_t neg_real = extract64(desc, SIMD_DATA_SHIFT, 1);
 +    uint64_t neg_imag = neg_real ^ 1;
 +    uintptr_t i;
 +
 +    /* Shift boolean to the sign bit so we can xor to negate.  */
 +    neg_real <<= 63;
 +    neg_imag <<= 63;
 +
 +    for (i = 0; i < opr_sz / 8; i += 2) {
 +        float64 e0 = n[i];
 +        float64 e1 = m[i + 1] ^ neg_imag;
 +        float64 e2 = n[i + 1];
 +        float64 e3 = m[i] ^ neg_real;
 +
 +        d[i] = float64_add(e0, e1, fpst);
 +        d[i + 1] = float64_add(e2, e3, fpst);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 28/39] target/arm: Decode aa64 armv8.1 three same extra
+[PULL 04/45] target/arm: Create gen_gvec_{sri,sli}
 From: Richard Henderson <richard.henderson@linaro.org>
+The functions eliminate duplication of the special cases for
+this operation.  They match up with the GVecGen2iFn typedef.
+Add out-of-line helpers.  We got away with only having inline
+expanders because the neon vector size is only 16 bytes, and
+we know that the inline expansion will always succeed.
+When we reuse this for SVE, tcg-gvec-op may decide to use an
+out-of-line helper due to longer vector lengths.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-6-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-4-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h        |  9 +++++
+ target/arm/helper.h        |  10 ++
- target/arm/translate-a64.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate.h     |   7 +-
- target/arm/vec_helper.c    | 74 +++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate-a64.c |  20 +---
-files changed, 166 insertions(+)
+ target/arm/translate.c     | 186 +++++++++++++++++++++----------------
  target/arm/vec_helper.c    |  38 ++++++++
 files changed, 160 insertions(+), 101 deletions(-)
 diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.h
 +++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(dc_zva, void, env, i64)
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+ DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+ DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+DEF_HELPER_FLAGS_5(gvec_qrdmlah_s16, TCG_CALL_NO_RWG,
++DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+                   void, ptr, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s16, TCG_CALL_NO_RWG,
++DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+                   void, ptr, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG,
++
-+                   void, ptr, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG,
++DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+                   void, ptr, ptr, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
- #endif
+ #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 mls_op[4];
  extern const GVecGen3 cmtst_op[4];
  extern const GVecGen3 sshl_op[4];
  extern const GVecGen3 ushl_op[4];
 -extern const GVecGen2i sri_op[4];
 -extern const GVecGen2i sli_op[4];
  extern const GVecGen4 uqadd_op[4];
  extern const GVecGen4 sqadd_op[4];
  extern const GVecGen4 uqsub_op[4];
@@ -XXX,XX +XXX,XX @@ void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
  void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                      int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
+@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
-                    vec_full_reg_size(s), gvec_op);
+                    is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
  }
-+/* Expand a 3-operand + env pointer operation using
+-/* Expand a 2-operand + immediate AdvSIMD vector operation using
-+ * an out-of-line helper.
+- * an op descriptor.
-+ */
+- */
-+static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd,
+-static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd,
-+                             int rn, int rm, gen_helper_gvec_3_ptr *fn)
+-                          int rn, int64_t imm, const GVecGen2i *gvec_op)
-+{
+-{
-+    tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
+-    tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
-+                       vec_full_reg_offset(s, rn),
+-                    is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op);
-+                       vec_full_reg_offset(s, rm), cpu_env,
+-}
-+                       is_q ? 16 : 8, vec_full_reg_size(s), 0, fn);
+-
-+}
+ /* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
-+
+ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
- /* Set ZF and NF based on a 64 bit result. This is alas fiddlier
+                          int rn, int rm, const GVecGen3 *gvec_op)
-  * than the 32 bit equivalent.
+@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
-  */
+         gen_gvec_fn2i(s, is_q, rd, rn, shift,
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
+                       is_u ? gen_gvec_usra : gen_gvec_ssra, size);
          return;
 +
      case 0x08: /* SRI */
 -        /* Shift count same as element size is valid but does nothing.  */
 -        if (shift == 8 << size) {
 -            goto done;
 -        }
 -        gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]);
 +        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
          return;
      case 0x00: /* SSHR / USHR */
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
      }
      tcg_temp_free_i64(tcg_round);
 - done:
      clear_vec_high(s, is_q, rd);
  }
-+/* AdvSIMD three same extra
+@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
-+ *  31   30  29 28       24 23  22  21 20  16  15 14    11  10 9  5 4  0
+     }
-+ * +---+---+---+-----------+------+---+------+---+--------+---+----+----+
-+ * | 0 | Q | U | 0 1 1 1 0 | size | 0 |  Rm  | 1 | opcode | 1 | Rn | Rd |
+     if (insert) {
-+ * +---+---+---+-----------+------+---+------+---+--------+---+----+----+
+-        gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]);
-+ */
++        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size);
-+static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
+     } else {
          gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size);
      }
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
  {
 -    if (sh == 0) {
 -        tcg_gen_mov_vec(d, a);
 -    } else {
 -        TCGv_vec t = tcg_temp_new_vec_matching(d);
 -        TCGv_vec m = tcg_temp_new_vec_matching(d);
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    TCGv_vec m = tcg_temp_new_vec_matching(d);
 -        tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
 -        tcg_gen_shri_vec(vece, t, a, sh);
 -        tcg_gen_and_vec(vece, d, d, m);
 -        tcg_gen_or_vec(vece, d, d, t);
 +    tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
 +    tcg_gen_shri_vec(vece, t, a, sh);
 +    tcg_gen_and_vec(vece, d, d, m);
 +    tcg_gen_or_vec(vece, d, d, t);
 -        tcg_temp_free_vec(t);
 -        tcg_temp_free_vec(m);
 -    }
 +    tcg_temp_free_vec(t);
 +    tcg_temp_free_vec(m);
  }
 -static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 };
 +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
-+    int rd = extract32(insn, 0, 5);
++    static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 };
-+    int rn = extract32(insn, 5, 5);
++    const GVecGen2i ops[4] = {
-+    int opcode = extract32(insn, 11, 4);
++        { .fni8 = gen_shr8_ins_i64,
-+    int rm = extract32(insn, 16, 5);
++          .fniv = gen_shr_ins_vec,
-+    int size = extract32(insn, 22, 2);
++          .fno = gen_helper_gvec_sri_b,
-+    bool u = extract32(insn, 29, 1);
++          .load_dest = true,
-+    bool is_q = extract32(insn, 30, 1);
++          .opt_opc = vecop_list,
-+    int feature;
++          .vece = MO_8 },
-+
++        { .fni8 = gen_shr16_ins_i64,
-+    switch (u * 16 + opcode) {
++          .fniv = gen_shr_ins_vec,
-+    case 0x10: /* SQRDMLAH (vector) */
++          .fno = gen_helper_gvec_sri_h,
-+    case 0x11: /* SQRDMLSH (vector) */
++          .load_dest = true,
-+        if (size != 1 && size != 2) {
++          .opt_opc = vecop_list,
-+            unallocated_encoding(s);
++          .vece = MO_16 },
-+            return;
++        { .fni4 = gen_shr32_ins_i32,
-+        }
++          .fniv = gen_shr_ins_vec,
-+        feature = ARM_FEATURE_V8_RDM;
++          .fno = gen_helper_gvec_sri_s,
-+        break;
++          .load_dest = true,
-+    default:
++          .opt_opc = vecop_list,
-+        unallocated_encoding(s);
++          .vece = MO_32 },
-+        return;
++        { .fni8 = gen_shr64_ins_i64,
-+    }
++          .fniv = gen_shr_ins_vec,
-+    if (!arm_dc_feature(s, feature)) {
++          .fno = gen_helper_gvec_sri_d,
-+        unallocated_encoding(s);
++          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-+        return;
++          .load_dest = true,
-+    }
++          .opt_opc = vecop_list,
-+    if (!fp_access_check(s)) {
++          .vece = MO_64 },
-+        return;
++    };
-+    }
-+
+-const GVecGen2i sri_op[4] = {
-+    switch (opcode) {
+-    { .fni8 = gen_shr8_ins_i64,
-+    case 0x0: /* SQRDMLAH (vector) */
+-      .fniv = gen_shr_ins_vec,
-+        switch (size) {
+-      .load_dest = true,
-+        case 1:
+-      .opt_opc = vecop_list_sri,
-+            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16);
+-      .vece = MO_8 },
-+            break;
+-    { .fni8 = gen_shr16_ins_i64,
-+        case 2:
+-      .fniv = gen_shr_ins_vec,
-+            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32);
+-      .load_dest = true,
-+            break;
+-      .opt_opc = vecop_list_sri,
-+        default:
+-      .vece = MO_16 },
-+            g_assert_not_reached();
+-    { .fni4 = gen_shr32_ins_i32,
-+        }
+-      .fniv = gen_shr_ins_vec,
-+        return;
+-      .load_dest = true,
-+
+-      .opt_opc = vecop_list_sri,
-+    case 0x1: /* SQRDMLSH (vector) */
+-      .vece = MO_32 },
-+        switch (size) {
+-    { .fni8 = gen_shr64_ins_i64,
-+        case 1:
+-      .fniv = gen_shr_ins_vec,
-+            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16);
+-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-+            break;
+-      .load_dest = true,
-+        case 2:
+-      .opt_opc = vecop_list_sri,
-+            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32);
+-      .vece = MO_64 },
-+            break;
+-};
-+        default:
++    /* tszimm encoding produces immediates in the range [1..esize]. */
-+            g_assert_not_reached();
++    tcg_debug_assert(shift > 0);
-+        }
++    tcg_debug_assert(shift <= (8 << vece));
-+        return;
++
-+
++    /* Shift of esize leaves destination unchanged. */
-+    default:
++    if (shift < (8 << vece)) {
-+        g_assert_not_reached();
++        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    } else {
 +        /* Nop, but we do need to clear the tail. */
 +        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
 +    }
 +}
-+
- static void handle_2misc_widening(DisasContext *s, int opcode, bool is_q,
+ static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
                                    int size, int rn, int rd)
  {
-@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_imm2(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
- static const AArch64DecodeTable data_proc_simd[] = {
-     /* pattern  ,  mask     ,  fn                        */
+ static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
-     { 0x0e200400, 0x9f200400, disas_simd_three_reg_same },
+ {
-+    { 0x0e008400, 0x9f208400, disas_simd_three_reg_same_extra },
+-    if (sh == 0) {
-     { 0x0e200000, 0x9f200c00, disas_simd_three_reg_diff },
+-        tcg_gen_mov_vec(d, a);
-     { 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },
+-    } else {
-     { 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes },
+-        TCGv_vec t = tcg_temp_new_vec_matching(d);
 -        TCGv_vec m = tcg_temp_new_vec_matching(d);
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    TCGv_vec m = tcg_temp_new_vec_matching(d);
 -        tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
 -        tcg_gen_shli_vec(vece, t, a, sh);
 -        tcg_gen_and_vec(vece, d, d, m);
 -        tcg_gen_or_vec(vece, d, d, t);
 +    tcg_gen_shli_vec(vece, t, a, sh);
 +    tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
 +    tcg_gen_and_vec(vece, d, d, m);
 +    tcg_gen_or_vec(vece, d, d, t);
 -        tcg_temp_free_vec(t);
 -        tcg_temp_free_vec(m);
 -    }
 +    tcg_temp_free_vec(t);
 +    tcg_temp_free_vec(m);
  }
 -static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 };
 +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 };
 +    const GVecGen2i ops[4] = {
 +        { .fni8 = gen_shl8_ins_i64,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_b,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni8 = gen_shl16_ins_i64,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_h,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_shl32_ins_i32,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_s,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_shl64_ins_i64,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 -const GVecGen2i sli_op[4] = {
 -    { .fni8 = gen_shl8_ins_i64,
 -      .fniv = gen_shl_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_8 },
 -    { .fni8 = gen_shl16_ins_i64,
 -      .fniv = gen_shl_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_16 },
 -    { .fni4 = gen_shl32_ins_i32,
 -      .fniv = gen_shl_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_32 },
 -    { .fni8 = gen_shl64_ins_i64,
 -      .fniv = gen_shl_ins_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_64 },
 -};
 +    /* tszimm encoding produces immediates in the range [0..esize-1]. */
 +    tcg_debug_assert(shift >= 0);
 +    tcg_debug_assert(shift < (8 << vece));
 +
 +    if (shift == 0) {
 +        tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz);
 +    } else {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    }
 +}
  static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
  {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      }
                      /* Right shift comes here negative.  */
                      shift = -shift;
 -                    /* Shift out of range leaves destination unchanged.  */
 -                    if (shift < 8 << size) {
 -                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 -                                        shift, &sri_op[size]);
 -                    }
 +                    gen_gvec_sri(size, rd_ofs, rm_ofs, shift,
 +                                 vec_size, vec_size);
                      return 0;
                  case 5: /* VSHL, VSLI */
                      if (u) { /* VSLI */
 -                        /* Shift out of range leaves destination unchanged.  */
 -                        if (shift < 8 << size) {
 -                            tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size,
 -                                            vec_size, shift, &sli_op[size]);
 -                        }
 +                        gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
 +                                     vec_size, vec_size);
                      } else { /* VSHL */
                          /* Shifts larger than the element size are
                           * architecturally valid and results in zero.
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ DO_RSRA(gvec_ursra_d, uint64_t)
- #define SET_QC() env->vfp.xregs[ARM_VFP_FPSCR] |= CPSR_Q
+ #undef DO_RSRA
-+static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
++#define DO_SRI(NAME, TYPE)                              \
-+{
++void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
-+    uint64_t *d = vd + opr_sz;
++{                                                       \
-+    uintptr_t i;
++    intptr_t i, oprsz = simd_oprsz(desc);               \
-+
++    int shift = simd_data(desc);                        \
-+    for (i = opr_sz; i < max_sz; i += 8) {
++    TYPE *d = vd, *n = vn;                              \
-+        *d++ = 0;
++    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
-+    }
++        d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \
 +    }                                                   \
 +    clear_tail(d, oprsz, simd_maxsz(desc));             \
 +}
 +
- /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
++DO_SRI(gvec_sri_b, uint8_t)
- static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
++DO_SRI(gvec_sri_h, uint16_t)
-                                 int16_t src2, int16_t src3)
++DO_SRI(gvec_sri_s, uint32_t)
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
++DO_SRI(gvec_sri_d, uint64_t)
-     return deposit32(e1, 16, 16, e2);
++
- }
++#undef DO_SRI
++
-+void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm,
++#define DO_SLI(NAME, TYPE)                              \
-+                              void *ve, uint32_t desc)
++void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
-+{
++{                                                       \
-+    uintptr_t opr_sz = simd_oprsz(desc);
++    intptr_t i, oprsz = simd_oprsz(desc);               \
-+    int16_t *d = vd;
++    int shift = simd_data(desc);                        \
-+    int16_t *n = vn;
++    TYPE *d = vd, *n = vn;                              \
-+    int16_t *m = vm;
++    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
-+    CPUARMState *env = ve;
++        d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \
-+    uintptr_t i;
++    }                                                   \
-+
++    clear_tail(d, oprsz, simd_maxsz(desc));             \
 +    for (i = 0; i < opr_sz / 2; ++i) {
 +        d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 +
- /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
++DO_SLI(gvec_sli_b, uint8_t)
- static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
++DO_SLI(gvec_sli_h, uint16_t)
-                                 int16_t src2, int16_t src3)
++DO_SLI(gvec_sli_s, uint32_t)
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
++DO_SLI(gvec_sli_d, uint64_t)
-     return deposit32(e1, 16, 16, e2);
++
- }
++#undef DO_SLI
++
-+void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
+ /*
-+                              void *ve, uint32_t desc)
+  * Convert float16 to float32, raising no exceptions and
-+{
+  * preserving exceptional values, including SNaN.
 +    uintptr_t opr_sz = simd_oprsz(desc);
 +    int16_t *d = vd;
 +    int16_t *n = vn;
 +    int16_t *m = vm;
 +    CPUARMState *env = ve;
 +    uintptr_t i;
 +
 +    for (i = 0; i < opr_sz / 2; ++i) {
 +        d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 +
  /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
  uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
                                    int32_t src2, int32_t src3)
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
      return ret;
  }
 +void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
 +                              void *ve, uint32_t desc)
 +{
 +    uintptr_t opr_sz = simd_oprsz(desc);
 +    int32_t *d = vd;
 +    int32_t *n = vn;
 +    int32_t *m = vm;
 +    CPUARMState *env = ve;
 +    uintptr_t i;
 +
 +    for (i = 0; i < opr_sz / 4; ++i) {
 +        d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 +
  /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
  uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
                                    int32_t src2, int32_t src3)
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
      }
      return ret;
  }
 +
 +void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
 +                              void *ve, uint32_t desc)
 +{
 +    uintptr_t opr_sz = simd_oprsz(desc);
 +    int32_t *d = vd;
 +    int32_t *n = vn;
 +    int32_t *m = vm;
 +    CPUARMState *env = ve;
 +    uintptr_t i;
 +
 +    for (i = 0; i < opr_sz / 4; ++i) {
 +        d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 38/39] target/arm: Decode t32 simd 3reg and 2reg_scalar extension
+[PULL 05/45] target/arm: Remove unnecessary range check for VSHL
 From: Richard Henderson <richard.henderson@linaro.org>
-Happily, the bits are in the same places compared to a32.
+In 1dc8425e551, while converting to gvec, I added an extra range check
 against the shift count.  This was unnecessary because the encoding of
 the shift count produces 0 to the element size - 1.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-16-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-5-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 14 +++++++++++++-
+ target/arm/translate.c | 12 ++----------
-file changed, 13 insertions(+), 1 deletion(-)
+file changed, 2 insertions(+), 10 deletions(-)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                                default_exception_el(s));
+                         gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
-             break;
+                                      vec_size, vec_size);
-         }
+                     } else { /* VSHL */
--        if (((insn >> 24) & 3) == 3) {
+-                        /* Shifts larger than the element size are
-+        if ((insn & 0xfe000a00) == 0xfc000800
+-                         * architecturally valid and results in zero.
-+            && arm_dc_feature(s, ARM_FEATURE_V8)) {
+-                         */
-+            /* The Thumb2 and ARM encodings are identical.  */
+-                        if (shift >= 8 << size) {
-+            if (disas_neon_insn_3same_ext(s, insn)) {
+-                            tcg_gen_gvec_dup_imm(size, rd_ofs,
-+                goto illegal_op;
+-                                                 vec_size, vec_size, 0);
-+            }
+-                        } else {
-+        } else if ((insn & 0xff000a00) == 0xfe000800
+-                            tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
-+                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
+-                                              vec_size, vec_size);
-+            /* The Thumb2 and ARM encodings are identical.  */
+-                        }
-+            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
++                        tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
-+                goto illegal_op;
++                                          vec_size, vec_size);
-+            }
+                     }
-+        } else if (((insn >> 24) & 3) == 3) {
+                     return 0;
-             /* Translate into the equivalent ARM encoding.  */
+                 }
              insn = (insn & 0xe2ffffff) | ((insn & (1 << 28)) >> 4) | (1 << 28);
              if (disas_neon_data_insn(s, insn)) {
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 29/39] target/arm: Decode aa64 armv8.1 scalar/vector x indexed element
+[PULL 06/45] target/arm: Tidy handle_vec_simd_shri
 From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+Now that we've converted all cases to gvec, there is quite a bit
 of dead code at the end of the function.  Remove it.
 Sink the call to gen_gvec_fn2i to the end, loading a function
 pointer within the switch statement.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-7-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-6-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 29 +++++++++++++++++++++++++++++
+ target/arm/translate-a64.c | 56 ++++++++++----------------------------
-file changed, 29 insertions(+)
+file changed, 14 insertions(+), 42 deletions(-)
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
-     case 0x19: /* FMULX */
+     int size = 32 - clz32(immh) - 1;
-         is_fp = true;
+     int immhb = immh << 3 | immb;
-         break;
+     int shift = 2 * (8 << size) - immhb;
-+    case 0x1d: /* SQRDMLAH */
+-    bool accumulate = false;
-+    case 0x1f: /* SQRDMLSH */
+-    int dsize = is_q ? 128 : 64;
-+        if (!arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
+-    int esize = 8 << size;
-+            unallocated_encoding(s);
+-    int elements = dsize/esize;
-+            return;
+-    MemOp memop = size | (is_u ? 0 : MO_SIGN);
-+        }
+-    TCGv_i64 tcg_rn = new_tmp_a64(s);
 -    TCGv_i64 tcg_rd = new_tmp_a64(s);
 -    TCGv_i64 tcg_round;
 -    uint64_t round_const;
 -    int i;
 +    GVecGen2iFn *gvec_fn;
      if (extract32(immh, 3, 1) && !is_q) {
          unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
      switch (opcode) {
      case 0x02: /* SSRA / USRA (accumulate) */
 -        gen_gvec_fn2i(s, is_q, rd, rn, shift,
 -                      is_u ? gen_gvec_usra : gen_gvec_ssra, size);
 -        return;
 +        gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra;
 +        break;
+     case 0x08: /* SRI */
+-        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
+-        return;
++        gvec_fn = gen_gvec_sri;
++        break;
+     case 0x00: /* SSHR / USHR */
+         if (is_u) {
+@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
+                 /* Shift count the same size as element size produces zero.  */
+                 tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd),
+                                      is_q ? 16 : 8, vec_full_reg_size(s), 0);
+-            } else {
+-                gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size);
++                return;
+             }
++            gvec_fn = tcg_gen_gvec_shri;
+         } else {
+             /* Shift count the same size as element size produces all sign.  */
+             if (shift == 8 << size) {
+                 shift -= 1;
+             }
+-            gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size);
++            gvec_fn = tcg_gen_gvec_sari;
+         }
+-        return;
++        break;
+     case 0x04: /* SRSHR / URSHR (rounding) */
+-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
+-                      is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
+-        return;
++        gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr;
++        break;
+     case 0x06: /* SRSRA / URSRA (accum + rounding) */
+-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
+-                      is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
+-        return;
++        gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra;
++        break;
      default:
-         unallocated_encoding(s);
+         g_assert_not_reached();
-         return;
+     }
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
-                                                 tcg_op, tcg_idx);
+-    round_const = 1ULL << (shift - 1);
-                 }
+-    tcg_round = tcg_const_i64(round_const);
-                 break;
+-
-+            case 0x1d: /* SQRDMLAH */
+-    for (i = 0; i < elements; i++) {
-+                read_vec_element_i32(s, tcg_res, rd, pass,
+-        read_vec_element(s, tcg_rn, rn, i, memop);
-+                                     is_scalar ? size : MO_32);
+-        if (accumulate) {
-+                if (size == 1) {
+-            read_vec_element(s, tcg_rd, rd, i, memop);
-+                    gen_helper_neon_qrdmlah_s16(tcg_res, cpu_env,
+-        }
-+                                                tcg_op, tcg_idx, tcg_res);
+-
-+                } else {
+-        handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
-+                    gen_helper_neon_qrdmlah_s32(tcg_res, cpu_env,
+-                                accumulate, is_u, size, shift);
-+                                                tcg_op, tcg_idx, tcg_res);
+-
-+                }
+-        write_vec_element(s, tcg_rd, rd, i, size);
-+                break;
+-    }
-+            case 0x1f: /* SQRDMLSH */
+-    tcg_temp_free_i64(tcg_round);
-+                read_vec_element_i32(s, tcg_res, rd, pass,
+-
-+                                     is_scalar ? size : MO_32);
+-    clear_vec_high(s, is_q, rd);
-+                if (size == 1) {
++    gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size);
-+                    gen_helper_neon_qrdmlsh_s16(tcg_res, cpu_env,
+ }
-+                                                tcg_op, tcg_idx, tcg_res);
-+                } else {
+ /* SHL/SLI - Vector shift left */
 +                    gen_helper_neon_qrdmlsh_s32(tcg_res, cpu_env,
 +                                                tcg_op, tcg_idx, tcg_res);
 +                }
 +                break;
              default:
                  g_assert_not_reached();
              }
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 31/39] target/arm: Decode aa32 armv8.1 two reg and a scalar
+[PULL 07/45] target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
 From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Provide a functional interface for the vector expansion.
 This fits better with the existing set of helpers that
 we provide for other operations.
 Macro-ize the 5 nearly identical comparisons.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-9-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-7-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 46 ++++++++++++++++++++++++++++++++++++++++++----
+ target/arm/translate.h     |  16 ++-
-file changed, 42 insertions(+), 4 deletions(-)
+ target/arm/translate-a64.c |  22 ++--
+ target/arm/translate.c     | 254 ++++++++-----------------------------
 files changed, 74 insertions(+), 218 deletions(-)
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
  uint64_t vfp_expand_imm(int size, uint8_t imm8);
  /* Vector operations shared between ARM and AArch64.  */
 -extern const GVecGen2 ceq0_op[4];
 -extern const GVecGen2 clt0_op[4];
 -extern const GVecGen2 cgt0_op[4];
 -extern const GVecGen2 cle0_op[4];
 -extern const GVecGen2 cge0_op[4];
 +void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   uint32_t opr_sz, uint32_t max_sz);
 +
  extern const GVecGen3 mla_op[4];
  extern const GVecGen3 mls_op[4];
  extern const GVecGen3 cmtst_op[4];
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
              is_q ? 16 : 8, vec_full_reg_size(s));
  }
 -/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */
 -static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
 -                         int rn, const GVecGen2 *gvec_op)
 -{
 -    tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
 -                   is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
 -}
 -
  /* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
  static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
                           int rn, int rm, const GVecGen3 *gvec_op)
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
          }
          break;
      case 0x8: /* CMGT, CMGE */
 -        gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]);
 +        if (u) {
 +            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size);
 +        } else {
 +            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size);
 +        }
          return;
      case 0x9: /* CMEQ, CMLE */
 -        gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]);
 +        if (u) {
 +            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size);
 +        } else {
 +            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size);
 +        }
          return;
      case 0xa: /* CMLT */
 -        gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]);
 +        gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size);
          return;
      case 0xb:
          if (u) { /* ABS, NEG */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static const char *regnames[] =
+@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
-     { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+     return 1;
-       "r8", "r9", "r10", "r11", "r12", "r13", "r14", "pc" };
+ }
-+/* Function prototypes for gen_ functions calling Neon helpers.  */
+-static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a)
-+typedef void NeonGenThreeOpEnvFn(TCGv_i32, TCGv_env, TCGv_i32,
+-{
-+                                 TCGv_i32, TCGv_i32);
+-    tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0);
-+
+-    tcg_gen_neg_i32(d, d);
- /* initialize TCG globals.  */
+-}
- void arm_translate_init(void)
+-
 -static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 +#define GEN_CMP0(NAME, COND)                                            \
 +    static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a)               \
 +    {                                                                   \
 +        tcg_gen_setcondi_i32(COND, d, a, 0);                            \
 +        tcg_gen_neg_i32(d, d);                                          \
 +    }                                                                   \
 +    static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a)               \
 +    {                                                                   \
 +        tcg_gen_setcondi_i64(COND, d, a, 0);                            \
 +        tcg_gen_neg_i64(d, d);                                          \
 +    }                                                                   \
 +    static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \
 +    {                                                                   \
 +        TCGv_vec zero = tcg_const_zeros_vec_matching(d);                \
 +        tcg_gen_cmp_vec(COND, vece, d, a, zero);                        \
 +        tcg_temp_free_vec(zero);                                        \
 +    }                                                                   \
 +    void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m,      \
 +                            uint32_t opr_sz, uint32_t max_sz)           \
 +    {                                                                   \
 +        const GVecGen2 op[4] = {                                        \
 +            { .fno = gen_helper_gvec_##NAME##0_b,                       \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .vece = MO_8 },                                           \
 +            { .fno = gen_helper_gvec_##NAME##0_h,                       \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .vece = MO_16 },                                          \
 +            { .fni4 = gen_##NAME##0_i32,                                \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .vece = MO_32 },                                          \
 +            { .fni8 = gen_##NAME##0_i64,                                \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .prefer_i64 = TCG_TARGET_REG_BITS == 64,                  \
 +              .vece = MO_64 },                                          \
 +        };                                                              \
 +        tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]);                \
 +    }
  static const TCGOpcode vecop_list_cmp[] = {
      INDEX_op_cmp_vec, 0
  };
 -const GVecGen2 ceq0_op[4] = {
 -    { .fno = gen_helper_gvec_ceq0_b,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_ceq0_h,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_ceq0_i32,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_ceq0_i64,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 +GEN_CMP0(ceq, TCG_COND_EQ)
 +GEN_CMP0(cle, TCG_COND_LE)
 +GEN_CMP0(cge, TCG_COND_GE)
 +GEN_CMP0(clt, TCG_COND_LT)
 +GEN_CMP0(cgt, TCG_COND_GT)
 -static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 cle0_op[4] = {
 -    { .fno = gen_helper_gvec_cle0_b,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_cle0_h,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cle0_i32,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cle0_i64,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 -
 -static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 cge0_op[4] = {
 -    { .fno = gen_helper_gvec_cge0_b,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_cge0_h,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cge0_i32,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cge0_i64,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 -
 -static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 clt0_op[4] = {
 -    { .fno = gen_helper_gvec_clt0_b,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_clt0_h,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_clt0_i32,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_clt0_i64,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 -
 -static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 cgt0_op[4] = {
 -    { .fno = gen_helper_gvec_cgt0_b,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_cgt0_h,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cgt0_i32,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cgt0_i64,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 +#undef GEN_CMP0
  static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
 @@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                         }
+                     break;
-                         neon_store_reg64(cpu_V0, rd + pass);
-                     }
+                 case NEON_2RM_VCEQ0:
--
+-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
--
+-                                   vec_size, &ceq0_op[size]);
-                     break;
++                    gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size);
--                default: /* 14 and 15 are RESERVED */
+                     break;
--                    return 1;
+                 case NEON_2RM_VCGT0:
-+                case 14: /* VQRDMLAH scalar */
+-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-+                case 15: /* VQRDMLSH scalar */
+-                                   vec_size, &cgt0_op[size]);
-+                    {
++                    gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
-+                        NeonGenThreeOpEnvFn *fn;
+                     break;
-+
+                 case NEON_2RM_VCLE0:
-+                        if (!arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
+-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-+                            return 1;
+-                                   vec_size, &cle0_op[size]);
-+                        }
++                    gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size);
-+                        if (u && ((rd | rn) & 1)) {
+                     break;
-+                            return 1;
+                 case NEON_2RM_VCGE0:
-+                        }
+-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-+                        if (op == 14) {
+-                                   vec_size, &cge0_op[size]);
-+                            if (size == 1) {
++                    gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size);
-+                                fn = gen_helper_neon_qrdmlah_s16;
+                     break;
-+                            } else {
+                 case NEON_2RM_VCLT0:
-+                                fn = gen_helper_neon_qrdmlah_s32;
+-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-+                            }
+-                                   vec_size, &clt0_op[size]);
-+                        } else {
++                    gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
-+                            if (size == 1) {
+                     break;
-+                                fn = gen_helper_neon_qrdmlsh_s16;
-+                            } else {
+                 default:
 +                                fn = gen_helper_neon_qrdmlsh_s32;
 +                            }
 +                        }
 +
 +                        tmp2 = neon_get_scalar(size, rm);
 +                        for (pass = 0; pass < (u ? 4 : 2); pass++) {
 +                            tmp = neon_load_reg(rn, pass);
 +                            tmp3 = neon_load_reg(rd, pass);
 +                            fn(tmp, cpu_env, tmp, tmp2, tmp3);
 +                            tcg_temp_free_i32(tmp3);
 +                            neon_store_reg(rd, pass, tmp);
 +                        }
 +                        tcg_temp_free_i32(tmp2);
 +                    }
 +                    break;
 +                default:
 +                    g_assert_not_reached();
                  }
              }
          } else { /* size == 3 */
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 39/39] target/arm: Enable ARM_FEATURE_V8_FCMA
+[PULL 08/45] target/arm: Create gen_gvec_{mla,mls}
 From: Richard Henderson <richard.henderson@linaro.org>
-Enable it for the "any" CPU used by *-linux-user.
+Provide a functional interface for the vector expansion.
+This fits better with the existing set of helpers that
 we provide for other operations.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200513163245.17915-8-richard.henderson@linaro.org
 Message-id: 20180228193125.20577-17-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.c   | 1 +
+ target/arm/translate.h          |   7 +-
- target/arm/cpu64.c | 1 +
+ target/arm/translate-a64.c      |   4 +-
-files changed, 2 insertions(+)
+ target/arm/translate-neon.inc.c |  16 +----
+ target/arm/translate.c          | 117 +++++++++++++++++---------------
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+files changed, 71 insertions(+), 73 deletions(-)
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+diff --git a/target/arm/translate.h b/target/arm/translate.h
-+++ b/target/arm/cpu.c
+index XXXXXXX..XXXXXXX 100644
-@@ -XXX,XX +XXX,XX @@ static void arm_any_initfn(Object *obj)
+--- a/target/arm/translate.h
-     set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
++++ b/target/arm/translate.h
-     set_feature(&cpu->env, ARM_FEATURE_CRC);
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-     set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
+ void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-+    set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
+                    uint32_t opr_sz, uint32_t max_sz);
-     cpu->midr = 0xffffffff;
 -extern const GVecGen3 mla_op[4];
 -extern const GVecGen3 mls_op[4];
 +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
  extern const GVecGen3 cmtst_op[4];
  extern const GVecGen3 sshl_op[4];
  extern const GVecGen3 ushl_op[4];
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
          return;
      case 0x12: /* MLA, MLS */
          if (u) {
 -            gen_gvec_op3(s, is_q, rd, rn, rm, &mls_op[size]);
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size);
          } else {
 -            gen_gvec_op3(s, is_q, rd, rn, rm, &mla_op[size]);
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size);
          }
          return;
      case 0x11:
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
  DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
  DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
  DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
 +DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
 +DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
  #define DO_3SAME_CMP(INSN, COND)                                        \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
      return do_3same(s, a, gen_VMUL_p_3s);
  }
- #endif
-diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
+-#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY)                           \
-index XXXXXXX..XXXXXXX 100644
+-    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
---- a/target/arm/cpu64.c
+-                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+++ b/target/arm/cpu64.c
+-                                uint32_t oprsz, uint32_t maxsz)         \
-@@ -XXX,XX +XXX,XX @@ static void aarch64_any_initfn(Object *obj)
+-    {                                                                   \
-     set_feature(&cpu->env, ARM_FEATURE_CRC);
+-        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
-     set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
+-                       oprsz, maxsz, &OPARRAY[vece]);                   \
-     set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
+-    }                                                                   \
-+    set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
+-    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
-     cpu->ctr = 0x80038003; /* 32 byte I and D cacheline size, VIPT icache */
+-
-     cpu->dcz_blocksize = 7; /*  512 bytes */
+-
- }
+-DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op)
 -DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op)
 -
  #define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
                                  uint32_t rn_ofs, uint32_t rm_ofs,       \
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
  /* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
   * these tables are shared with AArch64 which does support them.
   */
 +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_mul_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fni4 = gen_mla8_i32,
 +          .fniv = gen_mla_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni4 = gen_mla16_i32,
 +          .fniv = gen_mla_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_mla32_i32,
 +          .fniv = gen_mla_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_mla64_i64,
 +          .fniv = gen_mla_vec,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 -static const TCGOpcode vecop_list_mla[] = {
 -    INDEX_op_mul_vec, INDEX_op_add_vec, 0
 -};
 -
 -static const TCGOpcode vecop_list_mls[] = {
 -    INDEX_op_mul_vec, INDEX_op_sub_vec, 0
 -};
 -
 -const GVecGen3 mla_op[4] = {
 -    { .fni4 = gen_mla8_i32,
 -      .fniv = gen_mla_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_8 },
 -    { .fni4 = gen_mla16_i32,
 -      .fniv = gen_mla_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_16 },
 -    { .fni4 = gen_mla32_i32,
 -      .fniv = gen_mla_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_32 },
 -    { .fni8 = gen_mla64_i64,
 -      .fniv = gen_mla_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_64 },
 -};
 -
 -const GVecGen3 mls_op[4] = {
 -    { .fni4 = gen_mls8_i32,
 -      .fniv = gen_mls_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_8 },
 -    { .fni4 = gen_mls16_i32,
 -      .fniv = gen_mls_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_16 },
 -    { .fni4 = gen_mls32_i32,
 -      .fniv = gen_mls_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_32 },
 -    { .fni8 = gen_mls64_i64,
 -      .fniv = gen_mls_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_mul_vec, INDEX_op_sub_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fni4 = gen_mls8_i32,
 +          .fniv = gen_mls_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni4 = gen_mls16_i32,
 +          .fniv = gen_mls_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_mls32_i32,
 +          .fniv = gen_mls_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_mls64_i64,
 +          .fniv = gen_mls_vec,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  /* CMTST : test is "if (X & Y != 0)". */
  static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 33/39] target/arm: Add ARM_FEATURE_V8_FCMA
+[PULL 09/45] target/arm: Swap argument order for VSHL during decode
 From: Richard Henderson <richard.henderson@linaro.org>
-Not enabled anywhere yet.
+Rather than perform the argument swap during code generation,
 perform it during decode.  This means it doesn't have to be
 special cased later, and we can share code with aarch64 code
 generation.  Hopefully the decode comment addresses any confusion
 that might arise in between.
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-11-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-9-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h     | 1 +
+ target/arm/neon-dp.decode       | 17 +++++++++++++++--
- linux-user/elfload.c | 1 +
+ target/arm/translate-neon.inc.c |  3 +--
-files changed, 2 insertions(+)
+files changed, 16 insertions(+), 4 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ enum arm_features {
+@@ -XXX,XX +XXX,XX @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
-     ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */
+ VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
-     ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */
+ VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
-     ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
-+    ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions.  */
+-VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
- };
+-VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
++# The _rev suffix indicates that Vn and Vm are reversed. This is
- static inline int arm_feature(CPUARMState *env, int feature)
++# the case for shifts. In the Arm ARM these insns are documented
-diff --git a/linux-user/elfload.c b/linux-user/elfload.c
++# with the Vm and Vn fields in their usual places, but in the
 +# assembly the operands are listed "backwards", ie in the order
 +# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose
 +# to consider Vm and Vn as being in different fields in the insn,
 +# which allows us to avoid special-casing shifts in the trans_
 +# function code. We would otherwise need to manually swap the operands
 +# over to call Neon helper functions that are shared with AArch64,
 +# which does not have this odd reversed-operand situation.
 +@3same_rev       .... ... . . . size:2 .... .... .... . q:1 . . .... \
 +                 &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
 +
 +VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
 +VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
  VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
  VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/linux-user/elfload.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/linux-user/elfload.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
-     GET_FEATURE(ARM_FEATURE_V8_FP16,
+                                 uint32_t rn_ofs, uint32_t rm_ofs,       \
-                 ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
+                                 uint32_t oprsz, uint32_t maxsz)         \
-     GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
+     {                                                                   \
-+    GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
+-        /* Note the operation is vshl vd,vm,vn */                       \
- #undef GET_FEATURE
+-        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
++        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
-     return hwcaps;
+                        oprsz, maxsz, &OPARRAY[vece]);                   \
      }                                                                   \
      DO_3SAME(INSN, gen_##INSN##_3s)
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 25/39] target/arm: Refactor disas_simd_indexed decode
+[PULL 10/45] target/arm: Create gen_gvec_{cmtst,ushl,sshl}
 From: Richard Henderson <richard.henderson@linaro.org>
-Include the U bit in the switches rather than testing separately.
+Provide a functional interface for the vector expansion.
+This fits better with the existing set of helpers that
 we provide for other operations.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200513163245.17915-10-richard.henderson@linaro.org
 Message-id: 20180228193125.20577-3-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 129 +++++++++++++++++++++------------------------
+ target/arm/translate.h          |  10 ++-
-file changed, 61 insertions(+), 68 deletions(-)
+ target/arm/translate-a64.c      |  18 ++--
+ target/arm/translate-neon.inc.c |  23 +----
  target/arm/translate.c          | 146 +++++++++++++++++---------------
 files changed, 95 insertions(+), 102 deletions(-)
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
  void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 -extern const GVecGen3 cmtst_op[4];
 -extern const GVecGen3 sshl_op[4];
 -extern const GVecGen3 ushl_op[4];
 +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
  extern const GVecGen4 uqadd_op[4];
  extern const GVecGen4 sqadd_op[4];
  extern const GVecGen4 uqsub_op[4];
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
-     int index;
+             is_q ? 16 : 8, vec_full_reg_size(s));
-     TCGv_ptr fpst;
+ }
--    switch (opcode) {
+-/* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
--    case 0x0: /* MLA */
+-static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
--    case 0x4: /* MLS */
+-                         int rn, int rm, const GVecGen3 *gvec_op)
--        if (!u || is_scalar) {
+-{
-+    switch (16 * u + opcode) {
+-    tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
-+    case 0x08: /* MUL */
+-                   vec_full_reg_offset(s, rm), is_q ? 16 : 8,
-+    case 0x10: /* MLA */
+-                   vec_full_reg_size(s), gvec_op);
-+    case 0x14: /* MLS */
+-}
-+        if (is_scalar) {
+-
-             unallocated_encoding(s);
+ /* Expand a 3-operand operation using an out-of-line helper.  */
  static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
                               int rn, int rm, int data, gen_helper_gvec_3 *fn)
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                         (u ? uqsub_op : sqsub_op) + size);
          return;
      case 0x08: /* SSHL, USHL */
 -        gen_gvec_op3(s, is_q, rd, rn, rm,
 -                     u ? &ushl_op[size] : &sshl_op[size]);
 +        if (u) {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size);
 +        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size);
 +        }
          return;
      case 0x0c: /* SMAX, UMAX */
          if (u) {
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
          return;
      case 0x11:
          if (!u) { /* CMTST */
 -            gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]);
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size);
              return;
          }
-         break;
+         /* else CMEQ */
--    case 0x2: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
--    case 0x6: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */
+index XXXXXXX..XXXXXXX 100644
--    case 0xa: /* SMULL, SMULL2, UMULL, UMULL2 */
+--- a/target/arm/translate-neon.inc.c
-+    case 0x02: /* SMLAL, SMLAL2 */
++++ b/target/arm/translate-neon.inc.c
-+    case 0x12: /* UMLAL, UMLAL2 */
+@@ -XXX,XX +XXX,XX @@ DO_3SAME(VBIC, tcg_gen_gvec_andc)
-+    case 0x06: /* SMLSL, SMLSL2 */
+ DO_3SAME(VORR, tcg_gen_gvec_or)
-+    case 0x16: /* UMLSL, UMLSL2 */
+ DO_3SAME(VORN, tcg_gen_gvec_orc)
-+    case 0x0a: /* SMULL, SMULL2 */
+ DO_3SAME(VEOR, tcg_gen_gvec_xor)
-+    case 0x1a: /* UMULL, UMULL2 */
++DO_3SAME(VSHL_S, gen_gvec_sshl)
-         if (is_scalar) {
++DO_3SAME(VSHL_U, gen_gvec_ushl)
-             unallocated_encoding(s);
-             return;
+ /* These insns are all gvec_bitsel but with the inputs in various orders. */
-         }
+ #define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
-         is_long = true;
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
-         break;
+ DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
--    case 0x3: /* SQDMLAL, SQDMLAL2 */
+ DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
--    case 0x7: /* SQDMLSL, SQDMLSL2 */
+ DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
--    case 0xb: /* SQDMULL, SQDMULL2 */
++DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
-+    case 0x03: /* SQDMLAL, SQDMLAL2 */
-+    case 0x07: /* SQDMLSL, SQDMLSL2 */
+ #define DO_3SAME_CMP(INSN, COND)                                        \
-+    case 0x0b: /* SQDMULL, SQDMULL2 */
+     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-         is_long = true;
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
--        /* fall through */
+ DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
--    case 0xc: /* SQDMULH */
+ DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
--    case 0xd: /* SQRDMULH */
--        if (u) {
+-static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
--            unallocated_encoding(s);
+-                         uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
--            return;
+-{
--        }
+-    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
-         break;
+-}
--    case 0x8: /* MUL */
+-DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
--        if (u || is_scalar) {
+-
--            unallocated_encoding(s);
+ #define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
--            return;
+     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
--        }
+                                 uint32_t rn_ofs, uint32_t rm_ofs,       \
-+    case 0x0c: /* SQDMULH */
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
-+    case 0x0d: /* SQRDMULH */
+     }
-         break;
+     return do_3same(s, a, gen_VMUL_p_3s);
--    case 0x1: /* FMLA */
+ }
--    case 0x5: /* FMLS */
+-
--        if (u) {
+-#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
--            unallocated_encoding(s);
+-    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
--            return;
+-                                uint32_t rn_ofs, uint32_t rm_ofs,       \
--        }
+-                                uint32_t oprsz, uint32_t maxsz)         \
--        /* fall through */
+-    {                                                                   \
--    case 0x9: /* FMUL, FMULX */
+-        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
-+    case 0x01: /* FMLA */
+-                       oprsz, maxsz, &OPARRAY[vece]);                   \
-+    case 0x05: /* FMLS */
+-    }                                                                   \
-+    case 0x09: /* FMUL */
+-    DO_3SAME(INSN, gen_##INSN##_3s)
-+    case 0x19: /* FMULX */
+-
-         if (size == 1) {
+-DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
-             unallocated_encoding(s);
+-DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
-             return;
+diff --git a/target/arm/translate.c b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
-             read_vec_element(s, tcg_op, rn, pass, MO_64);
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
--            switch (opcode) {
+     tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
--            case 0x5: /* FMLS */
+ }
-+            switch (16 * u + opcode) {
-+            case 0x05: /* FMLS */
+-static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 };
-                 /* As usual for ARM, separate negation for fused multiply-add */
+-
-                 gen_helper_vfp_negd(tcg_op, tcg_op);
+-const GVecGen3 cmtst_op[4] = {
-                 /* fall through */
+-    { .fni4 = gen_helper_neon_tst_u8,
--            case 0x1: /* FMLA */
+-      .fniv = gen_cmtst_vec,
-+            case 0x01: /* FMLA */
+-      .opt_opc = vecop_list_cmtst,
-                 read_vec_element(s, tcg_res, rd, pass, MO_64);
+-      .vece = MO_8 },
-                 gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst);
+-    { .fni4 = gen_helper_neon_tst_u16,
-                 break;
+-      .fniv = gen_cmtst_vec,
--            case 0x9: /* FMUL, FMULX */
+-      .opt_opc = vecop_list_cmtst,
--                if (u) {
+-      .vece = MO_16 },
--                    gen_helper_vfp_mulxd(tcg_res, tcg_op, tcg_idx, fpst);
+-    { .fni4 = gen_cmtst_i32,
--                } else {
+-      .fniv = gen_cmtst_vec,
--                    gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst);
+-      .opt_opc = vecop_list_cmtst,
--                }
+-      .vece = MO_32 },
-+            case 0x09: /* FMUL */
+-    { .fni8 = gen_cmtst_i64,
-+                gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst);
+-      .fniv = gen_cmtst_vec,
-+                break;
+-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-+            case 0x19: /* FMULX */
+-      .opt_opc = vecop_list_cmtst,
-+                gen_helper_vfp_mulxd(tcg_res, tcg_op, tcg_idx, fpst);
+-      .vece = MO_64 },
-                 break;
+-};
-             default:
++void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-                 g_assert_not_reached();
++                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
++{
++    static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
-             read_vec_element_i32(s, tcg_op, rn, pass, is_scalar ? size : MO_32);
++    static const GVecGen3 ops[4] = {
++        { .fni4 = gen_helper_neon_tst_u8,
--            switch (opcode) {
++          .fniv = gen_cmtst_vec,
--            case 0x0: /* MLA */
++          .opt_opc = vecop_list,
--            case 0x4: /* MLS */
++          .vece = MO_8 },
--            case 0x8: /* MUL */
++        { .fni4 = gen_helper_neon_tst_u16,
-+            switch (16 * u + opcode) {
++          .fniv = gen_cmtst_vec,
-+            case 0x08: /* MUL */
++          .opt_opc = vecop_list,
-+            case 0x10: /* MLA */
++          .vece = MO_16 },
-+            case 0x14: /* MLS */
++        { .fni4 = gen_cmtst_i32,
-             {
++          .fniv = gen_cmtst_vec,
-                 static NeonGenTwoOpFn * const fns[2][2] = {
++          .opt_opc = vecop_list,
-                     { gen_helper_neon_add_u16, gen_helper_neon_sub_u16 },
++          .vece = MO_32 },
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
++        { .fni8 = gen_cmtst_i64,
-                 genfn(tcg_res, tcg_op, tcg_res);
++          .fniv = gen_cmtst_vec,
-                 break;
++          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-             }
++          .opt_opc = vecop_list,
--            case 0x5: /* FMLS */
++          .vece = MO_64 },
--            case 0x1: /* FMLA */
++    };
-+            case 0x05: /* FMLS */
++    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
-+            case 0x01: /* FMLA */
++}
-                 read_vec_element_i32(s, tcg_res, rd, pass,
-                                      is_scalar ? size : MO_32);
+ void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
-                 switch (size) {
+ {
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
-                     g_assert_not_reached();
+     tcg_temp_free_vec(rsh);
-                 }
+ }
-                 break;
--            case 0x9: /* FMUL, FMULX */
+-static const TCGOpcode ushl_list[] = {
-+            case 0x09: /* FMUL */
+-    INDEX_op_neg_vec, INDEX_op_shlv_vec,
-                 switch (size) {
+-    INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
-                 case 1:
+-};
--                    if (u) {
+-
--                        if (is_scalar) {
+-const GVecGen3 ushl_op[4] = {
--                            gen_helper_advsimd_mulxh(tcg_res, tcg_op,
+-    { .fniv = gen_ushl_vec,
--                                                     tcg_idx, fpst);
+-      .fno = gen_helper_gvec_ushl_b,
--                        } else {
+-      .opt_opc = ushl_list,
--                            gen_helper_advsimd_mulx2h(tcg_res, tcg_op,
+-      .vece = MO_8 },
--                                                      tcg_idx, fpst);
+-    { .fniv = gen_ushl_vec,
--                        }
+-      .fno = gen_helper_gvec_ushl_h,
-+                    if (is_scalar) {
+-      .opt_opc = ushl_list,
-+                        gen_helper_advsimd_mulh(tcg_res, tcg_op,
+-      .vece = MO_16 },
-+                                                tcg_idx, fpst);
+-    { .fni4 = gen_ushl_i32,
-                     } else {
+-      .fniv = gen_ushl_vec,
--                        if (is_scalar) {
+-      .opt_opc = ushl_list,
--                            gen_helper_advsimd_mulh(tcg_res, tcg_op,
+-      .vece = MO_32 },
--                                                    tcg_idx, fpst);
+-    { .fni8 = gen_ushl_i64,
--                        } else {
+-      .fniv = gen_ushl_vec,
--                            gen_helper_advsimd_mul2h(tcg_res, tcg_op,
+-      .opt_opc = ushl_list,
--                                                     tcg_idx, fpst);
+-      .vece = MO_64 },
--                        }
+-};
-+                        gen_helper_advsimd_mul2h(tcg_res, tcg_op,
++void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-+                                                 tcg_idx, fpst);
++                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
-                     }
++{
-                     break;
++    static const TCGOpcode vecop_list[] = {
-                 case 2:
++        INDEX_op_neg_vec, INDEX_op_shlv_vec,
--                    if (u) {
++        INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
--                        gen_helper_vfp_mulxs(tcg_res, tcg_op, tcg_idx, fpst);
++    };
--                    } else {
++    static const GVecGen3 ops[4] = {
--                        gen_helper_vfp_muls(tcg_res, tcg_op, tcg_idx, fpst);
++        { .fniv = gen_ushl_vec,
--                    }
++          .fno = gen_helper_gvec_ushl_b,
-+                    gen_helper_vfp_muls(tcg_res, tcg_op, tcg_idx, fpst);
++          .opt_opc = vecop_list,
-                     break;
++          .vece = MO_8 },
-                 default:
++        { .fniv = gen_ushl_vec,
-                     g_assert_not_reached();
++          .fno = gen_helper_gvec_ushl_h,
-                 }
++          .opt_opc = vecop_list,
-                 break;
++          .vece = MO_16 },
--            case 0xc: /* SQDMULH */
++        { .fni4 = gen_ushl_i32,
-+            case 0x19: /* FMULX */
++          .fniv = gen_ushl_vec,
-+                switch (size) {
++          .opt_opc = vecop_list,
-+                case 1:
++          .vece = MO_32 },
-+                    if (is_scalar) {
++        { .fni8 = gen_ushl_i64,
-+                        gen_helper_advsimd_mulxh(tcg_res, tcg_op,
++          .fniv = gen_ushl_vec,
-+                                                 tcg_idx, fpst);
++          .opt_opc = vecop_list,
-+                    } else {
++          .vece = MO_64 },
-+                        gen_helper_advsimd_mulx2h(tcg_res, tcg_op,
++    };
-+                                                  tcg_idx, fpst);
++    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
-+                    }
++}
-+                    break;
-+                case 2:
+ void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
-+                    gen_helper_vfp_mulxs(tcg_res, tcg_op, tcg_idx, fpst);
+ {
-+                    break;
+@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
-+                default:
+     tcg_temp_free_vec(tmp);
-+                    g_assert_not_reached();
+ }
-+                }
-+                break;
+-static const TCGOpcode sshl_list[] = {
-+            case 0x0c: /* SQDMULH */
+-    INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
-                 if (size == 1) {
+-    INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
-                     gen_helper_neon_qdmulh_s16(tcg_res, cpu_env,
+-};
-                                                tcg_op, tcg_idx);
+-
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
+-const GVecGen3 sshl_op[4] = {
-                                                tcg_op, tcg_idx);
+-    { .fniv = gen_sshl_vec,
-                 }
+-      .fno = gen_helper_gvec_sshl_b,
-                 break;
+-      .opt_opc = sshl_list,
--            case 0xd: /* SQRDMULH */
+-      .vece = MO_8 },
-+            case 0x0d: /* SQRDMULH */
+-    { .fniv = gen_sshl_vec,
-                 if (size == 1) {
+-      .fno = gen_helper_gvec_sshl_h,
-                     gen_helper_neon_qrdmulh_s16(tcg_res, cpu_env,
+-      .opt_opc = sshl_list,
-                                                 tcg_op, tcg_idx);
+-      .vece = MO_16 },
 -    { .fni4 = gen_sshl_i32,
 -      .fniv = gen_sshl_vec,
 -      .opt_opc = sshl_list,
 -      .vece = MO_32 },
 -    { .fni8 = gen_sshl_i64,
 -      .fniv = gen_sshl_vec,
 -      .opt_opc = sshl_list,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
 +        INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_sshl_vec,
 +          .fno = gen_helper_gvec_sshl_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_sshl_vec,
 +          .fno = gen_helper_gvec_sshl_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_sshl_i32,
 +          .fniv = gen_sshl_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_sshl_i64,
 +          .fniv = gen_sshl_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 36/39] target/arm: Decode aa32 armv8.3 3-same
+[PULL 11/45] target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
 From: Richard Henderson <richard.henderson@linaro.org>
+Provide a functional interface for the vector expansion.
+This fits better with the existing set of helpers that
+we provide for other operations.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200513163245.17915-11-richard.henderson@linaro.org
 Message-id: 20180228193125.20577-14-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate.h          |  13 +-
-file changed, 68 insertions(+)
+ target/arm/translate-a64.c      |  22 ++-
+ target/arm/translate-neon.inc.c |  19 +--
  target/arm/translate.c          | 228 +++++++++++++++++---------------
 files changed, 147 insertions(+), 135 deletions(-)
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
  void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                     uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 -extern const GVecGen4 uqadd_op[4];
 -extern const GVecGen4 sqadd_op[4];
 -extern const GVecGen4 uqsub_op[4];
 -extern const GVecGen4 sqsub_op[4];
  void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
  void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
  void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
  void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
  void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
  void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                     int64_t shift, uint32_t opr_sz, uint32_t max_sz);
  void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
      switch (opcode) {
      case 0x01: /* SQADD, UQADD */
 -        tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
 -                       offsetof(CPUARMState, vfp.qc),
 -                       vec_full_reg_offset(s, rn),
 -                       vec_full_reg_offset(s, rm),
 -                       is_q ? 16 : 8, vec_full_reg_size(s),
 -                       (u ? uqadd_op : sqadd_op) + size);
 +        if (u) {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size);
 +        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size);
 +        }
          return;
      case 0x05: /* SQSUB, UQSUB */
 -        tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
 -                       offsetof(CPUARMState, vfp.qc),
 -                       vec_full_reg_offset(s, rn),
 -                       vec_full_reg_offset(s, rm),
 -                       is_q ? 16 : 8, vec_full_reg_size(s),
 -                       (u ? uqsub_op : sqsub_op) + size);
 +        if (u) {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size);
 +        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size);
 +        }
          return;
      case 0x08: /* SSHL, USHL */
          if (u) {
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME(VORN, tcg_gen_gvec_orc)
  DO_3SAME(VEOR, tcg_gen_gvec_xor)
  DO_3SAME(VSHL_S, gen_gvec_sshl)
  DO_3SAME(VSHL_U, gen_gvec_ushl)
 +DO_3SAME(VQADD_S, gen_gvec_sqadd_qc)
 +DO_3SAME(VQADD_U, gen_gvec_uqadd_qc)
 +DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc)
 +DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc)
  /* These insns are all gvec_bitsel but with the inputs in various orders. */
  #define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
  DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
  DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
 -#define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
 -    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 -                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 -                                uint32_t oprsz, uint32_t maxsz)         \
 -    {                                                                   \
 -        tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),           \
 -                       rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]);   \
 -    }                                                                   \
 -    DO_3SAME(INSN, gen_##INSN##_3s)
 -
 -DO_3SAME_GVEC4(VQADD_S, sqadd_op)
 -DO_3SAME_GVEC4(VQADD_U, uqadd_op)
 -DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
 -DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
 -
  static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                             uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
  {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
-     return 0;
+     tcg_temp_free_vec(x);
  }
-+/* Advanced SIMD three registers of the same length extension.
+-static const TCGOpcode vecop_list_uqadd[] = {
-+ *  31           25    23  22    20   16   12  11   10   9    8        3     0
+-    INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
-+ * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
+-};
-+ * | 1 1 1 1 1 1 0 | op1 | D | op2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
+-
-+ * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
+-const GVecGen4 uqadd_op[4] = {
-+ */
+-    { .fniv = gen_uqadd_vec,
-+static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+-      .fno = gen_helper_gvec_uqadd_b,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_8 },
 -    { .fniv = gen_uqadd_vec,
 -      .fno = gen_helper_gvec_uqadd_h,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_16 },
 -    { .fniv = gen_uqadd_vec,
 -      .fno = gen_helper_gvec_uqadd_s,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_32 },
 -    { .fniv = gen_uqadd_vec,
 -      .fno = gen_helper_gvec_uqadd_d,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
-+    gen_helper_gvec_3_ptr *fn_gvec_ptr;
++    static const TCGOpcode vecop_list[] = {
-+    int rd, rn, rm, rot, size, opr_sz;
++        INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
-+    TCGv_ptr fpst;
++    };
-+    bool q;
++    static const GVecGen4 ops[4] = {
-+
++        { .fniv = gen_uqadd_vec,
-+    q = extract32(insn, 6, 1);
++          .fno = gen_helper_gvec_uqadd_b,
-+    VFP_DREG_D(rd, insn);
++          .write_aofs = true,
-+    VFP_DREG_N(rn, insn);
++          .opt_opc = vecop_list,
-+    VFP_DREG_M(rm, insn);
++          .vece = MO_8 },
-+    if ((rd | rn | rm) & q) {
++        { .fniv = gen_uqadd_vec,
-+        return 1;
++          .fno = gen_helper_gvec_uqadd_h,
-+    }
++          .write_aofs = true,
-+
++          .opt_opc = vecop_list,
-+    if ((insn & 0xfe200f10) == 0xfc200800) {
++          .vece = MO_16 },
-+        /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
++        { .fniv = gen_uqadd_vec,
-+        size = extract32(insn, 20, 1);
++          .fno = gen_helper_gvec_uqadd_s,
-+        rot = extract32(insn, 23, 2);
++          .write_aofs = true,
-+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
++          .opt_opc = vecop_list,
-+            || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
++          .vece = MO_32 },
-+            return 1;
++        { .fniv = gen_uqadd_vec,
-+        }
++          .fno = gen_helper_gvec_uqadd_d,
-+        fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
++          .write_aofs = true,
-+    } else if ((insn & 0xfea00f10) == 0xfc800800) {
++          .opt_opc = vecop_list,
-+        /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
++          .vece = MO_64 },
-+        size = extract32(insn, 20, 1);
++    };
-+        rot = extract32(insn, 24, 1);
++    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
-+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
++                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +            || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
 +            return 1;
 +        }
 +        fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
 +    } else {
 +        return 1;
 +    }
 +
 +    if (s->fp_excp_el) {
 +        gen_exception_insn(s, 4, EXCP_UDEF,
 +                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
 +        return 0;
 +    }
 +    if (!s->vfp_enabled) {
 +        return 1;
 +    }
 +
 +    opr_sz = (1 + q) * 8;
 +    fpst = get_fpstatus_ptr(1);
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
 +                       vfp_reg_offset(1, rn),
 +                       vfp_reg_offset(1, rm), fpst,
 +                       opr_sz, opr_sz, rot, fn_gvec_ptr);
 +    tcg_temp_free_ptr(fpst);
 +    return 0;
 +}
-+
- static int disas_coproc_insn(DisasContext *s, uint32_t insn)
+ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
- {
+                           TCGv_vec a, TCGv_vec b)
-     int cpnum, is64, crn, crm, opc1, opc2, isread, rt, rt2;
+@@ -XXX,XX +XXX,XX @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+     tcg_temp_free_vec(x);
-                     }
+ }
-                 }
-             }
+-static const TCGOpcode vecop_list_sqadd[] = {
-+        } else if ((insn & 0x0e000a00) == 0x0c000800
+-    INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
-+                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
+-};
-+            if (disas_neon_insn_3same_ext(s, insn)) {
+-
-+                goto illegal_op;
+-const GVecGen4 sqadd_op[4] = {
-+            }
+-    { .fniv = gen_sqadd_vec,
-+            return;
+-      .fno = gen_helper_gvec_sqadd_b,
-         } else if ((insn & 0x0fe00000) == 0x0c400000) {
+-      .opt_opc = vecop_list_sqadd,
-             /* Coprocessor double register transfer.  */
+-      .write_aofs = true,
-             ARCH(5TE);
+-      .vece = MO_8 },
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_h,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_16 },
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_s,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_32 },
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_d,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_b,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_h,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_16 },
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_s,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_32 },
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_d,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
      tcg_temp_free_vec(x);
  }
 -static const TCGOpcode vecop_list_uqsub[] = {
 -    INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 -};
 -
 -const GVecGen4 uqsub_op[4] = {
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_b,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_8 },
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_h,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_16 },
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_s,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_32 },
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_d,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_b,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_h,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_16 },
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_s,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_32 },
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_d,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
      tcg_temp_free_vec(x);
  }
 -static const TCGOpcode vecop_list_sqsub[] = {
 -    INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 -};
 -
 -const GVecGen4 sqsub_op[4] = {
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_b,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_8 },
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_h,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_16 },
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_s,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_32 },
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_d,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_b,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_h,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_16 },
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_s,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_32 },
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_d,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 26/39] target/arm: Refactor disas_simd_indexed size checks
+[PULL 12/45] target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
 From: Richard Henderson <richard.henderson@linaro.org>
-The integer size check was already outside of the opcode switch;
+These operations do not touch fp_status.
 move the floating-point size check outside as well.  Unify the
 size vs index adjustment between fp and integer paths.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200513163245.17915-12-richard.henderson@linaro.org
 Message-id: 20180228193125.20577-4-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 65 +++++++++++++++++++++++-----------------------
+ target/arm/helper.h        |  4 ++--
-file changed, 32 insertions(+), 33 deletions(-)
+ target/arm/translate-a64.c |  5 ++---
  target/arm/translate.c     | 12 ++----------
  target/arm/vfp_helper.c    |  5 ++---
 files changed, 8 insertions(+), 18 deletions(-)
+diff --git a/target/arm/helper.h b/target/arm/helper.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.h
++++ b/target/arm/helper.h
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
+ DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
+ DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
+ DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
+-DEF_HELPER_2(recpe_u32, i32, i32, ptr)
+-DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr)
++DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32)
++DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32)
+ DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32)
+ DEF_HELPER_3(shl_cc, i32, env, i32, i32)
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode,
-     case 0x05: /* FMLS */
-     case 0x09: /* FMUL */
+             switch (opcode) {
-     case 0x19: /* FMULX */
+             case 0x3c: /* URECPE */
--        if (size == 1) {
+-                gen_helper_recpe_u32(tcg_res, tcg_op, fpst);
--            unallocated_encoding(s);
++                gen_helper_recpe_u32(tcg_res, tcg_op);
--            return;
+                 break;
--        }
+             case 0x3d: /* FRECPE */
-         is_fp = true;
+                 gen_helper_recpe_f32(tcg_res, tcg_op, fpst);
-         break;
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
      default:
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
      if (is_fp) {
          /* convert insn encoded size to TCGMemOp size */
          switch (size) {
 -        case 2: /* single precision */
 -            size = MO_32;
 -            index = h << 1 | l;
 -            rm |= (m << 4);
 -            break;
 -        case 3: /* double precision */
 -            size = MO_64;
 -            if (l || !is_q) {
 +        case 0: /* half-precision */
 +            if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
                  unallocated_encoding(s);
                  return;
              }
--            index = h;
+-            need_fpstatus = true;
 -            rm |= (m << 4);
 -            break;
 -        case 0: /* half precision */
              size = MO_16;
 -            index = h << 2 | l << 1 | m;
 -            is_fp16 = true;
 -            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 -                break;
 -            }
 -            /* fallthru */
 -        default: /* unallocated */
 -            unallocated_encoding(s);
 -            return;
 -        }
 -    } else {
 -        switch (size) {
 -        case 1:
 -            index = h << 2 | l << 1 | m;
              break;
--        case 2:
+         case 0x1e: /* FRINT32Z */
--            index = h << 1 | l;
+         case 0x1f: /* FRINT64Z */
--            rm |= (m << 4);
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
-+        case MO_32: /* single precision */
+                     gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus);
-+        case MO_64: /* double precision */
+                     break;
-             break;
+                 case 0x7c: /* URSQRTE */
-         default:
+-                    gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus);
-             unallocated_encoding(s);
++                    gen_helper_rsqrte_u32(tcg_res, tcg_op);
-             return;
+                     break;
-         }
+                 case 0x1e: /* FRINT32Z */
-+    } else {
+                 case 0x5e: /* FRINT32X */
-+        switch (size) {
+diff --git a/target/arm/translate.c b/target/arm/translate.c
-+        case MO_8:
+index XXXXXXX..XXXXXXX 100644
-+        case MO_64:
+--- a/target/arm/translate.c
-+            unallocated_encoding(s);
++++ b/target/arm/translate.c
-+            return;
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+        }
+                             break;
-+    }
+                         }
-+
+                         case NEON_2RM_VRECPE:
-+    /* Given TCGMemOp size, adjust register and indexing.  */
+-                        {
-+    switch (size) {
+-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-+    case MO_16:
+-                            gen_helper_recpe_u32(tmp, tmp, fpstatus);
-+        index = h << 2 | l << 1 | m;
+-                            tcg_temp_free_ptr(fpstatus);
-+        break;
++                            gen_helper_recpe_u32(tmp, tmp);
-+    case MO_32:
+                             break;
-+        index = h << 1 | l;
+-                        }
-+        rm |= m << 4;
+                         case NEON_2RM_VRSQRTE:
-+        break;
+-                        {
-+    case MO_64:
+-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-+        if (l || !is_q) {
+-                            gen_helper_rsqrte_u32(tmp, tmp, fpstatus);
-+            unallocated_encoding(s);
+-                            tcg_temp_free_ptr(fpstatus);
-+            return;
++                            gen_helper_rsqrte_u32(tmp, tmp);
-+        }
+                             break;
-+        index = h;
+-                        }
-+        rm |= m << 4;
+                         case NEON_2RM_VRECPE_F:
-+        break;
+                         {
-+    default:
+                             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-+        g_assert_not_reached();
+diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
-     }
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/vfp_helper.c
-     if (!fp_access_check(s)) {
++++ b/target/arm/vfp_helper.c
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp)
      return make_float64(val);
  }
 -uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp)
 +uint32_t HELPER(recpe_u32)(uint32_t a)
  {
 -    /* float_status *s = fpstp; */
      int input, estimate;
      if ((a & 0x80000000) == 0) {
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp)
      return deposit32(0, (32 - 9), 9, estimate);
  }
 -uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp)
 +uint32_t HELPER(rsqrte_u32)(uint32_t a)
  {
      int estimate;
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 24/39] target/arm: Add ARM_FEATURE_V8_RDM
+[PULL 13/45] target/arm: Create gen_gvec_{qrdmla,qrdmls}
 From: Richard Henderson <richard.henderson@linaro.org>
-Not enabled anywhere yet.
+Provide a functional interface for the vector expansion.
 This fits better with the existing set of helpers that
 we provide for other operations.
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-2-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-13-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h     | 1 +
+ target/arm/translate.h     |  5 ++++
- linux-user/elfload.c | 1 +
+ target/arm/translate-a64.c | 34 ++----------------------
-files changed, 2 insertions(+)
+ target/arm/translate.c     | 54 +++++++++++++++++++-------------------
 files changed, 34 insertions(+), 59 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/translate.h
-+++ b/target/arm/cpu.h
++++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ enum arm_features {
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-     ARM_FEATURE_V8_SHA3, /* implements SHA3 part of v8 Crypto Extensions */
+ void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-     ARM_FEATURE_V8_SM3, /* implements SM3 part of v8 Crypto Extensions */
+                   int64_t shift, uint32_t opr_sz, uint32_t max_sz);
-     ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */
-+    ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */
++void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-     ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
++                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
                         is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
  }
 -/* Expand a 3-operand + env pointer operation using
 - * an out-of-line helper.
 - */
 -static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd,
 -                             int rn, int rm, gen_helper_gvec_3_ptr *fn)
 -{
 -    tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
 -                       vec_full_reg_offset(s, rn),
 -                       vec_full_reg_offset(s, rm), cpu_env,
 -                       is_q ? 16 : 8, vec_full_reg_size(s), 0, fn);
 -}
 -
  /* Expand a 3-operand + fpstatus pointer + simd data value operation using
   * an out-of-line helper.
   */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
      switch (opcode) {
      case 0x0: /* SQRDMLAH (vector) */
 -        switch (size) {
 -        case 1:
 -            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16);
 -            break;
 -        case 2:
 -            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32);
 -            break;
 -        default:
 -            g_assert_not_reached();
 -        }
 +        gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlah_qc, size);
          return;
      case 0x1: /* SQRDMLSH (vector) */
 -        switch (size) {
 -        case 1:
 -            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16);
 -            break;
 -        case 2:
 -            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32);
 -            break;
 -        default:
 -            g_assert_not_reached();
 -        }
 +        gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlsh_qc, size);
          return;
      case 0x2: /* SDOT / UDOT */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
      [NEON_2RM_VCVT_UF] = 0x4,
  };
-diff --git a/linux-user/elfload.c b/linux-user/elfload.c
+-
-index XXXXXXX..XXXXXXX 100644
+-/* Expand v8.1 simd helper.  */
---- a/linux-user/elfload.c
+-static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
-+++ b/linux-user/elfload.c
+-                         int q, int rd, int rn, int rm)
-@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
++void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-     GET_FEATURE(ARM_FEATURE_V8_SHA512, ARM_HWCAP_A64_SHA512);
++                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
-     GET_FEATURE(ARM_FEATURE_V8_FP16,
+ {
-                 ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
+-    if (dc_isar_feature(aa32_rdm, s)) {
-+    GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
+-        int opr_sz = (1 + q) * 8;
- #undef GET_FEATURE
+-        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
+-                           vfp_reg_offset(1, rn),
-     return hwcaps;
+-                           vfp_reg_offset(1, rm), cpu_env,
 -                           opr_sz, opr_sz, 0, fn);
 -        return 0;
 -    }
 -    return 1;
 +    static gen_helper_gvec_3_ptr * const fns[2] = {
 +        gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
 +    };
 +    tcg_debug_assert(vece >= 1 && vece <= 2);
 +    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
 +                       opr_sz, max_sz, 0, fns[vece - 1]);
 +}
 +
 +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static gen_helper_gvec_3_ptr * const fns[2] = {
 +        gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
 +    };
 +    tcg_debug_assert(vece >= 1 && vece <= 2);
 +    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
 +                       opr_sz, max_sz, 0, fns[vece - 1]);
  }
  #define GEN_CMP0(NAME, COND)                                            \
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  break;  /* VPADD */
              }
              /* VQRDMLAH */
 -            switch (size) {
 -            case 1:
 -                return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16,
 -                                     q, rd, rn, rm);
 -            case 2:
 -                return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32,
 -                                     q, rd, rn, rm);
 +            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
 +                gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs,
 +                                     vec_size, vec_size);
 +                return 0;
              }
              return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  break;
              }
              /* VQRDMLSH */
 -            switch (size) {
 -            case 1:
 -                return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16,
 -                                     q, rd, rn, rm);
 -            case 2:
 -                return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32,
 -                                     q, rd, rn, rm);
 +            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
 +                gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs,
 +                                     vec_size, vec_size);
 +                return 0;
              }
              return 1;
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 37/39] target/arm: Decode aa32 armv8.3 2-reg-index
+[PULL 14/45] target/arm: Pass pointer to qc to qrdmla/qrdmls
 From: Richard Henderson <richard.henderson@linaro.org>
+Pass a pointer directly to env->vfp.qc[0], rather than env.
+This will allow SVE2, which does not modify QC, to pass a
+pointer to dummy storage.
+Change the return type of inl_qrdml.h_s16 to match the
+sense of the operation: signed.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-15-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-14-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate.c  | 18 ++++++++---
-file changed, 61 insertions(+)
+ target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------
 files changed, 54 insertions(+), 34 deletions(-)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
-     return 0;
+     [NEON_2RM_VCVT_UF] = 0x4,
- }
+ };
-+/* Advanced SIMD two registers and a scalar extension.
++static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
-+ *  31             24   23  22   20   16   12  11   10   9    8        3     0
++                            uint32_t opr_sz, uint32_t max_sz,
-+ * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
++                            gen_helper_gvec_3_ptr *fn)
 + * | 1 1 1 1 1 1 1 0 | o1 | D | o2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
 + * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
 + *
 + */
 +
 +static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
 +{
-+    int rd, rn, rm, rot, size, opr_sz;
++    TCGv_ptr qc_ptr = tcg_temp_new_ptr();
-+    TCGv_ptr fpst;
++
-+    bool q;
++    tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc));
-+
++    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
-+    q = extract32(insn, 6, 1);
++                       opr_sz, max_sz, 0, fn);
-+    VFP_DREG_D(rd, insn);
++    tcg_temp_free_ptr(qc_ptr);
 +    VFP_DREG_N(rn, insn);
 +    VFP_DREG_M(rm, insn);
 +    if ((rd | rn) & q) {
 +        return 1;
 +    }
 +
 +    if ((insn & 0xff000f10) == 0xfe000800) {
 +        /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */
 +        rot = extract32(insn, 20, 2);
 +        size = extract32(insn, 23, 1);
 +        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
 +            || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
 +            return 1;
 +        }
 +    } else {
 +        return 1;
 +    }
 +
 +    if (s->fp_excp_el) {
 +        gen_exception_insn(s, 4, EXCP_UDEF,
 +                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
 +        return 0;
 +    }
 +    if (!s->vfp_enabled) {
 +        return 1;
 +    }
 +
 +    opr_sz = (1 + q) * 8;
 +    fpst = get_fpstatus_ptr(1);
 +    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
 +                       vfp_reg_offset(1, rn),
 +                       vfp_reg_offset(1, rm), fpst,
 +                       opr_sz, opr_sz, rot,
 +                       size ? gen_helper_gvec_fcmlas_idx
 +                       : gen_helper_gvec_fcmlah_idx);
 +    tcg_temp_free_ptr(fpst);
 +    return 0;
 +}
 +
- static int disas_coproc_insn(DisasContext *s, uint32_t insn)
+ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
- {
+                           uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
-     int cpnum, is64, crn, crm, opc1, opc2, isread, rt, rt2;
+ {
-@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-                 goto illegal_op;
+         gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
-             }
+     };
-             return;
+     tcg_debug_assert(vece >= 1 && vece <= 2);
-+        } else if ((insn & 0x0f000a00) == 0x0e000800
+-    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
-+                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
+-                       opr_sz, max_sz, 0, fns[vece - 1]);
-+            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
++    gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
-+                goto illegal_op;
+ }
-+            }
-+            return;
+ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-         } else if ((insn & 0x0fe00000) == 0x0c400000) {
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-             /* Coprocessor double register transfer.  */
+         gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
-             ARCH(5TE);
+     };
      tcg_debug_assert(vece >= 1 && vece <= 2);
 -    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
 -                       opr_sz, max_sz, 0, fns[vece - 1]);
 +    gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
  }
  #define GEN_CMP0(NAME, COND)                                            \
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@
  #define H4(x)  (x)
  #endif
 -#define SET_QC() env->vfp.qc[0] = 1
 -
  static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
  {
      uint64_t *d = vd + opr_sz;
@@ -XXX,XX +XXX,XX @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
  }
  /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
 -static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
 -                                int16_t src2, int16_t src3)
 +static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2,
 +                               int16_t src3, uint32_t *sat)
  {
      /* Simplify:
       * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
      ret = ((int32_t)src3 << 15) + ret + (1 << 14);
      ret >>= 15;
      if (ret != (int16_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? -0x8000 : 0x7fff);
      }
      return ret;
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
  uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
                                    uint32_t src2, uint32_t src3)
  {
 -    uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3);
 -    uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
 +    uint32_t *sat = &env->vfp.qc[0];
 +    uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat);
 +    uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
      return deposit32(e1, 16, 16, e2);
  }
  void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int16_t *d = vd;
      int16_t *n = vn;
      int16_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 2; ++i) {
 -        d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
  /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
 -static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
 -                                int16_t src2, int16_t src3)
 +static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2,
 +                               int16_t src3, uint32_t *sat)
  {
      /* Similarly, using subtraction:
       * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
      ret = ((int32_t)src3 << 15) - ret + (1 << 14);
      ret >>= 15;
      if (ret != (int16_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? -0x8000 : 0x7fff);
      }
      return ret;
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
  uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
                                    uint32_t src2, uint32_t src3)
  {
 -    uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3);
 -    uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
 +    uint32_t *sat = &env->vfp.qc[0];
 +    uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat);
 +    uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
      return deposit32(e1, 16, 16, e2);
  }
  void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int16_t *d = vd;
      int16_t *n = vn;
      int16_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 2; ++i) {
 -        d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
  /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
 -uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
 -                                  int32_t src2, int32_t src3)
 +static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2,
 +                               int32_t src3, uint32_t *sat)
  {
      /* Simplify similarly to int_qrdmlah_s16 above.  */
      int64_t ret = (int64_t)src1 * src2;
      ret = ((int64_t)src3 << 31) + ret + (1 << 30);
      ret >>= 31;
      if (ret != (int32_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? INT32_MIN : INT32_MAX);
      }
      return ret;
  }
 +uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
 +                                  int32_t src2, int32_t src3)
 +{
 +    uint32_t *sat = &env->vfp.qc[0];
 +    return inl_qrdmlah_s32(src1, src2, src3, sat);
 +}
 +
  void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int32_t *d = vd;
      int32_t *n = vn;
      int32_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 4; ++i) {
 -        d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
  /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
 -uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
 -                                  int32_t src2, int32_t src3)
 +static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2,
 +                               int32_t src3, uint32_t *sat)
  {
      /* Simplify similarly to int_qrdmlsh_s16 above.  */
      int64_t ret = (int64_t)src1 * src2;
      ret = ((int64_t)src3 << 31) - ret + (1 << 30);
      ret >>= 31;
      if (ret != (int32_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? INT32_MIN : INT32_MAX);
      }
      return ret;
  }
 +uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
 +                                  int32_t src2, int32_t src3)
 +{
 +    uint32_t *sat = &env->vfp.qc[0];
 +    return inl_qrdmlsh_s32(src1, src2, src3, sat);
 +}
 +
  void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int32_t *d = vd;
      int32_t *n = vn;
      int32_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 4; ++i) {
 -        d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 32/39] target/arm: Enable ARM_FEATURE_V8_RDM
+[PULL 15/45] target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
 From: Richard Henderson <richard.henderson@linaro.org>
-Enable it for the "any" CPU used by *-linux-user.
+Must clear the tail for AdvSIMD when SVE is enabled.
+Fixes: ca40a6e6e39
+Cc: qemu-stable@nongnu.org
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20200513163245.17915-15-richard.henderson@linaro.org
 Message-id: 20180228193125.20577-10-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.c   | 1 +
+ target/arm/vec_helper.c | 2 ++
- target/arm/cpu64.c | 1 +
+file changed, 2 insertions(+)
 files changed, 2 insertions(+)
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+--- a/target/arm/vec_helper.c
-+++ b/target/arm/cpu.c
++++ b/target/arm/vec_helper.c
-@@ -XXX,XX +XXX,XX @@ static void arm_any_initfn(Object *obj)
+@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
-     set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
+             d[i + j] = TYPE##_mul(n[i + j], mm, stat);                     \
-     set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
+         }                                                                  \
-     set_feature(&cpu->env, ARM_FEATURE_CRC);
+     }                                                                      \
-+    set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
++    clear_tail(d, oprsz, simd_maxsz(desc));                                \
      cpu->midr = 0xffffffff;
  }
- #endif
-diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
+ DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
-index XXXXXXX..XXXXXXX 100644
+@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va,                  \
---- a/target/arm/cpu64.c
+                                      mm, a[i + j], 0, stat);               \
-+++ b/target/arm/cpu64.c
+         }                                                                  \
-@@ -XXX,XX +XXX,XX @@ static void aarch64_any_initfn(Object *obj)
+     }                                                                      \
-     set_feature(&cpu->env, ARM_FEATURE_V8_SM4);
++    clear_tail(d, oprsz, simd_maxsz(desc));                                \
-     set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
+ }
-     set_feature(&cpu->env, ARM_FEATURE_CRC);
-+    set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
+ DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2)
      set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
      cpu->ctr = 0x80038003; /* 32 byte I and D cacheline size, VIPT icache */
      cpu->dcz_blocksize = 7; /*  512 bytes */
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 30/39] target/arm: Decode aa32 armv8.1 three same
+[PULL 16/45] target/arm: Vectorize SABD/UABD
 From: Richard Henderson <richard.henderson@linaro.org>
+Include 64-bit element size in preparation for SVE2.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-8-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-16-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 86 +++++++++++++++++++++++++++++++++++++++-----------
+ target/arm/helper.h        |  10 +++
-file changed, 67 insertions(+), 19 deletions(-)
+ target/arm/translate.h     |   5 ++
+ target/arm/translate-a64.c |   8 ++-
  target/arm/translate.c     | 133 ++++++++++++++++++++++++++++++++++++-
  target/arm/vec_helper.c    |  24 +++++++
 files changed, 176 insertions(+), 4 deletions(-)
 diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.h
 +++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
  DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
  DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +
 +DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
  #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
  void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                            uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
              gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size);
          }
          return;
 +    case 0xe: /* SABD, UABD */
 +        if (u) {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size);
 +        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
 +        }
 +        return;
      case 0x10: /* ADD, SUB */
          if (u) {
              gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                  genenvfn = fns[size][u];
                  break;
              }
 -            case 0xe: /* SABD, UABD */
              case 0xf: /* SABA, UABA */
              {
                  static NeonGenTwoOpFn * const fns[3][2] = {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
- #include "disas/disas.h"
+                    rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
- #include "exec/exec-all.h"
+ }
- #include "tcg-op.h"
-+#include "tcg-op-gvec.h"
++static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
- #include "qemu/log.h"
++{
- #include "qemu/bitops.h"
++    TCGv_i32 t = tcg_temp_new_i32();
- #include "arm_ldst.h"
++
-@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
++    tcg_gen_sub_i32(t, a, b);
- #define NEON_3R_VPMAX 20
++    tcg_gen_sub_i32(d, b, a);
- #define NEON_3R_VPMIN 21
++    tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t);
- #define NEON_3R_VQDMULH_VQRDMULH 22
++    tcg_temp_free_i32(t);
--#define NEON_3R_VPADD 23
++}
-+#define NEON_3R_VPADD_VQRDMLAH 23
++
- #define NEON_3R_SHA 24 /* SHA1C,SHA1P,SHA1M,SHA1SU0,SHA256H{2},SHA256SU1 */
++static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
--#define NEON_3R_VFM 25 /* VFMA, VFMS : float fused multiply-add */
++{
-+#define NEON_3R_VFM_VQRDMLSH 25 /* VFMA, VFMS, VQRDMLSH */
++    TCGv_i64 t = tcg_temp_new_i64();
- #define NEON_3R_FLOAT_ARITH 26 /* float VADD, VSUB, VPADD, VABD */
++
- #define NEON_3R_FLOAT_MULTIPLY 27 /* float VMLA, VMLS, VMUL */
++    tcg_gen_sub_i64(t, a, b);
- #define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */
++    tcg_gen_sub_i64(d, b, a);
-@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_3r_sizes[] = {
++    tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t);
-     [NEON_3R_VPMAX] = 0x7,
++    tcg_temp_free_i64(t);
-     [NEON_3R_VPMIN] = 0x7,
++}
-     [NEON_3R_VQDMULH_VQRDMULH] = 0x6,
++
--    [NEON_3R_VPADD] = 0x7,
++static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
-+    [NEON_3R_VPADD_VQRDMLAH] = 0x7,
++{
-     [NEON_3R_SHA] = 0xf, /* size field encodes op type */
++    TCGv_vec t = tcg_temp_new_vec_matching(d);
--    [NEON_3R_VFM] = 0x5, /* size bit 1 encodes op */
++
-+    [NEON_3R_VFM_VQRDMLSH] = 0x7, /* For VFM, size bit 1 encodes op */
++    tcg_gen_smin_vec(vece, t, a, b);
-     [NEON_3R_FLOAT_ARITH] = 0x5, /* size bit 1 encodes op */
++    tcg_gen_smax_vec(vece, d, a, b);
-     [NEON_3R_FLOAT_MULTIPLY] = 0x5, /* size bit 1 encodes op */
++    tcg_gen_sub_vec(vece, d, d, t);
-     [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */
++    tcg_temp_free_vec(t);
-@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
++}
-     [NEON_2RM_VCVT_UF] = 0x4,
++
- };
++void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
-+
++{
-+/* Expand v8.1 simd helper.  */
++    static const TCGOpcode vecop_list[] = {
-+static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
++        INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0
-+                         int q, int rd, int rn, int rm)
++    };
-+{
++    static const GVecGen3 ops[4] = {
-+    if (arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
++        { .fniv = gen_sabd_vec,
-+        int opr_sz = (1 + q) * 8;
++          .fno = gen_helper_gvec_sabd_b,
-+        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
++          .opt_opc = vecop_list,
-+                           vfp_reg_offset(1, rn),
++          .vece = MO_8 },
-+                           vfp_reg_offset(1, rm), cpu_env,
++        { .fniv = gen_sabd_vec,
-+                           opr_sz, opr_sz, 0, fn);
++          .fno = gen_helper_gvec_sabd_h,
-+        return 0;
++          .opt_opc = vecop_list,
-+    }
++          .vece = MO_16 },
-+    return 1;
++        { .fni4 = gen_sabd_i32,
 +          .fniv = gen_sabd_vec,
 +          .fno = gen_helper_gvec_sabd_s,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_sabd_i64,
 +          .fniv = gen_sabd_vec,
 +          .fno = gen_helper_gvec_sabd_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
 +static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    tcg_gen_sub_i32(t, a, b);
 +    tcg_gen_sub_i32(d, b, a);
 +    tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_sub_i64(t, a, b);
 +    tcg_gen_sub_i64(d, b, a);
 +    tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +
 +    tcg_gen_umin_vec(vece, t, a, b);
 +    tcg_gen_umax_vec(vece, d, a, b);
 +    tcg_gen_sub_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_uabd_i32,
 +          .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_s,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_uabd_i64,
 +          .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
     We process data in a mixture of 32-bit and 64-bit chunks.
 @@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         if (q && ((rd | rn | rm) & 1)) {
+             }
              return 1;
-         }
--        /*
++        case NEON_3R_VABD:
--         * The SHA-1/SHA-256 3-register instructions require special treatment
++            if (u) {
--         * here, as their size field is overloaded as an op type selector, and
++                gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs,
--         * they all consume their input in a single pass.
++                              vec_size, vec_size);
--         */
++            } else {
--        if (op == NEON_3R_SHA) {
++                gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs,
-+        switch (op) {
++                              vec_size, vec_size);
-+        case NEON_3R_SHA:
++            }
-+            /* The SHA-1/SHA-256 3-register instructions require special
++            return 0;
-+             * treatment here, as their size field is overloaded as an
++
-+             * op type selector, and they all consume their input in a
+         case NEON_3R_VADD_VSUB:
-+             * single pass.
+         case NEON_3R_LOGIC:
-+             */
+         case NEON_3R_VMAX:
              if (!q) {
                  return 1;
              }
 @@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             tcg_temp_free_ptr(ptr2);
+         case NEON_3R_VQRSHL:
-             tcg_temp_free_ptr(ptr3);
+             GEN_NEON_INTEGER_OP_ENV(qrshl);
              return 0;
 +
 +        case NEON_3R_VPADD_VQRDMLAH:
 +            if (!u) {
 +                break;  /* VPADD */
 +            }
 +            /* VQRDMLAH */
 +            switch (size) {
 +            case 1:
 +                return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16,
 +                                     q, rd, rn, rm);
 +            case 2:
 +                return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32,
 +                                     q, rd, rn, rm);
 +            }
 +            return 1;
 +
 +        case NEON_3R_VFM_VQRDMLSH:
 +            if (!u) {
 +                /* VFM, VFMS */
 +                if (size == 1) {
 +                    return 1;
 +                }
 +                break;
 +            }
 +            /* VQRDMLSH */
 +            switch (size) {
 +            case 1:
 +                return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16,
 +                                     q, rd, rn, rm);
 +            case 2:
 +                return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32,
 +                                     q, rd, rn, rm);
 +            }
 +            return 1;
          }
          if (size == 3 && op != NEON_3R_LOGIC) {
              /* 64-bit element instructions. */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  rm = rtmp;
              }
              break;
--        case NEON_3R_VPADD:
+-        case NEON_3R_VABD:
--            if (u) {
+-            GEN_NEON_INTEGER_OP(abd);
--                return 1;
+-            break;
--            }
+         case NEON_3R_VABA:
--            /* Fall through */
+             GEN_NEON_INTEGER_OP(abd);
-+        case NEON_3R_VPADD_VQRDMLAH:
+             tcg_temp_free_i32(tmp2);
-         case NEON_3R_VPMAX:
+diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
-         case NEON_3R_VPMIN:
+index XXXXXXX..XXXXXXX 100644
-             pairwise = 1;
+--- a/target/arm/vec_helper.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
++++ b/target/arm/vec_helper.c
-                 return 1;
+@@ -XXX,XX +XXX,XX @@ DO_CMP0(gvec_cgt0_h, int16_t, >)
-             }
+ DO_CMP0(gvec_cge0_h, int16_t, >=)
-             break;
--        case NEON_3R_VFM:
+ #undef DO_CMP0
--            if (!arm_dc_feature(s, ARM_FEATURE_VFP4) || u) {
++
-+        case NEON_3R_VFM_VQRDMLSH:
++#define DO_ABD(NAME, TYPE)                                      \
-+            if (!arm_dc_feature(s, ARM_FEATURE_VFP4)) {
++void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc)  \
-                 return 1;
++{                                                               \
-             }
++    intptr_t i, opr_sz = simd_oprsz(desc);                      \
-             break;
++    TYPE *d = vd, *n = vn, *m = vm;                             \
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
++                                                                \
-                 }
++    for (i = 0; i < opr_sz / sizeof(TYPE); ++i) {               \
-             }
++        d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i];         \
-             break;
++    }                                                           \
--        case NEON_3R_VPADD:
++    clear_tail(d, opr_sz, simd_maxsz(desc));                    \
-+        case NEON_3R_VPADD_VQRDMLAH:
++}
-             switch (size) {
++
-             case 0: gen_helper_neon_padd_u8(tmp, tmp, tmp2); break;
++DO_ABD(gvec_sabd_b, int8_t)
-             case 1: gen_helper_neon_padd_u16(tmp, tmp, tmp2); break;
++DO_ABD(gvec_sabd_h, int16_t)
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
++DO_ABD(gvec_sabd_s, int32_t)
-               }
++DO_ABD(gvec_sabd_d, int64_t)
-             }
++
-             break;
++DO_ABD(gvec_uabd_b, uint8_t)
--        case NEON_3R_VFM:
++DO_ABD(gvec_uabd_h, uint16_t)
-+        case NEON_3R_VFM_VQRDMLSH:
++DO_ABD(gvec_uabd_s, uint32_t)
-         {
++DO_ABD(gvec_uabd_d, uint64_t)
-             /* VFMA, VFMS: fused multiply-add */
++
-             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
++#undef DO_ABD
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 35/39] target/arm: Decode aa64 armv8.3 fcmla
+[PULL 17/45] target/arm: Vectorize SABA/UABA
 From: Richard Henderson <richard.henderson@linaro.org>
+Include 64-bit element size in preparation for SVE2.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180228193125.20577-13-richard.henderson@linaro.org
+Message-id: 20200513163245.17915-17-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-[PMM: renamed e1/e2/e3/e4 to use the same naming as the version
- of the pseudocode in the Arm ARM]
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h        |  11 ++++
+ target/arm/helper.h        |  17 +++--
- target/arm/translate-a64.c |  94 +++++++++++++++++++++++++---
+ target/arm/translate.h     |   5 ++
- target/arm/vec_helper.c    | 149 +++++++++++++++++++++++++++++++++++++++++++++
+ target/arm/neon_helper.c   |  10 ---
-files changed, 246 insertions(+), 8 deletions(-)
+ target/arm/translate-a64.c |  17 ++---
  target/arm/translate.c     | 134 +++++++++++++++++++++++++++++++++++--
  target/arm/vec_helper.c    |  24 +++++++
 files changed, 174 insertions(+), 33 deletions(-)
 diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.h
 +++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG,
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32)
- DEF_HELPER_FLAGS_5(gvec_fcaddd, TCG_CALL_NO_RWG,
+ DEF_HELPER_2(neon_pmax_u16, i32, i32, i32)
-                    void, ptr, ptr, ptr, ptr, i32)
+ DEF_HELPER_2(neon_pmax_s16, i32, i32, i32)
-+DEF_HELPER_FLAGS_5(gvec_fcmlah, TCG_CALL_NO_RWG,
+-DEF_HELPER_2(neon_abd_u8, i32, i32, i32)
-+                   void, ptr, ptr, ptr, ptr, i32)
+-DEF_HELPER_2(neon_abd_s8, i32, i32, i32)
-+DEF_HELPER_FLAGS_5(gvec_fcmlah_idx, TCG_CALL_NO_RWG,
+-DEF_HELPER_2(neon_abd_u16, i32, i32, i32)
-+                   void, ptr, ptr, ptr, ptr, i32)
+-DEF_HELPER_2(neon_abd_s16, i32, i32, i32)
-+DEF_HELPER_FLAGS_5(gvec_fcmlas, TCG_CALL_NO_RWG,
+-DEF_HELPER_2(neon_abd_u32, i32, i32, i32)
-+                   void, ptr, ptr, ptr, ptr, i32)
+-DEF_HELPER_2(neon_abd_s32, i32, i32, i32)
-+DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG,
+-
-+                   void, ptr, ptr, ptr, ptr, i32)
+ DEF_HELPER_2(neon_shl_u16, i32, i32, i32)
-+DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG,
+ DEF_HELPER_2(neon_shl_s16, i32, i32, i32)
-+                   void, ptr, ptr, ptr, ptr, i32)
+ DEF_HELPER_2(neon_rshl_u8, i32, i32, i32)
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +
 +DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
- #endif
+ #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
  void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                     uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon_helper.c
 +++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ NEON_POP(pmax_s16, neon_s16, 2)
  NEON_POP(pmax_u16, neon_u16, 2)
  #undef NEON_FN
 -#define NEON_FN(dest, src1, src2) \
 -    dest = (src1 > src2) ? (src1 - src2) : (src2 - src1)
 -NEON_VOP(abd_s8, neon_s8, 4)
 -NEON_VOP(abd_u8, neon_u8, 4)
 -NEON_VOP(abd_s16, neon_s16, 2)
 -NEON_VOP(abd_u16, neon_u16, 2)
 -NEON_VOP(abd_s32, neon_s32, 1)
 -NEON_VOP(abd_u32, neon_u32, 1)
 -#undef NEON_FN
 -
  #define NEON_FN(dest, src1, src2) do { \
      int8_t tmp; \
      tmp = (int8_t)src2; \
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
-         }
+             gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
          feature = ARM_FEATURE_V8_RDM;
          break;
 +    case 0x8: /* FCMLA, #0 */
 +    case 0x9: /* FCMLA, #90 */
 +    case 0xa: /* FCMLA, #180 */
 +    case 0xb: /* FCMLA, #270 */
      case 0xc: /* FCADD, #90 */
      case 0xe: /* FCADD, #270 */
          if (size == 0
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
          }
          return;
++    case 0xf: /* SABA, UABA */
-+    case 0x8: /* FCMLA, #0 */
++        if (u) {
-+    case 0x9: /* FCMLA, #90 */
++            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size);
-+    case 0xa: /* FCMLA, #180 */
++        } else {
-+    case 0xb: /* FCMLA, #270 */
++            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size);
 +        rot = extract32(opcode, 0, 2);
 +        switch (size) {
 +        case 1:
 +            gen_gvec_op3_fpst(s, is_q, rd, rn, rm, true, rot,
 +                              gen_helper_gvec_fcmlah);
 +            break;
 +        case 2:
 +            gen_gvec_op3_fpst(s, is_q, rd, rn, rm, false, rot,
 +                              gen_helper_gvec_fcmlas);
 +            break;
 +        case 3:
 +            gen_gvec_op3_fpst(s, is_q, rd, rn, rm, false, rot,
 +                              gen_helper_gvec_fcmlad);
 +            break;
 +        default:
 +            g_assert_not_reached();
 +        }
 +        return;
-+
+     case 0x10: /* ADD, SUB */
-     case 0xc: /* FCADD, #90 */
+         if (u) {
-     case 0xe: /* FCADD, #270 */
+             gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
-         rot = extract32(opcode, 1, 1);
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
+                 genenvfn = fns[size][u];
-     int rn = extract32(insn, 5, 5);
+                 break;
-     int rd = extract32(insn, 0, 5);
+             }
-     bool is_long = false;
+-            case 0xf: /* SABA, UABA */
--    bool is_fp = false;
+-            {
-+    int is_fp = 0;
+-                static NeonGenTwoOpFn * const fns[3][2] = {
-     bool is_fp16 = false;
+-                    { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 },
-     int index;
+-                    { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 },
-     TCGv_ptr fpst;
+-                    { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 },
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
+-                };
-     case 0x05: /* FMLS */
+-                genfn = fns[size][u];
-     case 0x09: /* FMUL */
+-                break;
      case 0x19: /* FMULX */
 -        is_fp = true;
 +        is_fp = 1;
          break;
      case 0x1d: /* SQRDMLAH */
      case 0x1f: /* SQRDMLSH */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
              return;
          }
          break;
 +    case 0x11: /* FCMLA #0 */
 +    case 0x13: /* FCMLA #90 */
 +    case 0x15: /* FCMLA #180 */
 +    case 0x17: /* FCMLA #270 */
 +        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)) {
 +            unallocated_encoding(s);
 +            return;
 +        }
 +        is_fp = 2;
 +        break;
      default:
          unallocated_encoding(s);
          return;
      }
 -    if (is_fp) {
 +    switch (is_fp) {
 +    case 1: /* normal fp */
          /* convert insn encoded size to TCGMemOp size */
          switch (size) {
          case 0: /* half-precision */
 -            if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 -                unallocated_encoding(s);
 -                return;
 -            }
-             size = MO_16;
+             case 0x16: /* SQDMULH, SQRDMULH */
-+            is_fp16 = true;
+             {
                  static NeonGenTwoOpEnvFn * const fns[2][2] = {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
      tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
  }
 +static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +    gen_sabd_i32(t, a, b);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +    gen_sabd_i64(t, a, b);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    gen_sabd_vec(vece, t, a, b);
 +    tcg_gen_add_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_add_vec,
 +        INDEX_op_smin_vec, INDEX_op_smax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_b,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_h,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_16 },
 +        { .fni4 = gen_saba_i32,
 +          .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_s,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_32 },
 +        { .fni8 = gen_saba_i64,
 +          .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
 +static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +    gen_uabd_i32(t, a, b);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +    gen_uabd_i64(t, a, b);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    gen_uabd_vec(vece, t, a, b);
 +    tcg_gen_add_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_add_vec,
 +        INDEX_op_umin_vec, INDEX_op_umax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_b,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_h,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_16 },
 +        { .fni4 = gen_uaba_i32,
 +          .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_s,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_32 },
 +        { .fni8 = gen_uaba_i64,
 +          .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
     We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 0;
 +        case NEON_3R_VABA:
 +            if (u) {
 +                gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
 +                              vec_size, vec_size);
 +            } else {
 +                gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
 +                              vec_size, vec_size);
 +            }
 +            return 0;
 +
          case NEON_3R_VADD_VSUB:
          case NEON_3R_LOGIC:
          case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VQRSHL:
              GEN_NEON_INTEGER_OP_ENV(qrshl);
              break;
-         case MO_32: /* single precision */
+-        case NEON_3R_VABA:
-         case MO_64: /* double precision */
+-            GEN_NEON_INTEGER_OP(abd);
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
+-            tcg_temp_free_i32(tmp2);
-             unallocated_encoding(s);
+-            tmp2 = neon_load_reg(rd, pass);
-             return;
+-            gen_neon_add(size, tmp, tmp2);
-         }
+-            break;
--    } else {
+         case NEON_3R_VPMAX:
-+        break;
+             GEN_NEON_INTEGER_OP(pmax);
-+
+             break;
 +    case 2: /* complex fp */
 +        /* Each indexable element is a complex pair.  */
 +        size <<= 1;
 +        switch (size) {
 +        case MO_32:
 +            if (h && !is_q) {
 +                unallocated_encoding(s);
 +                return;
 +            }
 +            is_fp16 = true;
 +            break;
 +        case MO_64:
 +            break;
 +        default:
 +            unallocated_encoding(s);
 +            return;
 +        }
 +        break;
 +
 +    default: /* integer */
          switch (size) {
          case MO_8:
          case MO_64:
              unallocated_encoding(s);
              return;
          }
 +        break;
 +    }
 +    if (is_fp16 && !arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
 +        unallocated_encoding(s);
 +        return;
      }
      /* Given TCGMemOp size, adjust register and indexing.  */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
          fpst = NULL;
      }
 +    switch (16 * u + opcode) {
 +    case 0x11: /* FCMLA #0 */
 +    case 0x13: /* FCMLA #90 */
 +    case 0x15: /* FCMLA #180 */
 +    case 0x17: /* FCMLA #270 */
 +        tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
 +                           vec_full_reg_offset(s, rn),
 +                           vec_reg_offset(s, rm, index, size), fpst,
 +                           is_q ? 16 : 8, vec_full_reg_size(s),
 +                           extract32(insn, 13, 2), /* rot */
 +                           size == MO_64
 +                           ? gen_helper_gvec_fcmlas_idx
 +                           : gen_helper_gvec_fcmlah_idx);
 +        tcg_temp_free_ptr(fpst);
 +        return;
 +    }
 +
      if (size == 3) {
          TCGv_i64 tcg_idx = tcg_temp_new_i64();
          int pass;
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
-@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_fcaddd)(void *vd, void *vn, void *vm,
+@@ -XXX,XX +XXX,XX @@ DO_ABD(gvec_uabd_s, uint32_t)
-     }
+ DO_ABD(gvec_uabd_d, uint64_t)
-     clear_tail(d, opr_sz, simd_maxsz(desc));
- }
+ #undef DO_ABD
 +
-+void HELPER(gvec_fcmlah)(void *vd, void *vn, void *vm,
++#define DO_ABA(NAME, TYPE)                                      \
-+                         void *vfpst, uint32_t desc)
++void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc)  \
-+{
++{                                                               \
-+    uintptr_t opr_sz = simd_oprsz(desc);
++    intptr_t i, opr_sz = simd_oprsz(desc);                      \
-+    float16 *d = vd;
++    TYPE *d = vd, *n = vn, *m = vm;                             \
-+    float16 *n = vn;
++                                                                \
-+    float16 *m = vm;
++    for (i = 0; i < opr_sz / sizeof(TYPE); ++i) {               \
-+    float_status *fpst = vfpst;
++        d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i];        \
-+    intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
++    }                                                           \
-+    uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
++    clear_tail(d, opr_sz, simd_maxsz(desc));                    \
-+    uint32_t neg_real = flip ^ neg_imag;
++}
-+    uintptr_t i;
++
-+
++DO_ABA(gvec_saba_b, int8_t)
-+    /* Shift boolean to the sign bit so we can xor to negate.  */
++DO_ABA(gvec_saba_h, int16_t)
-+    neg_real <<= 15;
++DO_ABA(gvec_saba_s, int32_t)
-+    neg_imag <<= 15;
++DO_ABA(gvec_saba_d, int64_t)
 +
-+    for (i = 0; i < opr_sz / 2; i += 2) {
++DO_ABA(gvec_uaba_b, uint8_t)
-+        float16 e2 = n[H2(i + flip)];
++DO_ABA(gvec_uaba_h, uint16_t)
-+        float16 e1 = m[H2(i + flip)] ^ neg_real;
++DO_ABA(gvec_uaba_s, uint32_t)
-+        float16 e4 = e2;
++DO_ABA(gvec_uaba_d, uint64_t)
-+        float16 e3 = m[H2(i + 1 - flip)] ^ neg_imag;
++
-+
++#undef DO_ABA
 +        d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst);
 +        d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 +
 +void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm,
 +                             void *vfpst, uint32_t desc)
 +{
 +    uintptr_t opr_sz = simd_oprsz(desc);
 +    float16 *d = vd;
 +    float16 *n = vn;
 +    float16 *m = vm;
 +    float_status *fpst = vfpst;
 +    intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
 +    uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
 +    uint32_t neg_real = flip ^ neg_imag;
 +    uintptr_t i;
 +    float16 e1 = m[H2(flip)];
 +    float16 e3 = m[H2(1 - flip)];
 +
 +    /* Shift boolean to the sign bit so we can xor to negate.  */
 +    neg_real <<= 15;
 +    neg_imag <<= 15;
 +    e1 ^= neg_real;
 +    e3 ^= neg_imag;
 +
 +    for (i = 0; i < opr_sz / 2; i += 2) {
 +        float16 e2 = n[H2(i + flip)];
 +        float16 e4 = e2;
 +
 +        d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst);
 +        d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 +
 +void HELPER(gvec_fcmlas)(void *vd, void *vn, void *vm,
 +                         void *vfpst, uint32_t desc)
 +{
 +    uintptr_t opr_sz = simd_oprsz(desc);
 +    float32 *d = vd;
 +    float32 *n = vn;
 +    float32 *m = vm;
 +    float_status *fpst = vfpst;
 +    intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
 +    uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
 +    uint32_t neg_real = flip ^ neg_imag;
 +    uintptr_t i;
 +
 +    /* Shift boolean to the sign bit so we can xor to negate.  */
 +    neg_real <<= 31;
 +    neg_imag <<= 31;
 +
 +    for (i = 0; i < opr_sz / 4; i += 2) {
 +        float32 e2 = n[H4(i + flip)];
 +        float32 e1 = m[H4(i + flip)] ^ neg_real;
 +        float32 e4 = e2;
 +        float32 e3 = m[H4(i + 1 - flip)] ^ neg_imag;
 +
 +        d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst);
 +        d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 +
 +void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm,
 +                             void *vfpst, uint32_t desc)
 +{
 +    uintptr_t opr_sz = simd_oprsz(desc);
 +    float32 *d = vd;
 +    float32 *n = vn;
 +    float32 *m = vm;
 +    float_status *fpst = vfpst;
 +    intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
 +    uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
 +    uint32_t neg_real = flip ^ neg_imag;
 +    uintptr_t i;
 +    float32 e1 = m[H4(flip)];
 +    float32 e3 = m[H4(1 - flip)];
 +
 +    /* Shift boolean to the sign bit so we can xor to negate.  */
 +    neg_real <<= 31;
 +    neg_imag <<= 31;
 +    e1 ^= neg_real;
 +    e3 ^= neg_imag;
 +
 +    for (i = 0; i < opr_sz / 4; i += 2) {
 +        float32 e2 = n[H4(i + flip)];
 +        float32 e4 = e2;
 +
 +        d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst);
 +        d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 +
 +void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm,
 +                         void *vfpst, uint32_t desc)
 +{
 +    uintptr_t opr_sz = simd_oprsz(desc);
 +    float64 *d = vd;
 +    float64 *n = vn;
 +    float64 *m = vm;
 +    float_status *fpst = vfpst;
 +    intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
 +    uint64_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
 +    uint64_t neg_real = flip ^ neg_imag;
 +    uintptr_t i;
 +
 +    /* Shift boolean to the sign bit so we can xor to negate.  */
 +    neg_real <<= 63;
 +    neg_imag <<= 63;
 +
 +    for (i = 0; i < opr_sz / 8; i += 2) {
 +        float64 e2 = n[i + flip];
 +        float64 e1 = m[i + flip] ^ neg_real;
 +        float64 e4 = e2;
 +        float64 e3 = m[i + 1 - flip] ^ neg_imag;
 +
 +        d[i] = float64_muladd(e2, e1, d[i], 0, fpst);
 +        d[i + 1] = float64_muladd(e4, e3, d[i + 1], 0, fpst);
 +    }
 +    clear_tail(d, opr_sz, simd_maxsz(desc));
 +}
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 23/39] mps2-an505: New board model: MPS2 with AN505 Cortex-M33 FPGA image
+[PULL 18/45] aspeed: Add support for the sonorapass-bmc board
-Define a new board model for the MPS2 with an AN505 FPGA image
+From: Patrick Williams <patrick@stwcx.xyz>
 containing a Cortex-M33. Since the FPGA images for TrustZone
 cores (AN505, and the similar AN519 for Cortex-M23) have a
 significantly different layout of devices to the non-TrustZone
 images, we use a new source file rather than shoehorning them
 into the existing mps2.c.
+Sonora Pass is a 2 socket x86 motherboard designed by Facebook
+and supported by OpenBMC.  Strapping configuration was obtained
+from hardware and i2c configuration is based on dts found at:
+https://github.com/facebook/openbmc-linux/blob/1633c87b8ba7c162095787c988979b748ba65dc8/arch/arm/boot/dts/aspeed-bmc-facebook-sonorapass.dts
+Booted a test image of http://github.com/facebook/openbmc to login
+prompt.
+Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
+Reviewed-by: Amithash Prasad <amithash@fb.com>
+Reviewed-by: Cédric Le Goater <clg@kaod.org>
+[PMM: fixed block comment style nit]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-20-peter.maydell@linaro.org
 ---
- hw/arm/Makefile.objs |   1 +
+ hw/arm/aspeed.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++
- hw/arm/mps2-tz.c     | 503 +++++++++++++++++++++++++++++++++++++++++++++++++++
+file changed, 78 insertions(+)
 files changed, 504 insertions(+)
  create mode 100644 hw/arm/mps2-tz.c
-diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
+diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/Makefile.objs
+--- a/hw/arm/aspeed.c
-+++ b/hw/arm/Makefile.objs
++++ b/hw/arm/aspeed.c
-@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
+@@ -XXX,XX +XXX,XX @@ struct AspeedBoardState {
- obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
+         SCU_AST2500_HW_STRAP_ACPI_ENABLE |                              \
- obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
+         SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER))
- obj-$(CONFIG_MPS2) += mps2.o
-+obj-$(CONFIG_MPS2) += mps2-tz.o
++/* Sonorapass hardware value: 0xF100D216 */
- obj-$(CONFIG_MSF2) += msf2-soc.o msf2-som.o
++#define SONORAPASS_BMC_HW_STRAP1 (                                      \
- obj-$(CONFIG_IOTKIT) += iotkit.o
++        SCU_AST2500_HW_STRAP_SPI_AUTOFETCH_ENABLE |                     \
-diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
++        SCU_AST2500_HW_STRAP_GPIO_STRAP_ENABLE |                        \
-new file mode 100644
++        SCU_AST2500_HW_STRAP_UART_DEBUG |                               \
-index XXXXXXX..XXXXXXX
++        SCU_AST2500_HW_STRAP_RESERVED28 |                               \
---- /dev/null
++        SCU_AST2500_HW_STRAP_DDR4_ENABLE |                              \
-+++ b/hw/arm/mps2-tz.c
++        SCU_HW_STRAP_VGA_CLASS_CODE |                                   \
-@@ -XXX,XX +XXX,XX @@
++        SCU_HW_STRAP_LPC_RESET_PIN |                                    \
-+/*
++        SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER) |                \
-+ * ARM V2M MPS2 board emulation, trustzone aware FPGA images
++        SCU_AST2500_HW_STRAP_SET_AXI_AHB_RATIO(AXI_AHB_RATIO_2_1) |     \
-+ *
++        SCU_HW_STRAP_VGA_BIOS_ROM |                                     \
-+ * Copyright (c) 2017 Linaro Limited
++        SCU_HW_STRAP_VGA_SIZE_SET(VGA_16M_DRAM) |                       \
-+ * Written by Peter Maydell
++        SCU_AST2500_HW_STRAP_RESERVED1)
 + *
 + *  This program is free software; you can redistribute it and/or modify
 + *  it under the terms of the GNU General Public License version 2 or
 + *  (at your option) any later version.
 + */
 +
-+/* The MPS2 and MPS2+ dev boards are FPGA based (the 2+ has a bigger
+ /* Swift hardware value: 0xF11AD206 */
-+ * FPGA but is otherwise the same as the 2). Since the CPU itself
+ #define SWIFT_BMC_HW_STRAP1 (                                           \
-+ * and most of the devices are in the FPGA, the details of the board
+         AST2500_HW_STRAP1_DEFAULTS |                                    \
-+ * as seen by the guest depend significantly on the FPGA image.
+@@ -XXX,XX +XXX,XX @@ static void swift_bmc_i2c_init(AspeedBoardState *bmc)
-+ * This source file covers the following FPGA images, for TrustZone cores:
+     i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 12), "tmp105", 0x4a);
-+ *  "mps2-an505" -- Cortex-M33 as documented in ARM Application Note AN505
+ }
-+ *
-+ * Links to the TRM for the board itself and to the various Application
++static void sonorapass_bmc_i2c_init(AspeedBoardState *bmc)
-+ * Notes which document the FPGA images can be found here:
++{
-+ * https://developer.arm.com/products/system-design/development-boards/fpga-prototyping-boards/mps2
++    AspeedSoCState *soc = &bmc->soc;
 + *
 + * Board TRM:
 + * http://infocenter.arm.com/help/topic/com.arm.doc.100112_0200_06_en/versatile_express_cortex_m_prototyping_systems_v2m_mps2_and_v2m_mps2plus_technical_reference_100112_0200_06_en.pdf
 + * Application Note AN505:
 + * http://infocenter.arm.com/help/topic/com.arm.doc.dai0505b/index.html
 + *
 + * The AN505 defers to the Cortex-M33 processor ARMv8M IoT Kit FVP User Guide
 + * (ARM ECM0601256) for the details of some of the device layout:
 + *   http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ecm0601256/index.html
 + */
 +
-+#include "qemu/osdep.h"
++    /* bus 2 : */
-+#include "qapi/error.h"
++    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x48);
-+#include "qemu/error-report.h"
++    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x49);
-+#include "hw/arm/arm.h"
++    /* bus 2 : pca9546 @ 0x73 */
 +#include "hw/arm/armv7m.h"
 +#include "hw/or-irq.h"
 +#include "hw/boards.h"
 +#include "exec/address-spaces.h"
 +#include "sysemu/sysemu.h"
 +#include "hw/misc/unimp.h"
 +#include "hw/char/cmsdk-apb-uart.h"
 +#include "hw/timer/cmsdk-apb-timer.h"
 +#include "hw/misc/mps2-scc.h"
 +#include "hw/misc/mps2-fpgaio.h"
 +#include "hw/arm/iotkit.h"
 +#include "hw/devices.h"
 +#include "net/net.h"
 +#include "hw/core/split-irq.h"
 +
-+typedef enum MPS2TZFPGAType {
++    /* bus 3 : pca9548 @ 0x70 */
 +    FPGA_AN505,
 +} MPS2TZFPGAType;
 +
-+typedef struct {
++    /* bus 4 : */
-+    MachineClass parent;
++    uint8_t *eeprom4_54 = g_malloc0(8 * 1024);
-+    MPS2TZFPGAType fpga_type;
++    smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), 0x54,
-+    uint32_t scc_id;
++                          eeprom4_54);
-+} MPS2TZMachineClass;
++    /* PCA9539 @ 0x76, but PCA9552 is compatible */
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x76);
 +    /* PCA9539 @ 0x77, but PCA9552 is compatible */
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x77);
 +
-+typedef struct {
++    /* bus 6 : */
-+    MachineState parent;
++    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x48);
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x49);
 +    /* bus 6 : pca9546 @ 0x73 */
 +
-+    IoTKit iotkit;
++    /* bus 8 : */
-+    MemoryRegion psram;
++    uint8_t *eeprom8_56 = g_malloc0(8 * 1024);
-+    MemoryRegion ssram1;
++    smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), 0x56,
-+    MemoryRegion ssram1_m;
++                          eeprom8_56);
-+    MemoryRegion ssram23;
++    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x60);
-+    MPS2SCC scc;
++    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x61);
-+    MPS2FPGAIO fpgaio;
++    /* bus 8 : adc128d818 @ 0x1d */
-+    TZPPC ppc[5];
++    /* bus 8 : adc128d818 @ 0x1f */
 +    UnimplementedDeviceState ssram_mpc[3];
 +    UnimplementedDeviceState spi[5];
 +    UnimplementedDeviceState i2c[4];
 +    UnimplementedDeviceState i2s_audio;
 +    UnimplementedDeviceState gpio[5];
 +    UnimplementedDeviceState dma[4];
 +    UnimplementedDeviceState gfx;
 +    CMSDKAPBUART uart[5];
 +    SplitIRQ sec_resp_splitter;
 +    qemu_or_irq uart_irq_orgate;
 +} MPS2TZMachineState;
 +
-+#define TYPE_MPS2TZ_MACHINE "mps2tz"
++    /*
-+#define TYPE_MPS2TZ_AN505_MACHINE MACHINE_TYPE_NAME("mps2-an505")
++     * bus 13 : pca9548 @ 0x71
-+
++     *      - channel 3:
-+#define MPS2TZ_MACHINE(obj) \
++     *          - tmm421 @ 0x4c
-+    OBJECT_CHECK(MPS2TZMachineState, obj, TYPE_MPS2TZ_MACHINE)
++     *          - tmp421 @ 0x4e
-+#define MPS2TZ_MACHINE_GET_CLASS(obj) \
++     *          - tmp421 @ 0x4f
-+    OBJECT_GET_CLASS(MPS2TZMachineClass, obj, TYPE_MPS2TZ_MACHINE)
++     */
 +#define MPS2TZ_MACHINE_CLASS(klass) \
 +    OBJECT_CLASS_CHECK(MPS2TZMachineClass, klass, TYPE_MPS2TZ_MACHINE)
 +
 +/* Main SYSCLK frequency in Hz */
 +#define SYSCLK_FRQ 20000000
 +
 +/* Initialize the auxiliary RAM region @mr and map it into
 + * the memory map at @base.
 + */
 +static void make_ram(MemoryRegion *mr, const char *name,
 +                     hwaddr base, hwaddr size)
 +{
 +    memory_region_init_ram(mr, NULL, name, size, &error_fatal);
 +    memory_region_add_subregion(get_system_memory(), base, mr);
 +}
 +
 +/* Create an alias of an entire original MemoryRegion @orig
 + * located at @base in the memory map.
 + */
 +static void make_ram_alias(MemoryRegion *mr, const char *name,
 +                           MemoryRegion *orig, hwaddr base)
 +{
 +    memory_region_init_alias(mr, NULL, name, orig, 0,
 +                             memory_region_size(orig));
 +    memory_region_add_subregion(get_system_memory(), base, mr);
 +}
 +
 +static void init_sysbus_child(Object *parent, const char *childname,
 +                              void *child, size_t childsize,
 +                              const char *childtype)
 +{
 +    object_initialize(child, childsize, childtype);
 +    object_property_add_child(parent, childname, OBJECT(child), &error_abort);
 +    qdev_set_parent_bus(DEVICE(child), sysbus_get_default());
 +
 +}
 +
-+/* Most of the devices in the AN505 FPGA image sit behind
+ static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
-+ * Peripheral Protection Controllers. These data structures
+ {
-+ * define the layout of which devices sit behind which PPCs.
+     AspeedSoCState *soc = &bmc->soc;
-+ * The devfn for each port is a function which creates, configures
+@@ -XXX,XX +XXX,XX @@ static void aspeed_machine_romulus_class_init(ObjectClass *oc, void *data)
-+ * and initializes the device, returning the MemoryRegion which
+     mc->default_ram_size       = 512 * MiB;
-+ * needs to be plugged into the downstream end of the PPC port.
+ };
-+ */
-+typedef MemoryRegion *MakeDevFn(MPS2TZMachineState *mms, void *opaque,
++static void aspeed_machine_sonorapass_class_init(ObjectClass *oc, void *data)
 +                                const char *name, hwaddr size);
 +
 +typedef struct PPCPortInfo {
 +    const char *name;
 +    MakeDevFn *devfn;
 +    void *opaque;
 +    hwaddr addr;
 +    hwaddr size;
 +} PPCPortInfo;
 +
 +typedef struct PPCInfo {
 +    const char *name;
 +    PPCPortInfo ports[TZ_NUM_PORTS];
 +} PPCInfo;
 +
 +static MemoryRegion *make_unimp_dev(MPS2TZMachineState *mms,
 +                                       void *opaque,
 +                                       const char *name, hwaddr size)
 +{
 +    /* Initialize, configure and realize a TYPE_UNIMPLEMENTED_DEVICE,
 +     * and return a pointer to its MemoryRegion.
 +     */
 +    UnimplementedDeviceState *uds = opaque;
 +
 +    init_sysbus_child(OBJECT(mms), name, uds,
 +                      sizeof(UnimplementedDeviceState),
 +                      TYPE_UNIMPLEMENTED_DEVICE);
 +    qdev_prop_set_string(DEVICE(uds), "name", name);
 +    qdev_prop_set_uint64(DEVICE(uds), "size", size);
 +    object_property_set_bool(OBJECT(uds), true, "realized", &error_fatal);
 +    return sysbus_mmio_get_region(SYS_BUS_DEVICE(uds), 0);
 +}
 +
 +static MemoryRegion *make_uart(MPS2TZMachineState *mms, void *opaque,
 +                               const char *name, hwaddr size)
 +{
 +    CMSDKAPBUART *uart = opaque;
 +    int i = uart - &mms->uart[0];
 +    Chardev *uartchr = i < MAX_SERIAL_PORTS ? serial_hds[i] : NULL;
 +    int rxirqno = i * 2;
 +    int txirqno = i * 2 + 1;
 +    int combirqno = i + 10;
 +    SysBusDevice *s;
 +    DeviceState *iotkitdev = DEVICE(&mms->iotkit);
 +    DeviceState *orgate_dev = DEVICE(&mms->uart_irq_orgate);
 +
 +    init_sysbus_child(OBJECT(mms), name, uart,
 +                      sizeof(mms->uart[0]), TYPE_CMSDK_APB_UART);
 +    qdev_prop_set_chr(DEVICE(uart), "chardev", uartchr);
 +    qdev_prop_set_uint32(DEVICE(uart), "pclk-frq", SYSCLK_FRQ);
 +    object_property_set_bool(OBJECT(uart), true, "realized", &error_fatal);
 +    s = SYS_BUS_DEVICE(uart);
 +    sysbus_connect_irq(s, 0, qdev_get_gpio_in_named(iotkitdev,
 +                                                    "EXP_IRQ", txirqno));
 +    sysbus_connect_irq(s, 1, qdev_get_gpio_in_named(iotkitdev,
 +                                                    "EXP_IRQ", rxirqno));
 +    sysbus_connect_irq(s, 2, qdev_get_gpio_in(orgate_dev, i * 2));
 +    sysbus_connect_irq(s, 3, qdev_get_gpio_in(orgate_dev, i * 2 + 1));
 +    sysbus_connect_irq(s, 4, qdev_get_gpio_in_named(iotkitdev,
 +                                                    "EXP_IRQ", combirqno));
 +    return sysbus_mmio_get_region(SYS_BUS_DEVICE(uart), 0);
 +}
 +
 +static MemoryRegion *make_scc(MPS2TZMachineState *mms, void *opaque,
 +                              const char *name, hwaddr size)
 +{
 +    MPS2SCC *scc = opaque;
 +    DeviceState *sccdev;
 +    MPS2TZMachineClass *mmc = MPS2TZ_MACHINE_GET_CLASS(mms);
 +
 +    object_initialize(scc, sizeof(mms->scc), TYPE_MPS2_SCC);
 +    sccdev = DEVICE(scc);
 +    qdev_set_parent_bus(sccdev, sysbus_get_default());
 +    qdev_prop_set_uint32(sccdev, "scc-cfg4", 0x2);
 +    qdev_prop_set_uint32(sccdev, "scc-aid", 0x02000008);
 +    qdev_prop_set_uint32(sccdev, "scc-id", mmc->scc_id);
 +    object_property_set_bool(OBJECT(scc), true, "realized", &error_fatal);
 +    return sysbus_mmio_get_region(SYS_BUS_DEVICE(sccdev), 0);
 +}
 +
 +static MemoryRegion *make_fpgaio(MPS2TZMachineState *mms, void *opaque,
 +                                 const char *name, hwaddr size)
 +{
 +    MPS2FPGAIO *fpgaio = opaque;
 +
 +    object_initialize(fpgaio, sizeof(mms->fpgaio), TYPE_MPS2_FPGAIO);
 +    qdev_set_parent_bus(DEVICE(fpgaio), sysbus_get_default());
 +    object_property_set_bool(OBJECT(fpgaio), true, "realized", &error_fatal);
 +    return sysbus_mmio_get_region(SYS_BUS_DEVICE(fpgaio), 0);
 +}
 +
 +static void mps2tz_common_init(MachineState *machine)
 +{
 +    MPS2TZMachineState *mms = MPS2TZ_MACHINE(machine);
 +    MachineClass *mc = MACHINE_GET_CLASS(machine);
 +    MemoryRegion *system_memory = get_system_memory();
 +    DeviceState *iotkitdev;
 +    DeviceState *dev_splitter;
 +    int i;
 +
 +    if (strcmp(machine->cpu_type, mc->default_cpu_type) != 0) {
 +        error_report("This board can only be used with CPU %s",
 +                     mc->default_cpu_type);
 +        exit(1);
 +    }
 +
 +    init_sysbus_child(OBJECT(machine), "iotkit", &mms->iotkit,
 +                      sizeof(mms->iotkit), TYPE_IOTKIT);
 +    iotkitdev = DEVICE(&mms->iotkit);
 +    object_property_set_link(OBJECT(&mms->iotkit), OBJECT(system_memory),
 +                             "memory", &error_abort);
 +    qdev_prop_set_uint32(iotkitdev, "EXP_NUMIRQ", 92);
 +    qdev_prop_set_uint32(iotkitdev, "MAINCLK", SYSCLK_FRQ);
 +    object_property_set_bool(OBJECT(&mms->iotkit), true, "realized",
 +                             &error_fatal);
 +
 +    /* The sec_resp_cfg output from the IoTKit must be split into multiple
 +     * lines, one for each of the PPCs we create here.
 +     */
 +    object_initialize(&mms->sec_resp_splitter, sizeof(mms->sec_resp_splitter),
 +                      TYPE_SPLIT_IRQ);
 +    object_property_add_child(OBJECT(machine), "sec-resp-splitter",
 +                              OBJECT(&mms->sec_resp_splitter), &error_abort);
 +    object_property_set_int(OBJECT(&mms->sec_resp_splitter), 5,
 +                            "num-lines", &error_fatal);
 +    object_property_set_bool(OBJECT(&mms->sec_resp_splitter), true,
 +                             "realized", &error_fatal);
 +    dev_splitter = DEVICE(&mms->sec_resp_splitter);
 +    qdev_connect_gpio_out_named(iotkitdev, "sec_resp_cfg", 0,
 +                                qdev_get_gpio_in(dev_splitter, 0));
 +
 +    /* The IoTKit sets up much of the memory layout, including
 +     * the aliases between secure and non-secure regions in the
 +     * address space. The FPGA itself contains:
 +     *
 +     * 0x00000000..0x003fffff  SSRAM1
 +     * 0x00400000..0x007fffff  alias of SSRAM1
 +     * 0x28000000..0x283fffff  4MB SSRAM2 + SSRAM3
 +     * 0x40100000..0x4fffffff  AHB Master Expansion 1 interface devices
 +     * 0x80000000..0x80ffffff  16MB PSRAM
 +     */
 +
 +    /* The FPGA images have an odd combination of different RAMs,
 +     * because in hardware they are different implementations and
 +     * connected to different buses, giving varying performance/size
 +     * tradeoffs. For QEMU they're all just RAM, though. We arbitrarily
 +     * call the 16MB our "system memory", as it's the largest lump.
 +     */
 +    memory_region_allocate_system_memory(&mms->psram,
 +                                         NULL, "mps.ram", 0x01000000);
 +    memory_region_add_subregion(system_memory, 0x80000000, &mms->psram);
 +
 +    /* The SSRAM memories should all be behind Memory Protection Controllers,
 +     * but we don't implement that yet.
 +     */
 +    make_ram(&mms->ssram1, "mps.ssram1", 0x00000000, 0x00400000);
 +    make_ram_alias(&mms->ssram1_m, "mps.ssram1_m", &mms->ssram1, 0x00400000);
 +
 +    make_ram(&mms->ssram23, "mps.ssram23", 0x28000000, 0x00400000);
 +
 +    /* The overflow IRQs for all UARTs are ORed together.
 +     * Tx, Rx and "combined" IRQs are sent to the NVIC separately.
 +     * Create the OR gate for this.
 +     */
 +    object_initialize(&mms->uart_irq_orgate, sizeof(mms->uart_irq_orgate),
 +                      TYPE_OR_IRQ);
 +    object_property_add_child(OBJECT(mms), "uart-irq-orgate",
 +                              OBJECT(&mms->uart_irq_orgate), &error_abort);
 +    object_property_set_int(OBJECT(&mms->uart_irq_orgate), 10, "num-lines",
 +                            &error_fatal);
 +    object_property_set_bool(OBJECT(&mms->uart_irq_orgate), true,
 +                             "realized", &error_fatal);
 +    qdev_connect_gpio_out(DEVICE(&mms->uart_irq_orgate), 0,
 +                          qdev_get_gpio_in_named(iotkitdev, "EXP_IRQ", 15));
 +
 +    /* Most of the devices in the FPGA are behind Peripheral Protection
 +     * Controllers. The required order for initializing things is:
 +     *  + initialize the PPC
 +     *  + initialize, configure and realize downstream devices
 +     *  + connect downstream device MemoryRegions to the PPC
 +     *  + realize the PPC
 +     *  + map the PPC's MemoryRegions to the places in the address map
 +     *    where the downstream devices should appear
 +     *  + wire up the PPC's control lines to the IoTKit object
 +     */
 +
 +    const PPCInfo ppcs[] = { {
 +            .name = "apb_ppcexp0",
 +            .ports = {
 +                { "ssram-mpc0", make_unimp_dev, &mms->ssram_mpc[0],
 +                  0x58007000, 0x1000 },
 +                { "ssram-mpc1", make_unimp_dev, &mms->ssram_mpc[1],
 +                  0x58008000, 0x1000 },
 +                { "ssram-mpc2", make_unimp_dev, &mms->ssram_mpc[2],
 +                  0x58009000, 0x1000 },
 +            },
 +        }, {
 +            .name = "apb_ppcexp1",
 +            .ports = {
 +                { "spi0", make_unimp_dev, &mms->spi[0], 0x40205000, 0x1000 },
 +                { "spi1", make_unimp_dev, &mms->spi[1], 0x40206000, 0x1000 },
 +                { "spi2", make_unimp_dev, &mms->spi[2], 0x40209000, 0x1000 },
 +                { "spi3", make_unimp_dev, &mms->spi[3], 0x4020a000, 0x1000 },
 +                { "spi4", make_unimp_dev, &mms->spi[4], 0x4020b000, 0x1000 },
 +                { "uart0", make_uart, &mms->uart[0], 0x40200000, 0x1000 },
 +                { "uart1", make_uart, &mms->uart[1], 0x40201000, 0x1000 },
 +                { "uart2", make_uart, &mms->uart[2], 0x40202000, 0x1000 },
 +                { "uart3", make_uart, &mms->uart[3], 0x40203000, 0x1000 },
 +                { "uart4", make_uart, &mms->uart[4], 0x40204000, 0x1000 },
 +                { "i2c0", make_unimp_dev, &mms->i2c[0], 0x40207000, 0x1000 },
 +                { "i2c1", make_unimp_dev, &mms->i2c[1], 0x40208000, 0x1000 },
 +                { "i2c2", make_unimp_dev, &mms->i2c[2], 0x4020c000, 0x1000 },
 +                { "i2c3", make_unimp_dev, &mms->i2c[3], 0x4020d000, 0x1000 },
 +            },
 +        }, {
 +            .name = "apb_ppcexp2",
 +            .ports = {
 +                { "scc", make_scc, &mms->scc, 0x40300000, 0x1000 },
 +                { "i2s-audio", make_unimp_dev, &mms->i2s_audio,
 +                  0x40301000, 0x1000 },
 +                { "fpgaio", make_fpgaio, &mms->fpgaio, 0x40302000, 0x1000 },
 +            },
 +        }, {
 +            .name = "ahb_ppcexp0",
 +            .ports = {
 +                { "gfx", make_unimp_dev, &mms->gfx, 0x41000000, 0x140000 },
 +                { "gpio0", make_unimp_dev, &mms->gpio[0], 0x40100000, 0x1000 },
 +                { "gpio1", make_unimp_dev, &mms->gpio[1], 0x40101000, 0x1000 },
 +                { "gpio2", make_unimp_dev, &mms->gpio[2], 0x40102000, 0x1000 },
 +                { "gpio3", make_unimp_dev, &mms->gpio[3], 0x40103000, 0x1000 },
 +                { "gpio4", make_unimp_dev, &mms->gpio[4], 0x40104000, 0x1000 },
 +            },
 +        }, {
 +            .name = "ahb_ppcexp1",
 +            .ports = {
 +                { "dma0", make_unimp_dev, &mms->dma[0], 0x40110000, 0x1000 },
 +                { "dma1", make_unimp_dev, &mms->dma[1], 0x40111000, 0x1000 },
 +                { "dma2", make_unimp_dev, &mms->dma[2], 0x40112000, 0x1000 },
 +                { "dma3", make_unimp_dev, &mms->dma[3], 0x40113000, 0x1000 },
 +            },
 +        },
 +    };
 +
 +    for (i = 0; i < ARRAY_SIZE(ppcs); i++) {
 +        const PPCInfo *ppcinfo = &ppcs[i];
 +        TZPPC *ppc = &mms->ppc[i];
 +        DeviceState *ppcdev;
 +        int port;
 +        char *gpioname;
 +
 +        init_sysbus_child(OBJECT(machine), ppcinfo->name, ppc,
 +                          sizeof(TZPPC), TYPE_TZ_PPC);
 +        ppcdev = DEVICE(ppc);
 +
 +        for (port = 0; port < TZ_NUM_PORTS; port++) {
 +            const PPCPortInfo *pinfo = &ppcinfo->ports[port];
 +            MemoryRegion *mr;
 +            char *portname;
 +
 +            if (!pinfo->devfn) {
 +                continue;
 +            }
 +
 +            mr = pinfo->devfn(mms, pinfo->opaque, pinfo->name, pinfo->size);
 +            portname = g_strdup_printf("port[%d]", port);
 +            object_property_set_link(OBJECT(ppc), OBJECT(mr),
 +                                     portname, &error_fatal);
 +            g_free(portname);
 +        }
 +
 +        object_property_set_bool(OBJECT(ppc), true, "realized", &error_fatal);
 +
 +        for (port = 0; port < TZ_NUM_PORTS; port++) {
 +            const PPCPortInfo *pinfo = &ppcinfo->ports[port];
 +
 +            if (!pinfo->devfn) {
 +                continue;
 +            }
 +            sysbus_mmio_map(SYS_BUS_DEVICE(ppc), port, pinfo->addr);
 +
 +            gpioname = g_strdup_printf("%s_nonsec", ppcinfo->name);
 +            qdev_connect_gpio_out_named(iotkitdev, gpioname, port,
 +                                        qdev_get_gpio_in_named(ppcdev,
 +                                                               "cfg_nonsec",
 +                                                               port));
 +            g_free(gpioname);
 +            gpioname = g_strdup_printf("%s_ap", ppcinfo->name);
 +            qdev_connect_gpio_out_named(iotkitdev, gpioname, port,
 +                                        qdev_get_gpio_in_named(ppcdev,
 +                                                               "cfg_ap", port));
 +            g_free(gpioname);
 +        }
 +
 +        gpioname = g_strdup_printf("%s_irq_enable", ppcinfo->name);
 +        qdev_connect_gpio_out_named(iotkitdev, gpioname, 0,
 +                                    qdev_get_gpio_in_named(ppcdev,
 +                                                           "irq_enable", 0));
 +        g_free(gpioname);
 +        gpioname = g_strdup_printf("%s_irq_clear", ppcinfo->name);
 +        qdev_connect_gpio_out_named(iotkitdev, gpioname, 0,
 +                                    qdev_get_gpio_in_named(ppcdev,
 +                                                           "irq_clear", 0));
 +        g_free(gpioname);
 +        gpioname = g_strdup_printf("%s_irq_status", ppcinfo->name);
 +        qdev_connect_gpio_out_named(ppcdev, "irq", 0,
 +                                    qdev_get_gpio_in_named(iotkitdev,
 +                                                           gpioname, 0));
 +        g_free(gpioname);
 +
 +        qdev_connect_gpio_out(dev_splitter, i,
 +                              qdev_get_gpio_in_named(ppcdev,
 +                                                     "cfg_sec_resp", 0));
 +    }
 +
 +    /* In hardware this is a LAN9220; the LAN9118 is software compatible
 +     * except that it doesn't support the checksum-offload feature.
 +     * The ethernet controller is not behind a PPC.
 +     */
 +    lan9118_init(&nd_table[0], 0x42000000,
 +                 qdev_get_gpio_in_named(iotkitdev, "EXP_IRQ", 16));
 +
 +    create_unimplemented_device("FPGA NS PC", 0x48007000, 0x1000);
 +
 +    armv7m_load_kernel(ARM_CPU(first_cpu), machine->kernel_filename, 0x400000);
 +}
 +
 +static void mps2tz_class_init(ObjectClass *oc, void *data)
 +{
 +    MachineClass *mc = MACHINE_CLASS(oc);
++    AspeedMachineClass *amc = ASPEED_MACHINE_CLASS(oc);
 +
-+    mc->init = mps2tz_common_init;
++    mc->desc       = "OCP SonoraPass BMC (ARM1176)";
-+    mc->max_cpus = 1;
++    amc->soc_name  = "ast2500-a1";
-+}
++    amc->hw_strap1 = SONORAPASS_BMC_HW_STRAP1;
-+
++    amc->fmc_model = "mx66l1g45g";
-+static void mps2tz_an505_class_init(ObjectClass *oc, void *data)
++    amc->spi_model = "mx66l1g45g";
-+{
++    amc->num_cs    = 2;
-+    MachineClass *mc = MACHINE_CLASS(oc);
++    amc->i2c_init  = sonorapass_bmc_i2c_init;
-+    MPS2TZMachineClass *mmc = MPS2TZ_MACHINE_CLASS(oc);
++    mc->default_ram_size       = 512 * MiB;
 +
 +    mc->desc = "ARM MPS2 with AN505 FPGA image for Cortex-M33";
 +    mmc->fpga_type = FPGA_AN505;
 +    mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-m33");
 +    mmc->scc_id = 0x41040000 | (505 << 4);
 +}
 +
 +static const TypeInfo mps2tz_info = {
 +    .name = TYPE_MPS2TZ_MACHINE,
 +    .parent = TYPE_MACHINE,
 +    .abstract = true,
 +    .instance_size = sizeof(MPS2TZMachineState),
 +    .class_size = sizeof(MPS2TZMachineClass),
 +    .class_init = mps2tz_class_init,
 +};
 +
-+static const TypeInfo mps2tz_an505_info = {
+ static void aspeed_machine_swift_class_init(ObjectClass *oc, void *data)
-+    .name = TYPE_MPS2TZ_AN505_MACHINE,
+ {
-+    .parent = TYPE_MPS2TZ_MACHINE,
+     MachineClass *mc = MACHINE_CLASS(oc);
-+    .class_init = mps2tz_an505_class_init,
+@@ -XXX,XX +XXX,XX @@ static const TypeInfo aspeed_machine_types[] = {
-+};
+         .name          = MACHINE_TYPE_NAME("swift-bmc"),
-+
+         .parent        = TYPE_ASPEED_MACHINE,
-+static void mps2tz_machine_init(void)
+         .class_init    = aspeed_machine_swift_class_init,
-+{
++    }, {
-+    type_register_static(&mps2tz_info);
++        .name          = MACHINE_TYPE_NAME("sonorapass-bmc"),
-+    type_register_static(&mps2tz_an505_info);
++        .parent        = TYPE_ASPEED_MACHINE,
-+}
++        .class_init    = aspeed_machine_sonorapass_class_init,
-+
+     }, {
-+type_init(mps2tz_machine_init);
+         .name          = MACHINE_TYPE_NAME("witherspoon-bmc"),
          .parent        = TYPE_ASPEED_MACHINE,
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 04/39] decodetree: Propagate return value from translate subroutines
+[PULL 19/45] acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
-From: Richard Henderson <richard.henderson@linaro.org>
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-Allow the translate subroutines to return false for invalid insns.
+The little end UUID is used in many places, so make
 NVDIMM_UUID_LE to a common macro to convert the UUID
 to a little end array.
-At present we can of course invoke an invalid insn exception from within
+Reviewed-by: Xiang Zheng <zhengxiang9@huawei.com>
-the translate subroutine, but in the short term this consolidates code.
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
-In the long term it would allow the decodetree language to support
+Message-id: 20200512030609.19593-2-gengdongjiu@huawei.com
 overlapping patterns for ISA extensions.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20180227232618.2908-1-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- scripts/decodetree.py | 5 ++---
+ include/qemu/uuid.h | 27 +++++++++++++++++++++++++++
-file changed, 2 insertions(+), 3 deletions(-)
+ hw/acpi/nvdimm.c    | 10 +++-------
 files changed, 30 insertions(+), 7 deletions(-)
-diff --git a/scripts/decodetree.py b/scripts/decodetree.py
+diff --git a/include/qemu/uuid.h b/include/qemu/uuid.h
-index XXXXXXX..XXXXXXX 100755
+index XXXXXXX..XXXXXXX 100644
---- a/scripts/decodetree.py
+--- a/include/qemu/uuid.h
-+++ b/scripts/decodetree.py
++++ b/include/qemu/uuid.h
-@@ -XXX,XX +XXX,XX @@ class Pattern(General):
+@@ -XXX,XX +XXX,XX @@ typedef struct {
-         global translate_prefix
+     };
-         output('typedef ', self.base.base.struct_name(),
+ } QemuUUID;
-                ' arg_', self.name, ';\n')
--        output(translate_scope, 'void ', translate_prefix, '_', self.name,
++/**
-+        output(translate_scope, 'bool ', translate_prefix, '_', self.name,
++ * UUID_LE - converts the fields of UUID to little-endian array,
-                '(DisasContext *ctx, arg_', self.name,
++ * each of parameters is the filed of UUID.
-                ' *a, ', insntype, ' insn);\n')
++ *
++ * @time_low: The low field of the timestamp
-@@ -XXX,XX +XXX,XX @@ class Pattern(General):
++ * @time_mid: The middle field of the timestamp
-             output(ind, self.base.extract_name(), '(&u.f_', arg, ', insn);\n')
++ * @time_hi_and_version: The high field of the timestamp
-         for n, f in self.fields.items():
++ *                       multiplexed with the version number
-             output(ind, 'u.f_', arg, '.', n, ' = ', f.str_extract(), ';\n')
++ * @clock_seq_hi_and_reserved: The high field of the clock
--        output(ind, translate_prefix, '_', self.name,
++ *                             sequence multiplexed with the variant
-+        output(ind, 'return ', translate_prefix, '_', self.name,
++ * @clock_seq_low: The low field of the clock sequence
-                '(ctx, &u.f_', arg, ', insn);\n')
++ * @node0: The spatially unique node0 identifier
--        output(ind, 'return true;\n')
++ * @node1: The spatially unique node1 identifier
- # end Pattern
++ * @node2: The spatially unique node2 identifier
++ * @node3: The spatially unique node3 identifier
++ * @node4: The spatially unique node4 identifier
 + * @node5: The spatially unique node5 identifier
 + */
 +#define UUID_LE(time_low, time_mid, time_hi_and_version,                    \
 +  clock_seq_hi_and_reserved, clock_seq_low, node0, node1, node2,            \
 +  node3, node4, node5)                                                      \
 +  { (time_low) & 0xff, ((time_low) >> 8) & 0xff, ((time_low) >> 16) & 0xff, \
 +    ((time_low) >> 24) & 0xff, (time_mid) & 0xff, ((time_mid) >> 8) & 0xff, \
 +    (time_hi_and_version) & 0xff, ((time_hi_and_version) >> 8) & 0xff,      \
 +    (clock_seq_hi_and_reserved), (clock_seq_low), (node0), (node1), (node2),\
 +    (node3), (node4), (node5) }
 +
  #define UUID_FMT "%02hhx%02hhx%02hhx%02hhx-" \
                   "%02hhx%02hhx-%02hhx%02hhx-" \
                   "%02hhx%02hhx-" \
 diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/acpi/nvdimm.c
 +++ b/hw/acpi/nvdimm.c
@@ -XXX,XX +XXX,XX @@
   */
  #include "qemu/osdep.h"
 +#include "qemu/uuid.h"
  #include "hw/acpi/acpi.h"
  #include "hw/acpi/aml-build.h"
  #include "hw/acpi/bios-linker-loader.h"
@@ -XXX,XX +XXX,XX @@
  #include "hw/mem/nvdimm.h"
  #include "qemu/nvdimm-utils.h"
 -#define NVDIMM_UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)             \
 -   { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
 -     (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
 -     (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) }
 -
  /*
   * define Byte Addressable Persistent Memory (PM) Region according to
   * ACPI 6.0: 5.2.25.1 System Physical Address Range Structure.
   */
  static const uint8_t nvdimm_nfit_spa_uuid[] =
 -      NVDIMM_UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
 -                     0x18, 0xb7, 0x8c, 0xdb);
 +      UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
 +              0x18, 0xb7, 0x8c, 0xdb);
  /*
   * NVDIMM Firmware Interface Table
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 03/39] xlnx-zynqmp: Connect the RTC device
+[PULL 20/45] hw/arm/virt: Introduce a RAS machine option
-From: Alistair Francis <alistair.francis@xilinx.com>
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
+RAS Virtualization feature is not supported now, so
 add a RAS machine option and disable it by default.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Message-id: 20200512030609.19593-3-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/arm/xlnx-zynqmp.h |  2 ++
+ include/hw/arm/virt.h |  1 +
- hw/arm/xlnx-zynqmp.c         | 14 ++++++++++++++
+ hw/arm/virt.c         | 23 +++++++++++++++++++++++
-files changed, 16 insertions(+)
+files changed, 24 insertions(+)
-diff --git a/include/hw/arm/xlnx-zynqmp.h b/include/hw/arm/xlnx-zynqmp.h
+diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/xlnx-zynqmp.h
+--- a/include/hw/arm/virt.h
-+++ b/include/hw/arm/xlnx-zynqmp.h
++++ b/include/hw/arm/virt.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ typedef struct {
- #include "hw/dma/xlnx_dpdma.h"
+     bool highmem_ecam;
- #include "hw/display/xlnx_dp.h"
+     bool its;
- #include "hw/intc/xlnx-zynqmp-ipi.h"
+     bool virt;
-+#include "hw/timer/xlnx-zynqmp-rtc.h"
++    bool ras;
+     OnOffAuto acpi;
- #define TYPE_XLNX_ZYNQMP "xlnx,zynqmp"
+     VirtGICType gic_version;
- #define XLNX_ZYNQMP(obj) OBJECT_CHECK(XlnxZynqMPState, (obj), \
+     VirtIOMMUType iommu;
-@@ -XXX,XX +XXX,XX @@ typedef struct XlnxZynqMPState {
+diff --git a/hw/arm/virt.c b/hw/arm/virt.c
      XlnxDPState dp;
      XlnxDPDMAState dpdma;
      XlnxZynqMPIPI ipi;
 +    XlnxZynqMPRTC rtc;
      char *boot_cpu;
      ARMCPU *boot_cpu_ptr;
 diff --git a/hw/arm/xlnx-zynqmp.c b/hw/arm/xlnx-zynqmp.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/xlnx-zynqmp.c
+--- a/hw/arm/virt.c
-+++ b/hw/arm/xlnx-zynqmp.c
++++ b/hw/arm/virt.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static void virt_set_acpi(Object *obj, Visitor *v, const char *name,
- #define IPI_ADDR            0xFF300000
+     visit_type_OnOffAuto(v, name, &vms->acpi, errp);
- #define IPI_IRQ             64
+ }
-+#define RTC_ADDR            0xffa60000
++static bool virt_get_ras(Object *obj, Error **errp)
-+#define RTC_IRQ             26
++{
 +    VirtMachineState *vms = VIRT_MACHINE(obj);
 +
- #define SDHCI_CAPABILITIES  0x280737ec6481 /* Datasheet: UG1085 (v1.7) */
++    return vms->ras;
++}
  static const uint64_t gem_addr[XLNX_ZYNQMP_NUM_GEMS] = {
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_init(Object *obj)
      object_initialize(&s->ipi, sizeof(s->ipi), TYPE_XLNX_ZYNQMP_IPI);
      qdev_set_parent_bus(DEVICE(&s->ipi), sysbus_get_default());
 +
-+    object_initialize(&s->rtc, sizeof(s->rtc), TYPE_XLNX_ZYNQMP_RTC);
++static void virt_set_ras(Object *obj, bool value, Error **errp)
-+    qdev_set_parent_bus(DEVICE(&s->rtc), sysbus_get_default());
++{
- }
++    VirtMachineState *vms = VIRT_MACHINE(obj);
  static void xlnx_zynqmp_realize(DeviceState *dev, Error **errp)
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error **errp)
      }
      sysbus_mmio_map(SYS_BUS_DEVICE(&s->ipi), 0, IPI_ADDR);
      sysbus_connect_irq(SYS_BUS_DEVICE(&s->ipi), 0, gic_spi[IPI_IRQ]);
 +
-+    object_property_set_bool(OBJECT(&s->rtc), true, "realized", &err);
++    vms->ras = value;
-+    if (err) {
++}
-+        error_propagate(errp, err);
++
-+        return;
+ static char *virt_get_gic_version(Object *obj, Error **errp)
-+    }
+ {
-+    sysbus_mmio_map(SYS_BUS_DEVICE(&s->rtc), 0, RTC_ADDR);
+     VirtMachineState *vms = VIRT_MACHINE(obj);
-+    sysbus_connect_irq(SYS_BUS_DEVICE(&s->rtc), 0, gic_spi[RTC_IRQ]);
+@@ -XXX,XX +XXX,XX @@ static void virt_instance_init(Object *obj)
- }
+                                     "Valid values are none and smmuv3",
+                                     NULL);
- static Property xlnx_zynqmp_props[] = {
 +    /* Default disallows RAS instantiation */
 +    vms->ras = false;
 +    object_property_add_bool(obj, "ras", virt_get_ras,
 +                             virt_set_ras, NULL);
 +    object_property_set_description(obj, "ras",
 +                                    "Set on/off to enable/disable reporting host memory errors "
 +                                    "to a KVM guest using ACPI and guest external abort exceptions",
 +                                    NULL);
 +
      vms->irqmap = a15irqmap;
      virt_flash_create(vms);
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 01/39] xlnx-zynqmp-rtc: Initial commit
+[PULL 21/45] docs: APEI GHES generation and CPER record description
-From: Alistair Francis <alistair.francis@xilinx.com>
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-Initial commit of the ZynqMP RTC device.
+Add APEI/GHES detailed design document
-Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Message-id: 20200512030609.19593-4-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/timer/Makefile.objs             |   1 +
+ docs/specs/acpi_hest_ghes.rst | 110 ++++++++++++++++++++++++++++++++++
- include/hw/timer/xlnx-zynqmp-rtc.h |  84 +++++++++++++++
+ docs/specs/index.rst          |   1 +
- hw/timer/xlnx-zynqmp-rtc.c         | 214 +++++++++++++++++++++++++++++++++++++
+files changed, 111 insertions(+)
-files changed, 299 insertions(+)
+ create mode 100644 docs/specs/acpi_hest_ghes.rst
  create mode 100644 include/hw/timer/xlnx-zynqmp-rtc.h
  create mode 100644 hw/timer/xlnx-zynqmp-rtc.c
-diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs
+diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/timer/Makefile.objs
 +++ b/hw/timer/Makefile.objs
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_IMX) += imx_epit.o
  common-obj-$(CONFIG_IMX) += imx_gpt.o
  common-obj-$(CONFIG_LM32) += lm32_timer.o
  common-obj-$(CONFIG_MILKYMIST) += milkymist-sysctl.o
 +common-obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp-rtc.o
  obj-$(CONFIG_ALTERA_TIMER) += altera_timer.o
  obj-$(CONFIG_EXYNOS4) += exynos4210_mct.o
 diff --git a/include/hw/timer/xlnx-zynqmp-rtc.h b/include/hw/timer/xlnx-zynqmp-rtc.h
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
-+++ b/include/hw/timer/xlnx-zynqmp-rtc.h
++++ b/docs/specs/acpi_hest_ghes.rst
 @@ -XXX,XX +XXX,XX @@
-+/*
++APEI tables generating and CPER record
-+ * QEMU model of the Xilinx ZynqMP Real Time Clock (RTC).
++======================================
 + *
 + * Copyright (c) 2017 Xilinx Inc.
 + *
 + * Written-by: Alistair Francis <alistair.francis@xilinx.com>
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a copy
 + * of this software and associated documentation files (the "Software"), to deal
 + * in the Software without restriction, including without limitation the rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
-+#include "hw/register.h"
++..
 +   Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
 +
-+#define TYPE_XLNX_ZYNQMP_RTC "xlnx-zynmp.rtc"
++   This work is licensed under the terms of the GNU GPL, version 2 or later.
 +   See the COPYING file in the top-level directory.
 +
-+#define XLNX_ZYNQMP_RTC(obj) \
++Design Details
-+     OBJECT_CHECK(XlnxZynqMPRTC, (obj), TYPE_XLNX_ZYNQMP_RTC)
++--------------
 +
-+REG32(SET_TIME_WRITE, 0x0)
++::
 +REG32(SET_TIME_READ, 0x4)
 +REG32(CALIB_WRITE, 0x8)
 +    FIELD(CALIB_WRITE, FRACTION_EN, 20, 1)
 +    FIELD(CALIB_WRITE, FRACTION_DATA, 16, 4)
 +    FIELD(CALIB_WRITE, MAX_TICK, 0, 16)
 +REG32(CALIB_READ, 0xc)
 +    FIELD(CALIB_READ, FRACTION_EN, 20, 1)
 +    FIELD(CALIB_READ, FRACTION_DATA, 16, 4)
 +    FIELD(CALIB_READ, MAX_TICK, 0, 16)
 +REG32(CURRENT_TIME, 0x10)
 +REG32(CURRENT_TICK, 0x14)
 +    FIELD(CURRENT_TICK, VALUE, 0, 16)
 +REG32(ALARM, 0x18)
 +REG32(RTC_INT_STATUS, 0x20)
 +    FIELD(RTC_INT_STATUS, ALARM, 1, 1)
 +    FIELD(RTC_INT_STATUS, SECONDS, 0, 1)
 +REG32(RTC_INT_MASK, 0x24)
 +    FIELD(RTC_INT_MASK, ALARM, 1, 1)
 +    FIELD(RTC_INT_MASK, SECONDS, 0, 1)
 +REG32(RTC_INT_EN, 0x28)
 +    FIELD(RTC_INT_EN, ALARM, 1, 1)
 +    FIELD(RTC_INT_EN, SECONDS, 0, 1)
 +REG32(RTC_INT_DIS, 0x2c)
 +    FIELD(RTC_INT_DIS, ALARM, 1, 1)
 +    FIELD(RTC_INT_DIS, SECONDS, 0, 1)
 +REG32(ADDR_ERROR, 0x30)
 +    FIELD(ADDR_ERROR, STATUS, 0, 1)
 +REG32(ADDR_ERROR_INT_MASK, 0x34)
 +    FIELD(ADDR_ERROR_INT_MASK, MASK, 0, 1)
 +REG32(ADDR_ERROR_INT_EN, 0x38)
 +    FIELD(ADDR_ERROR_INT_EN, MASK, 0, 1)
 +REG32(ADDR_ERROR_INT_DIS, 0x3c)
 +    FIELD(ADDR_ERROR_INT_DIS, MASK, 0, 1)
 +REG32(CONTROL, 0x40)
 +    FIELD(CONTROL, BATTERY_DISABLE, 31, 1)
 +    FIELD(CONTROL, OSC_CNTRL, 24, 4)
 +    FIELD(CONTROL, SLVERR_ENABLE, 0, 1)
 +REG32(SAFETY_CHK, 0x50)
 +
-+#define XLNX_ZYNQMP_RTC_R_MAX (R_SAFETY_CHK + 1)
++         etc/acpi/tables                           etc/hardware_errors
 +      ====================                   ===============================
 +  + +--------------------------+            +----------------------------+
 +  | | HEST                     | +--------->|    error_block_address1    |------+
 +  | +--------------------------+ |          +----------------------------+      |
 +  | | GHES1                    | | +------->|    error_block_address2    |------+-+
 +  | +--------------------------+ | |        +----------------------------+      | |
 +  | | .................        | | |        |      ..............        |      | |
 +  | | error_status_address-----+-+ |        -----------------------------+      | |
 +  | | .................        |   |   +--->|    error_block_addressN    |------+-+---+
 +  | | read_ack_register--------+-+ |   |    +----------------------------+      | |   |
 +  | | read_ack_preserve        | +-+---+--->|     read_ack_register1     |      | |   |
 +  | | read_ack_write           |   |   |    +----------------------------+      | |   |
 +  + +--------------------------+   | +-+--->|     read_ack_register2     |      | |   |
 +  | | GHES2                    |   | | |    +----------------------------+      | |   |
 +  + +--------------------------+   | | |    |       .............        |      | |   |
 +  | | .................        |   | | |    +----------------------------+      | |   |
 +  | | error_status_address-----+---+ | | +->|     read_ack_registerN     |      | |   |
 +  | | .................        |     | | |  +----------------------------+      | |   |
 +  | | read_ack_register--------+-----+ | |  |Generic Error Status Block 1|<-----+ |   |
 +  | | read_ack_preserve        |       | |  |-+------------------------+-+        |   |
 +  | | read_ack_write           |       | |  | |          CPER          | |        |   |
 +  + +--------------------------|       | |  | |          CPER          | |        |   |
 +  | | ...............          |       | |  | |          ....          | |        |   |
 +  + +--------------------------+       | |  | |          CPER          | |        |   |
 +  | | GHESN                    |       | |  |-+------------------------+-|        |   |
 +  + +--------------------------+       | |  |Generic Error Status Block 2|<-------+   |
 +  | | .................        |       | |  |-+------------------------+-+            |
 +  | | error_status_address-----+-------+ |  | |           CPER         | |            |
 +  | | .................        |         |  | |           CPER         | |            |
 +  | | read_ack_register--------+---------+  | |           ....         | |            |
 +  | | read_ack_preserve        |            | |           CPER         | |            |
 +  | | read_ack_write           |            +-+------------------------+-+            |
 +  + +--------------------------+            |         ..........         |            |
 +                                            |----------------------------+            |
 +                                            |Generic Error Status Block N |<----------+
 +                                            |-+-------------------------+-+
 +                                            | |          CPER           | |
 +                                            | |          CPER           | |
 +                                            | |          ....           | |
 +                                            | |          CPER           | |
 +                                            +-+-------------------------+-+
 +
-+typedef struct XlnxZynqMPRTC {
-+    SysBusDevice parent_obj;
-+    MemoryRegion iomem;
-+    qemu_irq irq_rtc_int;
-+    qemu_irq irq_addr_error_int;
 +
-+    uint32_t regs[XLNX_ZYNQMP_RTC_R_MAX];
++(1) QEMU generates the ACPI HEST table. This table goes in the current
-+    RegisterInfo regs_info[XLNX_ZYNQMP_RTC_R_MAX];
++    "etc/acpi/tables" fw_cfg blob. Each error source has different
-+} XlnxZynqMPRTC;
++    notification types.
 diff --git a/hw/timer/xlnx-zynqmp-rtc.c b/hw/timer/xlnx-zynqmp-rtc.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/hw/timer/xlnx-zynqmp-rtc.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + * QEMU model of the Xilinx ZynqMP Real Time Clock (RTC).
 + *
 + * Copyright (c) 2017 Xilinx Inc.
 + *
 + * Written-by: Alistair Francis <alistair.francis@xilinx.com>
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a copy
 + * of this software and associated documentation files (the "Software"), to deal
 + * in the Software without restriction, including without limitation the rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
-+#include "qemu/osdep.h"
++(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
-+#include "hw/sysbus.h"
++    also needs to populate this blob. The "etc/hardware_errors" fw_cfg blob
-+#include "hw/register.h"
++    contains an address registers table and an Error Status Data Block table.
 +#include "qemu/bitops.h"
 +#include "qemu/log.h"
 +#include "hw/timer/xlnx-zynqmp-rtc.h"
 +
-+#ifndef XLNX_ZYNQMP_RTC_ERR_DEBUG
++(3) The address registers table contains N Error Block Address entries
-+#define XLNX_ZYNQMP_RTC_ERR_DEBUG 0
++    and N Read Ack Register entries. The size for each entry is 8-byte.
-+#endif
++    The Error Status Data Block table contains N Error Status Data Block
 +    entries. The size for each entry is 4096(0x1000) bytes. The total size
 +    for the "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
 +    N is the number of the kinds of hardware error sources.
 +
-+static void rtc_int_update_irq(XlnxZynqMPRTC *s)
++(4) QEMU generates the ACPI linker/loader script for the firmware. The
-+{
++    firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
-+    bool pending = s->regs[R_RTC_INT_STATUS] & ~s->regs[R_RTC_INT_MASK];
++    and copies blob contents there.
 +    qemu_set_irq(s->irq_rtc_int, pending);
 +}
 +
-+static void addr_error_int_update_irq(XlnxZynqMPRTC *s)
++(5) QEMU generates N ADD_POINTER commands, which patch addresses in the
-+{
++    "error_status_address" fields of the HEST table with a pointer to the
-+    bool pending = s->regs[R_ADDR_ERROR] & ~s->regs[R_ADDR_ERROR_INT_MASK];
++    corresponding "address registers" in the "etc/hardware_errors" blob.
 +    qemu_set_irq(s->irq_addr_error_int, pending);
 +}
 +
-+static void rtc_int_status_postw(RegisterInfo *reg, uint64_t val64)
++(6) QEMU generates N ADD_POINTER commands, which patch addresses in the
-+{
++    "read_ack_register" fields of the HEST table with a pointer to the
-+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
++    corresponding "read_ack_register" within the "etc/hardware_errors" blob.
 +    rtc_int_update_irq(s);
 +}
 +
-+static uint64_t rtc_int_en_prew(RegisterInfo *reg, uint64_t val64)
++(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
-+{
++    addresses in the "error_block_address" fields with a pointer to the
-+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
++    respective "Error Status Data Block" in the "etc/hardware_errors" blob.
 +
-+    s->regs[R_RTC_INT_MASK] &= (uint32_t) ~val64;
++(8) QEMU defines a third and write-only fw_cfg blob which is called
-+    rtc_int_update_irq(s);
++    "etc/hardware_errors_addr". Through that blob, the firmware can send back
-+    return 0;
++    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
-+}
++    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
 +    for the firmware. The firmware will write back the start address of
 +    "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
 +
-+static uint64_t rtc_int_dis_prew(RegisterInfo *reg, uint64_t val64)
++(9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
-+{
++    "Error Status Data Block", guest memory, and then injects platform specific
-+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
++    interrupt (in case of arm/virt machine it's Synchronous External Abort) as a
 +    notification which is necessary for notifying the guest.
 +
-+    s->regs[R_RTC_INT_MASK] |= (uint32_t) val64;
++(10) This notification (in virtual hardware) will be handled by the guest
-+    rtc_int_update_irq(s);
++     kernel, on receiving notification, guest APEI driver could read the CPER error
-+    return 0;
++     and take appropriate action.
 +}
 +
-+static void addr_error_postw(RegisterInfo *reg, uint64_t val64)
++(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
-+{
++     find out "Error Status Data Block" entry corresponding to error source. So supported
-+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
++     source_id values should be assigned here and not be changed afterwards to make sure
-+    addr_error_int_update_irq(s);
++     that guest will write error into expected "Error Status Data Block" even if guest was
-+}
++     migrated to a newer QEMU.
-+
+diff --git a/docs/specs/index.rst b/docs/specs/index.rst
-+static uint64_t addr_error_int_en_prew(RegisterInfo *reg, uint64_t val64)
+index XXXXXXX..XXXXXXX 100644
-+{
+--- a/docs/specs/index.rst
-+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
++++ b/docs/specs/index.rst
-+
+@@ -XXX,XX +XXX,XX @@ Contents:
-+    s->regs[R_ADDR_ERROR_INT_MASK] &= (uint32_t) ~val64;
+    ppc-spapr-xive
-+    addr_error_int_update_irq(s);
+    acpi_hw_reduced_hotplug
-+    return 0;
+    tpm
-+}
++   acpi_hest_ghes
 +
 +static uint64_t addr_error_int_dis_prew(RegisterInfo *reg, uint64_t val64)
 +{
 +    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
 +
 +    s->regs[R_ADDR_ERROR_INT_MASK] |= (uint32_t) val64;
 +    addr_error_int_update_irq(s);
 +    return 0;
 +}
 +
 +static const RegisterAccessInfo rtc_regs_info[] = {
 +    {   .name = "SET_TIME_WRITE",  .addr = A_SET_TIME_WRITE,
 +    },{ .name = "SET_TIME_READ",  .addr = A_SET_TIME_READ,
 +        .ro = 0xffffffff,
 +    },{ .name = "CALIB_WRITE",  .addr = A_CALIB_WRITE,
 +    },{ .name = "CALIB_READ",  .addr = A_CALIB_READ,
 +        .ro = 0x1fffff,
 +    },{ .name = "CURRENT_TIME",  .addr = A_CURRENT_TIME,
 +        .ro = 0xffffffff,
 +    },{ .name = "CURRENT_TICK",  .addr = A_CURRENT_TICK,
 +        .ro = 0xffff,
 +    },{ .name = "ALARM",  .addr = A_ALARM,
 +    },{ .name = "RTC_INT_STATUS",  .addr = A_RTC_INT_STATUS,
 +        .w1c = 0x3,
 +        .post_write = rtc_int_status_postw,
 +    },{ .name = "RTC_INT_MASK",  .addr = A_RTC_INT_MASK,
 +        .reset = 0x3,
 +        .ro = 0x3,
 +    },{ .name = "RTC_INT_EN",  .addr = A_RTC_INT_EN,
 +        .pre_write = rtc_int_en_prew,
 +    },{ .name = "RTC_INT_DIS",  .addr = A_RTC_INT_DIS,
 +        .pre_write = rtc_int_dis_prew,
 +    },{ .name = "ADDR_ERROR",  .addr = A_ADDR_ERROR,
 +        .w1c = 0x1,
 +        .post_write = addr_error_postw,
 +    },{ .name = "ADDR_ERROR_INT_MASK",  .addr = A_ADDR_ERROR_INT_MASK,
 +        .reset = 0x1,
 +        .ro = 0x1,
 +    },{ .name = "ADDR_ERROR_INT_EN",  .addr = A_ADDR_ERROR_INT_EN,
 +        .pre_write = addr_error_int_en_prew,
 +    },{ .name = "ADDR_ERROR_INT_DIS",  .addr = A_ADDR_ERROR_INT_DIS,
 +        .pre_write = addr_error_int_dis_prew,
 +    },{ .name = "CONTROL",  .addr = A_CONTROL,
 +        .reset = 0x1000000,
 +        .rsvd = 0x70fffffe,
 +    },{ .name = "SAFETY_CHK",  .addr = A_SAFETY_CHK,
 +    }
 +};
 +
 +static void rtc_reset(DeviceState *dev)
 +{
 +    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(dev);
 +    unsigned int i;
 +
 +    for (i = 0; i < ARRAY_SIZE(s->regs_info); ++i) {
 +        register_reset(&s->regs_info[i]);
 +    }
 +
 +    rtc_int_update_irq(s);
 +    addr_error_int_update_irq(s);
 +}
 +
 +static const MemoryRegionOps rtc_ops = {
 +    .read = register_read_memory,
 +    .write = register_write_memory,
 +    .endianness = DEVICE_LITTLE_ENDIAN,
 +    .valid = {
 +        .min_access_size = 4,
 +        .max_access_size = 4,
 +    },
 +};
 +
 +static void rtc_init(Object *obj)
 +{
 +    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(obj);
 +    SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
 +    RegisterInfoArray *reg_array;
 +
 +    memory_region_init(&s->iomem, obj, TYPE_XLNX_ZYNQMP_RTC,
 +                       XLNX_ZYNQMP_RTC_R_MAX * 4);
 +    reg_array =
 +        register_init_block32(DEVICE(obj), rtc_regs_info,
 +                              ARRAY_SIZE(rtc_regs_info),
 +                              s->regs_info, s->regs,
 +                              &rtc_ops,
 +                              XLNX_ZYNQMP_RTC_ERR_DEBUG,
 +                              XLNX_ZYNQMP_RTC_R_MAX * 4);
 +    memory_region_add_subregion(&s->iomem,
 +                                0x0,
 +                                &reg_array->mem);
 +    sysbus_init_mmio(sbd, &s->iomem);
 +    sysbus_init_irq(sbd, &s->irq_rtc_int);
 +    sysbus_init_irq(sbd, &s->irq_addr_error_int);
 +}
 +
 +static const VMStateDescription vmstate_rtc = {
 +    .name = TYPE_XLNX_ZYNQMP_RTC,
 +    .version_id = 1,
 +    .minimum_version_id = 1,
 +    .fields = (VMStateField[]) {
 +        VMSTATE_UINT32_ARRAY(regs, XlnxZynqMPRTC, XLNX_ZYNQMP_RTC_R_MAX),
 +        VMSTATE_END_OF_LIST(),
 +    }
 +};
 +
 +static void rtc_class_init(ObjectClass *klass, void *data)
 +{
 +    DeviceClass *dc = DEVICE_CLASS(klass);
 +
 +    dc->reset = rtc_reset;
 +    dc->vmsd = &vmstate_rtc;
 +}
 +
 +static const TypeInfo rtc_info = {
 +    .name          = TYPE_XLNX_ZYNQMP_RTC,
 +    .parent        = TYPE_SYS_BUS_DEVICE,
 +    .instance_size = sizeof(XlnxZynqMPRTC),
 +    .class_init    = rtc_class_init,
 +    .instance_init = rtc_init,
 +};
 +
 +static void rtc_register_types(void)
 +{
 +    type_register_static(&rtc_info);
 +}
 +
 +type_init(rtc_register_types)
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 22/39] hw/arm/iotkit: Model Arm IOT Kit
+[PULL 22/45] ACPI: Build related register address fields via hardware error fw_cfg blob
-Model the Arm IoT Kit documented in
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ecm0601256/index.html
+This patch builds error_block_address and read_ack_register fields
-The Arm IoT Kit is a subsystem which includes a CPU and some devices,
+in hardware errors table , the error_block_address points to Generic
-and is intended be extended by adding extra devices to form a
+Error Status Block(GESB) via bios_linker. The max size for one GESB
-complete system.  It is used in the MPS2 board's AN505 image for the
+is 1kb, For more detailed information, please refer to
-Cortex-M33.
+document: docs/specs/acpi_hest_ghes.rst
 Now we only support one Error source, if necessary, we can extend to
 support more.
 Suggested-by: Laszlo Ersek <lersek@redhat.com>
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Message-id: 20200512030609.19593-5-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-19-peter.maydell@linaro.org
 ---
- hw/arm/Makefile.objs            |   1 +
+ default-configs/arm-softmmu.mak |  1 +
- include/hw/arm/iotkit.h         | 109 ++++++++
+ include/hw/acpi/aml-build.h     |  1 +
- hw/arm/iotkit.c                 | 598 ++++++++++++++++++++++++++++++++++++++++
+ include/hw/acpi/ghes.h          | 28 +++++++++++
- default-configs/arm-softmmu.mak |   1 +
+ hw/acpi/aml-build.c             |  2 +
-files changed, 709 insertions(+)
+ hw/acpi/ghes.c                  | 89 +++++++++++++++++++++++++++++++++
- create mode 100644 include/hw/arm/iotkit.h
+ hw/arm/virt-acpi-build.c        |  5 ++
- create mode 100644 hw/arm/iotkit.c
+ hw/acpi/Kconfig                 |  4 ++
+ hw/acpi/Makefile.objs           |  1 +
-diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
+files changed, 131 insertions(+)
-index XXXXXXX..XXXXXXX 100644
+ create mode 100644 include/hw/acpi/ghes.h
---- a/hw/arm/Makefile.objs
+ create mode 100644 hw/acpi/ghes.c
-+++ b/hw/arm/Makefile.objs
-@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
+diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
- obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
+index XXXXXXX..XXXXXXX 100644
- obj-$(CONFIG_MPS2) += mps2.o
+--- a/default-configs/arm-softmmu.mak
- obj-$(CONFIG_MSF2) += msf2-soc.o msf2-som.o
++++ b/default-configs/arm-softmmu.mak
-+obj-$(CONFIG_IOTKIT) += iotkit.o
+@@ -XXX,XX +XXX,XX @@ CONFIG_FSL_IMX7=y
-diff --git a/include/hw/arm/iotkit.h b/include/hw/arm/iotkit.h
+ CONFIG_FSL_IMX6UL=y
  CONFIG_SEMIHOSTING=y
  CONFIG_ALLWINNER_H3=y
 +CONFIG_ACPI_APEI=y
 diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
 index XXXXXXX..XXXXXXX 100644
 --- a/include/hw/acpi/aml-build.h
 +++ b/include/hw/acpi/aml-build.h
@@ -XXX,XX +XXX,XX @@ struct AcpiBuildTables {
      GArray *rsdp;
      GArray *tcpalog;
      GArray *vmgenid;
 +    GArray *hardware_errors;
      BIOSLinker *linker;
  } AcpiBuildTables;
 diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
-+++ b/include/hw/arm/iotkit.h
++++ b/include/hw/acpi/ghes.h
 @@ -XXX,XX +XXX,XX @@
 +/*
-+ * ARM IoT Kit
++ * Support for generating APEI tables and recording CPER for Guests
 + *
-+ * Copyright (c) 2018 Linaro Limited
++ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
-+ * Written by Peter Maydell
++ *
 + * Author: Dongjiu Geng <gengdongjiu@huawei.com>
 + *
 + * This program is free software; you can redistribute it and/or modify
-+ * it under the terms of the GNU General Public License version 2 or
++ * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation; either version 2 of the License, or
 + * (at your option) any later version.
++
++ * This program is distributed in the hope that it will be useful,
++ * but WITHOUT ANY WARRANTY; without even the implied warranty of
++ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++ * GNU General Public License for more details.
++
++ * You should have received a copy of the GNU General Public License along
++ * with this program; if not, see <http://www.gnu.org/licenses/>.
 + */
 +
-+/* This is a model of the Arm IoT Kit which is documented in
++#ifndef ACPI_GHES_H
-+ * http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ecm0601256/index.html
++#define ACPI_GHES_H
-+ * It contains:
++
-+ *  a Cortex-M33
++#include "hw/acpi/bios-linker-loader.h"
-+ *  the IDAU
++
-+ *  some timers and watchdogs
++void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
 + *  two peripheral protection controllers
 + *  a memory protection controller
 + *  a security controller
 + *  a bus fabric which arranges that some parts of the address
 + *  space are secure and non-secure aliases of each other
 + *
 + * QEMU interface:
 + *  + QOM property "memory" is a MemoryRegion containing the devices provided
 + *    by the board model.
 + *  + QOM property "MAINCLK" is the frequency of the main system clock
 + *  + QOM property "EXP_NUMIRQ" sets the number of expansion interrupts
 + *  + Named GPIO inputs "EXP_IRQ" 0..n are the expansion interrupts, which
 + *    are wired to the NVIC lines 32 .. n+32
 + * Controlling up to 4 AHB expansion PPBs which a system using the IoTKit
 + * might provide:
 + *  + named GPIO outputs apb_ppcexp{0,1,2,3}_nonsec[0..15]
 + *  + named GPIO outputs apb_ppcexp{0,1,2,3}_ap[0..15]
 + *  + named GPIO outputs apb_ppcexp{0,1,2,3}_irq_enable
 + *  + named GPIO outputs apb_ppcexp{0,1,2,3}_irq_clear
 + *  + named GPIO inputs apb_ppcexp{0,1,2,3}_irq_status
 + * Controlling each of the 4 expansion AHB PPCs which a system using the IoTKit
 + * might provide:
 + *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_nonsec[0..15]
 + *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_ap[0..15]
 + *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_irq_enable
 + *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_irq_clear
 + *  + named GPIO inputs ahb_ppcexp{0,1,2,3}_irq_status
 + */
 +
 +#ifndef IOTKIT_H
 +#define IOTKIT_H
 +
 +#include "hw/sysbus.h"
 +#include "hw/arm/armv7m.h"
 +#include "hw/misc/iotkit-secctl.h"
 +#include "hw/misc/tz-ppc.h"
 +#include "hw/timer/cmsdk-apb-timer.h"
 +#include "hw/misc/unimp.h"
 +#include "hw/or-irq.h"
 +#include "hw/core/split-irq.h"
 +
 +#define TYPE_IOTKIT "iotkit"
 +#define IOTKIT(obj) OBJECT_CHECK(IoTKit, (obj), TYPE_IOTKIT)
 +
 +/* We have an IRQ splitter and an OR gate input for each external PPC
 + * and the 2 internal PPCs
 + */
 +#define NUM_EXTERNAL_PPCS (IOTS_NUM_AHB_EXP_PPC + IOTS_NUM_APB_EXP_PPC)
 +#define NUM_PPCS (NUM_EXTERNAL_PPCS + 2)
 +
 +typedef struct IoTKit {
 +    /*< private >*/
 +    SysBusDevice parent_obj;
 +
 +    /*< public >*/
 +    ARMv7MState armv7m;
 +    IoTKitSecCtl secctl;
 +    TZPPC apb_ppc0;
 +    TZPPC apb_ppc1;
 +    CMSDKAPBTIMER timer0;
 +    CMSDKAPBTIMER timer1;
 +    qemu_or_irq ppc_irq_orgate;
 +    SplitIRQ sec_resp_splitter;
 +    SplitIRQ ppc_irq_splitter[NUM_PPCS];
 +
 +    UnimplementedDeviceState dualtimer;
 +    UnimplementedDeviceState s32ktimer;
 +
 +    MemoryRegion container;
 +    MemoryRegion alias1;
 +    MemoryRegion alias2;
 +    MemoryRegion alias3;
 +    MemoryRegion sram0;
 +
 +    qemu_irq *exp_irqs;
 +    qemu_irq ppc0_irq;
 +    qemu_irq ppc1_irq;
 +    qemu_irq sec_resp_cfg;
 +    qemu_irq sec_resp_cfg_in;
 +    qemu_irq nsc_cfg_in;
 +
 +    qemu_irq irq_status_in[NUM_EXTERNAL_PPCS];
 +
 +    uint32_t nsccfg;
 +
 +    /* Properties */
 +    MemoryRegion *board_memory;
 +    uint32_t exp_numirq;
 +    uint32_t mainclk_frq;
 +} IoTKit;
 +
 +#endif
-diff --git a/hw/arm/iotkit.c b/hw/arm/iotkit.c
+diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/acpi/aml-build.c
 +++ b/hw/acpi/aml-build.c
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_init(AcpiBuildTables *tables)
      tables->table_data = g_array_new(false, true /* clear */, 1);
      tables->tcpalog = g_array_new(false, true /* clear */, 1);
      tables->vmgenid = g_array_new(false, true /* clear */, 1);
 +    tables->hardware_errors = g_array_new(false, true /* clear */, 1);
      tables->linker = bios_linker_loader_init();
  }
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre)
      g_array_free(tables->table_data, true);
      g_array_free(tables->tcpalog, mfre);
      g_array_free(tables->vmgenid, mfre);
 +    g_array_free(tables->hardware_errors, mfre);
  }
  /*
 diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
-+++ b/hw/arm/iotkit.c
++++ b/hw/acpi/ghes.c
 @@ -XXX,XX +XXX,XX @@
 +/*
-+ * Arm IoT Kit
++ * Support for generating APEI tables and recording CPER for Guests
 + *
-+ * Copyright (c) 2018 Linaro Limited
++ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
-+ * Written by Peter Maydell
++ *
 + * Author: Dongjiu Geng <gengdongjiu@huawei.com>
 + *
 + * This program is free software; you can redistribute it and/or modify
-+ * it under the terms of the GNU General Public License version 2 or
++ * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation; either version 2 of the License, or
 + * (at your option) any later version.
++
++ * This program is distributed in the hope that it will be useful,
++ * but WITHOUT ANY WARRANTY; without even the implied warranty of
++ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++ * GNU General Public License for more details.
++
++ * You should have received a copy of the GNU General Public License along
++ * with this program; if not, see <http://www.gnu.org/licenses/>.
 + */
 +
 +#include "qemu/osdep.h"
-+#include "qemu/log.h"
++#include "qemu/units.h"
-+#include "qapi/error.h"
++#include "hw/acpi/ghes.h"
-+#include "trace.h"
++#include "hw/acpi/aml-build.h"
-+#include "hw/sysbus.h"
++
-+#include "hw/registerfields.h"
++#define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
-+#include "hw/arm/iotkit.h"
++#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
-+#include "hw/misc/unimp.h"
++
-+#include "hw/arm/arm.h"
++/* The max size in bytes for one error block */
-+
++#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
-+/* Create an alias region of @size bytes starting at @base
++
-+ * which mirrors the memory starting at @orig.
++/* Now only support ARMv8 SEA notification type error source */
 +#define ACPI_GHES_ERROR_SOURCE_COUNT        1
 +
 +/*
 + * Build table for the hardware error fw_cfg blob.
 + * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
 + * See docs/specs/acpi_hest_ghes.rst for blobs format.
 + */
-+static void make_alias(IoTKit *s, MemoryRegion *mr, const char *name,
++void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
 +                       hwaddr base, hwaddr size, hwaddr orig)
 +{
-+    memory_region_init_alias(mr, NULL, name, &s->container, orig, size);
++    int i, error_status_block_offset;
-+    /* The alias is even lower priority than unimplemented_device regions */
++
-+    memory_region_add_subregion_overlap(&s->container, base, mr, -1500);
++    /* Build error_block_address */
 +    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
 +        build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
 +    }
 +
 +    /* Build read_ack_register */
 +    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
 +        /*
 +         * Initialize the value of read_ack_register to 1, so GHES can be
 +         * writeable after (re)boot.
 +         * ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
 +         * (GHESv2 - Type 10)
 +         */
 +        build_append_int_noprefix(hardware_errors, 1, sizeof(uint64_t));
 +    }
 +
 +    /* Generic Error Status Block offset in the hardware error fw_cfg blob */
 +    error_status_block_offset = hardware_errors->len;
 +
 +    /* Reserve space for Error Status Data Block */
 +    acpi_data_push(hardware_errors,
 +        ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
 +
 +    /* Tell guest firmware to place hardware_errors blob into RAM */
 +    bios_linker_loader_alloc(linker, ACPI_GHES_ERRORS_FW_CFG_FILE,
 +                             hardware_errors, sizeof(uint64_t), false);
 +
 +    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
 +        /*
 +         * Tell firmware to patch error_block_address entries to point to
 +         * corresponding "Generic Error Status Block"
 +         */
 +        bios_linker_loader_add_pointer(linker,
 +            ACPI_GHES_ERRORS_FW_CFG_FILE, sizeof(uint64_t) * i,
 +            sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
 +            error_status_block_offset + i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
 +    }
 +
 +    /*
 +     * tell firmware to write hardware_errors GPA into
 +     * hardware_errors_addr fw_cfg, once the former has been initialized.
 +     */
 +    bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
 +        0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
 +}
-+
+diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
-+static void init_sysbus_child(Object *parent, const char *childname,
+index XXXXXXX..XXXXXXX 100644
-+                              void *child, size_t childsize,
+--- a/hw/arm/virt-acpi-build.c
-+                              const char *childtype)
++++ b/hw/arm/virt-acpi-build.c
-+{
+@@ -XXX,XX +XXX,XX @@
-+    object_initialize(child, childsize, childtype);
+ #include "sysemu/reset.h"
-+    object_property_add_child(parent, childname, OBJECT(child), &error_abort);
+ #include "kvm_arm.h"
-+    qdev_set_parent_bus(DEVICE(child), sysbus_get_default());
+ #include "migration/vmstate.h"
-+}
++#include "hw/acpi/ghes.h"
-+
-+static void irq_status_forwarder(void *opaque, int n, int level)
+ #define ARM_SPI_BASE 32
-+{
-+    qemu_irq destirq = opaque;
+@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
-+
+     acpi_add_table(table_offsets, tables_blob);
-+    qemu_set_irq(destirq, level);
+     build_spcr(tables_blob, tables->linker, vms);
-+}
-+
++    if (vms->ras) {
-+static void nsccfg_handler(void *opaque, int n, int level)
++        build_ghes_error_table(tables->hardware_errors, tables->linker);
-+{
++    }
-+    IoTKit *s = IOTKIT(opaque);
++
-+
+     if (ms->numa_state->num_nodes > 0) {
-+    s->nsccfg = level;
+         acpi_add_table(table_offsets, tables_blob);
-+}
+         build_srat(tables_blob, tables->linker, vms);
-+
+diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
-+static void iotkit_forward_ppc(IoTKit *s, const char *ppcname, int ppcnum)
+index XXXXXXX..XXXXXXX 100644
-+{
+--- a/hw/acpi/Kconfig
-+    /* Each of the 4 AHB and 4 APB PPCs that might be present in a
++++ b/hw/acpi/Kconfig
-+     * system using the IoTKit has a collection of control lines which
+@@ -XXX,XX +XXX,XX @@ config ACPI_HMAT
-+     * are provided by the security controller and which we want to
+     bool
-+     * expose as control lines on the IoTKit device itself, so the
+     depends on ACPI
-+     * code using the IoTKit can wire them up to the PPCs.
-+     */
++config ACPI_APEI
-+    SplitIRQ *splitter = &s->ppc_irq_splitter[ppcnum];
++    bool
-+    DeviceState *iotkitdev = DEVICE(s);
++    depends on ACPI
-+    DeviceState *dev_secctl = DEVICE(&s->secctl);
++
-+    DeviceState *dev_splitter = DEVICE(splitter);
+ config ACPI_PCI
-+    char *name;
+     bool
-+
+     depends on ACPI && PCI
-+    name = g_strdup_printf("%s_nonsec", ppcname);
+diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
-+    qdev_pass_gpios(dev_secctl, iotkitdev, name);
+index XXXXXXX..XXXXXXX 100644
-+    g_free(name);
+--- a/hw/acpi/Makefile.objs
-+    name = g_strdup_printf("%s_ap", ppcname);
++++ b/hw/acpi/Makefile.objs
-+    qdev_pass_gpios(dev_secctl, iotkitdev, name);
+@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
-+    g_free(name);
+ common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
-+    name = g_strdup_printf("%s_irq_enable", ppcname);
+ common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
-+    qdev_pass_gpios(dev_secctl, iotkitdev, name);
+ common-obj-$(CONFIG_ACPI_HMAT) += hmat.o
-+    g_free(name);
++common-obj-$(CONFIG_ACPI_APEI) += ghes.o
-+    name = g_strdup_printf("%s_irq_clear", ppcname);
+ common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
-+    qdev_pass_gpios(dev_secctl, iotkitdev, name);
+ common-obj-$(call lnot,$(CONFIG_PC)) += acpi-x86-stub.o
-+    g_free(name);
 +
 +    /* irq_status is a little more tricky, because we need to
 +     * split it so we can send it both to the security controller
 +     * and to our OR gate for the NVIC interrupt line.
 +     * Connect up the splitter's outputs, and create a GPIO input
 +     * which will pass the line state to the input splitter.
 +     */
 +    name = g_strdup_printf("%s_irq_status", ppcname);
 +    qdev_connect_gpio_out(dev_splitter, 0,
 +                          qdev_get_gpio_in_named(dev_secctl,
 +                                                 name, 0));
 +    qdev_connect_gpio_out(dev_splitter, 1,
 +                          qdev_get_gpio_in(DEVICE(&s->ppc_irq_orgate), ppcnum));
 +    s->irq_status_in[ppcnum] = qdev_get_gpio_in(dev_splitter, 0);
 +    qdev_init_gpio_in_named_with_opaque(iotkitdev, irq_status_forwarder,
 +                                        s->irq_status_in[ppcnum], name, 1);
 +    g_free(name);
 +}
 +
 +static void iotkit_forward_sec_resp_cfg(IoTKit *s)
 +{
 +    /* Forward the 3rd output from the splitter device as a
 +     * named GPIO output of the iotkit object.
 +     */
 +    DeviceState *dev = DEVICE(s);
 +    DeviceState *dev_splitter = DEVICE(&s->sec_resp_splitter);
 +
 +    qdev_init_gpio_out_named(dev, &s->sec_resp_cfg, "sec_resp_cfg", 1);
 +    s->sec_resp_cfg_in = qemu_allocate_irq(irq_status_forwarder,
 +                                           s->sec_resp_cfg, 1);
 +    qdev_connect_gpio_out(dev_splitter, 2, s->sec_resp_cfg_in);
 +}
 +
 +static void iotkit_init(Object *obj)
 +{
 +    IoTKit *s = IOTKIT(obj);
 +    int i;
 +
 +    memory_region_init(&s->container, obj, "iotkit-container", UINT64_MAX);
 +
 +    init_sysbus_child(obj, "armv7m", &s->armv7m, sizeof(s->armv7m),
 +                      TYPE_ARMV7M);
 +    qdev_prop_set_string(DEVICE(&s->armv7m), "cpu-type",
 +                         ARM_CPU_TYPE_NAME("cortex-m33"));
 +
 +    init_sysbus_child(obj, "secctl", &s->secctl, sizeof(s->secctl),
 +                      TYPE_IOTKIT_SECCTL);
 +    init_sysbus_child(obj, "apb-ppc0", &s->apb_ppc0, sizeof(s->apb_ppc0),
 +                      TYPE_TZ_PPC);
 +    init_sysbus_child(obj, "apb-ppc1", &s->apb_ppc1, sizeof(s->apb_ppc1),
 +                      TYPE_TZ_PPC);
 +    init_sysbus_child(obj, "timer0", &s->timer0, sizeof(s->timer0),
 +                      TYPE_CMSDK_APB_TIMER);
 +    init_sysbus_child(obj, "timer1", &s->timer1, sizeof(s->timer1),
 +                      TYPE_CMSDK_APB_TIMER);
 +    init_sysbus_child(obj, "dualtimer", &s->dualtimer, sizeof(s->dualtimer),
 +                      TYPE_UNIMPLEMENTED_DEVICE);
 +    object_initialize(&s->ppc_irq_orgate, sizeof(s->ppc_irq_orgate),
 +                      TYPE_OR_IRQ);
 +    object_property_add_child(obj, "ppc-irq-orgate",
 +                              OBJECT(&s->ppc_irq_orgate), &error_abort);
 +    object_initialize(&s->sec_resp_splitter, sizeof(s->sec_resp_splitter),
 +                      TYPE_SPLIT_IRQ);
 +    object_property_add_child(obj, "sec-resp-splitter",
 +                              OBJECT(&s->sec_resp_splitter), &error_abort);
 +    for (i = 0; i < ARRAY_SIZE(s->ppc_irq_splitter); i++) {
 +        char *name = g_strdup_printf("ppc-irq-splitter-%d", i);
 +        SplitIRQ *splitter = &s->ppc_irq_splitter[i];
 +
 +        object_initialize(splitter, sizeof(*splitter), TYPE_SPLIT_IRQ);
 +        object_property_add_child(obj, name, OBJECT(splitter), &error_abort);
 +    }
 +    init_sysbus_child(obj, "s32ktimer", &s->s32ktimer, sizeof(s->s32ktimer),
 +                      TYPE_UNIMPLEMENTED_DEVICE);
 +}
 +
 +static void iotkit_exp_irq(void *opaque, int n, int level)
 +{
 +    IoTKit *s = IOTKIT(opaque);
 +
 +    qemu_set_irq(s->exp_irqs[n], level);
 +}
 +
 +static void iotkit_realize(DeviceState *dev, Error **errp)
 +{
 +    IoTKit *s = IOTKIT(dev);
 +    int i;
 +    MemoryRegion *mr;
 +    Error *err = NULL;
 +    SysBusDevice *sbd_apb_ppc0;
 +    SysBusDevice *sbd_secctl;
 +    DeviceState *dev_apb_ppc0;
 +    DeviceState *dev_apb_ppc1;
 +    DeviceState *dev_secctl;
 +    DeviceState *dev_splitter;
 +
 +    if (!s->board_memory) {
 +        error_setg(errp, "memory property was not set");
 +        return;
 +    }
 +
 +    if (!s->mainclk_frq) {
 +        error_setg(errp, "MAINCLK property was not set");
 +        return;
 +    }
 +
 +    /* Handling of which devices should be available only to secure
 +     * code is usually done differently for M profile than for A profile.
 +     * Instead of putting some devices only into the secure address space,
 +     * devices exist in both address spaces but with hard-wired security
 +     * permissions that will cause the CPU to fault for non-secure accesses.
 +     *
 +     * The IoTKit has an IDAU (Implementation Defined Access Unit),
 +     * which specifies hard-wired security permissions for different
 +     * areas of the physical address space. For the IoTKit IDAU, the
 +     * top 4 bits of the physical address are the IDAU region ID, and
 +     * if bit 28 (ie the lowest bit of the ID) is 0 then this is an NS
 +     * region, otherwise it is an S region.
 +     *
 +     * The various devices and RAMs are generally all mapped twice,
 +     * once into a region that the IDAU defines as secure and once
 +     * into a non-secure region. They sit behind either a Memory
 +     * Protection Controller (for RAM) or a Peripheral Protection
 +     * Controller (for devices), which allow a more fine grained
 +     * configuration of whether non-secure accesses are permitted.
 +     *
 +     * (The other place that guest software can configure security
 +     * permissions is in the architected SAU (Security Attribution
 +     * Unit), which is entirely inside the CPU. The IDAU can upgrade
 +     * the security attributes for a region to more restrictive than
 +     * the SAU specifies, but cannot downgrade them.)
 +     *
 +     * 0x10000000..0x1fffffff  alias of 0x00000000..0x0fffffff
 +     * 0x20000000..0x2007ffff  32KB FPGA block RAM
 +     * 0x30000000..0x3fffffff  alias of 0x20000000..0x2fffffff
 +     * 0x40000000..0x4000ffff  base peripheral region 1
 +     * 0x40010000..0x4001ffff  CPU peripherals (none for IoTKit)
 +     * 0x40020000..0x4002ffff  system control element peripherals
 +     * 0x40080000..0x400fffff  base peripheral region 2
 +     * 0x50000000..0x5fffffff  alias of 0x40000000..0x4fffffff
 +     */
 +
 +    memory_region_add_subregion_overlap(&s->container, 0, s->board_memory, -1);
 +
 +    qdev_prop_set_uint32(DEVICE(&s->armv7m), "num-irq", s->exp_numirq + 32);
 +    /* In real hardware the initial Secure VTOR is set from the INITSVTOR0
 +     * register in the IoT Kit System Control Register block, and the
 +     * initial value of that is in turn specifiable by the FPGA that
 +     * instantiates the IoT Kit. In QEMU we don't implement this wrinkle,
 +     * and simply set the CPU's init-svtor to the IoT Kit default value.
 +     */
 +    qdev_prop_set_uint32(DEVICE(&s->armv7m), "init-svtor", 0x10000000);
 +    object_property_set_link(OBJECT(&s->armv7m), OBJECT(&s->container),
 +                             "memory", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    object_property_set_link(OBJECT(&s->armv7m), OBJECT(s), "idau", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    object_property_set_bool(OBJECT(&s->armv7m), true, "realized", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +
 +    /* Connect our EXP_IRQ GPIOs to the NVIC's lines 32 and up. */
 +    s->exp_irqs = g_new(qemu_irq, s->exp_numirq);
 +    for (i = 0; i < s->exp_numirq; i++) {
 +        s->exp_irqs[i] = qdev_get_gpio_in(DEVICE(&s->armv7m), i + 32);
 +    }
 +    qdev_init_gpio_in_named(dev, iotkit_exp_irq, "EXP_IRQ", s->exp_numirq);
 +
 +    /* Set up the big aliases first */
 +    make_alias(s, &s->alias1, "alias 1", 0x10000000, 0x10000000, 0x00000000);
 +    make_alias(s, &s->alias2, "alias 2", 0x30000000, 0x10000000, 0x20000000);
 +    /* The 0x50000000..0x5fffffff region is not a pure alias: it has
 +     * a few extra devices that only appear there (generally the
 +     * control interfaces for the protection controllers).
 +     * We implement this by mapping those devices over the top of this
 +     * alias MR at a higher priority.
 +     */
 +    make_alias(s, &s->alias3, "alias 3", 0x50000000, 0x10000000, 0x40000000);
 +
 +    /* This RAM should be behind a Memory Protection Controller, but we
 +     * don't implement that yet.
 +     */
 +    memory_region_init_ram(&s->sram0, NULL, "iotkit.sram0", 0x00008000, &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    memory_region_add_subregion(&s->container, 0x20000000, &s->sram0);
 +
 +    /* Security controller */
 +    object_property_set_bool(OBJECT(&s->secctl), true, "realized", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    sbd_secctl = SYS_BUS_DEVICE(&s->secctl);
 +    dev_secctl = DEVICE(&s->secctl);
 +    sysbus_mmio_map(sbd_secctl, 0, 0x50080000);
 +    sysbus_mmio_map(sbd_secctl, 1, 0x40080000);
 +
 +    s->nsc_cfg_in = qemu_allocate_irq(nsccfg_handler, s, 1);
 +    qdev_connect_gpio_out_named(dev_secctl, "nsc_cfg", 0, s->nsc_cfg_in);
 +
 +    /* The sec_resp_cfg output from the security controller must be split into
 +     * multiple lines, one for each of the PPCs within the IoTKit and one
 +     * that will be an output from the IoTKit to the system.
 +     */
 +    object_property_set_int(OBJECT(&s->sec_resp_splitter), 3,
 +                            "num-lines", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    object_property_set_bool(OBJECT(&s->sec_resp_splitter), true,
 +                             "realized", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    dev_splitter = DEVICE(&s->sec_resp_splitter);
 +    qdev_connect_gpio_out_named(dev_secctl, "sec_resp_cfg", 0,
 +                                qdev_get_gpio_in(dev_splitter, 0));
 +
 +    /* Devices behind APB PPC0:
 +     *   0x40000000: timer0
 +     *   0x40001000: timer1
 +     *   0x40002000: dual timer
 +     * We must configure and realize each downstream device and connect
 +     * it to the appropriate PPC port; then we can realize the PPC and
 +     * map its upstream ends to the right place in the container.
 +     */
 +    qdev_prop_set_uint32(DEVICE(&s->timer0), "pclk-frq", s->mainclk_frq);
 +    object_property_set_bool(OBJECT(&s->timer0), true, "realized", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    sysbus_connect_irq(SYS_BUS_DEVICE(&s->timer0), 0,
 +                       qdev_get_gpio_in(DEVICE(&s->armv7m), 3));
 +    mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->timer0), 0);
 +    object_property_set_link(OBJECT(&s->apb_ppc0), OBJECT(mr), "port[0]", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +
 +    qdev_prop_set_uint32(DEVICE(&s->timer1), "pclk-frq", s->mainclk_frq);
 +    object_property_set_bool(OBJECT(&s->timer1), true, "realized", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    sysbus_connect_irq(SYS_BUS_DEVICE(&s->timer1), 0,
 +                       qdev_get_gpio_in(DEVICE(&s->armv7m), 3));
 +    mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->timer1), 0);
 +    object_property_set_link(OBJECT(&s->apb_ppc0), OBJECT(mr), "port[1]", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +
 +    qdev_prop_set_string(DEVICE(&s->dualtimer), "name", "Dual timer");
 +    qdev_prop_set_uint64(DEVICE(&s->dualtimer), "size", 0x1000);
 +    object_property_set_bool(OBJECT(&s->dualtimer), true, "realized", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->dualtimer), 0);
 +    object_property_set_link(OBJECT(&s->apb_ppc0), OBJECT(mr), "port[2]", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +
 +    object_property_set_bool(OBJECT(&s->apb_ppc0), true, "realized", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +
 +    sbd_apb_ppc0 = SYS_BUS_DEVICE(&s->apb_ppc0);
 +    dev_apb_ppc0 = DEVICE(&s->apb_ppc0);
 +
 +    mr = sysbus_mmio_get_region(sbd_apb_ppc0, 0);
 +    memory_region_add_subregion(&s->container, 0x40000000, mr);
 +    mr = sysbus_mmio_get_region(sbd_apb_ppc0, 1);
 +    memory_region_add_subregion(&s->container, 0x40001000, mr);
 +    mr = sysbus_mmio_get_region(sbd_apb_ppc0, 2);
 +    memory_region_add_subregion(&s->container, 0x40002000, mr);
 +    for (i = 0; i < IOTS_APB_PPC0_NUM_PORTS; i++) {
 +        qdev_connect_gpio_out_named(dev_secctl, "apb_ppc0_nonsec", i,
 +                                    qdev_get_gpio_in_named(dev_apb_ppc0,
 +                                                           "cfg_nonsec", i));
 +        qdev_connect_gpio_out_named(dev_secctl, "apb_ppc0_ap", i,
 +                                    qdev_get_gpio_in_named(dev_apb_ppc0,
 +                                                           "cfg_ap", i));
 +    }
 +    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc0_irq_enable", 0,
 +                                qdev_get_gpio_in_named(dev_apb_ppc0,
 +                                                       "irq_enable", 0));
 +    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc0_irq_clear", 0,
 +                                qdev_get_gpio_in_named(dev_apb_ppc0,
 +                                                       "irq_clear", 0));
 +    qdev_connect_gpio_out(dev_splitter, 0,
 +                          qdev_get_gpio_in_named(dev_apb_ppc0,
 +                                                 "cfg_sec_resp", 0));
 +
 +    /* All the PPC irq lines (from the 2 internal PPCs and the 8 external
 +     * ones) are sent individually to the security controller, and also
 +     * ORed together to give a single combined PPC interrupt to the NVIC.
 +     */
 +    object_property_set_int(OBJECT(&s->ppc_irq_orgate),
 +                            NUM_PPCS, "num-lines", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    object_property_set_bool(OBJECT(&s->ppc_irq_orgate), true,
 +                             "realized", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    qdev_connect_gpio_out(DEVICE(&s->ppc_irq_orgate), 0,
 +                          qdev_get_gpio_in(DEVICE(&s->armv7m), 10));
 +
 +    /* 0x40010000 .. 0x4001ffff: private CPU region: unused in IoTKit */
 +
 +    /* 0x40020000 .. 0x4002ffff : IoTKit system control peripheral region */
 +    /* Devices behind APB PPC1:
 +     *   0x4002f000: S32K timer
 +     */
 +    qdev_prop_set_string(DEVICE(&s->s32ktimer), "name", "S32KTIMER");
 +    qdev_prop_set_uint64(DEVICE(&s->s32ktimer), "size", 0x1000);
 +    object_property_set_bool(OBJECT(&s->s32ktimer), true, "realized", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->s32ktimer), 0);
 +    object_property_set_link(OBJECT(&s->apb_ppc1), OBJECT(mr), "port[0]", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +
 +    object_property_set_bool(OBJECT(&s->apb_ppc1), true, "realized", &err);
 +    if (err) {
 +        error_propagate(errp, err);
 +        return;
 +    }
 +    mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->apb_ppc1), 0);
 +    memory_region_add_subregion(&s->container, 0x4002f000, mr);
 +
 +    dev_apb_ppc1 = DEVICE(&s->apb_ppc1);
 +    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc1_nonsec", 0,
 +                                qdev_get_gpio_in_named(dev_apb_ppc1,
 +                                                       "cfg_nonsec", 0));
 +    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc1_ap", 0,
 +                                qdev_get_gpio_in_named(dev_apb_ppc1,
 +                                                       "cfg_ap", 0));
 +    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc1_irq_enable", 0,
 +                                qdev_get_gpio_in_named(dev_apb_ppc1,
 +                                                       "irq_enable", 0));
 +    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc1_irq_clear", 0,
 +                                qdev_get_gpio_in_named(dev_apb_ppc1,
 +                                                       "irq_clear", 0));
 +    qdev_connect_gpio_out(dev_splitter, 1,
 +                          qdev_get_gpio_in_named(dev_apb_ppc1,
 +                                                 "cfg_sec_resp", 0));
 +
 +    /* Using create_unimplemented_device() maps the stub into the
 +     * system address space rather than into our container, but the
 +     * overall effect to the guest is the same.
 +     */
 +    create_unimplemented_device("SYSINFO", 0x40020000, 0x1000);
 +
 +    create_unimplemented_device("SYSCONTROL", 0x50021000, 0x1000);
 +    create_unimplemented_device("S32KWATCHDOG", 0x5002e000, 0x1000);
 +
 +    /* 0x40080000 .. 0x4008ffff : IoTKit second Base peripheral region */
 +
 +    create_unimplemented_device("NS watchdog", 0x40081000, 0x1000);
 +    create_unimplemented_device("S watchdog", 0x50081000, 0x1000);
 +
 +    create_unimplemented_device("SRAM0 MPC", 0x50083000, 0x1000);
 +
 +    for (i = 0; i < ARRAY_SIZE(s->ppc_irq_splitter); i++) {
 +        Object *splitter = OBJECT(&s->ppc_irq_splitter[i]);
 +
 +        object_property_set_int(splitter, 2, "num-lines", &err);
 +        if (err) {
 +            error_propagate(errp, err);
 +            return;
 +        }
 +        object_property_set_bool(splitter, true, "realized", &err);
 +        if (err) {
 +            error_propagate(errp, err);
 +            return;
 +        }
 +    }
 +
 +    for (i = 0; i < IOTS_NUM_AHB_EXP_PPC; i++) {
 +        char *ppcname = g_strdup_printf("ahb_ppcexp%d", i);
 +
 +        iotkit_forward_ppc(s, ppcname, i);
 +        g_free(ppcname);
 +    }
 +
 +    for (i = 0; i < IOTS_NUM_APB_EXP_PPC; i++) {
 +        char *ppcname = g_strdup_printf("apb_ppcexp%d", i);
 +
 +        iotkit_forward_ppc(s, ppcname, i + IOTS_NUM_AHB_EXP_PPC);
 +        g_free(ppcname);
 +    }
 +
 +    for (i = NUM_EXTERNAL_PPCS; i < NUM_PPCS; i++) {
 +        /* Wire up IRQ splitter for internal PPCs */
 +        DeviceState *devs = DEVICE(&s->ppc_irq_splitter[i]);
 +        char *gpioname = g_strdup_printf("apb_ppc%d_irq_status",
 +                                         i - NUM_EXTERNAL_PPCS);
 +        TZPPC *ppc = (i == NUM_EXTERNAL_PPCS) ? &s->apb_ppc0 : &s->apb_ppc1;
 +
 +        qdev_connect_gpio_out(devs, 0,
 +                              qdev_get_gpio_in_named(dev_secctl, gpioname, 0));
 +        qdev_connect_gpio_out(devs, 1,
 +                              qdev_get_gpio_in(DEVICE(&s->ppc_irq_orgate), i));
 +        qdev_connect_gpio_out_named(DEVICE(ppc), "irq", 0,
 +                                    qdev_get_gpio_in(devs, 0));
 +    }
 +
 +    iotkit_forward_sec_resp_cfg(s);
 +
 +    system_clock_scale = NANOSECONDS_PER_SECOND / s->mainclk_frq;
 +}
 +
 +static void iotkit_idau_check(IDAUInterface *ii, uint32_t address,
 +                              int *iregion, bool *exempt, bool *ns, bool *nsc)
 +{
 +    /* For IoTKit systems the IDAU responses are simple logical functions
 +     * of the address bits. The NSC attribute is guest-adjustable via the
 +     * NSCCFG register in the security controller.
 +     */
 +    IoTKit *s = IOTKIT(ii);
 +    int region = extract32(address, 28, 4);
 +
 +    *ns = !(region & 1);
 +    *nsc = (region == 1 && (s->nsccfg & 1)) || (region == 3 && (s->nsccfg & 2));
 +    /* 0xe0000000..0xe00fffff and 0xf0000000..0xf00fffff are exempt */
 +    *exempt = (address & 0xeff00000) == 0xe0000000;
 +    *iregion = region;
 +}
 +
 +static const VMStateDescription iotkit_vmstate = {
 +    .name = "iotkit",
 +    .version_id = 1,
 +    .minimum_version_id = 1,
 +    .fields = (VMStateField[]) {
 +        VMSTATE_UINT32(nsccfg, IoTKit),
 +        VMSTATE_END_OF_LIST()
 +    }
 +};
 +
 +static Property iotkit_properties[] = {
 +    DEFINE_PROP_LINK("memory", IoTKit, board_memory, TYPE_MEMORY_REGION,
 +                     MemoryRegion *),
 +    DEFINE_PROP_UINT32("EXP_NUMIRQ", IoTKit, exp_numirq, 64),
 +    DEFINE_PROP_UINT32("MAINCLK", IoTKit, mainclk_frq, 0),
 +    DEFINE_PROP_END_OF_LIST()
 +};
 +
 +static void iotkit_reset(DeviceState *dev)
 +{
 +    IoTKit *s = IOTKIT(dev);
 +
 +    s->nsccfg = 0;
 +}
 +
 +static void iotkit_class_init(ObjectClass *klass, void *data)
 +{
 +    DeviceClass *dc = DEVICE_CLASS(klass);
 +    IDAUInterfaceClass *iic = IDAU_INTERFACE_CLASS(klass);
 +
 +    dc->realize = iotkit_realize;
 +    dc->vmsd = &iotkit_vmstate;
 +    dc->props = iotkit_properties;
 +    dc->reset = iotkit_reset;
 +    iic->check = iotkit_idau_check;
 +}
 +
 +static const TypeInfo iotkit_info = {
 +    .name = TYPE_IOTKIT,
 +    .parent = TYPE_SYS_BUS_DEVICE,
 +    .instance_size = sizeof(IoTKit),
 +    .instance_init = iotkit_init,
 +    .class_init = iotkit_class_init,
 +    .interfaces = (InterfaceInfo[]) {
 +        { TYPE_IDAU_INTERFACE },
 +        { }
 +    }
 +};
 +
 +static void iotkit_register_types(void)
 +{
 +    type_register_static(&iotkit_info);
 +}
 +
 +type_init(iotkit_register_types);
 diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
 index XXXXXXX..XXXXXXX 100644
 --- a/default-configs/arm-softmmu.mak
 +++ b/default-configs/arm-softmmu.mak
@@ -XXX,XX +XXX,XX @@ CONFIG_MPS2_FPGAIO=y
  CONFIG_MPS2_SCC=y
  CONFIG_TZ_PPC=y
 +CONFIG_IOTKIT=y
  CONFIG_IOTKIT_SECCTL=y
  CONFIG_VERSATILE_PCI=y
 --
-.16.2
+.20.1

-New patch
+[PULL 23/45] ACPI: Build Hardware Error Source Table
+From: Dongjiu Geng <gengdongjiu@huawei.com>
 This patch builds Hardware Error Source Table(HEST) via fw_cfg blobs.
 Now it only supports ARMv8 SEA, a type of Generic Hardware Error
 Source version 2(GHESv2) error source. Afterwards, we can extend
 the supported types if needed. For the CPER section, currently it
 is memory section because kernel mainly wants userspace to handle
 the memory errors.
 This patch follows the spec ACPI 6.2 to build the Hardware Error
 Source table. For more detailed information, please refer to
 document: docs/specs/acpi_hest_ghes.rst
 build_ghes_hw_error_notification() helper will help to add Hardware
 Error Notification to ACPI tables without using packed C structures
 and avoid endianness issues as API doesn't need explicit conversion.
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Message-id: 20200512030609.19593-6-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  include/hw/acpi/ghes.h   |  39 ++++++++++++
  hw/acpi/ghes.c           | 126 +++++++++++++++++++++++++++++++++++++++
  hw/arm/virt-acpi-build.c |   2 +
 files changed, 167 insertions(+)
 diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
 index XXXXXXX..XXXXXXX 100644
 --- a/include/hw/acpi/ghes.h
 +++ b/include/hw/acpi/ghes.h
@@ -XXX,XX +XXX,XX @@
  #include "hw/acpi/bios-linker-loader.h"
 +/*
 + * Values for Hardware Error Notification Type field
 + */
 +enum AcpiGhesNotifyType {
 +    /* Polled */
 +    ACPI_GHES_NOTIFY_POLLED = 0,
 +    /* External Interrupt */
 +    ACPI_GHES_NOTIFY_EXTERNAL = 1,
 +    /* Local Interrupt */
 +    ACPI_GHES_NOTIFY_LOCAL = 2,
 +    /* SCI */
 +    ACPI_GHES_NOTIFY_SCI = 3,
 +    /* NMI */
 +    ACPI_GHES_NOTIFY_NMI = 4,
 +    /* CMCI, ACPI 5.0: 18.3.2.7, Table 18-290 */
 +    ACPI_GHES_NOTIFY_CMCI = 5,
 +    /* MCE, ACPI 5.0: 18.3.2.7, Table 18-290 */
 +    ACPI_GHES_NOTIFY_MCE = 6,
 +    /* GPIO-Signal, ACPI 6.0: 18.3.2.7, Table 18-332 */
 +    ACPI_GHES_NOTIFY_GPIO = 7,
 +    /* ARMv8 SEA, ACPI 6.1: 18.3.2.9, Table 18-345 */
 +    ACPI_GHES_NOTIFY_SEA = 8,
 +    /* ARMv8 SEI, ACPI 6.1: 18.3.2.9, Table 18-345 */
 +    ACPI_GHES_NOTIFY_SEI = 9,
 +    /* External Interrupt - GSIV, ACPI 6.1: 18.3.2.9, Table 18-345 */
 +    ACPI_GHES_NOTIFY_GSIV = 10,
 +    /* Software Delegated Exception, ACPI 6.2: 18.3.2.9, Table 18-383 */
 +    ACPI_GHES_NOTIFY_SDEI = 11,
 +    /* 12 and greater are reserved */
 +    ACPI_GHES_NOTIFY_RESERVED = 12
 +};
 +
 +enum {
 +    ACPI_HEST_SRC_ID_SEA = 0,
 +    /* future ids go here */
 +    ACPI_HEST_SRC_ID_RESERVED,
 +};
 +
  void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
 +void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
  #endif
 diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/acpi/ghes.c
 +++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu/units.h"
  #include "hw/acpi/ghes.h"
  #include "hw/acpi/aml-build.h"
 +#include "qemu/error-report.h"
  #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
  #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
@@ -XXX,XX +XXX,XX @@
  /* Now only support ARMv8 SEA notification type error source */
  #define ACPI_GHES_ERROR_SOURCE_COUNT        1
 +/* Generic Hardware Error Source version 2 */
 +#define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
 +
 +/* Address offset in Generic Address Structure(GAS) */
 +#define GAS_ADDR_OFFSET 4
 +
 +/*
 + * Hardware Error Notification
 + * ACPI 4.0: 17.3.2.7 Hardware Error Notification
 + * Composes dummy Hardware Error Notification descriptor of specified type
 + */
 +static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
 +{
 +    /* Type */
 +    build_append_int_noprefix(table, type, 1);
 +    /*
 +     * Length:
 +     * Total length of the structure in bytes
 +     */
 +    build_append_int_noprefix(table, 28, 1);
 +    /* Configuration Write Enable */
 +    build_append_int_noprefix(table, 0, 2);
 +    /* Poll Interval */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Vector */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Switch To Polling Threshold Value */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Switch To Polling Threshold Window */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Error Threshold Value */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Error Threshold Window */
 +    build_append_int_noprefix(table, 0, 4);
 +}
 +
  /*
   * Build table for the hardware error fw_cfg blob.
   * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
      bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
 , sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
  }
 +
 +/* Build Generic Hardware Error Source version 2 (GHESv2) */
 +static void build_ghes_v2(GArray *table_data, int source_id, BIOSLinker *linker)
 +{
 +    uint64_t address_offset;
 +    /*
 +     * Type:
 +     * Generic Hardware Error Source version 2(GHESv2 - Type 10)
 +     */
 +    build_append_int_noprefix(table_data, ACPI_GHES_SOURCE_GENERIC_ERROR_V2, 2);
 +    /* Source Id */
 +    build_append_int_noprefix(table_data, source_id, 2);
 +    /* Related Source Id */
 +    build_append_int_noprefix(table_data, 0xffff, 2);
 +    /* Flags */
 +    build_append_int_noprefix(table_data, 0, 1);
 +    /* Enabled */
 +    build_append_int_noprefix(table_data, 1, 1);
 +
 +    /* Number of Records To Pre-allocate */
 +    build_append_int_noprefix(table_data, 1, 4);
 +    /* Max Sections Per Record */
 +    build_append_int_noprefix(table_data, 1, 4);
 +    /* Max Raw Data Length */
 +    build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
 +
 +    address_offset = table_data->len;
 +    /* Error Status Address */
 +    build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
 +                     4 /* QWord access */, 0);
 +    bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
 +        address_offset + GAS_ADDR_OFFSET, sizeof(uint64_t),
 +        ACPI_GHES_ERRORS_FW_CFG_FILE, source_id * sizeof(uint64_t));
 +
 +    switch (source_id) {
 +    case ACPI_HEST_SRC_ID_SEA:
 +        /*
 +         * Notification Structure
 +         * Now only enable ARMv8 SEA notification type
 +         */
 +        build_ghes_hw_error_notification(table_data, ACPI_GHES_NOTIFY_SEA);
 +        break;
 +    default:
 +        error_report("Not support this error source");
 +        abort();
 +    }
 +
 +    /* Error Status Block Length */
 +    build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
 +
 +    /*
 +     * Read Ack Register
 +     * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source
 +     * version 2 (GHESv2 - Type 10)
 +     */
 +    address_offset = table_data->len;
 +    build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
 +                     4 /* QWord access */, 0);
 +    bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
 +        address_offset + GAS_ADDR_OFFSET,
 +        sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
 +        (ACPI_GHES_ERROR_SOURCE_COUNT + source_id) * sizeof(uint64_t));
 +
 +    /*
 +     * Read Ack Preserve field
 +     * We only provide the first bit in Read Ack Register to OSPM to write
 +     * while the other bits are preserved.
 +     */
 +    build_append_int_noprefix(table_data, ~0x1ULL, 8);
 +    /* Read Ack Write */
 +    build_append_int_noprefix(table_data, 0x1, 8);
 +}
 +
 +/* Build Hardware Error Source Table */
 +void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
 +{
 +    uint64_t hest_start = table_data->len;
 +
 +    /* Hardware Error Source Table header*/
 +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
 +
 +    /* Error Source Count */
 +    build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
 +
 +    build_ghes_v2(table_data, ACPI_HEST_SRC_ID_SEA, linker);
 +
 +    build_header(linker, table_data, (void *)(table_data->data + hest_start),
 +        "HEST", table_data->len - hest_start, 1, NULL, NULL);
 +}
 diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/virt-acpi-build.c
 +++ b/hw/arm/virt-acpi-build.c
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
      if (vms->ras) {
          build_ghes_error_table(tables->hardware_errors, tables->linker);
 +        acpi_add_table(table_offsets, tables_blob);
 +        acpi_build_hest(tables_blob, tables->linker);
      }
      if (ms->numa_state->num_nodes > 0) {
 --
 .20.1

-[Qemu-devel] [PULL 20/39] hw/misc/iotkit-secctl: Add handling for PPCs
+[PULL 24/45] ACPI: Record the Generic Error Status Block address
-The IoTKit Security Controller includes various registers
+From: Dongjiu Geng <gengdongjiu@huawei.com>
 that expose to software the controls for the Peripheral
 Protection Controllers in the system. Implement these.
+Record the GHEB address via fw_cfg file, when recording
+a error to CPER, it will use this address to find out
+Generic Error Data Entries and write the error.
+In order to avoid migration failure, make hardware
+error table address to a part of GED device instead
+of global variable, then this address will be migrated
+to target QEMU.
+Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
+Reviewed-by: Igor Mammedov <imammedo@redhat.com>
+Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
+Message-id: 20200512030609.19593-7-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-17-peter.maydell@linaro.org
 ---
- include/hw/misc/iotkit-secctl.h |  64 +++++++++-
+ include/hw/acpi/generic_event_device.h |  2 ++
- hw/misc/iotkit-secctl.c         | 270 +++++++++++++++++++++++++++++++++++++---
+ include/hw/acpi/ghes.h                 |  6 ++++++
-files changed, 315 insertions(+), 19 deletions(-)
+ hw/acpi/generic_event_device.c         | 19 +++++++++++++++++++
  hw/acpi/ghes.c                         | 14 ++++++++++++++
  hw/arm/virt-acpi-build.c               |  8 ++++++++
 files changed, 49 insertions(+)
-diff --git a/include/hw/misc/iotkit-secctl.h b/include/hw/misc/iotkit-secctl.h
+diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/misc/iotkit-secctl.h
+--- a/include/hw/acpi/generic_event_device.h
-+++ b/include/hw/misc/iotkit-secctl.h
++++ b/include/hw/acpi/generic_event_device.h
 @@ -XXX,XX +XXX,XX @@
-  * QEMU interface:
-  *  + sysbus MMIO region 0 is the "secure privilege control block" registers
+ #include "hw/sysbus.h"
-  *  + sysbus MMIO region 1 is the "non-secure privilege control block" registers
+ #include "hw/acpi/memory_hotplug.h"
-+ *  + named GPIO output "sec_resp_cfg" indicating whether blocked accesses
++#include "hw/acpi/ghes.h"
-+ *    should RAZ/WI or bus error
-+ * Controlling the 2 APB PPCs in the IoTKit:
+ #define ACPI_POWER_BUTTON_DEVICE "PWRB"
-+ *  + named GPIO outputs apb_ppc0_nonsec[0..2] and apb_ppc1_nonsec
-+ *  + named GPIO outputs apb_ppc0_ap[0..2] and apb_ppc1_ap
+@@ -XXX,XX +XXX,XX @@ typedef struct AcpiGedState {
-+ *  + named GPIO outputs apb_ppc{0,1}_irq_enable
+     GEDState ged_state;
-+ *  + named GPIO outputs apb_ppc{0,1}_irq_clear
+     uint32_t ged_event_bitmap;
-+ *  + named GPIO inputs apb_ppc{0,1}_irq_status
+     qemu_irq irq;
-+ * Controlling each of the 4 expansion APB PPCs which a system using the IoTKit
++    AcpiGhesState ghes_state;
-+ * might provide:
+ } AcpiGedState;
-+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_nonsec[0..15]
-+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_ap[0..15]
+ void build_ged_aml(Aml *table, const char* name, HotplugHandler *hotplug_dev,
-+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_irq_enable
+diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
-+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_irq_clear
+index XXXXXXX..XXXXXXX 100644
-+ *  + named GPIO inputs apb_ppcexp{0,1,2,3}_irq_status
+--- a/include/hw/acpi/ghes.h
-+ * Controlling each of the 4 expansion AHB PPCs which a system using the IoTKit
++++ b/include/hw/acpi/ghes.h
-+ * might provide:
+@@ -XXX,XX +XXX,XX @@ enum {
-+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_nonsec[0..15]
+     ACPI_HEST_SRC_ID_RESERVED,
-+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_ap[0..15]
+ };
-+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_irq_enable
-+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_irq_clear
++typedef struct AcpiGhesState {
-+ *  + named GPIO inputs ahb_ppcexp{0,1,2,3}_irq_status
++    uint64_t ghes_addr_le;
-  */
++} AcpiGhesState;
  #ifndef IOTKIT_SECCTL_H
@@ -XXX,XX +XXX,XX @@
  #define TYPE_IOTKIT_SECCTL "iotkit-secctl"
  #define IOTKIT_SECCTL(obj) OBJECT_CHECK(IoTKitSecCtl, (obj), TYPE_IOTKIT_SECCTL)
 -typedef struct IoTKitSecCtl {
 +#define IOTS_APB_PPC0_NUM_PORTS 3
 +#define IOTS_APB_PPC1_NUM_PORTS 1
 +#define IOTS_PPC_NUM_PORTS 16
 +#define IOTS_NUM_APB_PPC 2
 +#define IOTS_NUM_APB_EXP_PPC 4
 +#define IOTS_NUM_AHB_EXP_PPC 4
 +
-+typedef struct IoTKitSecCtl IoTKitSecCtl;
+ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
-+
+ void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
-+/* State and IRQ lines relating to a PPC. For the
++void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
-+ * PPCs in the IoTKit not all the IRQ lines are used.
++                          GArray *hardware_errors);
 + */
 +typedef struct IoTKitSecCtlPPC {
 +    qemu_irq nonsec[IOTS_PPC_NUM_PORTS];
 +    qemu_irq ap[IOTS_PPC_NUM_PORTS];
 +    qemu_irq irq_enable;
 +    qemu_irq irq_clear;
 +
 +    uint32_t ns;
 +    uint32_t sp;
 +    uint32_t nsp;
 +
 +    /* Number of ports actually present */
 +    int numports;
 +    /* Offset of this PPC's interrupt bits in SECPPCINTSTAT */
 +    int irq_bit_offset;
 +    IoTKitSecCtl *parent;
 +} IoTKitSecCtlPPC;
 +
 +struct IoTKitSecCtl {
      /*< private >*/
      SysBusDevice parent_obj;
      /*< public >*/
 +    qemu_irq sec_resp_cfg;
      MemoryRegion s_regs;
      MemoryRegion ns_regs;
 -} IoTKitSecCtl;
 +
 +    uint32_t secppcintstat;
 +    uint32_t secppcinten;
 +    uint32_t secrespcfg;
 +
 +    IoTKitSecCtlPPC apb[IOTS_NUM_APB_PPC];
 +    IoTKitSecCtlPPC apbexp[IOTS_NUM_APB_EXP_PPC];
 +    IoTKitSecCtlPPC ahbexp[IOTS_NUM_APB_EXP_PPC];
 +};
  #endif
-diff --git a/hw/misc/iotkit-secctl.c b/hw/misc/iotkit-secctl.c
+diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/iotkit-secctl.c
+--- a/hw/acpi/generic_event_device.c
-+++ b/hw/misc/iotkit-secctl.c
++++ b/hw/acpi/generic_event_device.c
-@@ -XXX,XX +XXX,XX @@ static const uint8_t iotkit_secctl_ns_idregs[] = {
+@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_ged_state = {
-x0d, 0xf0, 0x05, 0xb1,
+     }
  };
-+/* The register sets for the various PPCs (AHB internal, APB internal,
++static bool ghes_needed(void *opaque)
 + * AHB expansion, APB expansion) are all set up so that they are
 + * in 16-aligned blocks so offsets 0xN0, 0xN4, 0xN8, 0xNC are PPCs
 + * 0, 1, 2, 3 of that type, so we can convert a register address offset
 + * into an an index into a PPC array easily.
 + */
 +static inline int offset_to_ppc_idx(uint32_t offset)
 +{
-+    return extract32(offset, 2, 2);
++    AcpiGedState *s = opaque;
 +    return s->ghes_state.ghes_addr_le;
 +}
 +
-+typedef void PerPPCFunction(IoTKitSecCtlPPC *ppc);
++static const VMStateDescription vmstate_ghes_state = {
-+
++    .name = "acpi-ged/ghes",
 +static void foreach_ppc(IoTKitSecCtl *s, PerPPCFunction *fn)
 +{
 +    int i;
 +
 +    for (i = 0; i < IOTS_NUM_APB_PPC; i++) {
 +        fn(&s->apb[i]);
 +    }
 +    for (i = 0; i < IOTS_NUM_APB_EXP_PPC; i++) {
 +        fn(&s->apbexp[i]);
 +    }
 +    for (i = 0; i < IOTS_NUM_AHB_EXP_PPC; i++) {
 +        fn(&s->ahbexp[i]);
 +    }
 +}
 +
  static MemTxResult iotkit_secctl_s_read(void *opaque, hwaddr addr,
                                          uint64_t *pdata,
                                          unsigned size, MemTxAttrs attrs)
  {
      uint64_t r;
      uint32_t offset = addr & ~0x3;
 +    IoTKitSecCtl *s = IOTKIT_SECCTL(opaque);
      switch (offset) {
      case A_AHBNSPPC0:
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_read(void *opaque, hwaddr addr,
          r = 0;
          break;
      case A_SECRESPCFG:
 -    case A_NSCCFG:
 -    case A_SECMPCINTSTATUS:
 +        r = s->secrespcfg;
 +        break;
      case A_SECPPCINTSTAT:
 +        r = s->secppcintstat;
 +        break;
      case A_SECPPCINTEN:
 -    case A_SECMSCINTSTAT:
 -    case A_SECMSCINTEN:
 -    case A_BRGINTSTAT:
 -    case A_BRGINTEN:
 +        r = s->secppcinten;
 +        break;
      case A_AHBNSPPCEXP0:
      case A_AHBNSPPCEXP1:
      case A_AHBNSPPCEXP2:
      case A_AHBNSPPCEXP3:
 +        r = s->ahbexp[offset_to_ppc_idx(offset)].ns;
 +        break;
      case A_APBNSPPC0:
      case A_APBNSPPC1:
 +        r = s->apb[offset_to_ppc_idx(offset)].ns;
 +        break;
      case A_APBNSPPCEXP0:
      case A_APBNSPPCEXP1:
      case A_APBNSPPCEXP2:
      case A_APBNSPPCEXP3:
 +        r = s->apbexp[offset_to_ppc_idx(offset)].ns;
 +        break;
      case A_AHBSPPPCEXP0:
      case A_AHBSPPPCEXP1:
      case A_AHBSPPPCEXP2:
      case A_AHBSPPPCEXP3:
 +        r = s->apbexp[offset_to_ppc_idx(offset)].sp;
 +        break;
      case A_APBSPPPC0:
      case A_APBSPPPC1:
 +        r = s->apb[offset_to_ppc_idx(offset)].sp;
 +        break;
      case A_APBSPPPCEXP0:
      case A_APBSPPPCEXP1:
      case A_APBSPPPCEXP2:
      case A_APBSPPPCEXP3:
 +        r = s->apbexp[offset_to_ppc_idx(offset)].sp;
 +        break;
 +    case A_NSCCFG:
 +    case A_SECMPCINTSTATUS:
 +    case A_SECMSCINTSTAT:
 +    case A_SECMSCINTEN:
 +    case A_BRGINTSTAT:
 +    case A_BRGINTEN:
      case A_NSMSCEXP:
          qemu_log_mask(LOG_UNIMP,
                        "IoTKit SecCtl S block read: "
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_read(void *opaque, hwaddr addr,
      return MEMTX_OK;
  }
 +static void iotkit_secctl_update_ppc_ap(IoTKitSecCtlPPC *ppc)
 +{
 +    int i;
 +
 +    for (i = 0; i < ppc->numports; i++) {
 +        bool v;
 +
 +        if (extract32(ppc->ns, i, 1)) {
 +            v = extract32(ppc->nsp, i, 1);
 +        } else {
 +            v = extract32(ppc->sp, i, 1);
 +        }
 +        qemu_set_irq(ppc->ap[i], v);
 +    }
 +}
 +
 +static void iotkit_secctl_ppc_ns_write(IoTKitSecCtlPPC *ppc, uint32_t value)
 +{
 +    int i;
 +
 +    ppc->ns = value & MAKE_64BIT_MASK(0, ppc->numports);
 +    for (i = 0; i < ppc->numports; i++) {
 +        qemu_set_irq(ppc->nonsec[i], extract32(ppc->ns, i, 1));
 +    }
 +    iotkit_secctl_update_ppc_ap(ppc);
 +}
 +
 +static void iotkit_secctl_ppc_sp_write(IoTKitSecCtlPPC *ppc, uint32_t value)
 +{
 +    ppc->sp = value & MAKE_64BIT_MASK(0, ppc->numports);
 +    iotkit_secctl_update_ppc_ap(ppc);
 +}
 +
 +static void iotkit_secctl_ppc_nsp_write(IoTKitSecCtlPPC *ppc, uint32_t value)
 +{
 +    ppc->nsp = value & MAKE_64BIT_MASK(0, ppc->numports);
 +    iotkit_secctl_update_ppc_ap(ppc);
 +}
 +
 +static void iotkit_secctl_ppc_update_irq_clear(IoTKitSecCtlPPC *ppc)
 +{
 +    uint32_t value = ppc->parent->secppcintstat;
 +
 +    qemu_set_irq(ppc->irq_clear, extract32(value, ppc->irq_bit_offset, 1));
 +}
 +
 +static void iotkit_secctl_ppc_update_irq_enable(IoTKitSecCtlPPC *ppc)
 +{
 +    uint32_t value = ppc->parent->secppcinten;
 +
 +    qemu_set_irq(ppc->irq_enable, extract32(value, ppc->irq_bit_offset, 1));
 +}
 +
  static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
                                           uint64_t value,
                                           unsigned size, MemTxAttrs attrs)
  {
 +    IoTKitSecCtl *s = IOTKIT_SECCTL(opaque);
      uint32_t offset = addr;
 +    IoTKitSecCtlPPC *ppc;
      trace_iotkit_secctl_s_write(offset, value, size);
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
      switch (offset) {
      case A_SECRESPCFG:
 -    case A_NSCCFG:
 +        value &= 1;
 +        s->secrespcfg = value;
 +        qemu_set_irq(s->sec_resp_cfg, s->secrespcfg);
 +        break;
      case A_SECPPCINTCLR:
 +        value &= 0x00f000f3;
 +        foreach_ppc(s, iotkit_secctl_ppc_update_irq_clear);
 +        break;
      case A_SECPPCINTEN:
 -    case A_SECMSCINTCLR:
 -    case A_SECMSCINTEN:
 -    case A_BRGINTCLR:
 -    case A_BRGINTEN:
 +        s->secppcinten = value & 0x00f000f3;
 +        foreach_ppc(s, iotkit_secctl_ppc_update_irq_enable);
 +        break;
      case A_AHBNSPPCEXP0:
      case A_AHBNSPPCEXP1:
      case A_AHBNSPPCEXP2:
      case A_AHBNSPPCEXP3:
 +        ppc = &s->ahbexp[offset_to_ppc_idx(offset)];
 +        iotkit_secctl_ppc_ns_write(ppc, value);
 +        break;
      case A_APBNSPPC0:
      case A_APBNSPPC1:
 +        ppc = &s->apb[offset_to_ppc_idx(offset)];
 +        iotkit_secctl_ppc_ns_write(ppc, value);
 +        break;
      case A_APBNSPPCEXP0:
      case A_APBNSPPCEXP1:
      case A_APBNSPPCEXP2:
      case A_APBNSPPCEXP3:
 +        ppc = &s->apbexp[offset_to_ppc_idx(offset)];
 +        iotkit_secctl_ppc_ns_write(ppc, value);
 +        break;
      case A_AHBSPPPCEXP0:
      case A_AHBSPPPCEXP1:
      case A_AHBSPPPCEXP2:
      case A_AHBSPPPCEXP3:
 +        ppc = &s->ahbexp[offset_to_ppc_idx(offset)];
 +        iotkit_secctl_ppc_sp_write(ppc, value);
 +        break;
      case A_APBSPPPC0:
      case A_APBSPPPC1:
 +        ppc = &s->apb[offset_to_ppc_idx(offset)];
 +        iotkit_secctl_ppc_sp_write(ppc, value);
 +        break;
      case A_APBSPPPCEXP0:
      case A_APBSPPPCEXP1:
      case A_APBSPPPCEXP2:
      case A_APBSPPPCEXP3:
 +        ppc = &s->apbexp[offset_to_ppc_idx(offset)];
 +        iotkit_secctl_ppc_sp_write(ppc, value);
 +        break;
 +    case A_NSCCFG:
 +    case A_SECMSCINTCLR:
 +    case A_SECMSCINTEN:
 +    case A_BRGINTCLR:
 +    case A_BRGINTEN:
          qemu_log_mask(LOG_UNIMP,
                        "IoTKit SecCtl S block write: "
                        "unimplemented offset 0x%x\n", offset);
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_ns_read(void *opaque, hwaddr addr,
                                           uint64_t *pdata,
                                           unsigned size, MemTxAttrs attrs)
  {
 +    IoTKitSecCtl *s = IOTKIT_SECCTL(opaque);
      uint64_t r;
      uint32_t offset = addr & ~0x3;
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_ns_read(void *opaque, hwaddr addr,
      case A_AHBNSPPPCEXP1:
      case A_AHBNSPPPCEXP2:
      case A_AHBNSPPPCEXP3:
 +        r = s->ahbexp[offset_to_ppc_idx(offset)].nsp;
 +        break;
      case A_APBNSPPPC0:
      case A_APBNSPPPC1:
 +        r = s->apb[offset_to_ppc_idx(offset)].nsp;
 +        break;
      case A_APBNSPPPCEXP0:
      case A_APBNSPPPCEXP1:
      case A_APBNSPPPCEXP2:
      case A_APBNSPPPCEXP3:
 -        qemu_log_mask(LOG_UNIMP,
 -                      "IoTKit SecCtl NS block read: "
 -                      "unimplemented offset 0x%x\n", offset);
 +        r = s->apbexp[offset_to_ppc_idx(offset)].nsp;
          break;
      case A_PID4:
      case A_PID5:
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_ns_write(void *opaque, hwaddr addr,
                                            uint64_t value,
                                            unsigned size, MemTxAttrs attrs)
  {
 +    IoTKitSecCtl *s = IOTKIT_SECCTL(opaque);
      uint32_t offset = addr;
 +    IoTKitSecCtlPPC *ppc;
      trace_iotkit_secctl_ns_write(offset, value, size);
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_ns_write(void *opaque, hwaddr addr,
      case A_AHBNSPPPCEXP1:
      case A_AHBNSPPPCEXP2:
      case A_AHBNSPPPCEXP3:
 +        ppc = &s->ahbexp[offset_to_ppc_idx(offset)];
 +        iotkit_secctl_ppc_nsp_write(ppc, value);
 +        break;
      case A_APBNSPPPC0:
      case A_APBNSPPPC1:
 +        ppc = &s->apb[offset_to_ppc_idx(offset)];
 +        iotkit_secctl_ppc_nsp_write(ppc, value);
 +        break;
      case A_APBNSPPPCEXP0:
      case A_APBNSPPPCEXP1:
      case A_APBNSPPPCEXP2:
      case A_APBNSPPPCEXP3:
 -        qemu_log_mask(LOG_UNIMP,
 -                      "IoTKit SecCtl NS block write: "
 -                      "unimplemented offset 0x%x\n", offset);
 +        ppc = &s->apbexp[offset_to_ppc_idx(offset)];
 +        iotkit_secctl_ppc_nsp_write(ppc, value);
          break;
      case A_AHBNSPPPC0:
      case A_PID4:
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps iotkit_secctl_ns_ops = {
      .impl.max_access_size = 4,
  };
 +static void iotkit_secctl_reset_ppc(IoTKitSecCtlPPC *ppc)
 +{
 +    ppc->ns = 0;
 +    ppc->sp = 0;
 +    ppc->nsp = 0;
 +}
 +
  static void iotkit_secctl_reset(DeviceState *dev)
  {
 +    IoTKitSecCtl *s = IOTKIT_SECCTL(dev);
 +    s->secppcintstat = 0;
 +    s->secppcinten = 0;
 +    s->secrespcfg = 0;
 +
 +    foreach_ppc(s, iotkit_secctl_reset_ppc);
 +}
 +
 +static void iotkit_secctl_ppc_irqstatus(void *opaque, int n, int level)
 +{
 +    IoTKitSecCtlPPC *ppc = opaque;
 +    IoTKitSecCtl *s = IOTKIT_SECCTL(ppc->parent);
 +    int irqbit = ppc->irq_bit_offset + n;
 +
 +    s->secppcintstat = deposit32(s->secppcintstat, irqbit, 1, level);
 +}
 +
 +static void iotkit_secctl_init_ppc(IoTKitSecCtl *s,
 +                                   IoTKitSecCtlPPC *ppc,
 +                                   const char *name,
 +                                   int numports,
 +                                   int irq_bit_offset)
 +{
 +    char *gpioname;
 +    DeviceState *dev = DEVICE(s);
 +
 +    ppc->numports = numports;
 +    ppc->irq_bit_offset = irq_bit_offset;
 +    ppc->parent = s;
 +
 +    gpioname = g_strdup_printf("%s_nonsec", name);
 +    qdev_init_gpio_out_named(dev, ppc->nonsec, gpioname, numports);
 +    g_free(gpioname);
 +    gpioname = g_strdup_printf("%s_ap", name);
 +    qdev_init_gpio_out_named(dev, ppc->ap, gpioname, numports);
 +    g_free(gpioname);
 +    gpioname = g_strdup_printf("%s_irq_enable", name);
 +    qdev_init_gpio_out_named(dev, &ppc->irq_enable, gpioname, 1);
 +    g_free(gpioname);
 +    gpioname = g_strdup_printf("%s_irq_clear", name);
 +    qdev_init_gpio_out_named(dev, &ppc->irq_clear, gpioname, 1);
 +    g_free(gpioname);
 +    gpioname = g_strdup_printf("%s_irq_status", name);
 +    qdev_init_gpio_in_named_with_opaque(dev, iotkit_secctl_ppc_irqstatus,
 +                                        ppc, gpioname, 1);
 +    g_free(gpioname);
  }
  static void iotkit_secctl_init(Object *obj)
  {
      IoTKitSecCtl *s = IOTKIT_SECCTL(obj);
      SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
 +    DeviceState *dev = DEVICE(obj);
 +    int i;
 +
 +    iotkit_secctl_init_ppc(s, &s->apb[0], "apb_ppc0",
 +                           IOTS_APB_PPC0_NUM_PORTS, 0);
 +    iotkit_secctl_init_ppc(s, &s->apb[1], "apb_ppc1",
 +                           IOTS_APB_PPC1_NUM_PORTS, 1);
 +
 +    for (i = 0; i < IOTS_NUM_APB_EXP_PPC; i++) {
 +        IoTKitSecCtlPPC *ppc = &s->apbexp[i];
 +        char *ppcname = g_strdup_printf("apb_ppcexp%d", i);
 +        iotkit_secctl_init_ppc(s, ppc, ppcname, IOTS_PPC_NUM_PORTS, 4 + i);
 +        g_free(ppcname);
 +    }
 +    for (i = 0; i < IOTS_NUM_AHB_EXP_PPC; i++) {
 +        IoTKitSecCtlPPC *ppc = &s->ahbexp[i];
 +        char *ppcname = g_strdup_printf("ahb_ppcexp%d", i);
 +        iotkit_secctl_init_ppc(s, ppc, ppcname, IOTS_PPC_NUM_PORTS, 20 + i);
 +        g_free(ppcname);
 +    }
 +
 +    qdev_init_gpio_out_named(dev, &s->sec_resp_cfg, "sec_resp_cfg", 1);
      memory_region_init_io(&s->s_regs, obj, &iotkit_secctl_s_ops,
                            s, "iotkit-secctl-s-regs", 0x1000);
@@ -XXX,XX +XXX,XX @@ static void iotkit_secctl_init(Object *obj)
      sysbus_init_mmio(sbd, &s->ns_regs);
  }
 +static const VMStateDescription iotkit_secctl_ppc_vmstate = {
 +    .name = "iotkit-secctl-ppc",
 +    .version_id = 1,
 +    .minimum_version_id = 1,
-+    .fields = (VMStateField[]) {
++    .needed = ghes_needed,
-+        VMSTATE_UINT32(ns, IoTKitSecCtlPPC),
++    .fields      = (VMStateField[]) {
-+        VMSTATE_UINT32(sp, IoTKitSecCtlPPC),
++        VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
-+        VMSTATE_UINT32(nsp, IoTKitSecCtlPPC),
++                       vmstate_ghes_state, AcpiGhesState),
 +        VMSTATE_END_OF_LIST()
 +    }
 +};
 +
- static const VMStateDescription iotkit_secctl_vmstate = {
+ static const VMStateDescription vmstate_acpi_ged = {
-     .name = "iotkit-secctl",
+     .name = "acpi-ged",
      .version_id = 1,
-     .minimum_version_id = 1,
+@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_acpi_ged = {
-     .fields = (VMStateField[]) {
+     },
-+        VMSTATE_UINT32(secppcintstat, IoTKitSecCtl),
+     .subsections = (const VMStateDescription * []) {
-+        VMSTATE_UINT32(secppcinten, IoTKitSecCtl),
+         &vmstate_memhp_state,
-+        VMSTATE_UINT32(secrespcfg, IoTKitSecCtl),
++        &vmstate_ghes_state,
-+        VMSTATE_STRUCT_ARRAY(apb, IoTKitSecCtl, IOTS_NUM_APB_PPC, 1,
+         NULL
 +                             iotkit_secctl_ppc_vmstate, IoTKitSecCtlPPC),
 +        VMSTATE_STRUCT_ARRAY(apbexp, IoTKitSecCtl, IOTS_NUM_APB_EXP_PPC, 1,
 +                             iotkit_secctl_ppc_vmstate, IoTKitSecCtlPPC),
 +        VMSTATE_STRUCT_ARRAY(ahbexp, IoTKitSecCtl, IOTS_NUM_AHB_EXP_PPC, 1,
 +                             iotkit_secctl_ppc_vmstate, IoTKitSecCtlPPC),
          VMSTATE_END_OF_LIST()
      }
  };
+diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/acpi/ghes.c
++++ b/hw/acpi/ghes.c
+@@ -XXX,XX +XXX,XX @@
+ #include "hw/acpi/ghes.h"
+ #include "hw/acpi/aml-build.h"
+ #include "qemu/error-report.h"
++#include "hw/acpi/generic_event_device.h"
++#include "hw/nvram/fw_cfg.h"
+ #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
+ #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
+@@ -XXX,XX +XXX,XX @@ void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
+     build_header(linker, table_data, (void *)(table_data->data + hest_start),
+         "HEST", table_data->len - hest_start, 1, NULL, NULL);
+ }
++
++void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
++                          GArray *hardware_error)
++{
++    /* Create a read-only fw_cfg file for GHES */
++    fw_cfg_add_file(s, ACPI_GHES_ERRORS_FW_CFG_FILE, hardware_error->data,
++                    hardware_error->len);
++
++    /* Create a read-write fw_cfg file for Address */
++    fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
++        NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
++}
+diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/virt-acpi-build.c
++++ b/hw/arm/virt-acpi-build.c
+@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
+ {
+     AcpiBuildTables tables;
+     AcpiBuildState *build_state;
++    AcpiGedState *acpi_ged_state;
+     if (!vms->fw_cfg) {
+         trace_virt_acpi_setup();
+@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
+     fw_cfg_add_file(vms->fw_cfg, ACPI_BUILD_TPMLOG_FILE, tables.tcpalog->data,
+                     acpi_data_len(tables.tcpalog));
++    if (vms->ras) {
++        assert(vms->acpi_dev);
++        acpi_ged_state = ACPI_GED(vms->acpi_dev);
++        acpi_ghes_add_fw_cfg(&acpi_ged_state->ghes_state,
++                             vms->fw_cfg, tables.hardware_errors);
++    }
++
+     build_state->rsdp_mr = acpi_add_rom_blob(virt_acpi_build_update,
+                                              build_state, tables.rsdp,
+                                              ACPI_BUILD_RSDP_FILE, 0);
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 02/39] xlnx-zynqmp-rtc: Add basic time support
+[PULL 25/45] KVM: Move hwpoison page related functions into kvm-all.c
-From: Alistair Francis <alistair.francis@xilinx.com>
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-Allow the guest to determine the time set from the QEMU command line.
+kvm_hwpoison_page_add() and kvm_unpoison_all() will both
 be used by X86 and ARM platforms, so moving them into
 "accel/kvm/kvm-all.c" to avoid duplicate code.
-This includes adding a trace event to debug the new time.
+For architectures that don't use the poison-list functionality
 the reset handler will harmlessly do nothing, so let's register
 the kvm_unpoison_all() function in the generic kvm_init() function.
-Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
+Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
+Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
+Message-id: 20200512030609.19593-8-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/timer/xlnx-zynqmp-rtc.h |  2 ++
+ include/sysemu/kvm_int.h | 12 ++++++++++++
- hw/timer/xlnx-zynqmp-rtc.c         | 58 ++++++++++++++++++++++++++++++++++++++
+ accel/kvm/kvm-all.c      | 36 ++++++++++++++++++++++++++++++++++++
- hw/timer/trace-events              |  3 ++
+ target/i386/kvm.c        | 36 ------------------------------------
-files changed, 63 insertions(+)
+files changed, 48 insertions(+), 36 deletions(-)
-diff --git a/include/hw/timer/xlnx-zynqmp-rtc.h b/include/hw/timer/xlnx-zynqmp-rtc.h
+diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/timer/xlnx-zynqmp-rtc.h
+--- a/include/sysemu/kvm_int.h
-+++ b/include/hw/timer/xlnx-zynqmp-rtc.h
++++ b/include/sysemu/kvm_int.h
-@@ -XXX,XX +XXX,XX @@ typedef struct XlnxZynqMPRTC {
+@@ -XXX,XX +XXX,XX @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
-     qemu_irq irq_rtc_int;
+                                   AddressSpace *as, int as_id);
-     qemu_irq irq_addr_error_int;
+ void kvm_set_max_memslot_size(hwaddr max_slot_size);
 +    uint32_t tick_offset;
 +
-     uint32_t regs[XLNX_ZYNQMP_RTC_R_MAX];
++/**
-     RegisterInfo regs_info[XLNX_ZYNQMP_RTC_R_MAX];
++ * kvm_hwpoison_page_add:
- } XlnxZynqMPRTC;
++ *
-diff --git a/hw/timer/xlnx-zynqmp-rtc.c b/hw/timer/xlnx-zynqmp-rtc.c
++ * Parameters:
 + *  @ram_addr: the address in the RAM for the poisoned page
 + *
 + * Add a poisoned page to the list
 + *
 + * Return: None.
 + */
 +void kvm_hwpoison_page_add(ram_addr_t ram_addr);
  #endif
 diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/timer/xlnx-zynqmp-rtc.c
+--- a/accel/kvm/kvm-all.c
-+++ b/hw/timer/xlnx-zynqmp-rtc.c
++++ b/accel/kvm/kvm-all.c
 @@ -XXX,XX +XXX,XX @@
- #include "hw/register.h"
+ #include "qapi/visitor.h"
- #include "qemu/bitops.h"
+ #include "qapi/qapi-types-common.h"
- #include "qemu/log.h"
+ #include "qapi/qapi-visit-common.h"
-+#include "hw/ptimer.h"
++#include "sysemu/reset.h"
-+#include "qemu/cutils.h"
-+#include "sysemu/sysemu.h"
+ #include "hw/boards.h"
-+#include "trace.h"
- #include "hw/timer/xlnx-zynqmp-rtc.h"
+@@ -XXX,XX +XXX,XX @@ int kvm_vm_check_extension(KVMState *s, unsigned int extension)
+     return ret;
  #ifndef XLNX_ZYNQMP_RTC_ERR_DEBUG
@@ -XXX,XX +XXX,XX @@ static void addr_error_int_update_irq(XlnxZynqMPRTC *s)
      qemu_set_irq(s->irq_addr_error_int, pending);
  }
-+static uint32_t rtc_get_count(XlnxZynqMPRTC *s)
++typedef struct HWPoisonPage {
 +    ram_addr_t ram_addr;
 +    QLIST_ENTRY(HWPoisonPage) list;
 +} HWPoisonPage;
 +
 +static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
 +    QLIST_HEAD_INITIALIZER(hwpoison_page_list);
 +
 +static void kvm_unpoison_all(void *param)
 +{
-+    int64_t now = qemu_clock_get_ns(rtc_clock);
++    HWPoisonPage *page, *next_page;
-+    return s->tick_offset + now / NANOSECONDS_PER_SECOND;
++
 +    QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
 +        QLIST_REMOVE(page, list);
 +        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
 +        g_free(page);
 +    }
 +}
 +
-+static uint64_t current_time_postr(RegisterInfo *reg, uint64_t val64)
++void kvm_hwpoison_page_add(ram_addr_t ram_addr)
 +{
-+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
++    HWPoisonPage *page;
 +
-+    return rtc_get_count(s);
++    QLIST_FOREACH(page, &hwpoison_page_list, list) {
 +        if (page->ram_addr == ram_addr) {
 +            return;
 +        }
 +    }
 +    page = g_new(HWPoisonPage, 1);
 +    page->ram_addr = ram_addr;
 +    QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
 +}
 +
- static void rtc_int_status_postw(RegisterInfo *reg, uint64_t val64)
+ static uint32_t adjust_ioeventfd_endianness(uint32_t val, uint32_t size)
  {
-     XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
+ #if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
-@@ -XXX,XX +XXX,XX @@ static uint64_t addr_error_int_dis_prew(RegisterInfo *reg, uint64_t val64)
+@@ -XXX,XX +XXX,XX @@ static int kvm_init(MachineState *ms)
+         s->kernel_irqchip_split = mc->default_kernel_irqchip_split ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
- static const RegisterAccessInfo rtc_regs_info[] = {
+     }
-     {   .name = "SET_TIME_WRITE",  .addr = A_SET_TIME_WRITE,
-+        .unimp = MAKE_64BIT_MASK(0, 32),
++    qemu_register_reset(kvm_unpoison_all, NULL);
      },{ .name = "SET_TIME_READ",  .addr = A_SET_TIME_READ,
          .ro = 0xffffffff,
 +        .post_read = current_time_postr,
      },{ .name = "CALIB_WRITE",  .addr = A_CALIB_WRITE,
 +        .unimp = MAKE_64BIT_MASK(0, 32),
      },{ .name = "CALIB_READ",  .addr = A_CALIB_READ,
          .ro = 0x1fffff,
      },{ .name = "CURRENT_TIME",  .addr = A_CURRENT_TIME,
          .ro = 0xffffffff,
 +        .post_read = current_time_postr,
      },{ .name = "CURRENT_TICK",  .addr = A_CURRENT_TICK,
          .ro = 0xffff,
      },{ .name = "ALARM",  .addr = A_ALARM,
@@ -XXX,XX +XXX,XX @@ static void rtc_init(Object *obj)
      XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(obj);
      SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
      RegisterInfoArray *reg_array;
 +    struct tm current_tm;
      memory_region_init(&s->iomem, obj, TYPE_XLNX_ZYNQMP_RTC,
                         XLNX_ZYNQMP_RTC_R_MAX * 4);
@@ -XXX,XX +XXX,XX @@ static void rtc_init(Object *obj)
      sysbus_init_mmio(sbd, &s->iomem);
      sysbus_init_irq(sbd, &s->irq_rtc_int);
      sysbus_init_irq(sbd, &s->irq_addr_error_int);
 +
-+    qemu_get_timedate(&current_tm, 0);
+     if (s->kernel_irqchip_allowed) {
-+    s->tick_offset = mktimegm(&current_tm) -
+         kvm_irqchip_create(s);
-+        qemu_clock_get_ns(rtc_clock) / NANOSECONDS_PER_SECOND;
+     }
-+
+diff --git a/target/i386/kvm.c b/target/i386/kvm.c
-+    trace_xlnx_zynqmp_rtc_gettime(current_tm.tm_year, current_tm.tm_mon,
+index XXXXXXX..XXXXXXX 100644
-+                                  current_tm.tm_mday, current_tm.tm_hour,
+--- a/target/i386/kvm.c
-+                                  current_tm.tm_min, current_tm.tm_sec);
++++ b/target/i386/kvm.c
-+}
+@@ -XXX,XX +XXX,XX @@
-+
+ #include "sysemu/sysemu.h"
-+static int rtc_pre_save(void *opaque)
+ #include "sysemu/hw_accel.h"
-+{
+ #include "sysemu/kvm_int.h"
-+    XlnxZynqMPRTC *s = opaque;
+-#include "sysemu/reset.h"
-+    int64_t now = qemu_clock_get_ns(rtc_clock) / NANOSECONDS_PER_SECOND;
+ #include "sysemu/runstate.h"
-+
+ #include "kvm_i386.h"
-+    /* Add the time at migration */
+ #include "hyperv.h"
-+    s->tick_offset = s->tick_offset + now;
+@@ -XXX,XX +XXX,XX @@ uint64_t kvm_arch_get_supported_msr_feature(KVMState *s, uint32_t index)
-+
+     }
 +    return 0;
 +}
 +
 +static int rtc_post_load(void *opaque, int version_id)
 +{
 +    XlnxZynqMPRTC *s = opaque;
 +    int64_t now = qemu_clock_get_ns(rtc_clock) / NANOSECONDS_PER_SECOND;
 +
 +    /* Subtract the time after migration. This combined with the pre_save
 +     * action results in us having subtracted the time that the guest was
 +     * stopped to the offset.
 +     */
 +    s->tick_offset = s->tick_offset - now;
 +
 +    return 0;
  }
- static const VMStateDescription vmstate_rtc = {
+-
-     .name = TYPE_XLNX_ZYNQMP_RTC,
+-typedef struct HWPoisonPage {
-     .version_id = 1,
+-    ram_addr_t ram_addr;
-     .minimum_version_id = 1,
+-    QLIST_ENTRY(HWPoisonPage) list;
-+    .pre_save = rtc_pre_save,
+-} HWPoisonPage;
-+    .post_load = rtc_post_load,
+-
-     .fields = (VMStateField[]) {
+-static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
-         VMSTATE_UINT32_ARRAY(regs, XlnxZynqMPRTC, XLNX_ZYNQMP_RTC_R_MAX),
+-    QLIST_HEAD_INITIALIZER(hwpoison_page_list);
-+        VMSTATE_UINT32(tick_offset, XlnxZynqMPRTC),
+-
-         VMSTATE_END_OF_LIST(),
+-static void kvm_unpoison_all(void *param)
 -{
 -    HWPoisonPage *page, *next_page;
 -
 -    QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
 -        QLIST_REMOVE(page, list);
 -        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
 -        g_free(page);
 -    }
 -}
 -
 -static void kvm_hwpoison_page_add(ram_addr_t ram_addr)
 -{
 -    HWPoisonPage *page;
 -
 -    QLIST_FOREACH(page, &hwpoison_page_list, list) {
 -        if (page->ram_addr == ram_addr) {
 -            return;
 -        }
 -    }
 -    page = g_new(HWPoisonPage, 1);
 -    page->ram_addr = ram_addr;
 -    QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
 -}
 -
  static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap,
                                       int *max_banks)
  {
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init(MachineState *ms, KVMState *s)
          fprintf(stderr, "e820_add_entry() table is full\n");
          return ret;
      }
- };
+-    qemu_register_reset(kvm_unpoison_all, NULL);
-diff --git a/hw/timer/trace-events b/hw/timer/trace-events
-index XXXXXXX..XXXXXXX 100644
+     shadow_mem = object_property_get_int(OBJECT(s), "kvm-shadow-mem", &error_abort);
---- a/hw/timer/trace-events
+     if (shadow_mem != -1) {
 +++ b/hw/timer/trace-events
@@ -XXX,XX +XXX,XX @@ systick_write(uint64_t addr, uint32_t value, unsigned size) "systick write addr
  cmsdk_apb_timer_read(uint64_t offset, uint64_t data, unsigned size) "CMSDK APB timer read: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
  cmsdk_apb_timer_write(uint64_t offset, uint64_t data, unsigned size) "CMSDK APB timer write: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
  cmsdk_apb_timer_reset(void) "CMSDK APB timer: reset"
 +
 +# hw/timer/xlnx-zynqmp-rtc.c
 +xlnx_zynqmp_rtc_gettime(int year, int month, int day, int hour, int min, int sec) "Get time from host: %d-%d-%d %2d:%02d:%02d"
 --
-.16.2
+.20.1

-New patch
+[PULL 26/45] ACPI: Record Generic Error Status Block(GESB) table
+From: Dongjiu Geng <gengdongjiu@huawei.com>
 kvm_arch_on_sigbus_vcpu() error injection uses source_id as
 index in etc/hardware_errors to find out Error Status Data
 Block entry corresponding to error source. So supported source_id
 values should be assigned here and not be changed afterwards to
 make sure that guest will write error into expected Error Status
 Data Block.
 Before QEMU writes a new error to ACPI table, it will check whether
 previous error has been acknowledged. If not acknowledged, the new
 errors will be ignored and not be recorded. For the errors section
 type, QEMU simulate it to memory section error.
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Message-id: 20200512030609.19593-9-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  include/hw/acpi/ghes.h |   1 +
  hw/acpi/ghes.c         | 219 +++++++++++++++++++++++++++++++++++++++++
 files changed, 220 insertions(+)
 diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
 index XXXXXXX..XXXXXXX 100644
 --- a/include/hw/acpi/ghes.h
 +++ b/include/hw/acpi/ghes.h
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
  void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
  void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
                            GArray *hardware_errors);
 +int acpi_ghes_record_errors(uint8_t notify, uint64_t error_physical_addr);
  #endif
 diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/acpi/ghes.c
 +++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu/error-report.h"
  #include "hw/acpi/generic_event_device.h"
  #include "hw/nvram/fw_cfg.h"
 +#include "qemu/uuid.h"
  #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
  #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
@@ -XXX,XX +XXX,XX @@
  /* Address offset in Generic Address Structure(GAS) */
  #define GAS_ADDR_OFFSET 4
 +/*
 + * The total size of Generic Error Data Entry
 + * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
 + * Table 18-343 Generic Error Data Entry
 + */
 +#define ACPI_GHES_DATA_LENGTH               72
 +
 +/* The memory section CPER size, UEFI 2.6: N.2.5 Memory Error Section */
 +#define ACPI_GHES_MEM_CPER_LENGTH           80
 +
 +/* Masks for block_status flags */
 +#define ACPI_GEBS_UNCORRECTABLE         1
 +
 +/*
 + * Total size for Generic Error Status Block except Generic Error Data Entries
 + * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
 + * Table 18-380 Generic Error Status Block
 + */
 +#define ACPI_GHES_GESB_SIZE                 20
 +
 +/*
 + * Values for error_severity field
 + */
 +enum AcpiGenericErrorSeverity {
 +    ACPI_CPER_SEV_RECOVERABLE = 0,
 +    ACPI_CPER_SEV_FATAL = 1,
 +    ACPI_CPER_SEV_CORRECTED = 2,
 +    ACPI_CPER_SEV_NONE = 3,
 +};
 +
  /*
   * Hardware Error Notification
   * ACPI 4.0: 17.3.2.7 Hardware Error Notification
@@ -XXX,XX +XXX,XX @@ static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
      build_append_int_noprefix(table, 0, 4);
  }
 +/*
 + * Generic Error Data Entry
 + * ACPI 6.1: 18.3.2.7.1 Generic Error Data
 + */
 +static void acpi_ghes_generic_error_data(GArray *table,
 +                const uint8_t *section_type, uint32_t error_severity,
 +                uint8_t validation_bits, uint8_t flags,
 +                uint32_t error_data_length, QemuUUID fru_id,
 +                uint64_t time_stamp)
 +{
 +    const uint8_t fru_text[20] = {0};
 +
 +    /* Section Type */
 +    g_array_append_vals(table, section_type, 16);
 +
 +    /* Error Severity */
 +    build_append_int_noprefix(table, error_severity, 4);
 +    /* Revision */
 +    build_append_int_noprefix(table, 0x300, 2);
 +    /* Validation Bits */
 +    build_append_int_noprefix(table, validation_bits, 1);
 +    /* Flags */
 +    build_append_int_noprefix(table, flags, 1);
 +    /* Error Data Length */
 +    build_append_int_noprefix(table, error_data_length, 4);
 +
 +    /* FRU Id */
 +    g_array_append_vals(table, fru_id.data, ARRAY_SIZE(fru_id.data));
 +
 +    /* FRU Text */
 +    g_array_append_vals(table, fru_text, sizeof(fru_text));
 +
 +    /* Timestamp */
 +    build_append_int_noprefix(table, time_stamp, 8);
 +}
 +
 +/*
 + * Generic Error Status Block
 + * ACPI 6.1: 18.3.2.7.1 Generic Error Data
 + */
 +static void acpi_ghes_generic_error_status(GArray *table, uint32_t block_status,
 +                uint32_t raw_data_offset, uint32_t raw_data_length,
 +                uint32_t data_length, uint32_t error_severity)
 +{
 +    /* Block Status */
 +    build_append_int_noprefix(table, block_status, 4);
 +    /* Raw Data Offset */
 +    build_append_int_noprefix(table, raw_data_offset, 4);
 +    /* Raw Data Length */
 +    build_append_int_noprefix(table, raw_data_length, 4);
 +    /* Data Length */
 +    build_append_int_noprefix(table, data_length, 4);
 +    /* Error Severity */
 +    build_append_int_noprefix(table, error_severity, 4);
 +}
 +
 +/* UEFI 2.6: N.2.5 Memory Error Section */
 +static void acpi_ghes_build_append_mem_cper(GArray *table,
 +                                            uint64_t error_physical_addr)
 +{
 +    /*
 +     * Memory Error Record
 +     */
 +
 +    /* Validation Bits */
 +    build_append_int_noprefix(table,
 +                              (1ULL << 14) | /* Type Valid */
 +                              (1ULL << 1) /* Physical Address Valid */,
 +                              8);
 +    /* Error Status */
 +    build_append_int_noprefix(table, 0, 8);
 +    /* Physical Address */
 +    build_append_int_noprefix(table, error_physical_addr, 8);
 +    /* Skip all the detailed information normally found in such a record */
 +    build_append_int_noprefix(table, 0, 48);
 +    /* Memory Error Type */
 +    build_append_int_noprefix(table, 0 /* Unknown error */, 1);
 +    /* Skip all the detailed information normally found in such a record */
 +    build_append_int_noprefix(table, 0, 7);
 +}
 +
 +static int acpi_ghes_record_mem_error(uint64_t error_block_address,
 +                                      uint64_t error_physical_addr)
 +{
 +    GArray *block;
 +
 +    /* Memory Error Section Type */
 +    const uint8_t uefi_cper_mem_sec[] =
 +          UUID_LE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
 +                  0xED, 0x7C, 0x83, 0xB1);
 +
 +    /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
 +     * Table 17-13 Generic Error Data Entry
 +     */
 +    QemuUUID fru_id = {};
 +    uint32_t data_length;
 +
 +    block = g_array_new(false, true /* clear */, 1);
 +
 +    /* This is the length if adding a new generic error data entry*/
 +    data_length = ACPI_GHES_DATA_LENGTH + ACPI_GHES_MEM_CPER_LENGTH;
 +
 +    /*
 +     * Check whether it will run out of the preallocated memory if adding a new
 +     * generic error data entry
 +     */
 +    if ((data_length + ACPI_GHES_GESB_SIZE) > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
 +        error_report("Not enough memory to record new CPER!!!");
 +        g_array_free(block, true);
 +        return -1;
 +    }
 +
 +    /* Build the new generic error status block header */
 +    acpi_ghes_generic_error_status(block, ACPI_GEBS_UNCORRECTABLE,
 +        0, 0, data_length, ACPI_CPER_SEV_RECOVERABLE);
 +
 +    /* Build this new generic error data entry header */
 +    acpi_ghes_generic_error_data(block, uefi_cper_mem_sec,
 +        ACPI_CPER_SEV_RECOVERABLE, 0, 0,
 +        ACPI_GHES_MEM_CPER_LENGTH, fru_id, 0);
 +
 +    /* Build the memory section CPER for above new generic error data entry */
 +    acpi_ghes_build_append_mem_cper(block, error_physical_addr);
 +
 +    /* Write the generic error data entry into guest memory */
 +    cpu_physical_memory_write(error_block_address, block->data, block->len);
 +
 +    g_array_free(block, true);
 +
 +    return 0;
 +}
 +
  /*
   * Build table for the hardware error fw_cfg blob.
   * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
@@ -XXX,XX +XXX,XX @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
      fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
          NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
  }
 +
 +int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
 +{
 +    uint64_t error_block_addr, read_ack_register_addr, read_ack_register = 0;
 +    uint64_t start_addr;
 +    bool ret = -1;
 +    AcpiGedState *acpi_ged_state;
 +    AcpiGhesState *ags;
 +
 +    assert(source_id < ACPI_HEST_SRC_ID_RESERVED);
 +
 +    acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
 +                                                       NULL));
 +    g_assert(acpi_ged_state);
 +    ags = &acpi_ged_state->ghes_state;
 +
 +    start_addr = le64_to_cpu(ags->ghes_addr_le);
 +
 +    if (physical_address) {
 +
 +        if (source_id < ACPI_HEST_SRC_ID_RESERVED) {
 +            start_addr += source_id * sizeof(uint64_t);
 +        }
 +
 +        cpu_physical_memory_read(start_addr, &error_block_addr,
 +                                 sizeof(error_block_addr));
 +
 +        error_block_addr = le64_to_cpu(error_block_addr);
 +
 +        read_ack_register_addr = start_addr +
 +            ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t);
 +
 +        cpu_physical_memory_read(read_ack_register_addr,
 +                                 &read_ack_register, sizeof(read_ack_register));
 +
 +        /* zero means OSPM does not acknowledge the error */
 +        if (!read_ack_register) {
 +            error_report("OSPM does not acknowledge previous error,"
 +                " so can not record CPER for current error anymore");
 +        } else if (error_block_addr) {
 +            read_ack_register = cpu_to_le64(0);
 +            /*
 +             * Clear the Read Ack Register, OSPM will write it to 1 when
 +             * it acknowledges this error.
 +             */
 +            cpu_physical_memory_write(read_ack_register_addr,
 +                &read_ack_register, sizeof(uint64_t));
 +
 +            ret = acpi_ghes_record_mem_error(error_block_addr,
 +                                             physical_address);
 +        } else
 +            error_report("can not find Generic Error Status Block");
 +    }
 +
 +    return ret;
 +}
 --
 .20.1

-[Qemu-devel] [PULL 08/39] target/arm: Define an IDAU interface
+[PULL 27/45] target-arm: kvm64: handle SIGBUS signal from kernel or KVM
-In v8M, the Implementation Defined Attribution Unit (IDAU) is
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-a small piece of hardware typically implemented in the SoC
-which provides board or SoC specific security attribution
+Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
-information for each address that the CPU performs MPU/SAU
+translates the host VA delivered by host to guest PA, then fills this PA
-checks on. For QEMU, we model this with a QOM interface which
+to guest APEI GHES memory, then notifies guest according to the SIGBUS
-is implemented by the board or SoC object and connected to
+type.
-the CPU using a link property.
+When guest accesses the poisoned memory, it will generate a Synchronous
-This commit defines the new interface class, adds the link
+External Abort(SEA). Then host kernel gets an APEI notification and calls
-property to the CPU object, and makes the SAU checking
+memory_failure() to unmapped the affected page in stage 2, finally
-code call the IDAU interface if one is present.
+returns to guest.
 Guest continues to access the PG_hwpoison page, it will trap to KVM as
 stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
 Qemu, Qemu records this error address into guest APEI GHES memory and
 notifes guest using Synchronous-External-Abort(SEA).
 In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
 in which we can setup the type of exception and the syndrome information.
 When switching to guest, the target vcpu will jump to the synchronous
 external abort vector table entry.
 The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
 ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
 not valid and hold an UNKNOWN value. These values will be set to KVM
 register structures through KVM_SET_ONE_REG IOCTL.
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Message-id: 20200512030609.19593-10-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-5-peter.maydell@linaro.org
 ---
- target/arm/cpu.h    |  3 +++
+ include/sysemu/kvm.h    |  3 +-
- target/arm/idau.h   | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++
+ target/arm/cpu.h        |  4 +++
- target/arm/cpu.c    | 15 +++++++++++++
+ target/arm/internals.h  |  5 +--
- target/arm/helper.c | 28 +++++++++++++++++++++---
+ target/i386/cpu.h       |  2 ++
-files changed, 104 insertions(+), 3 deletions(-)
+ target/arm/helper.c     |  2 +-
- create mode 100644 target/arm/idau.h
+ target/arm/kvm64.c      | 77 +++++++++++++++++++++++++++++++++++++++++
+ target/arm/tlb_helper.c |  2 +-
 files changed, 89 insertions(+), 6 deletions(-)
 diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
 index XXXXXXX..XXXXXXX 100644
 --- a/include/sysemu/kvm.h
 +++ b/include/sysemu/kvm.h
@@ -XXX,XX +XXX,XX @@ bool kvm_vcpu_id_is_valid(int vcpu_id);
  /* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
  unsigned long kvm_arch_vcpu_id(CPUState *cpu);
 -#ifdef TARGET_I386
 -#define KVM_HAVE_MCE_INJECTION 1
 +#ifdef KVM_HAVE_MCE_INJECTION
  void kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
  #endif
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
-     /* MemoryRegion to use for secure physical accesses */
-     MemoryRegion *secure_memory;
-+    /* For v8M, pointer to the IDAU interface provided by board/SoC */
-+    Object *idau;
-+
-     /* 'compatible' string for this CPU for Linux device trees */
-     const char *dtb_compatible;
-diff --git a/target/arm/idau.h b/target/arm/idau.h
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/target/arm/idau.h
 @@ -XXX,XX +XXX,XX @@
-+/*
+ /* ARM processors have a weak memory model */
-+ * QEMU ARM CPU -- interface for the Arm v8M IDAU
+ #define TCG_GUEST_DEFAULT_MO      (0)
-+ *
-+ * Copyright (c) 2018 Linaro Ltd
++#ifdef TARGET_AARCH64
-+ *
++#define KVM_HAVE_MCE_INJECTION 1
 + * This program is free software; you can redistribute it and/or
 + * modify it under the terms of the GNU General Public License
 + * as published by the Free Software Foundation; either version 2
 + * of the License, or (at your option) any later version.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 + *
 + * You should have received a copy of the GNU General Public License
 + * along with this program; if not, see
 + * <http://www.gnu.org/licenses/gpl-2.0.html>
 + *
 + * In the v8M architecture, the IDAU is a small piece of hardware
 + * typically implemented in the SoC which provides board or SoC
 + * specific security attribution information for each address that
 + * the CPU performs MPU/SAU checks on. For QEMU, we model this with a
 + * QOM interface which is implemented by the board or SoC object and
 + * connected to the CPU using a link property.
 + */
 +
 +#ifndef TARGET_ARM_IDAU_H
 +#define TARGET_ARM_IDAU_H
 +
 +#include "qom/object.h"
 +
 +#define TYPE_IDAU_INTERFACE "idau-interface"
 +#define IDAU_INTERFACE(obj) \
 +    INTERFACE_CHECK(IDAUInterface, (obj), TYPE_IDAU_INTERFACE)
 +#define IDAU_INTERFACE_CLASS(class) \
 +    OBJECT_CLASS_CHECK(IDAUInterfaceClass, (class), TYPE_IDAU_INTERFACE)
 +#define IDAU_INTERFACE_GET_CLASS(obj) \
 +    OBJECT_GET_CLASS(IDAUInterfaceClass, (obj), TYPE_IDAU_INTERFACE)
 +
 +typedef struct IDAUInterface {
 +    Object parent;
 +} IDAUInterface;
 +
 +#define IREGION_NOTVALID -1
 +
 +typedef struct IDAUInterfaceClass {
 +    InterfaceClass parent;
 +
 +    /* Check the specified address and return the IDAU security information
 +     * for it by filling in iregion, exempt, ns and nsc:
 +     *  iregion: IDAU region number, or IREGION_NOTVALID if not valid
 +     *  exempt: true if address is exempt from security attribution
 +     *  ns: true if the address is NonSecure
 +     *  nsc: true if the address is NonSecure-callable
 +     */
 +    void (*check)(IDAUInterface *ii, uint32_t address, int *iregion,
 +                  bool *exempt, bool *ns, bool *nsc);
 +} IDAUInterfaceClass;
 +
 +#endif
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
++
-index XXXXXXX..XXXXXXX 100644
+ #define EXCP_UDEF            1   /* undefined instruction */
---- a/target/arm/cpu.c
+ #define EXCP_SWI             2   /* software interrupt */
-+++ b/target/arm/cpu.c
+ #define EXCP_PREFETCH_ABORT  3
 diff --git a/target/arm/internals.h b/target/arm/internals.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/internals.h
 +++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_insn_abort(int same_el, int ea, int s1ptw, int fsc)
          | ARM_EL_IL | (ea << 9) | (s1ptw << 7) | fsc;
  }
 -static inline uint32_t syn_data_abort_no_iss(int same_el,
 +static inline uint32_t syn_data_abort_no_iss(int same_el, int fnv,
                                               int ea, int cm, int s1ptw,
                                               int wnr, int fsc)
  {
      return (EC_DATAABORT << ARM_EL_EC_SHIFT) | (same_el << ARM_EL_EC_SHIFT)
             | ARM_EL_IL
 -           | (ea << 9) | (cm << 8) | (s1ptw << 7) | (wnr << 6) | fsc;
 +           | (fnv << 10) | (ea << 9) | (cm << 8) | (s1ptw << 7)
 +           | (wnr << 6) | fsc;
  }
  static inline uint32_t syn_data_abort_with_iss(int same_el,
 diff --git a/target/i386/cpu.h b/target/i386/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/i386/cpu.h
 +++ b/target/i386/cpu.h
 @@ -XXX,XX +XXX,XX @@
-  */
+ /* The x86 has a strong memory model with some store-after-load re-ordering */
+ #define TCG_GUEST_DEFAULT_MO      (TCG_MO_ALL & ~TCG_MO_ST_LD)
- #include "qemu/osdep.h"
-+#include "target/arm/idau.h"
++#define KVM_HAVE_MCE_INJECTION 1
- #include "qemu/error-report.h"
++
- #include "qapi/error.h"
+ /* Maximum instruction code size */
- #include "cpu.h"
+ #define TARGET_MAX_INSN_SIZE 16
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_post_init(Object *obj)
          }
      }
 +    if (arm_feature(&cpu->env, ARM_FEATURE_M_SECURITY)) {
 +        object_property_add_link(obj, "idau", TYPE_IDAU_INTERFACE, &cpu->idau,
 +                                 qdev_prop_allow_set_link_before_realize,
 +                                 OBJ_PROP_LINK_UNREF_ON_RELEASE,
 +                                 &error_abort);
 +    }
 +
      qdev_property_add_static(DEVICE(obj), &arm_cpu_cfgend_property,
                               &error_abort);
  }
@@ -XXX,XX +XXX,XX @@ static const TypeInfo arm_cpu_type_info = {
      .class_init = arm_cpu_class_init,
  };
 +static const TypeInfo idau_interface_type_info = {
 +    .name = TYPE_IDAU_INTERFACE,
 +    .parent = TYPE_INTERFACE,
 +    .class_size = sizeof(IDAUInterfaceClass),
 +};
 +
  static void arm_cpu_register_types(void)
  {
      const ARMCPUInfo *info = arm_cpus;
      type_register_static(&arm_cpu_type_info);
 +    type_register_static(&idau_interface_type_info);
      while (info->name) {
          cpu_register(info);
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
+              * Report exception with ESR indicating a fault due to a
+              * translation table walk for a cache maintenance instruction.
+              */
+-            syn = syn_data_abort_no_iss(current_el == target_el,
++            syn = syn_data_abort_no_iss(current_el == target_el, 0,
+                                         fi.ea, 1, fi.s1ptw, 1, fsc);
+             env->exception.vaddress = value;
+             env->exception.fsr = fsr;
+diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/kvm64.c
++++ b/target/arm/kvm64.c
 @@ -XXX,XX +XXX,XX @@
- #include "qemu/osdep.h"
+ #include "sysemu/kvm_int.h"
-+#include "target/arm/idau.h"
+ #include "kvm_arm.h"
  #include "trace.h"
  #include "cpu.h"
  #include "internals.h"
-@@ -XXX,XX +XXX,XX @@ static void v8m_security_lookup(CPUARMState *env, uint32_t address,
++#include "hw/acpi/acpi.h"
 +#include "hw/acpi/ghes.h"
 +#include "hw/arm/virt.h"
  static bool have_guest_debug;
@@ -XXX,XX +XXX,XX @@ int kvm_arm_cpreg_level(uint64_t regidx)
      return KVM_PUT_RUNTIME_STATE;
  }
 +/* Callers must hold the iothread mutex lock */
 +static void kvm_inject_arm_sea(CPUState *c)
 +{
 +    ARMCPU *cpu = ARM_CPU(c);
 +    CPUARMState *env = &cpu->env;
 +    CPUClass *cc = CPU_GET_CLASS(c);
 +    uint32_t esr;
 +    bool same_el;
 +
 +    c->exception_index = EXCP_DATA_ABORT;
 +    env->exception.target_el = 1;
 +
 +    /*
 +     * Set the DFSC to synchronous external abort and set FnV to not valid,
 +     * this will tell guest the FAR_ELx is UNKNOWN for this abort.
 +     */
 +    same_el = arm_current_el(env) == env->exception.target_el;
 +    esr = syn_data_abort_no_iss(same_el, 1, 0, 0, 0, 0, 0x10);
 +
 +    env->exception.syndrome = esr;
 +
 +    cc->do_interrupt(c);
 +}
 +
  #define AARCH64_CORE_REG(x)   (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
                   KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
      return ret;
  }
 +void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
 +{
 +    ram_addr_t ram_addr;
 +    hwaddr paddr;
 +    Object *obj = qdev_get_machine();
 +    VirtMachineState *vms = VIRT_MACHINE(obj);
 +    bool acpi_enabled = virt_is_acpi_enabled(vms);
 +
 +    assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
 +
 +    if (acpi_enabled && addr &&
 +            object_property_get_bool(obj, "ras", NULL)) {
 +        ram_addr = qemu_ram_addr_from_host(addr);
 +        if (ram_addr != RAM_ADDR_INVALID &&
 +            kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
 +            kvm_hwpoison_page_add(ram_addr);
 +            /*
 +             * If this is a BUS_MCEERR_AR, we know we have been called
 +             * synchronously from the vCPU thread, so we can easily
 +             * synchronize the state and inject an error.
 +             *
 +             * TODO: we currently don't tell the guest at all about
 +             * BUS_MCEERR_AO. In that case we might either be being
 +             * called synchronously from the vCPU thread, or a bit
 +             * later from the main thread, so doing the injection of
 +             * the error would be more complicated.
 +             */
 +            if (code == BUS_MCEERR_AR) {
 +                kvm_cpu_synchronize_state(c);
 +                if (!acpi_ghes_record_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
 +                    kvm_inject_arm_sea(c);
 +                } else {
 +                    error_report("failed to record the error");
 +                    abort();
 +                }
 +            }
 +            return;
 +        }
 +        if (code == BUS_MCEERR_AO) {
 +            error_report("Hardware memory error at addr %p for memory used by "
 +                "QEMU itself instead of guest system!", addr);
 +        }
 +    }
 +
 +    if (code == BUS_MCEERR_AR) {
 +        error_report("Hardware memory error!");
 +        exit(1);
 +    }
 +}
 +
  /* C6.6.29 BRK instruction */
  static const uint32_t brk_insn = 0xd4200000;
 diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/tlb_helper.c
 +++ b/target/arm/tlb_helper.c
@@ -XXX,XX +XXX,XX @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn,
       * ISV field.
       */
-     ARMCPU *cpu = arm_env_get_cpu(env);
+     if (!(template_syn & ARM_EL_ISV) || target_el != 2 || s1ptw) {
-     int r;
+-        syn = syn_data_abort_no_iss(same_el,
-+    bool idau_exempt = false, idau_ns = true, idau_nsc = true;
++        syn = syn_data_abort_no_iss(same_el, 0,
-+    int idau_region = IREGION_NOTVALID;
+                                     ea, 0, s1ptw, is_write, fsc);
+     } else {
--    /* TODO: implement IDAU */
+         /*
 +    if (cpu->idau) {
 +        IDAUInterfaceClass *iic = IDAU_INTERFACE_GET_CLASS(cpu->idau);
 +        IDAUInterface *ii = IDAU_INTERFACE(cpu->idau);
 +
 +        iic->check(ii, address, &idau_region, &idau_exempt, &idau_ns,
 +                   &idau_nsc);
 +    }
      if (access_type == MMU_INST_FETCH && extract32(address, 28, 4) == 0xf) {
          /* 0xf0000000..0xffffffff is always S for insn fetches */
          return;
      }
 -    if (v8m_is_sau_exempt(env, address, access_type)) {
 +    if (idau_exempt || v8m_is_sau_exempt(env, address, access_type)) {
          sattrs->ns = !regime_is_secure(env, mmu_idx);
          return;
      }
 +    if (idau_region != IREGION_NOTVALID) {
 +        sattrs->irvalid = true;
 +        sattrs->iregion = idau_region;
 +    }
 +
      switch (env->sau.ctrl & 3) {
      case 0: /* SAU.ENABLE == 0, SAU.ALLNS == 0 */
          break;
@@ -XXX,XX +XXX,XX @@ static void v8m_security_lookup(CPUARMState *env, uint32_t address,
              }
          }
 -        /* TODO when we support the IDAU then it may override the result here */
 +        /* The IDAU will override the SAU lookup results if it specifies
 +         * higher security than the SAU does.
 +         */
 +        if (!idau_ns) {
 +            if (sattrs->ns || (!idau_nsc && sattrs->nsc)) {
 +                sattrs->ns = false;
 +                sattrs->nsc = idau_nsc;
 +            }
 +        }
          break;
      }
  }
 --
-.16.2
+.20.1

-New patch
+[PULL 28/45] MAINTAINERS: Add ACPI/HEST/GHES entries
+From: Dongjiu Geng <gengdongjiu@huawei.com>
+I and Xiang are willing to review the APEI-related patches and
+volunteer as the reviewers for the HEST/GHES part.
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
+Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Acked-by: Michael S. Tsirkin <mst@redhat.com>
+Message-id: 20200512030609.19593-11-gengdongjiu@huawei.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ MAINTAINERS | 9 +++++++++
+file changed, 9 insertions(+)
+diff --git a/MAINTAINERS b/MAINTAINERS
+index XXXXXXX..XXXXXXX 100644
+--- a/MAINTAINERS
++++ b/MAINTAINERS
+@@ -XXX,XX +XXX,XX @@ F: tests/qtest/bios-tables-test.c
+ F: tests/qtest/acpi-utils.[hc]
+ F: tests/data/acpi/
++ACPI/HEST/GHES
++R: Dongjiu Geng <gengdongjiu@huawei.com>
++R: Xiang Zheng <zhengxiang9@huawei.com>
++L: qemu-arm@nongnu.org
++S: Maintained
++F: hw/acpi/ghes.c
++F: include/hw/acpi/ghes.h
++F: docs/specs/acpi_hest_ghes.rst
++
+ ppc4xx
+ M: David Gibson <david@gibson.dropbear.id.au>
+ L: qemu-ppc@nongnu.org
+--
+.20.1

-[Qemu-devel] [PULL 06/39] hw/arm/boot: Honour CPU's address space for image loads
+[PULL 29/45] target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
-Instead of loading kernels, device trees, and the like to
+Convert the Neon VQRDMLAH and VQRDMLSH insns in the 3-reg-same group
-the system address space, use the CPU's address space. This
+to decodetree.  These don't use do_3same() because they want to
-is important if we're trying to load the file to memory or
+operate on VFP double registers, whose offsets are different from the
-via an alias memory region that is provided by an SoC
+neon_reg_offset() calculations do_3same does.
 object and thus not mapped into the system address space.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-3-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-2-peter.maydell@linaro.org
 ---
- hw/arm/boot.c | 119 +++++++++++++++++++++++++++++++++++++---------------------
+ target/arm/neon-dp.decode       |  3 +++
-file changed, 76 insertions(+), 43 deletions(-)
+ target/arm/translate-neon.inc.c | 15 +++++++++++++++
  target/arm/translate.c          | 14 ++------------
 files changed, 20 insertions(+), 12 deletions(-)
-diff --git a/hw/arm/boot.c b/hw/arm/boot.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/boot.c
+--- a/target/arm/neon-dp.decode
-+++ b/hw/arm/boot.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
- #define ARM64_TEXT_OFFSET_OFFSET    8
- #define ARM64_MAGIC_OFFSET          56
+ VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
+ VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
 +static AddressSpace *arm_boot_address_space(ARMCPU *cpu,
 +                                            const struct arm_boot_info *info)
 +{
 +    /* Return the address space to use for bootloader reads and writes.
 +     * We prefer the secure address space if the CPU has it and we're
 +     * going to boot the guest into it.
 +     */
 +    int asidx;
 +    CPUState *cs = CPU(cpu);
 +
-+    if (arm_feature(&cpu->env, ARM_FEATURE_EL3) && info->secure_boot) {
++VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
-+        asidx = ARMASIdx_S;
++VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
-+    } else {
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-+        asidx = ARMASIdx_NS;
+index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
      }
      return do_3same(s, a, gen_VMUL_p_3s);
  }
 +
 +#define DO_VQRDMLAH(INSN, FUNC)                                         \
 +    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
 +    {                                                                   \
 +        if (!dc_isar_feature(aa32_rdm, s)) {                            \
 +            return false;                                               \
 +        }                                                               \
 +        if (a->size != 1 && a->size != 2) {                             \
 +            return false;                                               \
 +        }                                                               \
 +        return do_3same(s, a, FUNC);                                    \
 +    }
 +
-+    return cpu_get_address_space(cs, asidx);
++DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
-+}
++DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
-+
+diff --git a/target/arm/translate.c b/target/arm/translate.c
- typedef enum {
+index XXXXXXX..XXXXXXX 100644
-     FIXUP_NONE = 0,     /* do nothing */
+--- a/target/arm/translate.c
-     FIXUP_TERMINATOR,   /* end of insns */
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static const ARMInsnFixup smpboot[] = {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
- };
+             if (!u) {
+                 break;  /* VPADD */
  static void write_bootloader(const char *name, hwaddr addr,
 -                             const ARMInsnFixup *insns, uint32_t *fixupcontext)
 +                             const ARMInsnFixup *insns, uint32_t *fixupcontext,
 +                             AddressSpace *as)
  {
      /* Fix up the specified bootloader fragment and write it into
       * guest memory using rom_add_blob_fixed(). fixupcontext is
@@ -XXX,XX +XXX,XX @@ static void write_bootloader(const char *name, hwaddr addr,
          code[i] = tswap32(insn);
      }
 -    rom_add_blob_fixed(name, code, len * sizeof(uint32_t), addr);
 +    rom_add_blob_fixed_as(name, code, len * sizeof(uint32_t), addr, as);
      g_free(code);
  }
@@ -XXX,XX +XXX,XX @@ static void default_write_secondary(ARMCPU *cpu,
                                      const struct arm_boot_info *info)
  {
      uint32_t fixupcontext[FIXUP_MAX];
 +    AddressSpace *as = arm_boot_address_space(cpu, info);
      fixupcontext[FIXUP_GIC_CPU_IF] = info->gic_cpu_if_addr;
      fixupcontext[FIXUP_BOOTREG] = info->smp_bootreg_addr;
@@ -XXX,XX +XXX,XX @@ static void default_write_secondary(ARMCPU *cpu,
      }
      write_bootloader("smpboot", info->smp_loader_start,
 -                     smpboot, fixupcontext);
 +                     smpboot, fixupcontext, as);
  }
  void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu,
                                              const struct arm_boot_info *info,
                                              hwaddr mvbar_addr)
  {
 +    AddressSpace *as = arm_boot_address_space(cpu, info);
      int n;
      uint32_t mvbar_blob[] = {
          /* mvbar_addr: secure monitor vectors
@@ -XXX,XX +XXX,XX @@ void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu,
      for (n = 0; n < ARRAY_SIZE(mvbar_blob); n++) {
          mvbar_blob[n] = tswap32(mvbar_blob[n]);
      }
 -    rom_add_blob_fixed("board-setup-mvbar", mvbar_blob, sizeof(mvbar_blob),
 -                       mvbar_addr);
 +    rom_add_blob_fixed_as("board-setup-mvbar", mvbar_blob, sizeof(mvbar_blob),
 +                          mvbar_addr, as);
      for (n = 0; n < ARRAY_SIZE(board_setup_blob); n++) {
          board_setup_blob[n] = tswap32(board_setup_blob[n]);
      }
 -    rom_add_blob_fixed("board-setup", board_setup_blob,
 -                       sizeof(board_setup_blob), info->board_setup_addr);
 +    rom_add_blob_fixed_as("board-setup", board_setup_blob,
 +                          sizeof(board_setup_blob), info->board_setup_addr, as);
  }
  static void default_reset_secondary(ARMCPU *cpu,
                                      const struct arm_boot_info *info)
  {
 +    AddressSpace *as = arm_boot_address_space(cpu, info);
      CPUState *cs = CPU(cpu);
 -    address_space_stl_notdirty(&address_space_memory, info->smp_bootreg_addr,
 +    address_space_stl_notdirty(as, info->smp_bootreg_addr,
 , MEMTXATTRS_UNSPECIFIED, NULL);
      cpu_set_pc(cs, info->smp_loader_start);
  }
@@ -XXX,XX +XXX,XX @@ static inline bool have_dtb(const struct arm_boot_info *info)
  }
  #define WRITE_WORD(p, value) do { \
 -    address_space_stl_notdirty(&address_space_memory, p, value, \
 +    address_space_stl_notdirty(as, p, value, \
                                 MEMTXATTRS_UNSPECIFIED, NULL);  \
      p += 4;                       \
  } while (0)
 -static void set_kernel_args(const struct arm_boot_info *info)
 +static void set_kernel_args(const struct arm_boot_info *info, AddressSpace *as)
  {
      int initrd_size = info->initrd_size;
      hwaddr base = info->loader_start;
@@ -XXX,XX +XXX,XX @@ static void set_kernel_args(const struct arm_boot_info *info)
          int cmdline_size;
          cmdline_size = strlen(info->kernel_cmdline);
 -        cpu_physical_memory_write(p + 8, info->kernel_cmdline,
 -                                  cmdline_size + 1);
 +        address_space_write(as, p + 8, MEMTXATTRS_UNSPECIFIED,
 +                            (const uint8_t *)info->kernel_cmdline,
 +                            cmdline_size + 1);
          cmdline_size = (cmdline_size >> 2) + 1;
          WRITE_WORD(p, cmdline_size + 2);
          WRITE_WORD(p, 0x54410009);
@@ -XXX,XX +XXX,XX @@ static void set_kernel_args(const struct arm_boot_info *info)
          atag_board_len = (info->atag_board(info, atag_board_buf) + 3) & ~3;
          WRITE_WORD(p, (atag_board_len + 8) >> 2);
          WRITE_WORD(p, 0x414f4d50);
 -        cpu_physical_memory_write(p, atag_board_buf, atag_board_len);
 +        address_space_write(as, p, MEMTXATTRS_UNSPECIFIED,
 +                            atag_board_buf, atag_board_len);
          p += atag_board_len;
      }
      /* ATAG_END */
@@ -XXX,XX +XXX,XX @@ static void set_kernel_args(const struct arm_boot_info *info)
      WRITE_WORD(p, 0);
  }
 -static void set_kernel_args_old(const struct arm_boot_info *info)
 +static void set_kernel_args_old(const struct arm_boot_info *info,
 +                                AddressSpace *as)
  {
      hwaddr p;
      const char *s;
@@ -XXX,XX +XXX,XX @@ static void set_kernel_args_old(const struct arm_boot_info *info)
      }
      s = info->kernel_cmdline;
      if (s) {
 -        cpu_physical_memory_write(p, s, strlen(s) + 1);
 +        address_space_write(as, p, MEMTXATTRS_UNSPECIFIED,
 +                            (const uint8_t *)s, strlen(s) + 1);
      } else {
          WRITE_WORD(p, 0);
      }
@@ -XXX,XX +XXX,XX @@ static void fdt_add_psci_node(void *fdt)
   * @addr:       the address to load the image at
   * @binfo:      struct describing the boot environment
   * @addr_limit: upper limit of the available memory area at @addr
 + * @as:         address space to load image to
   *
   * Load a device tree supplied by the machine or by the user  with the
   * '-dtb' command line option, and put it at offset @addr in target
@@ -XXX,XX +XXX,XX @@ static void fdt_add_psci_node(void *fdt)
   * Note: Must not be called unless have_dtb(binfo) is true.
   */
  static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
 -                    hwaddr addr_limit)
 +                    hwaddr addr_limit, AddressSpace *as)
  {
      void *fdt = NULL;
      int size, rc;
@@ -XXX,XX +XXX,XX @@ static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
      /* Put the DTB into the memory map as a ROM image: this will ensure
       * the DTB is copied again upon reset, even if addr points into RAM.
       */
 -    rom_add_blob_fixed("dtb", fdt, size, addr);
 +    rom_add_blob_fixed_as("dtb", fdt, size, addr, as);
      g_free(fdt);
@@ -XXX,XX +XXX,XX @@ static void do_cpu_reset(void *opaque)
              }
+-            /* VQRDMLAH */
-             if (cs == first_cpu) {
+-            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
-+                AddressSpace *as = arm_boot_address_space(cpu, info);
+-                gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs,
-+
+-                                     vec_size, vec_size);
-                 cpu_set_pc(cs, info->loader_start);
+-                return 0;
+-            }
-                 if (!have_dtb(info)) {
++            /* VQRDMLAH : handled by decodetree */
-                     if (old_param) {
+             return 1;
--                        set_kernel_args_old(info);
-+                        set_kernel_args_old(info, as);
+         case NEON_3R_VFM_VQRDMLSH:
-                     } else {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
 -                        set_kernel_args(info);
 +                        set_kernel_args(info, as);
                      }
                  }
-             } else {
+                 break;
@@ -XXX,XX +XXX,XX @@ static int do_arm_linux_init(Object *obj, void *opaque)
  static uint64_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
                               uint64_t *lowaddr, uint64_t *highaddr,
 -                             int elf_machine)
 +                             int elf_machine, AddressSpace *as)
  {
      bool elf_is64;
      union {
@@ -XXX,XX +XXX,XX @@ static uint64_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
          }
      }
 -    ret = load_elf(info->kernel_filename, NULL, NULL,
 -                   pentry, lowaddr, highaddr, big_endian, elf_machine,
 -                   1, data_swab);
 +    ret = load_elf_as(info->kernel_filename, NULL, NULL,
 +                      pentry, lowaddr, highaddr, big_endian, elf_machine,
 +                      1, data_swab, as);
      if (ret <= 0) {
          /* The header loaded but the image didn't */
          exit(1);
@@ -XXX,XX +XXX,XX @@ static uint64_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
  }
  static uint64_t load_aarch64_image(const char *filename, hwaddr mem_base,
 -                                   hwaddr *entry)
 +                                   hwaddr *entry, AddressSpace *as)
  {
      hwaddr kernel_load_offset = KERNEL64_LOAD_ADDR;
      uint8_t *buffer;
@@ -XXX,XX +XXX,XX @@ static uint64_t load_aarch64_image(const char *filename, hwaddr mem_base,
      }
      *entry = mem_base + kernel_load_offset;
 -    rom_add_blob_fixed(filename, buffer, size, *entry);
 +    rom_add_blob_fixed_as(filename, buffer, size, *entry, as);
      g_free(buffer);
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
      ARMCPU *cpu = n->cpu;
      struct arm_boot_info *info =
          container_of(n, struct arm_boot_info, load_kernel_notifier);
 +    AddressSpace *as = arm_boot_address_space(cpu, info);
      /* The board code is not supposed to set secure_board_setup unless
       * running its code in secure mode is actually possible, and KVM
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
               * the kernel is supposed to be loaded by the bootloader), copy the
               * DTB to the base of RAM for the bootloader to pick up.
               */
 -            if (load_dtb(info->loader_start, info, 0) < 0) {
 +            if (load_dtb(info->loader_start, info, 0, as) < 0) {
                  exit(1);
              }
-         }
+-            /* VQRDMLSH */
-@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
+-            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
+-                gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs,
-     /* Assume that raw images are linux kernels, and ELF images are not.  */
+-                                     vec_size, vec_size);
-     kernel_size = arm_load_elf(info, &elf_entry, &elf_low_addr,
+-                return 0;
--                               &elf_high_addr, elf_machine);
+-            }
-+                               &elf_high_addr, elf_machine, as);
++            /* VQRDMLSH : handled by decodetree */
-     if (kernel_size > 0 && have_dtb(info)) {
+             return 1;
-         /* If there is still some room left at the base of RAM, try and put
-          * the DTB there like we do for images loaded with -bios or -pflash.
+         case NEON_3R_VABD:
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
              if (elf_low_addr < info->loader_start) {
                  elf_low_addr = 0;
              }
 -            if (load_dtb(info->loader_start, info, elf_low_addr) < 0) {
 +            if (load_dtb(info->loader_start, info, elf_low_addr, as) < 0) {
                  exit(1);
              }
          }
      }
      entry = elf_entry;
      if (kernel_size < 0) {
 -        kernel_size = load_uimage(info->kernel_filename, &entry, NULL,
 -                                  &is_linux, NULL, NULL);
 +        kernel_size = load_uimage_as(info->kernel_filename, &entry, NULL,
 +                                     &is_linux, NULL, NULL, as);
      }
      if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64) && kernel_size < 0) {
          kernel_size = load_aarch64_image(info->kernel_filename,
 -                                         info->loader_start, &entry);
 +                                         info->loader_start, &entry, as);
          is_linux = 1;
      } else if (kernel_size < 0) {
          /* 32-bit ARM */
          entry = info->loader_start + KERNEL_LOAD_ADDR;
 -        kernel_size = load_image_targphys(info->kernel_filename, entry,
 -                                          info->ram_size - KERNEL_LOAD_ADDR);
 +        kernel_size = load_image_targphys_as(info->kernel_filename, entry,
 +                                             info->ram_size - KERNEL_LOAD_ADDR,
 +                                             as);
          is_linux = 1;
      }
      if (kernel_size < 0) {
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
          uint32_t fixupcontext[FIXUP_MAX];
          if (info->initrd_filename) {
 -            initrd_size = load_ramdisk(info->initrd_filename,
 -                                       info->initrd_start,
 -                                       info->ram_size -
 -                                       info->initrd_start);
 +            initrd_size = load_ramdisk_as(info->initrd_filename,
 +                                          info->initrd_start,
 +                                          info->ram_size - info->initrd_start,
 +                                          as);
              if (initrd_size < 0) {
 -                initrd_size = load_image_targphys(info->initrd_filename,
 -                                                  info->initrd_start,
 -                                                  info->ram_size -
 -                                                  info->initrd_start);
 +                initrd_size = load_image_targphys_as(info->initrd_filename,
 +                                                     info->initrd_start,
 +                                                     info->ram_size -
 +                                                     info->initrd_start,
 +                                                     as);
              }
              if (initrd_size < 0) {
                  error_report("could not load initrd '%s'",
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
              /* Place the DTB after the initrd in memory with alignment. */
              dtb_start = QEMU_ALIGN_UP(info->initrd_start + initrd_size, align);
 -            if (load_dtb(dtb_start, info, 0) < 0) {
 +            if (load_dtb(dtb_start, info, 0, as) < 0) {
                  exit(1);
              }
              fixupcontext[FIXUP_ARGPTR] = dtb_start;
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
          fixupcontext[FIXUP_ENTRYPOINT] = entry;
          write_bootloader("bootloader", info->loader_start,
 -                         primary_loader, fixupcontext);
 +                         primary_loader, fixupcontext, as);
          if (info->nb_cpus > 1) {
              info->write_secondary_boot(cpu, info);
 --
-.16.2
+.20.1

-New patch
+[PULL 30/45] target/arm: Convert Neon 3-reg-same SHA to decodetree
+Convert the Neon SHA instructions in the 3-reg-same group
 to decodetree.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200512163904.10918-3-peter.maydell@linaro.org
 ---
  target/arm/neon-dp.decode       |  10 +++
  target/arm/translate-neon.inc.c | 139 ++++++++++++++++++++++++++++++++
  target/arm/translate.c          |  46 +----------
 files changed, 151 insertions(+), 44 deletions(-)
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-dp.decode
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
  VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
  VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
 +
 +SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
 +                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +SHA256H_3s       1111 001 1 0 . 00 .... .... 1100 . 1 . 0 .... \
 +                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +SHA256H2_3s      1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
 +                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
 +                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
  VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
  DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
  DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
 +
 +static bool trans_SHA1_3s(DisasContext *s, arg_SHA1_3s *a)
 +{
 +    TCGv_ptr ptr1, ptr2, ptr3;
 +    TCGv_i32 tmp;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
 +        !dc_isar_feature(aa32_sha1, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    ptr1 = vfp_reg_ptr(true, a->vd);
 +    ptr2 = vfp_reg_ptr(true, a->vn);
 +    ptr3 = vfp_reg_ptr(true, a->vm);
 +    tmp = tcg_const_i32(a->optype);
 +    gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp);
 +    tcg_temp_free_i32(tmp);
 +    tcg_temp_free_ptr(ptr1);
 +    tcg_temp_free_ptr(ptr2);
 +    tcg_temp_free_ptr(ptr3);
 +
 +    return true;
 +}
 +
 +static bool trans_SHA256H_3s(DisasContext *s, arg_SHA256H_3s *a)
 +{
 +    TCGv_ptr ptr1, ptr2, ptr3;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
 +        !dc_isar_feature(aa32_sha2, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    ptr1 = vfp_reg_ptr(true, a->vd);
 +    ptr2 = vfp_reg_ptr(true, a->vn);
 +    ptr3 = vfp_reg_ptr(true, a->vm);
 +    gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
 +    tcg_temp_free_ptr(ptr1);
 +    tcg_temp_free_ptr(ptr2);
 +    tcg_temp_free_ptr(ptr3);
 +
 +    return true;
 +}
 +
 +static bool trans_SHA256H2_3s(DisasContext *s, arg_SHA256H2_3s *a)
 +{
 +    TCGv_ptr ptr1, ptr2, ptr3;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
 +        !dc_isar_feature(aa32_sha2, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    ptr1 = vfp_reg_ptr(true, a->vd);
 +    ptr2 = vfp_reg_ptr(true, a->vn);
 +    ptr3 = vfp_reg_ptr(true, a->vm);
 +    gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
 +    tcg_temp_free_ptr(ptr1);
 +    tcg_temp_free_ptr(ptr2);
 +    tcg_temp_free_ptr(ptr3);
 +
 +    return true;
 +}
 +
 +static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
 +{
 +    TCGv_ptr ptr1, ptr2, ptr3;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
 +        !dc_isar_feature(aa32_sha2, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    ptr1 = vfp_reg_ptr(true, a->vd);
 +    ptr2 = vfp_reg_ptr(true, a->vn);
 +    ptr3 = vfp_reg_ptr(true, a->vm);
 +    gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
 +    tcg_temp_free_ptr(ptr1);
 +    tcg_temp_free_ptr(ptr2);
 +    tcg_temp_free_ptr(ptr3);
 +
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      int vec_size;
      uint32_t imm;
      TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
 -    TCGv_ptr ptr1, ptr2, ptr3;
 +    TCGv_ptr ptr1, ptr2;
      TCGv_i64 tmp64;
      if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              return 1;
          }
          switch (op) {
 -        case NEON_3R_SHA:
 -            /* The SHA-1/SHA-256 3-register instructions require special
 -             * treatment here, as their size field is overloaded as an
 -             * op type selector, and they all consume their input in a
 -             * single pass.
 -             */
 -            if (!q) {
 -                return 1;
 -            }
 -            if (!u) { /* SHA-1 */
 -                if (!dc_isar_feature(aa32_sha1, s)) {
 -                    return 1;
 -                }
 -                ptr1 = vfp_reg_ptr(true, rd);
 -                ptr2 = vfp_reg_ptr(true, rn);
 -                ptr3 = vfp_reg_ptr(true, rm);
 -                tmp4 = tcg_const_i32(size);
 -                gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp4);
 -                tcg_temp_free_i32(tmp4);
 -            } else { /* SHA-256 */
 -                if (!dc_isar_feature(aa32_sha2, s) || size == 3) {
 -                    return 1;
 -                }
 -                ptr1 = vfp_reg_ptr(true, rd);
 -                ptr2 = vfp_reg_ptr(true, rn);
 -                ptr3 = vfp_reg_ptr(true, rm);
 -                switch (size) {
 -                case 0:
 -                    gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
 -                    break;
 -                case 1:
 -                    gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
 -                    break;
 -                case 2:
 -                    gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
 -                    break;
 -                }
 -            }
 -            tcg_temp_free_ptr(ptr1);
 -            tcg_temp_free_ptr(ptr2);
 -            tcg_temp_free_ptr(ptr3);
 -            return 0;
 -
          case NEON_3R_VPADD_VQRDMLAH:
              if (!u) {
                  break;  /* VPADD */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VMUL:
          case NEON_3R_VML:
          case NEON_3R_VSHL:
 +        case NEON_3R_SHA:
              /* Already handled by decodetree */
              return 1;
          }
 --
 .20.1

-[Qemu-devel] [PULL 10/39] target/arm: Define init-svtor property for the reset secure VTOR value
+[PULL 31/45] target/arm: Convert Neon 64-bit element 3-reg-same insns
-The Cortex-M33 allows the system to specify the reset value of the
+Convert the 64-bit element insns in the 3-reg-same group
-secure Vector Table Offset Register (VTOR) by asserting config
+to decodetree. This covers VQSHL, VRSHL and VQRSHL where
-signals. In particular, guest images for the MPS2 AN505 board rely
+size==0b11.
 on the MPS2's initial VTOR being correct for that board.
 Implement a QEMU property so board and SoC code can set the reset
 value to the correct value.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-7-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-4-peter.maydell@linaro.org
 ---
- target/arm/cpu.h |  3 +++
+ target/arm/neon-dp.decode       | 13 +++++++++++
- target/arm/cpu.c | 18 ++++++++++++++----
+ target/arm/translate-neon.inc.c | 24 +++++++++++++++++++++
-files changed, 17 insertions(+), 4 deletions(-)
+ target/arm/translate.c          | 38 ++-------------------------------
 files changed, 39 insertions(+), 36 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
+@@ -XXX,XX +XXX,XX @@ VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
-      */
+ VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
-     uint32_t psci_conduit;
+ VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
-+    /* For v8M, initial value of the Secure VTOR */
++# Insns operating on 64-bit elements (size!=0b11 handled elsewhere)
-+    uint32_t init_svtor;
++# The _rev suffix indicates that Vn and Vm are reversed (as explained
 +# by the comment for the @3same_rev format).
 +@3same_64_rev    .... ... . . . 11 .... .... .... . q:1 . . .... \
 +                 &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
 +
-     /* [QEMU_]KVM_ARM_TARGET_* constant for this CPU, or
++VQSHL_S64_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-      * QEMU_KVM_ARM_TARGET_NONE if the kernel doesn't support this CPU type.
++VQSHL_U64_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-      */
++VRSHL_S64_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
++VRSHL_U64_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
 +VQRSHL_S64_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
 +VQRSHL_U64_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
 +
  VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
  VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
  VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/cpu.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
+@@ -XXX,XX +XXX,XX @@ static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
-         uint32_t initial_msp; /* Loaded from 0x0 */
-         uint32_t initial_pc; /* Loaded from 0x4 */
+     return true;
-         uint8_t *rom;
+ }
 +        uint32_t vecbase;
          if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
              env->v7m.secure = true;
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
          /* Unlike A/R profile, M profile defines the reset LR value */
          env->regs[14] = 0xffffffff;
 -        /* Load the initial SP and PC from the vector table at address 0 */
 -        rom = rom_ptr(0);
 +        env->v7m.vecbase[M_REG_S] = cpu->init_svtor & 0xffffff80;
 +
-+        /* Load the initial SP and PC from offset 0 and 4 in the vector table */
++#define DO_3SAME_64(INSN, FUNC)                                         \
-+        vecbase = env->v7m.vecbase[env->v7m.secure];
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+        rom = rom_ptr(vecbase);
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-         if (rom) {
++                                uint32_t oprsz, uint32_t maxsz)         \
-             /* Address zero is covered by ROM which hasn't yet been
++    {                                                                   \
-              * copied into physical memory.
++        static const GVecGen3 op = { .fni8 = FUNC };                    \
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
++        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &op);      \
-              * it got copied into memory. In the latter case, rom_ptr
++    }                                                                   \
-              * will return a NULL pointer and we should use ldl_phys instead.
++    DO_3SAME(INSN, gen_##INSN##_3s)
-              */
++
--            initial_msp = ldl_phys(s->as, 0);
++#define DO_3SAME_64_ENV(INSN, FUNC)                                     \
--            initial_pc = ldl_phys(s->as, 4);
++    static void gen_##INSN##_elt(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m)    \
-+            initial_msp = ldl_phys(s->as, vecbase);
++    {                                                                   \
-+            initial_pc = ldl_phys(s->as, vecbase + 4);
++        FUNC(d, cpu_env, n, m);                                         \
 +    }                                                                   \
 +    DO_3SAME_64(INSN, gen_##INSN##_elt)
 +
 +DO_3SAME_64(VRSHL_S64, gen_helper_neon_rshl_s64)
 +DO_3SAME_64(VRSHL_U64, gen_helper_neon_rshl_u64)
 +DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
 +DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
 +DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
 +DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          }
-         env->regs[13] = initial_msp & 0xFFFFFFFC;
+         if (size == 3) {
-@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_pmsav7_dregion_property =
+-            /* 64-bit element instructions. */
-                                            pmsav7_dregion,
+-            for (pass = 0; pass < (q ? 2 : 1); pass++) {
-                                            qdev_prop_uint32, uint32_t);
+-                neon_load_reg64(cpu_V0, rn + pass);
+-                neon_load_reg64(cpu_V1, rm + pass);
-+/* M profile: initial value of the Secure VTOR */
+-                switch (op) {
-+static Property arm_cpu_initsvtor_property =
+-                case NEON_3R_VQSHL:
-+            DEFINE_PROP_UINT32("init-svtor", ARMCPU, init_svtor, 0);
+-                    if (u) {
-+
+-                        gen_helper_neon_qshl_u64(cpu_V0, cpu_env,
- static void arm_cpu_post_init(Object *obj)
+-                                                 cpu_V1, cpu_V0);
- {
+-                    } else {
-     ARMCPU *cpu = ARM_CPU(obj);
+-                        gen_helper_neon_qshl_s64(cpu_V0, cpu_env,
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_post_init(Object *obj)
+-                                                 cpu_V1, cpu_V0);
-                                  qdev_prop_allow_set_link_before_realize,
+-                    }
-                                  OBJ_PROP_LINK_UNREF_ON_RELEASE,
+-                    break;
-                                  &error_abort);
+-                case NEON_3R_VRSHL:
-+        qdev_property_add_static(DEVICE(obj), &arm_cpu_initsvtor_property,
+-                    if (u) {
-+                                 &error_abort);
+-                        gen_helper_neon_rshl_u64(cpu_V0, cpu_V1, cpu_V0);
-     }
+-                    } else {
+-                        gen_helper_neon_rshl_s64(cpu_V0, cpu_V1, cpu_V0);
-     qdev_property_add_static(DEVICE(obj), &arm_cpu_cfgend_property,
+-                    }
 -                    break;
 -                case NEON_3R_VQRSHL:
 -                    if (u) {
 -                        gen_helper_neon_qrshl_u64(cpu_V0, cpu_env,
 -                                                  cpu_V1, cpu_V0);
 -                    } else {
 -                        gen_helper_neon_qrshl_s64(cpu_V0, cpu_env,
 -                                                  cpu_V1, cpu_V0);
 -                    }
 -                    break;
 -                default:
 -                    abort();
 -                }
 -                neon_store_reg64(cpu_V0, rd + pass);
 -            }
 -            return 0;
 +            /* 64-bit element instructions: handled by decodetree */
 +            return 1;
          }
          pairwise = 0;
          switch (op) {
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 17/39] hw/misc/mps2-fpgaio: FPGA control block for MPS2 AN505
+[PULL 32/45] target/arm: Convert Neon VHADD 3-reg-same insns
-The MPS2 AN505 FPGA image includes a "FPGA control block"
+Convert the Neon VHADD insns in the 3-reg-same group to decodetree.
 which is a small set of registers handling LEDs, buttons
 and some counters.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-14-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-5-peter.maydell@linaro.org
 ---
- hw/misc/Makefile.objs           |   1 +
+ target/arm/neon-dp.decode       |  2 ++
- include/hw/misc/mps2-fpgaio.h   |  43 ++++++++++
+ target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
- hw/misc/mps2-fpgaio.c           | 176 ++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate.c          |  4 +---
- default-configs/arm-softmmu.mak |   1 +
+files changed, 27 insertions(+), 3 deletions(-)
  hw/misc/trace-events            |   6 ++
 files changed, 227 insertions(+)
  create mode 100644 include/hw/misc/mps2-fpgaio.h
  create mode 100644 hw/misc/mps2-fpgaio.c
-diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/Makefile.objs
+--- a/target/arm/neon-dp.decode
-+++ b/hw/misc/Makefile.objs
++++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_STM32F2XX_SYSCFG) += stm32f2xx_syscfg.o
  obj-$(CONFIG_MIPS_CPS) += mips_cmgcr.o
  obj-$(CONFIG_MIPS_CPS) += mips_cpc.o
  obj-$(CONFIG_MIPS_ITU) += mips_itu.o
 +obj-$(CONFIG_MPS2_FPGAIO) += mps2-fpgaio.o
  obj-$(CONFIG_MPS2_SCC) += mps2-scc.o
  obj-$(CONFIG_PVPANIC) += pvpanic.o
 diff --git a/include/hw/misc/mps2-fpgaio.h b/include/hw/misc/mps2-fpgaio.h
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/include/hw/misc/mps2-fpgaio.h
 @@ -XXX,XX +XXX,XX @@
-+/*
+ @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
-+ * ARM MPS2 FPGAIO emulation
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+ *
-+ * Copyright (c) 2018 Linaro Limited
++VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
-+ * Written by Peter Maydell
++VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
-+ *
+ VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
-+ *  This program is free software; you can redistribute it and/or modify
+ VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
-+ *  it under the terms of the GNU General Public License version 2 or
-+ *  (at your option) any later version.
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-+ */
+index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
  DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
  DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
  DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
 +
-+/* This is a model of the FPGAIO register block in the AN505
++#define DO_3SAME_32(INSN, FUNC)                                         \
-+ * FPGA image for the MPS2 dev board; it is documented in the
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+ * application note:
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+ * http://infocenter.arm.com/help/topic/com.arm.doc.dai0505b/index.html
++                                uint32_t oprsz, uint32_t maxsz)         \
-+ *
++    {                                                                   \
-+ * QEMU interface:
++        static const GVecGen3 ops[4] = {                                \
-+ *  + sysbus MMIO region 0: the register bank
++            { .fni4 = gen_helper_neon_##FUNC##8 },                      \
-+ */
++            { .fni4 = gen_helper_neon_##FUNC##16 },                     \
-+
++            { .fni4 = gen_helper_neon_##FUNC##32 },                     \
-+#ifndef MPS2_FPGAIO_H
++            { 0 },                                                      \
-+#define MPS2_FPGAIO_H
++        };                                                              \
-+
++        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
-+#include "hw/sysbus.h"
++    }                                                                   \
-+
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-+#define TYPE_MPS2_FPGAIO "mps2-fpgaio"
++    {                                                                   \
-+#define MPS2_FPGAIO(obj) OBJECT_CHECK(MPS2FPGAIO, (obj), TYPE_MPS2_FPGAIO)
++        if (a->size > 2) {                                              \
-+
++            return false;                                               \
-+typedef struct {
++        }                                                               \
-+    /*< private >*/
++        return do_3same(s, a, gen_##INSN##_3s);                         \
 +    SysBusDevice parent_obj;
 +
 +    /*< public >*/
 +    MemoryRegion iomem;
 +
 +    uint32_t led0;
 +    uint32_t prescale;
 +    uint32_t misc;
 +
 +    uint32_t prescale_clk;
 +} MPS2FPGAIO;
 +
 +#endif
 diff --git a/hw/misc/mps2-fpgaio.c b/hw/misc/mps2-fpgaio.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/hw/misc/mps2-fpgaio.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + * ARM MPS2 AN505 FPGAIO emulation
 + *
 + * Copyright (c) 2018 Linaro Limited
 + * Written by Peter Maydell
 + *
 + *  This program is free software; you can redistribute it and/or modify
 + *  it under the terms of the GNU General Public License version 2 or
 + *  (at your option) any later version.
 + */
 +
 +/* This is a model of the "FPGA system control and I/O" block found
 + * in the AN505 FPGA image for the MPS2 devboard.
 + * It is documented in AN505:
 + * http://infocenter.arm.com/help/topic/com.arm.doc.dai0505b/index.html
 + */
 +
 +#include "qemu/osdep.h"
 +#include "qemu/log.h"
 +#include "qapi/error.h"
 +#include "trace.h"
 +#include "hw/sysbus.h"
 +#include "hw/registerfields.h"
 +#include "hw/misc/mps2-fpgaio.h"
 +
 +REG32(LED0, 0)
 +REG32(BUTTON, 8)
 +REG32(CLK1HZ, 0x10)
 +REG32(CLK100HZ, 0x14)
 +REG32(COUNTER, 0x18)
 +REG32(PRESCALE, 0x1c)
 +REG32(PSCNTR, 0x20)
 +REG32(MISC, 0x4c)
 +
 +static uint64_t mps2_fpgaio_read(void *opaque, hwaddr offset, unsigned size)
 +{
 +    MPS2FPGAIO *s = MPS2_FPGAIO(opaque);
 +    uint64_t r;
 +
 +    switch (offset) {
 +    case A_LED0:
 +        r = s->led0;
 +        break;
 +    case A_BUTTON:
 +        /* User-pressable board buttons. We don't model that, so just return
 +         * zeroes.
 +         */
 +        r = 0;
 +        break;
 +    case A_PRESCALE:
 +        r = s->prescale;
 +        break;
 +    case A_MISC:
 +        r = s->misc;
 +        break;
 +    case A_CLK1HZ:
 +    case A_CLK100HZ:
 +    case A_COUNTER:
 +    case A_PSCNTR:
 +        /* These are all upcounters of various frequencies. */
 +        qemu_log_mask(LOG_UNIMP, "MPS2 FPGAIO: counters unimplemented\n");
 +        r = 0;
 +        break;
 +    default:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "MPS2 FPGAIO read: bad offset %x\n", (int) offset);
 +        r = 0;
 +        break;
 +    }
 +
-+    trace_mps2_fpgaio_read(offset, r, size);
++DO_3SAME_32(VHADD_S, hadd_s)
-+    return r;
++DO_3SAME_32(VHADD_U, hadd_u)
-+}
+diff --git a/target/arm/translate.c b/target/arm/translate.c
 +
 +static void mps2_fpgaio_write(void *opaque, hwaddr offset, uint64_t value,
 +                              unsigned size)
 +{
 +    MPS2FPGAIO *s = MPS2_FPGAIO(opaque);
 +
 +    trace_mps2_fpgaio_write(offset, value, size);
 +
 +    switch (offset) {
 +    case A_LED0:
 +        /* LED bits [1:0] control board LEDs. We don't currently have
 +         * a mechanism for displaying this graphically, so use a trace event.
 +         */
 +        trace_mps2_fpgaio_leds(value & 0x02 ? '*' : '.',
 +                               value & 0x01 ? '*' : '.');
 +        s->led0 = value & 0x3;
 +        break;
 +    case A_PRESCALE:
 +        s->prescale = value;
 +        break;
 +    case A_MISC:
 +        /* These are control bits for some of the other devices on the
 +         * board (SPI, CLCD, etc). We don't implement that yet, so just
 +         * make the bits read as written.
 +         */
 +        qemu_log_mask(LOG_UNIMP,
 +                      "MPS2 FPGAIO: MISC control bits unimplemented\n");
 +        s->misc = value;
 +        break;
 +    default:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "MPS2 FPGAIO write: bad offset 0x%x\n", (int) offset);
 +        break;
 +    }
 +}
 +
 +static const MemoryRegionOps mps2_fpgaio_ops = {
 +    .read = mps2_fpgaio_read,
 +    .write = mps2_fpgaio_write,
 +    .endianness = DEVICE_LITTLE_ENDIAN,
 +};
 +
 +static void mps2_fpgaio_reset(DeviceState *dev)
 +{
 +    MPS2FPGAIO *s = MPS2_FPGAIO(dev);
 +
 +    trace_mps2_fpgaio_reset();
 +    s->led0 = 0;
 +    s->prescale = 0;
 +    s->misc = 0;
 +}
 +
 +static void mps2_fpgaio_init(Object *obj)
 +{
 +    SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
 +    MPS2FPGAIO *s = MPS2_FPGAIO(obj);
 +
 +    memory_region_init_io(&s->iomem, obj, &mps2_fpgaio_ops, s,
 +                          "mps2-fpgaio", 0x1000);
 +    sysbus_init_mmio(sbd, &s->iomem);
 +}
 +
 +static const VMStateDescription mps2_fpgaio_vmstate = {
 +    .name = "mps2-fpgaio",
 +    .version_id = 1,
 +    .minimum_version_id = 1,
 +    .fields = (VMStateField[]) {
 +        VMSTATE_UINT32(led0, MPS2FPGAIO),
 +        VMSTATE_UINT32(prescale, MPS2FPGAIO),
 +        VMSTATE_UINT32(misc, MPS2FPGAIO),
 +        VMSTATE_END_OF_LIST()
 +    }
 +};
 +
 +static Property mps2_fpgaio_properties[] = {
 +    /* Frequency of the prescale counter */
 +    DEFINE_PROP_UINT32("prescale-clk", MPS2FPGAIO, prescale_clk, 20000000),
 +    DEFINE_PROP_END_OF_LIST(),
 +};
 +
 +static void mps2_fpgaio_class_init(ObjectClass *klass, void *data)
 +{
 +    DeviceClass *dc = DEVICE_CLASS(klass);
 +
 +    dc->vmsd = &mps2_fpgaio_vmstate;
 +    dc->reset = mps2_fpgaio_reset;
 +    dc->props = mps2_fpgaio_properties;
 +}
 +
 +static const TypeInfo mps2_fpgaio_info = {
 +    .name = TYPE_MPS2_FPGAIO,
 +    .parent = TYPE_SYS_BUS_DEVICE,
 +    .instance_size = sizeof(MPS2FPGAIO),
 +    .instance_init = mps2_fpgaio_init,
 +    .class_init = mps2_fpgaio_class_init,
 +};
 +
 +static void mps2_fpgaio_register_types(void)
 +{
 +    type_register_static(&mps2_fpgaio_info);
 +}
 +
 +type_init(mps2_fpgaio_register_types);
 diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
 index XXXXXXX..XXXXXXX 100644
---- a/default-configs/arm-softmmu.mak
+--- a/target/arm/translate.c
-+++ b/default-configs/arm-softmmu.mak
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ CONFIG_STM32F205_SOC=y
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
- CONFIG_CMSDK_APB_TIMER=y
+         case NEON_3R_VML:
- CONFIG_CMSDK_APB_UART=y
+         case NEON_3R_VSHL:
+         case NEON_3R_SHA:
-+CONFIG_MPS2_FPGAIO=y
++        case NEON_3R_VHADD:
- CONFIG_MPS2_SCC=y
+             /* Already handled by decodetree */
+             return 1;
- CONFIG_VERSATILE_PCI=y
+         }
-diff --git a/hw/misc/trace-events b/hw/misc/trace-events
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-index XXXXXXX..XXXXXXX 100644
+             tmp2 = neon_load_reg(rm, pass);
---- a/hw/misc/trace-events
+         }
-+++ b/hw/misc/trace-events
+         switch (op) {
-@@ -XXX,XX +XXX,XX @@ mps2_scc_leds(char led7, char led6, char led5, char led4, char led3, char led2,
+-        case NEON_3R_VHADD:
- mps2_scc_cfg_write(unsigned function, unsigned device, uint32_t value) "MPS2 SCC config write: function %d device %d data 0x%" PRIx32
+-            GEN_NEON_INTEGER_OP(hadd);
- mps2_scc_cfg_read(unsigned function, unsigned device, uint32_t value) "MPS2 SCC config read: function %d device %d data 0x%" PRIx32
+-            break;
+         case NEON_3R_VRHADD:
-+# hw/misc/mps2_fpgaio.c
+             GEN_NEON_INTEGER_OP(rhadd);
-+mps2_fpgaio_read(uint64_t offset, uint64_t data, unsigned size) "MPS2 FPGAIO read: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
+             break;
 +mps2_fpgaio_write(uint64_t offset, uint64_t data, unsigned size) "MPS2 FPGAIO write: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
 +mps2_fpgaio_reset(void) "MPS2 FPGAIO: reset"
 +mps2_fpgaio_leds(char led1, char led0) "MPS2 FPGAIO LEDs: %c%c"
 +
  # hw/misc/msf2-sysreg.c
  msf2_sysreg_write(uint64_t offset, uint32_t val, uint32_t prev) "msf2-sysreg write: addr 0x%08" HWADDR_PRIx " data 0x%" PRIx32 " prev 0x%" PRIx32
  msf2_sysreg_read(uint64_t offset, uint32_t val) "msf2-sysreg read: addr 0x%08" HWADDR_PRIx " data 0x%08" PRIx32
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 13/39] hw/misc/unimp: Move struct to header file
+[PULL 33/45] target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
-Move the definition of the struct for the unimplemented-device
+Convert the Neon VABA and VABD insns in the 3-reg-same group to
-from unimp.c to unimp.h, so that users can embed the struct
+decodetree.
 in their own device structs if they prefer.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-10-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-6-peter.maydell@linaro.org
 ---
- include/hw/misc/unimp.h | 10 ++++++++++
+ target/arm/neon-dp.decode       |  6 ++++++
- hw/misc/unimp.c         | 10 ----------
+ target/arm/translate-neon.inc.c |  4 ++++
-files changed, 10 insertions(+), 10 deletions(-)
+ target/arm/translate.c          | 22 ++--------------------
 files changed, 12 insertions(+), 20 deletions(-)
-diff --git a/include/hw/misc/unimp.h b/include/hw/misc/unimp.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/misc/unimp.h
+--- a/target/arm/neon-dp.decode
-+++ b/include/hw/misc/unimp.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
+ VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
- #define TYPE_UNIMPLEMENTED_DEVICE "unimplemented-device"
+ VMIN_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
-+#define UNIMPLEMENTED_DEVICE(obj) \
++VABD_S_3s        1111 001 0 0 . .. .... .... 0111 . . . 0 .... @3same
-+    OBJECT_CHECK(UnimplementedDeviceState, (obj), TYPE_UNIMPLEMENTED_DEVICE)
++VABD_U_3s        1111 001 1 0 . .. .... .... 0111 . . . 0 .... @3same
 +
-+typedef struct {
++VABA_S_3s        1111 001 0 0 . .. .... .... 0111 . . . 1 .... @3same
-+    SysBusDevice parent_obj;
++VABA_U_3s        1111 001 1 0 . .. .... .... 0111 . . . 1 .... @3same
 +    MemoryRegion iomem;
 +    char *name;
 +    uint64_t size;
 +} UnimplementedDeviceState;
 +
- /**
+ VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
-  * create_unimplemented_device: create and map a dummy device
+ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
-  * @name: name of the device for debug logging
-diff --git a/hw/misc/unimp.c b/hw/misc/unimp.c
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/unimp.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/misc/unimp.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
- #include "qemu/log.h"
+ DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
- #include "qapi/error.h"
+ DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
+ DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
--#define UNIMPLEMENTED_DEVICE(obj) \
++DO_3SAME_NO_SZ_3(VABD_S, gen_gvec_sabd)
--    OBJECT_CHECK(UnimplementedDeviceState, (obj), TYPE_UNIMPLEMENTED_DEVICE)
++DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba)
 +DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd)
 +DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba)
  #define DO_3SAME_CMP(INSN, COND)                                        \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              /* VQRDMLSH : handled by decodetree */
              return 1;
 -        case NEON_3R_VABD:
 -            if (u) {
 -                gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs,
 -                              vec_size, vec_size);
 -            } else {
 -                gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs,
 -                              vec_size, vec_size);
 -            }
 -            return 0;
 -
--typedef struct {
+-        case NEON_3R_VABA:
--    SysBusDevice parent_obj;
+-            if (u) {
--    MemoryRegion iomem;
+-                gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
--    char *name;
+-                              vec_size, vec_size);
--    uint64_t size;
+-            } else {
--} UnimplementedDeviceState;
+-                gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
 -                              vec_size, vec_size);
 -            }
 -            return 0;
 -
- static uint64_t unimp_read(void *opaque, hwaddr offset, unsigned size)
+         case NEON_3R_VADD_VSUB:
- {
+         case NEON_3R_LOGIC:
-     UnimplementedDeviceState *s = UNIMPLEMENTED_DEVICE(opaque);
+         case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VSHL:
          case NEON_3R_SHA:
          case NEON_3R_VHADD:
 +        case NEON_3R_VABD:
 +        case NEON_3R_VABA:
              /* Already handled by decodetree */
              return 1;
          }
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 11/39] armv7m: Forward init-svtor property to CPU object
+[PULL 34/45] target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
-Create an "init-svtor" property on the armv7m container
+Convert the Neon VRHADD and VHSUB 3-reg-same insns to decodetree.
-object which we can forward to the CPU object.
+(These are all the other insns in 3-reg-same which were using
 GEN_NEON_INTEGER_OP() and which are not pairwise or
 reversed-operands.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-8-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-7-peter.maydell@linaro.org
 ---
- include/hw/arm/armv7m.h | 2 ++
+ target/arm/neon-dp.decode       | 6 ++++++
- hw/arm/armv7m.c         | 9 +++++++++
+ target/arm/translate-neon.inc.c | 4 ++++
-files changed, 11 insertions(+)
+ target/arm/translate.c          | 8 ++------
 files changed, 12 insertions(+), 6 deletions(-)
-diff --git a/include/hw/arm/armv7m.h b/include/hw/arm/armv7m.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/armv7m.h
+--- a/target/arm/neon-dp.decode
-+++ b/include/hw/arm/armv7m.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ typedef struct {
+@@ -XXX,XX +XXX,XX @@ VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
-  *   that CPU accesses see. (The NVIC, bitbanding and other CPU-internal
+ VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
-  *   devices will be automatically layered on top of this view.)
+ VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
-  * + Property "idau": IDAU interface (forwarded to CPU object)
-+ * + Property "init-svtor": secure VTOR reset value (forwarded to CPU object)
++VRHADD_S_3s      1111 001 0 0 . .. .... .... 0001 . . . 0 .... @3same
-  */
++VRHADD_U_3s      1111 001 1 0 . .. .... .... 0001 . . . 0 .... @3same
- typedef struct ARMv7MState {
++
-     /*< private >*/
+ @3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
-@@ -XXX,XX +XXX,XX @@ typedef struct ARMv7MState {
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
-     /* MemoryRegion the board provides to us (with its devices, RAM, etc) */
-     MemoryRegion *board_memory;
+@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
-     Object *idau;
+ VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
-+    uint32_t init_svtor;
+ VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
- } ARMv7MState;
++VHSUB_S_3s       1111 001 0 0 . .. .... .... 0010 . . . 0 .... @3same
- #endif
++VHSUB_U_3s       1111 001 1 0 . .. .... .... 0010 . . . 0 .... @3same
-diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
++
  VQSUB_S_3s       1111 001 0 0 . .. .... .... 0010 . . . 1 .... @3same
  VQSUB_U_3s       1111 001 1 0 . .. .... .... 0010 . . . 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/armv7m.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/arm/armv7m.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static void armv7m_realize(DeviceState *dev, Error **errp)
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
-             return;
  DO_3SAME_32(VHADD_S, hadd_s)
  DO_3SAME_32(VHADD_U, hadd_u)
 +DO_3SAME_32(VHSUB_S, hsub_s)
 +DO_3SAME_32(VHSUB_U, hsub_u)
 +DO_3SAME_32(VRHADD_S, rhadd_s)
 +DO_3SAME_32(VRHADD_U, rhadd_u)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VSHL:
          case NEON_3R_SHA:
          case NEON_3R_VHADD:
 +        case NEON_3R_VRHADD:
 +        case NEON_3R_VHSUB:
          case NEON_3R_VABD:
          case NEON_3R_VABA:
              /* Already handled by decodetree */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tmp2 = neon_load_reg(rm, pass);
          }
-     }
+         switch (op) {
-+    if (object_property_find(OBJECT(s->cpu), "init-svtor", NULL)) {
+-        case NEON_3R_VRHADD:
-+        object_property_set_uint(OBJECT(s->cpu), s->init_svtor,
+-            GEN_NEON_INTEGER_OP(rhadd);
-+                                 "init-svtor", &err);
+-            break;
-+        if (err != NULL) {
+-        case NEON_3R_VHSUB:
-+            error_propagate(errp, err);
+-            GEN_NEON_INTEGER_OP(hsub);
-+            return;
+-            break;
-+        }
+         case NEON_3R_VQSHL:
-+    }
+             GEN_NEON_INTEGER_OP_ENV(qshl);
-     object_property_set_bool(OBJECT(s->cpu), true, "realized", &err);
+             break;
      if (err != NULL) {
          error_propagate(errp, err);
@@ -XXX,XX +XXX,XX @@ static Property armv7m_properties[] = {
      DEFINE_PROP_LINK("memory", ARMv7MState, board_memory, TYPE_MEMORY_REGION,
                       MemoryRegion *),
      DEFINE_PROP_LINK("idau", ARMv7MState, idau, TYPE_IDAU_INTERFACE, Object *),
 +    DEFINE_PROP_UINT32("init-svtor", ARMv7MState, init_svtor, 0),
      DEFINE_PROP_END_OF_LIST(),
  };
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 21/39] hw/misc/iotkit-secctl: Add remaining simple registers
+[PULL 35/45] target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
-Add remaining easy registers to iotkit-secctl:
+Convert the VQSHL, VRSHL and VQRSHL insns in the 3-reg-same
- * NSCCFG just routes its two bits out to external GPIO lines
+group to decodetree. We have already implemented the size==0b11
- * BRGINSTAT/BRGINTCLR/BRGINTEN can be dummies, because QEMU's
+case of these insns; this commit handles the remaining sizes.
    bus fabric can never report errors
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20180220180325.29818-18-peter.maydell@linaro.org
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200512163904.10918-8-peter.maydell@linaro.org
 ---
- include/hw/misc/iotkit-secctl.h |  4 ++++
+ target/arm/neon-dp.decode       | 30 ++++++++++++++++++-----
- hw/misc/iotkit-secctl.c         | 32 ++++++++++++++++++++++++++------
+ target/arm/translate-neon.inc.c | 43 +++++++++++++++++++++++++++++++++
-files changed, 30 insertions(+), 6 deletions(-)
+ target/arm/translate.c          | 22 +++--------------
 files changed, 70 insertions(+), 25 deletions(-)
-diff --git a/include/hw/misc/iotkit-secctl.h b/include/hw/misc/iotkit-secctl.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/misc/iotkit-secctl.h
+--- a/target/arm/neon-dp.decode
-+++ b/include/hw/misc/iotkit-secctl.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
-  *  + sysbus MMIO region 1 is the "non-secure privilege control block" registers
+ @3same_64_rev    .... ... . . . 11 .... .... .... . q:1 . . .... \
-  *  + named GPIO output "sec_resp_cfg" indicating whether blocked accesses
+                  &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
-  *    should RAZ/WI or bus error
-+ *  + named GPIO output "nsc_cfg" whose value tracks the NSCCFG register value
+-VQSHL_S64_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-  * Controlling the 2 APB PPCs in the IoTKit:
+-VQSHL_U64_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-  *  + named GPIO outputs apb_ppc0_nonsec[0..2] and apb_ppc1_nonsec
+-VRSHL_S64_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-  *  + named GPIO outputs apb_ppc0_ap[0..2] and apb_ppc1_ap
+-VRSHL_U64_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-@@ -XXX,XX +XXX,XX @@ struct IoTKitSecCtl {
+-VQRSHL_S64_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
+-VQRSHL_U64_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-     /*< public >*/
++{
-     qemu_irq sec_resp_cfg;
++  VQSHL_S64_3s   1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-+    qemu_irq nsc_cfg_irq;
++  VQSHL_S_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_rev
++}
-     MemoryRegion s_regs;
++{
-     MemoryRegion ns_regs;
++  VQSHL_U64_3s   1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-@@ -XXX,XX +XXX,XX @@ struct IoTKitSecCtl {
++  VQSHL_U_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev
-     uint32_t secppcintstat;
++}
-     uint32_t secppcinten;
++{
-     uint32_t secrespcfg;
++  VRSHL_S64_3s   1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-+    uint32_t nsccfg;
++  VRSHL_S_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev
-+    uint32_t brginten;
++}
++{
-     IoTKitSecCtlPPC apb[IOTS_NUM_APB_PPC];
++  VRSHL_U64_3s   1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-     IoTKitSecCtlPPC apbexp[IOTS_NUM_APB_EXP_PPC];
++  VRSHL_U_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev
-diff --git a/hw/misc/iotkit-secctl.c b/hw/misc/iotkit-secctl.c
++}
 +{
 +  VQRSHL_S64_3s  1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
 +  VQRSHL_S_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_rev
 +}
 +{
 +  VQRSHL_U64_3s  1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
 +  VQRSHL_U_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_rev
 +}
  VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
  VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/iotkit-secctl.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/misc/iotkit-secctl.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_read(void *opaque, hwaddr addr,
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
-     case A_SECRESPCFG:
+         return do_3same(s, a, gen_##INSN##_3s);                         \
          r = s->secrespcfg;
          break;
 +    case A_NSCCFG:
 +        r = s->nsccfg;
 +        break;
      case A_SECPPCINTSTAT:
          r = s->secppcintstat;
          break;
      case A_SECPPCINTEN:
          r = s->secppcinten;
          break;
 +    case A_BRGINTSTAT:
 +        /* QEMU's bus fabric can never report errors as it doesn't buffer
 +         * writes, so we never report bridge interrupts.
 +         */
 +        r = 0;
 +        break;
 +    case A_BRGINTEN:
 +        r = s->brginten;
 +        break;
      case A_AHBNSPPCEXP0:
      case A_AHBNSPPCEXP1:
      case A_AHBNSPPCEXP2:
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_read(void *opaque, hwaddr addr,
      case A_APBSPPPCEXP3:
          r = s->apbexp[offset_to_ppc_idx(offset)].sp;
          break;
 -    case A_NSCCFG:
      case A_SECMPCINTSTATUS:
      case A_SECMSCINTSTAT:
      case A_SECMSCINTEN:
 -    case A_BRGINTSTAT:
 -    case A_BRGINTEN:
      case A_NSMSCEXP:
          qemu_log_mask(LOG_UNIMP,
                        "IoTKit SecCtl S block read: "
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
      }
-     switch (offset) {
++/*
-+    case A_NSCCFG:
++ * Some helper functions need to be passed the cpu_env. In order
-+        s->nsccfg = value & 3;
++ * to use those with the gvec APIs like tcg_gen_gvec_3() we need
-+        qemu_set_irq(s->nsc_cfg_irq, s->nsccfg);
++ * to create wrapper functions whose prototype is a NeonGenTwoOpFn()
-+        break;
++ * and which call a NeonGenTwoOpEnvFn().
-     case A_SECRESPCFG:
++ */
-         value &= 1;
++#define WRAP_ENV_FN(WRAPNAME, FUNC)                                     \
-         s->secrespcfg = value;
++    static void WRAPNAME(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m)            \
-@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
++    {                                                                   \
-         s->secppcinten = value & 0x00f000f3;
++        FUNC(d, cpu_env, n, m);                                         \
-         foreach_ppc(s, iotkit_secctl_ppc_update_irq_enable);
++    }
-         break;
++
-+    case A_BRGINTCLR:
++#define DO_3SAME_32_ENV(INSN, FUNC)                                     \
-+        break;
++    WRAP_ENV_FN(gen_##INSN##_tramp8, gen_helper_neon_##FUNC##8);        \
-+    case A_BRGINTEN:
++    WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##16);      \
-+        s->brginten = value & 0xffff0000;
++    WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##32);      \
-+        break;
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-     case A_AHBNSPPCEXP0:
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-     case A_AHBNSPPCEXP1:
++                                uint32_t oprsz, uint32_t maxsz)         \
-     case A_AHBNSPPCEXP2:
++    {                                                                   \
-@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
++        static const GVecGen3 ops[4] = {                                \
-         ppc = &s->apbexp[offset_to_ppc_idx(offset)];
++            { .fni4 = gen_##INSN##_tramp8 },                            \
-         iotkit_secctl_ppc_sp_write(ppc, value);
++            { .fni4 = gen_##INSN##_tramp16 },                           \
-         break;
++            { .fni4 = gen_##INSN##_tramp32 },                           \
--    case A_NSCCFG:
++            { 0 },                                                      \
-     case A_SECMSCINTCLR:
++        };                                                              \
-     case A_SECMSCINTEN:
++        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
--    case A_BRGINTCLR:
++    }                                                                   \
--    case A_BRGINTEN:
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-         qemu_log_mask(LOG_UNIMP,
++    {                                                                   \
-                       "IoTKit SecCtl S block write: "
++        if (a->size > 2) {                                              \
-                       "unimplemented offset 0x%x\n", offset);
++            return false;                                               \
-@@ -XXX,XX +XXX,XX @@ static void iotkit_secctl_reset(DeviceState *dev)
++        }                                                               \
-     s->secppcintstat = 0;
++        return do_3same(s, a, gen_##INSN##_3s);                         \
-     s->secppcinten = 0;
++    }
-     s->secrespcfg = 0;
++
-+    s->nsccfg = 0;
+ DO_3SAME_32(VHADD_S, hadd_s)
-+    s->brginten = 0;
+ DO_3SAME_32(VHADD_U, hadd_u)
+ DO_3SAME_32(VHSUB_S, hsub_s)
-     foreach_ppc(s, iotkit_secctl_reset_ppc);
+ DO_3SAME_32(VHSUB_U, hsub_u)
- }
+ DO_3SAME_32(VRHADD_S, rhadd_s)
-@@ -XXX,XX +XXX,XX @@ static void iotkit_secctl_init(Object *obj)
+ DO_3SAME_32(VRHADD_U, rhadd_u)
-     }
++DO_3SAME_32(VRSHL_S, rshl_s)
++DO_3SAME_32(VRSHL_U, rshl_u)
-     qdev_init_gpio_out_named(dev, &s->sec_resp_cfg, "sec_resp_cfg", 1);
++
-+    qdev_init_gpio_out_named(dev, &s->nsc_cfg_irq, "nsc_cfg", 1);
++DO_3SAME_32_ENV(VQSHL_S, qshl_s)
++DO_3SAME_32_ENV(VQSHL_U, qshl_u)
-     memory_region_init_io(&s->s_regs, obj, &iotkit_secctl_s_ops,
++DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
-                           s, "iotkit-secctl-s-regs", 0x1000);
++DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
-@@ -XXX,XX +XXX,XX @@ static const VMStateDescription iotkit_secctl_vmstate = {
+diff --git a/target/arm/translate.c b/target/arm/translate.c
-         VMSTATE_UINT32(secppcintstat, IoTKitSecCtl),
+index XXXXXXX..XXXXXXX 100644
-         VMSTATE_UINT32(secppcinten, IoTKitSecCtl),
+--- a/target/arm/translate.c
-         VMSTATE_UINT32(secrespcfg, IoTKitSecCtl),
++++ b/target/arm/translate.c
-+        VMSTATE_UINT32(nsccfg, IoTKitSecCtl),
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+        VMSTATE_UINT32(brginten, IoTKitSecCtl),
+         case NEON_3R_VHSUB:
-         VMSTATE_STRUCT_ARRAY(apb, IoTKitSecCtl, IOTS_NUM_APB_PPC, 1,
+         case NEON_3R_VABD:
-                              iotkit_secctl_ppc_vmstate, IoTKitSecCtlPPC),
+         case NEON_3R_VABA:
-         VMSTATE_STRUCT_ARRAY(apbexp, IoTKitSecCtl, IOTS_NUM_APB_EXP_PPC, 1,
++        case NEON_3R_VQSHL:
 +        case NEON_3R_VRSHL:
 +        case NEON_3R_VQRSHL:
              /* Already handled by decodetree */
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          }
          pairwise = 0;
          switch (op) {
 -        case NEON_3R_VQSHL:
 -        case NEON_3R_VRSHL:
 -        case NEON_3R_VQRSHL:
 -            {
 -                int rtmp;
 -                /* Shift instruction operands are reversed.  */
 -                rtmp = rn;
 -                rn = rm;
 -                rm = rtmp;
 -            }
 -            break;
          case NEON_3R_VPADD_VQRDMLAH:
          case NEON_3R_VPMAX:
          case NEON_3R_VPMIN:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tmp2 = neon_load_reg(rm, pass);
          }
          switch (op) {
 -        case NEON_3R_VQSHL:
 -            GEN_NEON_INTEGER_OP_ENV(qshl);
 -            break;
 -        case NEON_3R_VRSHL:
 -            GEN_NEON_INTEGER_OP(rshl);
 -            break;
 -        case NEON_3R_VQRSHL:
 -            GEN_NEON_INTEGER_OP_ENV(qrshl);
              break;
          case NEON_3R_VPMAX:
              GEN_NEON_INTEGER_OP(pmax);
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 05/39] loader: Add new load_ramdisk_as()
+[PULL 36/45] target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
-Add a function load_ramdisk_as() which behaves like the existing
+Convert the Neon integer VPMAX and VPMIN 3-reg-same insns to
-load_ramdisk() but allows the caller to specify the AddressSpace
+decodetree. These are 'pairwise' operations.
 to use. This matches the pattern we have already for various
 other loader functions.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-2-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-9-peter.maydell@linaro.org
 ---
- include/hw/loader.h | 12 +++++++++++-
+ target/arm/neon-dp.decode       |  9 +++++
- hw/core/loader.c    |  8 +++++++-
+ target/arm/translate-neon.inc.c | 71 +++++++++++++++++++++++++++++++++
-files changed, 18 insertions(+), 2 deletions(-)
+ target/arm/translate.c          | 17 +-------
 files changed, 82 insertions(+), 15 deletions(-)
-diff --git a/include/hw/loader.h b/include/hw/loader.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/loader.h
+--- a/target/arm/neon-dp.decode
-+++ b/include/hw/loader.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ int load_uimage(const char *filename, hwaddr *ep,
+@@ -XXX,XX +XXX,XX @@
-                 void *translate_opaque);
+ @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
- /**
-- * load_ramdisk:
++@3same_q0        .... ... . . . size:2 .... .... .... . 0 . . .... \
-+ * load_ramdisk_as:
++                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
   * @filename: Path to the ramdisk image
   * @addr: Memory address to load the ramdisk to
   * @max_sz: Maximum allowed ramdisk size (for non-u-boot ramdisks)
 + * @as: The AddressSpace to load the ELF to. The value of address_space_memory
 + *      is used if nothing is supplied here.
   *
   * Load a ramdisk image with U-Boot header to the specified memory
   * address.
   *
   * Returns the size of the loaded image on success, -1 otherwise.
   */
 +int load_ramdisk_as(const char *filename, hwaddr addr, uint64_t max_sz,
 +                    AddressSpace *as);
 +
-+/**
+ VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
-+ * load_ramdisk:
+ VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
-+ * Same as load_ramdisk_as(), but doesn't allow the caller to specify
+ VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
-+ * an AddressSpace.
+@@ -XXX,XX +XXX,XX @@ VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
-+ */
+ VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
- int load_ramdisk(const char *filename, hwaddr addr, uint64_t max_sz);
+ VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
- ssize_t gunzip(void *dst, size_t dstlen, uint8_t *src, size_t srclen);
++VPMAX_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 0 .... @3same_q0
-diff --git a/hw/core/loader.c b/hw/core/loader.c
++VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
 +
 +VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 +VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 +
  VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
  SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/core/loader.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/core/loader.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ int load_uimage_as(const char *filename, hwaddr *ep, hwaddr *loadaddr,
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_32_ENV(VQSHL_S, qshl_s)
+ DO_3SAME_32_ENV(VQSHL_U, qshl_u)
- /* Load a ramdisk.  */
+ DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
- int load_ramdisk(const char *filename, hwaddr addr, uint64_t max_sz)
+ DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
 +
 +static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
 +{
-+    return load_ramdisk_as(filename, addr, max_sz, NULL);
++    /* Operations handled pairwise 32 bits at a time */
 +    TCGv_i32 tmp, tmp2, tmp3;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if (a->size == 3) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    assert(a->q == 0); /* enforced by decode patterns */
 +
 +    /*
 +     * Note that we have to be careful not to clobber the source operands
 +     * in the "vm == vd" case by storing the result of the first pass too
 +     * early. Since Q is 0 there are always just two passes, so instead
 +     * of a complicated loop over each pass we just unroll.
 +     */
 +    tmp = neon_load_reg(a->vn, 0);
 +    tmp2 = neon_load_reg(a->vn, 1);
 +    fn(tmp, tmp, tmp2);
 +    tcg_temp_free_i32(tmp2);
 +
 +    tmp3 = neon_load_reg(a->vm, 0);
 +    tmp2 = neon_load_reg(a->vm, 1);
 +    fn(tmp3, tmp3, tmp2);
 +    tcg_temp_free_i32(tmp2);
 +
 +    neon_store_reg(a->vd, 0, tmp);
 +    neon_store_reg(a->vd, 1, tmp3);
 +    return true;
 +}
 +
-+int load_ramdisk_as(const char *filename, hwaddr addr, uint64_t max_sz,
++#define DO_3SAME_PAIR(INSN, func)                                       \
-+                    AddressSpace *as)
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
- {
++    {                                                                   \
-     return load_uboot_image(filename, NULL, &addr, NULL, IH_TYPE_RAMDISK,
++        static NeonGenTwoOpFn * const fns[] = {                         \
--                            NULL, NULL, NULL);
++            gen_helper_neon_##func##8,                                  \
-+                            NULL, NULL, as);
++            gen_helper_neon_##func##16,                                 \
 +            gen_helper_neon_##func##32,                                 \
 +        };                                                              \
 +        if (a->size > 2) {                                              \
 +            return false;                                               \
 +        }                                                               \
 +        return do_3same_pair(s, a, fns[a->size]);                       \
 +    }
 +
 +/* 32-bit pairwise ops end up the same as the elementwise versions.  */
 +#define gen_helper_neon_pmax_s32  tcg_gen_smax_i32
 +#define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
 +#define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
 +#define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
 +
 +DO_3SAME_PAIR(VPMAX_S, pmax_s)
 +DO_3SAME_PAIR(VPMIN_S, pmin_s)
 +DO_3SAME_PAIR(VPMAX_U, pmax_u)
 +DO_3SAME_PAIR(VPMIN_U, pmin_u)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
      }
  }
- /* Load a gzip-compressed kernel to a dynamically allocated buffer. */
+-/* 32-bit pairwise ops end up the same as the elementwise versions.  */
 -#define gen_helper_neon_pmax_s32  tcg_gen_smax_i32
 -#define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
 -#define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
 -#define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
 -
  #define GEN_NEON_INTEGER_OP_ENV(name) do { \
      switch ((size << 1) | u) { \
      case 0: \
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VQSHL:
          case NEON_3R_VRSHL:
          case NEON_3R_VQRSHL:
 +        case NEON_3R_VPMAX:
 +        case NEON_3R_VPMIN:
              /* Already handled by decodetree */
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          pairwise = 0;
          switch (op) {
          case NEON_3R_VPADD_VQRDMLAH:
 -        case NEON_3R_VPMAX:
 -        case NEON_3R_VPMIN:
              pairwise = 1;
              break;
          case NEON_3R_FLOAT_ARITH:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tmp2 = neon_load_reg(rm, pass);
          }
          switch (op) {
 -            break;
 -        case NEON_3R_VPMAX:
 -            GEN_NEON_INTEGER_OP(pmax);
 -            break;
 -        case NEON_3R_VPMIN:
 -            GEN_NEON_INTEGER_OP(pmin);
 -            break;
          case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high.  */
              if (!u) { /* VQDMULH */
                  switch (size) {
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 09/39] armv7m: Forward idau property to CPU object
+[PULL 37/45] target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
-Create an "idau" property on the armv7m container object which
+Convert the Neon integer VPADD 3-reg-same insns to decodetree.  These
-we can forward to the CPU object. Annoyingly, we can't use
+are 'pairwise' operations.  (Note that VQRDMLAH, which shares the
-object_property_add_alias() because the CPU object we want to
+same primary opcode but has U=1, has already been converted.)
 forward to doesn't exist until the armv7m container is realized.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-6-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-10-peter.maydell@linaro.org
 ---
- include/hw/arm/armv7m.h | 3 +++
+ target/arm/neon-dp.decode       |  2 ++
- hw/arm/armv7m.c         | 9 +++++++++
+ target/arm/translate-neon.inc.c |  2 ++
-files changed, 12 insertions(+)
+ target/arm/translate.c          | 19 +------------------
 files changed, 5 insertions(+), 18 deletions(-)
-diff --git a/include/hw/arm/armv7m.h b/include/hw/arm/armv7m.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/armv7m.h
+--- a/target/arm/neon-dp.decode
-+++ b/include/hw/arm/armv7m.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
+ VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
- #include "hw/sysbus.h"
+ VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
- #include "hw/intc/armv7m_nvic.h"
-+#include "target/arm/idau.h"
++VPADD_3s         1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
++
- #define TYPE_BITBAND "ARM,bitband-memory"
+ VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
- #define BITBAND(obj) OBJECT_CHECK(BitBandState, (obj), TYPE_BITBAND)
-@@ -XXX,XX +XXX,XX @@ typedef struct {
+ SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
-  * + Property "memory": MemoryRegion defining the physical address space
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
   *   that CPU accesses see. (The NVIC, bitbanding and other CPU-internal
   *   devices will be automatically layered on top of this view.)
 + * + Property "idau": IDAU interface (forwarded to CPU object)
   */
  typedef struct ARMv7MState {
      /*< private >*/
@@ -XXX,XX +XXX,XX @@ typedef struct ARMv7MState {
      char *cpu_type;
      /* MemoryRegion the board provides to us (with its devices, RAM, etc) */
      MemoryRegion *board_memory;
 +    Object *idau;
  } ARMv7MState;
  #endif
 diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/armv7m.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/arm/armv7m.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
- #include "sysemu/qtest.h"
+ #define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
- #include "qemu/error-report.h"
+ #define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
- #include "exec/address-spaces.h"
+ #define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
-+#include "target/arm/idau.h"
++#define gen_helper_neon_padd_u32  tcg_gen_add_i32
- /* Bitbanded IO.  Each word corresponds to a single bit.  */
+ DO_3SAME_PAIR(VPMAX_S, pmax_s)
+ DO_3SAME_PAIR(VPMIN_S, pmin_s)
-@@ -XXX,XX +XXX,XX @@ static void armv7m_realize(DeviceState *dev, Error **errp)
+ DO_3SAME_PAIR(VPMAX_U, pmax_u)
+ DO_3SAME_PAIR(VPMIN_U, pmin_u)
-     object_property_set_link(OBJECT(s->cpu), OBJECT(&s->container), "memory",
++DO_3SAME_PAIR(VPADD, padd_u)
-                              &error_abort);
+diff --git a/target/arm/translate.c b/target/arm/translate.c
-+    if (object_property_find(OBJECT(s->cpu), "idau", NULL)) {
+index XXXXXXX..XXXXXXX 100644
-+        object_property_set_link(OBJECT(s->cpu), s->idau, "idau", &err);
+--- a/target/arm/translate.c
-+        if (err != NULL) {
++++ b/target/arm/translate.c
-+            error_propagate(errp, err);
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+            return;
+             return 1;
-+        }
+         }
-+    }
+         switch (op) {
-     object_property_set_bool(OBJECT(s->cpu), true, "realized", &err);
+-        case NEON_3R_VPADD_VQRDMLAH:
-     if (err != NULL) {
+-            if (!u) {
-         error_propagate(errp, err);
+-                break;  /* VPADD */
-@@ -XXX,XX +XXX,XX @@ static Property armv7m_properties[] = {
+-            }
-     DEFINE_PROP_STRING("cpu-type", ARMv7MState, cpu_type),
+-            /* VQRDMLAH : handled by decodetree */
-     DEFINE_PROP_LINK("memory", ARMv7MState, board_memory, TYPE_MEMORY_REGION,
+-            return 1;
-                      MemoryRegion *),
+-
-+    DEFINE_PROP_LINK("idau", ARMv7MState, idau, TYPE_IDAU_INTERFACE, Object *),
+         case NEON_3R_VFM_VQRDMLSH:
-     DEFINE_PROP_END_OF_LIST(),
+             if (!u) {
- };
+                 /* VFM, VFMS */
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VQRSHL:
          case NEON_3R_VPMAX:
          case NEON_3R_VPMIN:
 +        case NEON_3R_VPADD_VQRDMLAH:
              /* Already handled by decodetree */
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          }
          pairwise = 0;
          switch (op) {
 -        case NEON_3R_VPADD_VQRDMLAH:
 -            pairwise = 1;
 -            break;
          case NEON_3R_FLOAT_ARITH:
              pairwise = (u && size < 2); /* if VPADD (float) */
              break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  }
              }
              break;
 -        case NEON_3R_VPADD_VQRDMLAH:
 -            switch (size) {
 -            case 0: gen_helper_neon_padd_u8(tmp, tmp, tmp2); break;
 -            case 1: gen_helper_neon_padd_u16(tmp, tmp, tmp2); break;
 -            case 2: tcg_gen_add_i32(tmp, tmp, tmp2); break;
 -            default: abort();
 -            }
 -            break;
          case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 16/39] hw/core/split-irq: Device that splits IRQ lines
+[PULL 38/45] target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
-In some board or SoC models it is necessary to split a qemu_irq line
+Convert the Neon VQDMULH and VQRDMULH 3-reg-same insns to
-so that one input can feed multiple outputs.  We currently have
+decodetree. These are the last integer operations in the
-qemu_irq_split() for this, but that has several deficiencies:
+-reg-same group.
  * it can only handle splitting a line into two
  * it unavoidably leaks memory, so it can't be used
    in a device that can be deleted
 Implement a qdev device that encapsulates splitting of IRQs, with a
 configurable number of outputs.  (This is in some ways the inverse of
 the TYPE_OR_IRQ device.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-13-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-11-peter.maydell@linaro.org
 ---
- hw/core/Makefile.objs       |  1 +
+ target/arm/neon-dp.decode       |  3 +++
- include/hw/core/split-irq.h | 57 +++++++++++++++++++++++++++++
+ target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
- include/hw/irq.h            |  4 +-
+ target/arm/translate.c          | 24 +-----------------------
- hw/core/split-irq.c         | 89 +++++++++++++++++++++++++++++++++++++++++++++
+files changed, 28 insertions(+), 23 deletions(-)
 files changed, 150 insertions(+), 1 deletion(-)
  create mode 100644 include/hw/core/split-irq.h
  create mode 100644 hw/core/split-irq.c
-diff --git a/hw/core/Makefile.objs b/hw/core/Makefile.objs
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/core/Makefile.objs
+--- a/target/arm/neon-dp.decode
-+++ b/hw/core/Makefile.objs
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_FITLOADER) += loader-fit.o
+@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
- common-obj-$(CONFIG_SOFTMMU) += qdev-properties-system.o
+ VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
- common-obj-$(CONFIG_SOFTMMU) += register.o
+ VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
- common-obj-$(CONFIG_SOFTMMU) += or-irq.o
-+common-obj-$(CONFIG_SOFTMMU) += split-irq.o
++VQDMULH_3s       1111 001 0 0 . .. .... .... 1011 . . . 0 .... @3same
- common-obj-$(CONFIG_PLATFORM_BUS) += platform-bus.o
++VQRDMULH_3s      1111 001 1 0 . .. .... .... 1011 . . . 0 .... @3same
  obj-$(CONFIG_SOFTMMU) += generic-loader.o
 diff --git a/include/hw/core/split-irq.h b/include/hw/core/split-irq.h
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/include/hw/core/split-irq.h
@@ -XXX,XX +XXX,XX @@
 +/*
 + * IRQ splitter device.
 + *
 + * Copyright (c) 2018 Linaro Limited.
 + * Written by Peter Maydell
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a copy
 + * of this software and associated documentation files (the "Software"), to deal
 + * in the Software without restriction, including without limitation the rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
-+/* This is a simple device which has one GPIO input line and multiple
+ VPADD_3s         1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
-+ * GPIO output lines. Any change on the input line is forwarded to all
-+ * of the outputs.
+ VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
-+ *
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-+ * QEMU interface:
+index XXXXXXX..XXXXXXX 100644
-+ *  + one unnamed GPIO input: the input line
+--- a/target/arm/translate-neon.inc.c
-+ *  + N unnamed GPIO outputs: the output lines
++++ b/target/arm/translate-neon.inc.c
-+ *  + QOM property "num-lines": sets the number of output lines
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPMIN_S, pmin_s)
-+ */
+ DO_3SAME_PAIR(VPMAX_U, pmax_u)
-+#ifndef HW_SPLIT_IRQ_H
+ DO_3SAME_PAIR(VPMIN_U, pmin_u)
-+#define HW_SPLIT_IRQ_H
+ DO_3SAME_PAIR(VPADD, padd_u)
 +
-+#include "hw/irq.h"
++#define DO_3SAME_VQDMULH(INSN, FUNC)                                    \
-+#include "hw/sysbus.h"
++    WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16);    \
-+#include "qom/object.h"
++    WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32);    \
-+
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+#define TYPE_SPLIT_IRQ "split-irq"
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+
++                                uint32_t oprsz, uint32_t maxsz)         \
-+#define MAX_SPLIT_LINES 16
++    {                                                                   \
-+
++        static const GVecGen3 ops[2] = {                                \
-+typedef struct SplitIRQ SplitIRQ;
++            { .fni4 = gen_##INSN##_tramp16 },                           \
-+
++            { .fni4 = gen_##INSN##_tramp32 },                           \
-+#define SPLIT_IRQ(obj) OBJECT_CHECK(SplitIRQ, (obj), TYPE_SPLIT_IRQ)
++        };                                                              \
-+
++        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece - 1]); \
-+struct SplitIRQ {
++    }                                                                   \
-+    DeviceState parent_obj;
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-+
++    {                                                                   \
-+    qemu_irq out_irq[MAX_SPLIT_LINES];
++        if (a->size != 1 && a->size != 2) {                             \
-+    uint16_t num_lines;
++            return false;                                               \
-+};
++        }                                                               \
-+
++        return do_3same(s, a, gen_##INSN##_3s);                         \
 +#endif
 diff --git a/include/hw/irq.h b/include/hw/irq.h
 index XXXXXXX..XXXXXXX 100644
 --- a/include/hw/irq.h
 +++ b/include/hw/irq.h
@@ -XXX,XX +XXX,XX @@ void qemu_free_irq(qemu_irq irq);
  /* Returns a new IRQ with opposite polarity.  */
  qemu_irq qemu_irq_invert(qemu_irq irq);
 -/* Returns a new IRQ which feeds into both the passed IRQs */
 +/* Returns a new IRQ which feeds into both the passed IRQs.
 + * It's probably better to use the TYPE_SPLIT_IRQ device instead.
 + */
  qemu_irq qemu_irq_split(qemu_irq irq1, qemu_irq irq2);
  /* Returns a new IRQ set which connects 1:1 to another IRQ set, which
 diff --git a/hw/core/split-irq.c b/hw/core/split-irq.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/hw/core/split-irq.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + * IRQ splitter device.
 + *
 + * Copyright (c) 2018 Linaro Limited.
 + * Written by Peter Maydell
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a copy
 + * of this software and associated documentation files (the "Software"), to deal
 + * in the Software without restriction, including without limitation the rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
 +#include "qemu/osdep.h"
 +#include "hw/core/split-irq.h"
 +#include "qapi/error.h"
 +
 +static void split_irq_handler(void *opaque, int n, int level)
 +{
 +    SplitIRQ *s = SPLIT_IRQ(opaque);
 +    int i;
 +
 +    for (i = 0; i < s->num_lines; i++) {
 +        qemu_set_irq(s->out_irq[i], level);
 +    }
 +}
 +
 +static void split_irq_init(Object *obj)
 +{
 +    qdev_init_gpio_in(DEVICE(obj), split_irq_handler, 1);
 +}
 +
 +static void split_irq_realize(DeviceState *dev, Error **errp)
 +{
 +    SplitIRQ *s = SPLIT_IRQ(dev);
 +
 +    if (s->num_lines < 1 || s->num_lines >= MAX_SPLIT_LINES) {
 +        error_setg(errp,
 +                   "IRQ splitter number of lines %d is not between 1 and %d",
 +                   s->num_lines, MAX_SPLIT_LINES);
 +        return;
 +    }
 +
-+    qdev_init_gpio_out(dev, s->out_irq, s->num_lines);
++DO_3SAME_VQDMULH(VQDMULH, qdmulh)
-+}
++DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
-+
+diff --git a/target/arm/translate.c b/target/arm/translate.c
-+static Property split_irq_properties[] = {
+index XXXXXXX..XXXXXXX 100644
-+    DEFINE_PROP_UINT16("num-lines", SplitIRQ, num_lines, 1),
+--- a/target/arm/translate.c
-+    DEFINE_PROP_END_OF_LIST(),
++++ b/target/arm/translate.c
-+};
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+
+         case NEON_3R_VPMAX:
-+static void split_irq_class_init(ObjectClass *klass, void *data)
+         case NEON_3R_VPMIN:
-+{
+         case NEON_3R_VPADD_VQRDMLAH:
-+    DeviceClass *dc = DEVICE_CLASS(klass);
++        case NEON_3R_VQDMULH_VQRDMULH:
-+
+             /* Already handled by decodetree */
-+    /* No state to reset or migrate */
+             return 1;
-+    dc->props = split_irq_properties;
+         }
-+    dc->realize = split_irq_realize;
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+
+             tmp2 = neon_load_reg(rm, pass);
-+    /* Reason: Needs to be wired up to work */
+         }
-+    dc->user_creatable = false;
+         switch (op) {
-+}
+-        case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high.  */
-+
+-            if (!u) { /* VQDMULH */
-+static const TypeInfo split_irq_type_info = {
+-                switch (size) {
-+   .name = TYPE_SPLIT_IRQ,
+-                case 1:
-+   .parent = TYPE_DEVICE,
+-                    gen_helper_neon_qdmulh_s16(tmp, cpu_env, tmp, tmp2);
-+   .instance_size = sizeof(SplitIRQ),
+-                    break;
-+   .instance_init = split_irq_init,
+-                case 2:
-+   .class_init = split_irq_class_init,
+-                    gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
-+};
+-                    break;
-+
+-                default: abort();
-+static void split_irq_register_types(void)
+-                }
-+{
+-            } else { /* VQRDMULH */
-+    type_register_static(&split_irq_type_info);
+-                switch (size) {
-+}
+-                case 1:
-+
+-                    gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
-+type_init(split_irq_register_types)
+-                    break;
 -                case 2:
 -                    gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
 -                    break;
 -                default: abort();
 -                }
 -            }
 -            break;
          case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 12/39] target/arm: Add Cortex-M33
+[PULL 39/45] target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
-Add a Cortex-M33 definition. The M33 is an M profile CPU
+Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree.
-which implements the ARM v8M architecture, including the
+We already have gvec helpers for addition and subtraction, but must
-M profile Security Extension.
+add one for fabd.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-9-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-12-peter.maydell@linaro.org
 ---
- target/arm/cpu.c | 31 +++++++++++++++++++++++++++++++
+ target/arm/helper.h             |  3 ++-
-file changed, 31 insertions(+)
+ target/arm/neon-dp.decode       |  8 ++++++++
  target/arm/neon_helper.c        |  7 -------
  target/arm/translate-neon.inc.c | 28 ++++++++++++++++++++++++++++
  target/arm/translate.c          | 10 +++-------
  target/arm/vec_helper.c         |  7 +++++++
 files changed, 48 insertions(+), 15 deletions(-)
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+--- a/target/arm/helper.h
-+++ b/target/arm/cpu.c
++++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(neon_qneg_s16, TCG_CALL_NO_RWG, i32, env, i32)
-     cpu->id_isar5 = 0x00000000;
+ DEF_HELPER_FLAGS_2(neon_qneg_s32, TCG_CALL_NO_RWG, i32, env, i32)
  DEF_HELPER_FLAGS_2(neon_qneg_s64, TCG_CALL_NO_RWG, i64, env, i64)
 -DEF_HELPER_3(neon_abd_f32, i32, i32, i32, ptr)
  DEF_HELPER_3(neon_ceq_f32, i32, i32, i32, ptr)
  DEF_HELPER_3(neon_cge_f32, i32, i32, i32, ptr)
  DEF_HELPER_3(neon_cgt_f32, i32, i32, i32, ptr)
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 +
  DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
                     void, ptr, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-dp.decode
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
  @3same_q0        .... ... . . . size:2 .... .... .... . 0 . . .... \
                   &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
 +# For FP insns the high bit of 'size' is used as part of opcode decode
 +@3same_fp        .... ... . . . . size:1 .... .... .... . q:1 . . .... \
 +                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
  VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
  VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
  VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
@@ -XXX,XX +XXX,XX @@ SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
                   vm=%vm_dp vn=%vn_dp vd=%vd_dp
  VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
 +
 +VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
 +VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
 +VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
 diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon_helper.c
 +++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qneg_s64)(CPUARMState *env, uint64_t x)
  }
-+static void cortex_m33_initfn(Object *obj)
+ /* NEON Float helpers.  */
 -uint32_t HELPER(neon_abd_f32)(uint32_t a, uint32_t b, void *fpstp)
 -{
 -    float_status *fpst = fpstp;
 -    float32 f0 = make_float32(a);
 -    float32 f1 = make_float32(b);
 -    return float32_val(float32_abs(float32_sub(f0, f1, fpst)));
 -}
  /* Floating point comparisons produce an integer result.
   * Note that EQ doesn't signal InvalidOp for QNaNs but GE and GT do.
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
  DO_3SAME_VQDMULH(VQDMULH, qdmulh)
  DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
 +
 +/*
 + * For all the functions using this macro, size == 1 means fp16,
 + * which is an architecture extension we don't implement yet.
 + */
 +#define DO_3S_FP_GVEC(INSN,FUNC)                                        \
 +    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 +                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 +                                uint32_t oprsz, uint32_t maxsz)         \
 +    {                                                                   \
 +        TCGv_ptr fpst = get_fpstatus_ptr(1);                            \
 +        tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, fpst,                \
 +                           oprsz, maxsz, 0, FUNC);                      \
 +        tcg_temp_free_ptr(fpst);                                        \
 +    }                                                                   \
 +    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a)     \
 +    {                                                                   \
 +        if (a->size != 0) {                                             \
 +            /* TODO fp16 support */                                     \
 +            return false;                                               \
 +        }                                                               \
 +        return do_3same(s, a, gen_##INSN##_3s);                         \
 +    }
 +
 +
 +DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
 +DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
 +DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          switch (op) {
          case NEON_3R_FLOAT_ARITH:
              pairwise = (u && size < 2); /* if VPADD (float) */
 +            if (!pairwise) {
 +                return 1; /* handled by decodetree */
 +            }
              break;
          case NEON_3R_FLOAT_MINMAX:
              pairwise = u; /* if VPMIN/VPMAX (float) */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
              switch ((u << 2) | size) {
 -            case 0: /* VADD */
              case 4: /* VPADD */
                  gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
                  break;
 -            case 2: /* VSUB */
 -                gen_helper_vfp_subs(tmp, tmp, tmp2, fpstatus);
 -                break;
 -            case 6: /* VABD */
 -                gen_helper_neon_abd_f32(tmp, tmp, tmp2, fpstatus);
 -                break;
              default:
                  abort();
              }
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *stat)
      return result;
  }
 +static float32 float32_abd(float32 op1, float32 op2, float_status *stat)
 +{
-+    ARMCPU *cpu = ARM_CPU(obj);
++    return float32_abs(float32_sub(op1, op2, stat));
 +
 +    set_feature(&cpu->env, ARM_FEATURE_V8);
 +    set_feature(&cpu->env, ARM_FEATURE_M);
 +    set_feature(&cpu->env, ARM_FEATURE_M_SECURITY);
 +    set_feature(&cpu->env, ARM_FEATURE_THUMB_DSP);
 +    cpu->midr = 0x410fd213; /* r0p3 */
 +    cpu->pmsav7_dregion = 16;
 +    cpu->sau_sregion = 8;
 +    cpu->id_pfr0 = 0x00000030;
 +    cpu->id_pfr1 = 0x00000210;
 +    cpu->id_dfr0 = 0x00200000;
 +    cpu->id_afr0 = 0x00000000;
 +    cpu->id_mmfr0 = 0x00101F40;
 +    cpu->id_mmfr1 = 0x00000000;
 +    cpu->id_mmfr2 = 0x01000000;
 +    cpu->id_mmfr3 = 0x00000000;
 +    cpu->id_isar0 = 0x01101110;
 +    cpu->id_isar1 = 0x02212000;
 +    cpu->id_isar2 = 0x20232232;
 +    cpu->id_isar3 = 0x01111131;
 +    cpu->id_isar4 = 0x01310132;
 +    cpu->id_isar5 = 0x00000000;
 +    cpu->clidr = 0x00000000;
 +    cpu->ctr = 0x8000c000;
 +}
 +
- static void arm_v7m_class_init(ObjectClass *oc, void *data)
+ #define DO_3OP(NAME, FUNC, TYPE) \
- {
+ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
-     CPUClass *cc = CPU_CLASS(oc);
+ {                                                                          \
-@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo arm_cpus[] = {
+@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16)
-                              .class_init = arm_v7m_class_init },
+ DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32)
-     { .name = "cortex-m4",   .initfn = cortex_m4_initfn,
+ DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
-                              .class_init = arm_v7m_class_init },
-+    { .name = "cortex-m33",  .initfn = cortex_m33_initfn,
++DO_3OP(gvec_fabd_s, float32_abd, float32)
-+                             .class_init = arm_v7m_class_init },
++
-     { .name = "cortex-r5",   .initfn = cortex_r5_initfn },
+ #ifdef TARGET_AARCH64
-     { .name = "cortex-a7",   .initfn = cortex_a7_initfn },
-     { .name = "cortex-a8",   .initfn = cortex_a8_initfn },
+ DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
 --
-.16.2
+.20.1

-New patch
+[PULL 40/45] target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
+Convert the Neon float VPMIN, VPMAX and VPADD 3-reg-same insns to
 decodetree. These are the only remaining 'pairwise' operations,
 so we can delete the pairwise-specific bits of the old decoder's
 for-each-element loop now.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200512163904.10918-13-peter.maydell@linaro.org
 ---
  target/arm/neon-dp.decode       |  5 +++
  target/arm/translate-neon.inc.c | 63 +++++++++++++++++++++++++++++++++
  target/arm/translate.c          | 63 +++++----------------------------
 files changed, 76 insertions(+), 55 deletions(-)
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-dp.decode
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
  # For FP insns the high bit of 'size' is used as part of opcode decode
  @3same_fp        .... ... . . . . size:1 .... .... .... . q:1 . . .... \
                   &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +@3same_fp_q0     .... ... . . . . size:1 .... .... .... . 0 . . .... \
 +                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
  VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
  VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
@@ -XXX,XX +XXX,XX @@ VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
  VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
  VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
 +VPADD_fp_3s      1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
  VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
 +VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
 +VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
  DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
  DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
  DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
 +
 +static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
 +{
 +    /* FP operations handled pairwise 32 bits at a time */
 +    TCGv_i32 tmp, tmp2, tmp3;
 +    TCGv_ptr fpstatus;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    assert(a->q == 0); /* enforced by decode patterns */
 +
 +    /*
 +     * Note that we have to be careful not to clobber the source operands
 +     * in the "vm == vd" case by storing the result of the first pass too
 +     * early. Since Q is 0 there are always just two passes, so instead
 +     * of a complicated loop over each pass we just unroll.
 +     */
 +    fpstatus = get_fpstatus_ptr(1);
 +    tmp = neon_load_reg(a->vn, 0);
 +    tmp2 = neon_load_reg(a->vn, 1);
 +    fn(tmp, tmp, tmp2, fpstatus);
 +    tcg_temp_free_i32(tmp2);
 +
 +    tmp3 = neon_load_reg(a->vm, 0);
 +    tmp2 = neon_load_reg(a->vm, 1);
 +    fn(tmp3, tmp3, tmp2, fpstatus);
 +    tcg_temp_free_i32(tmp2);
 +    tcg_temp_free_ptr(fpstatus);
 +
 +    neon_store_reg(a->vd, 0, tmp);
 +    neon_store_reg(a->vd, 1, tmp3);
 +    return true;
 +}
 +
 +/*
 + * For all the functions using this macro, size == 1 means fp16,
 + * which is an architecture extension we don't implement yet.
 + */
 +#define DO_3S_FP_PAIR(INSN,FUNC)                                    \
 +    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
 +    {                                                               \
 +        if (a->size != 0) {                                         \
 +            /* TODO fp16 support */                                 \
 +            return false;                                           \
 +        }                                                           \
 +        return do_3same_fp_pair(s, a, FUNC);                        \
 +    }
 +
 +DO_3S_FP_PAIR(VPADD, gen_helper_vfp_adds)
 +DO_3S_FP_PAIR(VPMAX, gen_helper_vfp_maxs)
 +DO_3S_FP_PAIR(VPMIN, gen_helper_vfp_mins)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      int shift;
      int pass;
      int count;
 -    int pairwise;
      int u;
      int vec_size;
      uint32_t imm;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VPMIN:
          case NEON_3R_VPADD_VQRDMLAH:
          case NEON_3R_VQDMULH_VQRDMULH:
 +        case NEON_3R_FLOAT_ARITH:
              /* Already handled by decodetree */
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              /* 64-bit element instructions: handled by decodetree */
              return 1;
          }
 -        pairwise = 0;
          switch (op) {
 -        case NEON_3R_FLOAT_ARITH:
 -            pairwise = (u && size < 2); /* if VPADD (float) */
 -            if (!pairwise) {
 -                return 1; /* handled by decodetree */
 -            }
 -            break;
          case NEON_3R_FLOAT_MINMAX:
 -            pairwise = u; /* if VPMIN/VPMAX (float) */
 +            if (u) {
 +                return 1; /* VPMIN/VPMAX handled by decodetree */
 +            }
              break;
          case NEON_3R_FLOAT_CMP:
              if (!u && size) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              break;
          }
 -        if (pairwise && q) {
 -            /* All the pairwise insns UNDEF if Q is set */
 -            return 1;
 -        }
 -
          for (pass = 0; pass < (q ? 4 : 2); pass++) {
 -        if (pairwise) {
 -            /* Pairwise.  */
 -            if (pass < 1) {
 -                tmp = neon_load_reg(rn, 0);
 -                tmp2 = neon_load_reg(rn, 1);
 -            } else {
 -                tmp = neon_load_reg(rm, 0);
 -                tmp2 = neon_load_reg(rm, 1);
 -            }
 -        } else {
 -            /* Elementwise.  */
 -            tmp = neon_load_reg(rn, pass);
 -            tmp2 = neon_load_reg(rm, pass);
 -        }
 +        /* Elementwise.  */
 +        tmp = neon_load_reg(rn, pass);
 +        tmp2 = neon_load_reg(rm, pass);
          switch (op) {
 -        case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
 -        {
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            switch ((u << 2) | size) {
 -            case 4: /* VPADD */
 -                gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
 -                break;
 -            default:
 -                abort();
 -            }
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
          case NEON_3R_FLOAT_MULTIPLY:
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          }
          tcg_temp_free_i32(tmp2);
 -        /* Save the result.  For elementwise operations we can put it
 -           straight into the destination register.  For pairwise operations
 -           we have to be careful to avoid clobbering the source operands.  */
 -        if (pairwise && rd == rm) {
 -            neon_store_scratch(pass, tmp);
 -        } else {
 -            neon_store_reg(rd, pass, tmp);
 -        }
 +        neon_store_reg(rd, pass, tmp);
          } /* for pass */
 -        if (pairwise && rd == rm) {
 -            for (pass = 0; pass < (q ? 4 : 2); pass++) {
 -                tmp = neon_load_scratch(pass);
 -                neon_store_reg(rd, pass, tmp);
 -            }
 -        }
          /* End of 3 register same size operations.  */
      } else if (insn & (1 << 4)) {
          if ((insn & 0x00380080) != 0) {
 --
 .20.1

-[Qemu-devel] [PULL 18/39] hw/misc/tz-ppc: Model TrustZone peripheral protection controller
+[PULL 41/45] target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
-Add a model of the TrustZone peripheral protection controller (PPC),
+Convert the Neon integer VMUL, VMLA, and VMLS 3-reg-same inssn to
-which is used to gate transactions to non-TZ-aware peripherals so
+decodetree.
-that secure software can configure them to not be accessible to
-non-secure software.
+We don't have a gvec helper for multiply-accumulate, so VMLA and VMLS
 need a loop function do_3same_fp().  This takes a reads_vd parameter
 to do_3same_fp() which tells it to load the old value into vd before
 calling the callback function, in the same way that the do_vfp_3op_sp()
 and do_vfp_3op_dp() functions in translate-vfp.inc.c work. (The
 only uses in this patch pass reads_vd == true, but later commits
 will use reads_vd == false.)
 This conversion fixes in passing an underdecoding for VMUL
 (originally reported by Fredrik Strupe <fredrik@strupe.net>): bit 1
 of the 'size' field must be 0.  The old decoder didn't enforce this,
 but the decodetree pattern does.
 The gen_VMLA_fp_reg() function performs the addition operation
 with the operands in the opposite order to the old decoder:
 since Neon sets 'default NaN mode' float32_add operations are
 commutative so there is no behaviour difference, but putting
 them this way around matches the Arm ARM pseudocode and the
 required operation order for the subtraction in gen_VMLS_fp_reg().
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-15-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-14-peter.maydell@linaro.org
 ---
- hw/misc/Makefile.objs           |   2 +
+ target/arm/neon-dp.decode       |  3 ++
- include/hw/misc/tz-ppc.h        | 101 ++++++++++++++
+ target/arm/translate-neon.inc.c | 81 +++++++++++++++++++++++++++++++++
- hw/misc/tz-ppc.c                | 302 ++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 17 +------
- default-configs/arm-softmmu.mak |   2 +
+files changed, 85 insertions(+), 16 deletions(-)
  hw/misc/trace-events            |  11 ++
 files changed, 418 insertions(+)
  create mode 100644 include/hw/misc/tz-ppc.h
  create mode 100644 hw/misc/tz-ppc.c
-diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/Makefile.objs
+--- a/target/arm/neon-dp.decode
-+++ b/hw/misc/Makefile.objs
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_MIPS_ITU) += mips_itu.o
+@@ -XXX,XX +XXX,XX @@ VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
- obj-$(CONFIG_MPS2_FPGAIO) += mps2-fpgaio.o
+ VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
- obj-$(CONFIG_MPS2_SCC) += mps2-scc.o
+ VPADD_fp_3s      1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
+ VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
-+obj-$(CONFIG_TZ_PPC) += tz-ppc.o
++VMLA_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
 +VMLS_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 1 .... @3same_fp
 +VMUL_fp_3s       1111 001 1 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
  VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
  VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
  DO_3SAME_VQDMULH(VQDMULH, qdmulh)
  DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
 +static bool do_3same_fp(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn,
 +                        bool reads_vd)
 +{
 +    /*
 +     * FP operations handled elementwise 32 bits at a time.
 +     * If reads_vd is true then the old value of Vd will be
 +     * loaded before calling the callback function. This is
 +     * used for multiply-accumulate type operations.
 +     */
 +    TCGv_i32 tmp, tmp2;
 +    int pass;
 +
- obj-$(CONFIG_PVPANIC) += pvpanic.o
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
  obj-$(CONFIG_HYPERV_TESTDEV) += hyperv_testdev.o
  obj-$(CONFIG_AUX) += auxbus.o
 diff --git a/include/hw/misc/tz-ppc.h b/include/hw/misc/tz-ppc.h
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/include/hw/misc/tz-ppc.h
@@ -XXX,XX +XXX,XX @@
 +/*
 + * ARM TrustZone peripheral protection controller emulation
 + *
 + * Copyright (c) 2018 Linaro Limited
 + * Written by Peter Maydell
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License version 2 or
 + * (at your option) any later version.
 + */
 +
 +/* This is a model of the TrustZone peripheral protection controller (PPC).
 + * It is documented in the ARM CoreLink SIE-200 System IP for Embedded TRM
 + * (DDI 0571G):
 + * https://developer.arm.com/products/architecture/m-profile/docs/ddi0571/g
 + *
 + * The PPC sits in front of peripherals and allows secure software to
 + * configure it to either pass through or reject transactions.
 + * Rejected transactions may be configured to either be aborted, or to
 + * behave as RAZ/WI. An interrupt can be signalled for a rejected transaction.
 + *
 + * The PPC has no register interface -- it is configured purely by a
 + * collection of input signals from other hardware in the system. Typically
 + * they are either hardwired or exposed in an ad-hoc register interface by
 + * the SoC that uses the PPC.
 + *
 + * This QEMU model can be used to model either the AHB5 or APB4 TZ PPC,
 + * since the only difference between them is that the AHB version has a
 + * "default" port which has no security checks applied. In QEMU the default
 + * port can be emulated simply by wiring its downstream devices directly
 + * into the parent address space, since the PPC does not need to intercept
 + * transactions there.
 + *
 + * In the hardware, selection of which downstream port to use is done by
 + * the user's decode logic asserting one of the hsel[] signals. In QEMU,
 + * we provide 16 MMIO regions, one per port, and the user maps these into
 + * the desired addresses to implement the address decode.
 + *
 + * QEMU interface:
 + * + sysbus MMIO regions 0..15: MemoryRegions defining the upstream end
 + *   of each of the 16 ports of the PPC
 + * + Property "port[0..15]": MemoryRegion defining the downstream device(s)
 + *   for each of the 16 ports of the PPC
 + * + Named GPIO inputs "cfg_nonsec[0..15]": set to 1 if the port should be
 + *   accessible to NonSecure transactions
 + * + Named GPIO inputs "cfg_ap[0..15]": set to 1 if the port should be
 + *   accessible to non-privileged transactions
 + * + Named GPIO input "cfg_sec_resp": set to 1 if a rejected transaction should
 + *   result in a transaction error, or 0 for the transaction to RAZ/WI
 + * + Named GPIO input "irq_enable": set to 1 to enable interrupts
 + * + Named GPIO input "irq_clear": set to 1 to clear a pending interrupt
 + * + Named GPIO output "irq": set for a transaction-failed interrupt
 + * + Property "NONSEC_MASK": if a bit is set in this mask then accesses to
 + *   the associated port do not have the TZ security check performed. (This
 + *   corresponds to the hardware allowing this to be set as a Verilog
 + *   parameter.)
 + */
 +
 +#ifndef TZ_PPC_H
 +#define TZ_PPC_H
 +
 +#include "hw/sysbus.h"
 +
 +#define TYPE_TZ_PPC "tz-ppc"
 +#define TZ_PPC(obj) OBJECT_CHECK(TZPPC, (obj), TYPE_TZ_PPC)
 +
 +#define TZ_NUM_PORTS 16
 +
 +typedef struct TZPPC TZPPC;
 +
 +typedef struct TZPPCPort {
 +    TZPPC *ppc;
 +    MemoryRegion upstream;
 +    AddressSpace downstream_as;
 +    MemoryRegion *downstream;
 +} TZPPCPort;
 +
 +struct TZPPC {
 +    /*< private >*/
 +    SysBusDevice parent_obj;
 +
 +    /*< public >*/
 +
 +    /* State: these just track the values of our input signals */
 +    bool cfg_nonsec[TZ_NUM_PORTS];
 +    bool cfg_ap[TZ_NUM_PORTS];
 +    bool cfg_sec_resp;
 +    bool irq_enable;
 +    bool irq_clear;
 +    /* State: are we asserting irq ? */
 +    bool irq_status;
 +
 +    qemu_irq irq;
 +
 +    /* Properties */
 +    uint32_t nonsec_mask;
 +
 +    TZPPCPort port[TZ_NUM_PORTS];
 +};
 +
 +#endif
 diff --git a/hw/misc/tz-ppc.c b/hw/misc/tz-ppc.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/hw/misc/tz-ppc.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + * ARM TrustZone peripheral protection controller emulation
 + *
 + * Copyright (c) 2018 Linaro Limited
 + * Written by Peter Maydell
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License version 2 or
 + * (at your option) any later version.
 + */
 +
 +#include "qemu/osdep.h"
 +#include "qemu/log.h"
 +#include "qapi/error.h"
 +#include "trace.h"
 +#include "hw/sysbus.h"
 +#include "hw/registerfields.h"
 +#include "hw/misc/tz-ppc.h"
 +
 +static void tz_ppc_update_irq(TZPPC *s)
 +{
 +    bool level = s->irq_status && s->irq_enable;
 +
 +    trace_tz_ppc_update_irq(level);
 +    qemu_set_irq(s->irq, level);
 +}
 +
 +static void tz_ppc_cfg_nonsec(void *opaque, int n, int level)
 +{
 +    TZPPC *s = TZ_PPC(opaque);
 +
 +    assert(n < TZ_NUM_PORTS);
 +    trace_tz_ppc_cfg_nonsec(n, level);
 +    s->cfg_nonsec[n] = level;
 +}
 +
 +static void tz_ppc_cfg_ap(void *opaque, int n, int level)
 +{
 +    TZPPC *s = TZ_PPC(opaque);
 +
 +    assert(n < TZ_NUM_PORTS);
 +    trace_tz_ppc_cfg_ap(n, level);
 +    s->cfg_ap[n] = level;
 +}
 +
 +static void tz_ppc_cfg_sec_resp(void *opaque, int n, int level)
 +{
 +    TZPPC *s = TZ_PPC(opaque);
 +
 +    trace_tz_ppc_cfg_sec_resp(level);
 +    s->cfg_sec_resp = level;
 +}
 +
 +static void tz_ppc_irq_enable(void *opaque, int n, int level)
 +{
 +    TZPPC *s = TZ_PPC(opaque);
 +
 +    trace_tz_ppc_irq_enable(level);
 +    s->irq_enable = level;
 +    tz_ppc_update_irq(s);
 +}
 +
 +static void tz_ppc_irq_clear(void *opaque, int n, int level)
 +{
 +    TZPPC *s = TZ_PPC(opaque);
 +
 +    trace_tz_ppc_irq_clear(level);
 +
 +    s->irq_clear = level;
 +    if (level) {
 +        s->irq_status = false;
 +        tz_ppc_update_irq(s);
 +    }
 +}
 +
 +static bool tz_ppc_check(TZPPC *s, int n, MemTxAttrs attrs)
 +{
 +    /* Check whether to allow an access to port n; return true if
 +     * the check passes, and false if the transaction must be blocked.
 +     * If the latter, the caller must check cfg_sec_resp to determine
 +     * whether to abort or RAZ/WI the transaction.
 +     * The checks are:
 +     *  + nonsec_mask suppresses any check of the secure attribute
 +     *  + otherwise, block if cfg_nonsec is 1 and transaction is secure,
 +     *    or if cfg_nonsec is 0 and transaction is non-secure
 +     *  + block if transaction is usermode and cfg_ap is 0
 +     */
 +    if ((attrs.secure == s->cfg_nonsec[n] && !(s->nonsec_mask & (1 << n))) ||
 +        (attrs.user && !s->cfg_ap[n])) {
 +        /* Block the transaction. */
 +        if (!s->irq_clear) {
 +            /* Note that holding irq_clear high suppresses interrupts */
 +            s->irq_status = true;
 +            tz_ppc_update_irq(s);
 +        }
 +        return false;
 +    }
++
++    /* UNDEF accesses to D16-D31 if they don't exist. */
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
++        ((a->vd | a->vn | a->vm) & 0x10)) {
++        return false;
++    }
++
++    if ((a->vn | a->vm | a->vd) & a->q) {
++        return false;
++    }
++
++    if (!vfp_access_check(s)) {
++        return true;
++    }
++
++    TCGv_ptr fpstatus = get_fpstatus_ptr(1);
++    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
++        tmp = neon_load_reg(a->vn, pass);
++        tmp2 = neon_load_reg(a->vm, pass);
++        if (reads_vd) {
++            TCGv_i32 tmp_rd = neon_load_reg(a->vd, pass);
++            fn(tmp_rd, tmp, tmp2, fpstatus);
++            neon_store_reg(a->vd, pass, tmp_rd);
++            tcg_temp_free_i32(tmp);
++        } else {
++            fn(tmp, tmp, tmp2, fpstatus);
++            neon_store_reg(a->vd, pass, tmp);
++        }
++        tcg_temp_free_i32(tmp2);
++    }
++    tcg_temp_free_ptr(fpstatus);
 +    return true;
 +}
 +
-+static MemTxResult tz_ppc_read(void *opaque, hwaddr addr, uint64_t *pdata,
+ /*
-+                               unsigned size, MemTxAttrs attrs)
+  * For all the functions using this macro, size == 1 means fp16,
-+{
+  * which is an architecture extension we don't implement yet.
-+    TZPPCPort *p = opaque;
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
-+    TZPPC *s = p->ppc;
+ DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
-+    int n = p - s->port;
+ DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
-+    AddressSpace *as = &p->downstream_as;
+ DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
-+    uint64_t data;
++DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
 +    MemTxResult res;
 +
-+    if (!tz_ppc_check(s, n, attrs)) {
++/*
-+        trace_tz_ppc_read_blocked(n, addr, attrs.secure, attrs.user);
++ * For all the functions using this macro, size == 1 means fp16,
-+        if (s->cfg_sec_resp) {
++ * which is an architecture extension we don't implement yet.
-+            return MEMTX_ERROR;
++ */
-+        } else {
++#define DO_3S_FP(INSN,FUNC,READS_VD)                                \
-+            *pdata = 0;
++    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
-+            return MEMTX_OK;
++    {                                                               \
-+        }
++        if (a->size != 0) {                                         \
 +            /* TODO fp16 support */                                 \
 +            return false;                                           \
 +        }                                                           \
 +        return do_3same_fp(s, a, FUNC, READS_VD);                   \
 +    }
 +
-+    switch (size) {
++static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
-+    case 1:
++                            TCGv_ptr fpstatus)
-+        data = address_space_ldub(as, addr, attrs, &res);
++{
-+        break;
++    gen_helper_vfp_muls(vn, vn, vm, fpstatus);
-+    case 2:
++    gen_helper_vfp_adds(vd, vd, vn, fpstatus);
 +        data = address_space_lduw_le(as, addr, attrs, &res);
 +        break;
 +    case 4:
 +        data = address_space_ldl_le(as, addr, attrs, &res);
 +        break;
 +    case 8:
 +        data = address_space_ldq_le(as, addr, attrs, &res);
 +        break;
 +    default:
 +        g_assert_not_reached();
 +    }
 +    *pdata = data;
 +    return res;
 +}
 +
-+static MemTxResult tz_ppc_write(void *opaque, hwaddr addr, uint64_t val,
++static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
-+                                unsigned size, MemTxAttrs attrs)
++                            TCGv_ptr fpstatus)
 +{
-+    TZPPCPort *p = opaque;
++    gen_helper_vfp_muls(vn, vn, vm, fpstatus);
-+    TZPPC *s = p->ppc;
++    gen_helper_vfp_subs(vd, vd, vn, fpstatus);
 +    AddressSpace *as = &p->downstream_as;
 +    int n = p - s->port;
 +    MemTxResult res;
 +
 +    if (!tz_ppc_check(s, n, attrs)) {
 +        trace_tz_ppc_write_blocked(n, addr, attrs.secure, attrs.user);
 +        if (s->cfg_sec_resp) {
 +            return MEMTX_ERROR;
 +        } else {
 +            return MEMTX_OK;
 +        }
 +    }
 +
 +    switch (size) {
 +    case 1:
 +        address_space_stb(as, addr, val, attrs, &res);
 +        break;
 +    case 2:
 +        address_space_stw_le(as, addr, val, attrs, &res);
 +        break;
 +    case 4:
 +        address_space_stl_le(as, addr, val, attrs, &res);
 +        break;
 +    case 8:
 +        address_space_stq_le(as, addr, val, attrs, &res);
 +        break;
 +    default:
 +        g_assert_not_reached();
 +    }
 +    return res;
 +}
 +
-+static const MemoryRegionOps tz_ppc_ops = {
++DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
-+    .read_with_attrs = tz_ppc_read,
++DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
-+    .write_with_attrs = tz_ppc_write,
-+    .endianness = DEVICE_LITTLE_ENDIAN,
+ static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
-+};
+ {
-+
+diff --git a/target/arm/translate.c b/target/arm/translate.c
 +static void tz_ppc_reset(DeviceState *dev)
 +{
 +    TZPPC *s = TZ_PPC(dev);
 +
 +    trace_tz_ppc_reset();
 +    s->cfg_sec_resp = false;
 +    memset(s->cfg_nonsec, 0, sizeof(s->cfg_nonsec));
 +    memset(s->cfg_ap, 0, sizeof(s->cfg_ap));
 +}
 +
 +static void tz_ppc_init(Object *obj)
 +{
 +    DeviceState *dev = DEVICE(obj);
 +    TZPPC *s = TZ_PPC(obj);
 +
 +    qdev_init_gpio_in_named(dev, tz_ppc_cfg_nonsec, "cfg_nonsec", TZ_NUM_PORTS);
 +    qdev_init_gpio_in_named(dev, tz_ppc_cfg_ap, "cfg_ap", TZ_NUM_PORTS);
 +    qdev_init_gpio_in_named(dev, tz_ppc_cfg_sec_resp, "cfg_sec_resp", 1);
 +    qdev_init_gpio_in_named(dev, tz_ppc_irq_enable, "irq_enable", 1);
 +    qdev_init_gpio_in_named(dev, tz_ppc_irq_clear, "irq_clear", 1);
 +    qdev_init_gpio_out_named(dev, &s->irq, "irq", 1);
 +}
 +
 +static void tz_ppc_realize(DeviceState *dev, Error **errp)
 +{
 +    Object *obj = OBJECT(dev);
 +    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
 +    TZPPC *s = TZ_PPC(dev);
 +    int i;
 +
 +    /* We can't create the upstream end of the port until realize,
 +     * as we don't know the size of the MR used as the downstream until then.
 +     */
 +    for (i = 0; i < TZ_NUM_PORTS; i++) {
 +        TZPPCPort *port = &s->port[i];
 +        char *name;
 +        uint64_t size;
 +
 +        if (!port->downstream) {
 +            continue;
 +        }
 +
 +        name = g_strdup_printf("tz-ppc-port[%d]", i);
 +
 +        port->ppc = s;
 +        address_space_init(&port->downstream_as, port->downstream, name);
 +
 +        size = memory_region_size(port->downstream);
 +        memory_region_init_io(&port->upstream, obj, &tz_ppc_ops,
 +                              port, name, size);
 +        sysbus_init_mmio(sbd, &port->upstream);
 +        g_free(name);
 +    }
 +}
 +
 +static const VMStateDescription tz_ppc_vmstate = {
 +    .name = "tz-ppc",
 +    .version_id = 1,
 +    .minimum_version_id = 1,
 +    .fields = (VMStateField[]) {
 +        VMSTATE_BOOL_ARRAY(cfg_nonsec, TZPPC, 16),
 +        VMSTATE_BOOL_ARRAY(cfg_ap, TZPPC, 16),
 +        VMSTATE_BOOL(cfg_sec_resp, TZPPC),
 +        VMSTATE_BOOL(irq_enable, TZPPC),
 +        VMSTATE_BOOL(irq_clear, TZPPC),
 +        VMSTATE_BOOL(irq_status, TZPPC),
 +        VMSTATE_END_OF_LIST()
 +    }
 +};
 +
 +#define DEFINE_PORT(N)                                          \
 +    DEFINE_PROP_LINK("port[" #N "]", TZPPC, port[N].downstream, \
 +                     TYPE_MEMORY_REGION, MemoryRegion *)
 +
 +static Property tz_ppc_properties[] = {
 +    DEFINE_PROP_UINT32("NONSEC_MASK", TZPPC, nonsec_mask, 0),
 +    DEFINE_PORT(0),
 +    DEFINE_PORT(1),
 +    DEFINE_PORT(2),
 +    DEFINE_PORT(3),
 +    DEFINE_PORT(4),
 +    DEFINE_PORT(5),
 +    DEFINE_PORT(6),
 +    DEFINE_PORT(7),
 +    DEFINE_PORT(8),
 +    DEFINE_PORT(9),
 +    DEFINE_PORT(10),
 +    DEFINE_PORT(11),
 +    DEFINE_PORT(12),
 +    DEFINE_PORT(13),
 +    DEFINE_PORT(14),
 +    DEFINE_PORT(15),
 +    DEFINE_PROP_END_OF_LIST(),
 +};
 +
 +static void tz_ppc_class_init(ObjectClass *klass, void *data)
 +{
 +    DeviceClass *dc = DEVICE_CLASS(klass);
 +
 +    dc->realize = tz_ppc_realize;
 +    dc->vmsd = &tz_ppc_vmstate;
 +    dc->reset = tz_ppc_reset;
 +    dc->props = tz_ppc_properties;
 +}
 +
 +static const TypeInfo tz_ppc_info = {
 +    .name = TYPE_TZ_PPC,
 +    .parent = TYPE_SYS_BUS_DEVICE,
 +    .instance_size = sizeof(TZPPC),
 +    .instance_init = tz_ppc_init,
 +    .class_init = tz_ppc_class_init,
 +};
 +
 +static void tz_ppc_register_types(void)
 +{
 +    type_register_static(&tz_ppc_info);
 +}
 +
 +type_init(tz_ppc_register_types);
 diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
 index XXXXXXX..XXXXXXX 100644
---- a/default-configs/arm-softmmu.mak
+--- a/target/arm/translate.c
-+++ b/default-configs/arm-softmmu.mak
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ CONFIG_CMSDK_APB_UART=y
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
- CONFIG_MPS2_FPGAIO=y
+         case NEON_3R_VPADD_VQRDMLAH:
- CONFIG_MPS2_SCC=y
+         case NEON_3R_VQDMULH_VQRDMULH:
+         case NEON_3R_FLOAT_ARITH:
-+CONFIG_TZ_PPC=y
++        case NEON_3R_FLOAT_MULTIPLY:
-+
+             /* Already handled by decodetree */
- CONFIG_VERSATILE_PCI=y
+             return 1;
- CONFIG_VERSATILE_I2C=y
+         }
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-diff --git a/hw/misc/trace-events b/hw/misc/trace-events
+         tmp = neon_load_reg(rn, pass);
-index XXXXXXX..XXXXXXX 100644
+         tmp2 = neon_load_reg(rm, pass);
---- a/hw/misc/trace-events
+         switch (op) {
-+++ b/hw/misc/trace-events
+-        case NEON_3R_FLOAT_MULTIPLY:
-@@ -XXX,XX +XXX,XX @@ mos6522_get_next_irq_time(uint16_t latch, int64_t d, int64_t delta) "latch=%d co
+-        {
- mos6522_set_sr_int(void) "set sr_int"
+-            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- mos6522_write(uint64_t addr, uint64_t val) "reg=0x%"PRIx64 " val=0x%"PRIx64
+-            gen_helper_vfp_muls(tmp, tmp, tmp2, fpstatus);
- mos6522_read(uint64_t addr, unsigned val) "reg=0x%"PRIx64 " val=0x%x"
+-            if (!u) {
-+
+-                tcg_temp_free_i32(tmp2);
-+# hw/misc/tz-ppc.c
+-                tmp2 = neon_load_reg(rd, pass);
-+tz_ppc_reset(void) "TZ PPC: reset"
+-                if (size == 0) {
-+tz_ppc_cfg_nonsec(int n, int level) "TZ PPC: cfg_nonsec[%d] = %d"
+-                    gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
-+tz_ppc_cfg_ap(int n, int level) "TZ PPC: cfg_ap[%d] = %d"
+-                } else {
-+tz_ppc_cfg_sec_resp(int level) "TZ PPC: cfg_sec_resp = %d"
+-                    gen_helper_vfp_subs(tmp, tmp2, tmp, fpstatus);
-+tz_ppc_irq_enable(int level) "TZ PPC: int_enable = %d"
+-                }
-+tz_ppc_irq_clear(int level) "TZ PPC: int_clear = %d"
+-            }
-+tz_ppc_update_irq(int level) "TZ PPC: setting irq line to %d"
+-            tcg_temp_free_ptr(fpstatus);
-+tz_ppc_read_blocked(int n, hwaddr offset, bool secure, bool user) "TZ PPC: port %d offset 0x%" HWADDR_PRIx " read (secure %d user %d) blocked"
+-            break;
-+tz_ppc_write_blocked(int n, hwaddr offset, bool secure, bool user) "TZ PPC: port %d offset 0x%" HWADDR_PRIx " write (secure %d user %d) blocked"
+-        }
          case NEON_3R_FLOAT_CMP:
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 07/39] hw/arm/armv7m: Honour CPU's address space for image loads
+[PULL 42/45] target/arm: Convert Neon 3-reg-same compare insns to decodetree
-Instead of loading guest images to the system address space, use the
+Convert the Neon integer 3-reg-same compare insns VCGE, VCGT,
-CPU's address space.  This is important if we're trying to load the
+VCEQ, VACGE and VACGT to decodetree.
 file to memory or via an alias memory region that is provided by an
 SoC object and thus not mapped into the system address space.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-4-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-15-peter.maydell@linaro.org
 ---
- hw/arm/armv7m.c | 17 ++++++++++++++---
+ target/arm/neon-dp.decode       |  5 +++++
-file changed, 14 insertions(+), 3 deletions(-)
+ target/arm/translate-neon.inc.c |  6 +++++
  target/arm/translate.c          | 39 ++-------------------------------
 files changed, 13 insertions(+), 37 deletions(-)
-diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/armv7m.c
+--- a/target/arm/neon-dp.decode
-+++ b/hw/arm/armv7m.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size)
+@@ -XXX,XX +XXX,XX @@ VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
-     uint64_t entry;
+ VMLA_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
-     uint64_t lowaddr;
+ VMLS_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 1 .... @3same_fp
-     int big_endian;
+ VMUL_fp_3s       1111 001 1 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
-+    AddressSpace *as;
++VCEQ_fp_3s       1111 001 0 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
-+    int asidx;
++VCGE_fp_3s       1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
-+    CPUState *cs = CPU(cpu);
++VACGE_fp_3s      1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
++VCGT_fp_3s       1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
- #ifdef TARGET_WORDS_BIGENDIAN
++VACGT_fp_3s      1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
-     big_endian = 1;
+ VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
-@@ -XXX,XX +XXX,XX @@ void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size)
+ VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
-         exit(1);
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
          return do_3same_fp(s, a, FUNC, READS_VD);                   \
      }
-+    if (arm_feature(&cpu->env, ARM_FEATURE_EL3)) {
++DO_3S_FP(VCEQ, gen_helper_neon_ceq_f32, false)
-+        asidx = ARMASIdx_S;
++DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
-+    } else {
++DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
-+        asidx = ARMASIdx_NS;
++DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
-+    }
++DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
 +    as = cpu_get_address_space(cs, asidx);
 +
-     if (kernel_filename) {
+ static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
--        image_size = load_elf(kernel_filename, NULL, NULL, &entry, &lowaddr,
+                             TCGv_ptr fpstatus)
--                              NULL, big_endian, EM_ARM, 1, 0);
+ {
-+        image_size = load_elf_as(kernel_filename, NULL, NULL, &entry, &lowaddr,
+diff --git a/target/arm/translate.c b/target/arm/translate.c
-+                                 NULL, big_endian, EM_ARM, 1, 0, as);
+index XXXXXXX..XXXXXXX 100644
-         if (image_size < 0) {
+--- a/target/arm/translate.c
--            image_size = load_image_targphys(kernel_filename, 0, mem_size);
++++ b/target/arm/translate.c
-+            image_size = load_image_targphys_as(kernel_filename, 0,
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+                                                mem_size, as);
+         case NEON_3R_VQDMULH_VQRDMULH:
-             lowaddr = 0;
+         case NEON_3R_FLOAT_ARITH:
          case NEON_3R_FLOAT_MULTIPLY:
 +        case NEON_3R_FLOAT_CMP:
 +        case NEON_3R_FLOAT_ACMP:
              /* Already handled by decodetree */
              return 1;
          }
-         if (image_size < 0) {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  return 1; /* VPMIN/VPMAX handled by decodetree */
              }
              break;
 -        case NEON_3R_FLOAT_CMP:
 -            if (!u && size) {
 -                /* no encoding for U=0 C=1x */
 -                return 1;
 -            }
 -            break;
 -        case NEON_3R_FLOAT_ACMP:
 -            if (!u) {
 -                return 1;
 -            }
 -            break;
          case NEON_3R_FLOAT_MISC:
              /* VMAXNM/VMINNM in ARMv8 */
              if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          tmp = neon_load_reg(rn, pass);
          tmp2 = neon_load_reg(rm, pass);
          switch (op) {
 -        case NEON_3R_FLOAT_CMP:
 -        {
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            if (!u) {
 -                gen_helper_neon_ceq_f32(tmp, tmp, tmp2, fpstatus);
 -            } else {
 -                if (size == 0) {
 -                    gen_helper_neon_cge_f32(tmp, tmp, tmp2, fpstatus);
 -                } else {
 -                    gen_helper_neon_cgt_f32(tmp, tmp, tmp2, fpstatus);
 -                }
 -            }
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
 -        case NEON_3R_FLOAT_ACMP:
 -        {
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            if (size == 0) {
 -                gen_helper_neon_acge_f32(tmp, tmp, tmp2, fpstatus);
 -            } else {
 -                gen_helper_neon_acgt_f32(tmp, tmp, tmp2, fpstatus);
 -            }
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
          case NEON_3R_FLOAT_MINMAX:
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 15/39] qdev: Add new qdev_init_gpio_in_named_with_opaque()
+[PULL 43/45] target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
-The function qdev_init_gpio_in_named() passes the DeviceState pointer
+The usual location for the env argument in the argument list of a TCG helper
-as the opaque data pointor for the irq handler function.  Usually
+is immediately after the return-value argument. recps_f32 and rsqrts_f32
-this is what you want, but in some cases it would be helpful to use
+differ in that they put it at the end.
 some other data pointer.
-Add a new function qdev_init_gpio_in_named_with_opaque() which allows
+Move the env argument to its usual place; this will allow us to
-the caller to specify the data pointer they want.
+more easily use these helper functions with the gvec APIs.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-12-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-16-peter.maydell@linaro.org
 ---
- include/hw/qdev-core.h | 30 ++++++++++++++++++++++++++++--
+ target/arm/helper.h     | 4 ++--
- hw/core/qdev.c         |  8 +++++---
+ target/arm/translate.c  | 4 ++--
-files changed, 33 insertions(+), 5 deletions(-)
+ target/arm/vfp_helper.c | 4 ++--
 files changed, 6 insertions(+), 6 deletions(-)
-diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
+diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/qdev-core.h
+--- a/target/arm/helper.h
-+++ b/include/hw/qdev-core.h
++++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ BusState *qdev_get_child_bus(DeviceState *dev, const char *name);
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, ptr, i32)
- /* GPIO inputs also double as IRQ sinks.  */
+ DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
- void qdev_init_gpio_in(DeviceState *dev, qemu_irq_handler handler, int n);
+ DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
- void qdev_init_gpio_out(DeviceState *dev, qemu_irq *pins, int n);
--void qdev_init_gpio_in_named(DeviceState *dev, qemu_irq_handler handler,
+-DEF_HELPER_3(recps_f32, f32, f32, f32, env)
--                             const char *name, int n);
+-DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env)
- void qdev_init_gpio_out_named(DeviceState *dev, qemu_irq *pins,
++DEF_HELPER_3(recps_f32, f32, env, f32, f32)
-                               const char *name, int n);
++DEF_HELPER_3(rsqrts_f32, f32, env, f32, f32)
-+/**
+ DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
-+ * qdev_init_gpio_in_named_with_opaque: create an array of input GPIO lines
+ DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
-+ *   for the specified device
+ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
-+ *
+diff --git a/target/arm/translate.c b/target/arm/translate.c
 + * @dev: Device to create input GPIOs for
 + * @handler: Function to call when GPIO line value is set
 + * @opaque: Opaque data pointer to pass to @handler
 + * @name: Name of the GPIO input (must be unique for this device)
 + * @n: Number of GPIO lines in this input set
 + */
 +void qdev_init_gpio_in_named_with_opaque(DeviceState *dev,
 +                                         qemu_irq_handler handler,
 +                                         void *opaque,
 +                                         const char *name, int n);
 +
 +/**
 + * qdev_init_gpio_in_named: create an array of input GPIO lines
 + *   for the specified device
 + *
 + * Like qdev_init_gpio_in_named_with_opaque(), but the opaque pointer
 + * passed to the handler is @dev (which is the most commonly desired behaviour).
 + */
 +static inline void qdev_init_gpio_in_named(DeviceState *dev,
 +                                           qemu_irq_handler handler,
 +                                           const char *name, int n)
 +{
 +    qdev_init_gpio_in_named_with_opaque(dev, handler, dev, name, n);
 +}
  void qdev_pass_gpios(DeviceState *dev, DeviceState *container,
                       const char *name);
 diff --git a/hw/core/qdev.c b/hw/core/qdev.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/core/qdev.c
+--- a/target/arm/translate.c
-+++ b/hw/core/qdev.c
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static NamedGPIOList *qdev_get_named_gpio_list(DeviceState *dev,
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     return ngl;
+                 tcg_temp_free_ptr(fpstatus);
              } else {
                  if (size == 0) {
 -                    gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env);
 +                    gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
                  } else {
 -                    gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env);
 +                    gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
                }
              }
              break;
 diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vfp_helper.c
 +++ b/target/arm/vfp_helper.c
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
  #define float32_three make_float32(0x40400000)
  #define float32_one_point_five make_float32(0x3fc00000)
 -float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
 +float32 HELPER(recps_f32)(CPUARMState *env, float32 a, float32 b)
  {
      float_status *s = &env->vfp.standard_fp_status;
      if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
@@ -XXX,XX +XXX,XX @@ float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
      return float32_sub(float32_two, float32_mul(a, b, s), s);
  }
--void qdev_init_gpio_in_named(DeviceState *dev, qemu_irq_handler handler,
+-float32 HELPER(rsqrts_f32)(float32 a, float32 b, CPUARMState *env)
--                             const char *name, int n)
++float32 HELPER(rsqrts_f32)(CPUARMState *env, float32 a, float32 b)
 +void qdev_init_gpio_in_named_with_opaque(DeviceState *dev,
 +                                         qemu_irq_handler handler,
 +                                         void *opaque,
 +                                         const char *name, int n)
  {
-     int i;
+     float_status *s = &env->vfp.standard_fp_status;
-     NamedGPIOList *gpio_list = qdev_get_named_gpio_list(dev, name);
+     float32 product;
      assert(gpio_list->num_out == 0 || !name);
      gpio_list->in = qemu_extend_irqs(gpio_list->in, gpio_list->num_in, handler,
 -                                     dev, n);
 +                                     opaque, n);
      if (!name) {
          name = "unnamed-gpio-in";
 --
-.16.2
+.20.1

-[Qemu-devel] [PULL 19/39] hw/misc/iotkit-secctl: Arm IoT Kit security controller initial skeleton
+[PULL 44/45] target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
-The Arm IoT Kit includes a "security controller" which is largely a
+Convert the Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS 3-reg-same
-collection of registers for controlling the PPCs and other bits of
+insns to decodetree. (These are all the remaining non-accumulation
-glue in the system.  This commit provides the initial skeleton of the
+instructions in this group.)
 device, implementing just the ID registers, and a couple of read-only
 read-as-zero registers.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20180220180325.29818-16-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-17-peter.maydell@linaro.org
 ---
- hw/misc/Makefile.objs           |   1 +
+ target/arm/neon-dp.decode       |  6 +++
- include/hw/misc/iotkit-secctl.h |  39 ++++
+ target/arm/translate-neon.inc.c | 70 +++++++++++++++++++++++++++++++++
- hw/misc/iotkit-secctl.c         | 448 ++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 42 +-------------------
- default-configs/arm-softmmu.mak |   1 +
+files changed, 78 insertions(+), 40 deletions(-)
  hw/misc/trace-events            |   7 +
 files changed, 496 insertions(+)
  create mode 100644 include/hw/misc/iotkit-secctl.h
  create mode 100644 hw/misc/iotkit-secctl.c
-diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/Makefile.objs
+--- a/target/arm/neon-dp.decode
-+++ b/hw/misc/Makefile.objs
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_MPS2_FPGAIO) += mps2-fpgaio.o
+@@ -XXX,XX +XXX,XX @@ VCGE_fp_3s       1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
- obj-$(CONFIG_MPS2_SCC) += mps2-scc.o
+ VACGE_fp_3s      1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
+ VCGT_fp_3s       1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
- obj-$(CONFIG_TZ_PPC) += tz-ppc.o
+ VACGT_fp_3s      1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
-+obj-$(CONFIG_IOTKIT_SECCTL) += iotkit-secctl.o
++VMAX_fp_3s       1111 001 0 0 . 0 . .... .... 1111 ... 0 .... @3same_fp
++VMIN_fp_3s       1111 001 0 0 . 1 . .... .... 1111 ... 0 .... @3same_fp
- obj-$(CONFIG_PVPANIC) += pvpanic.o
+ VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
- obj-$(CONFIG_HYPERV_TESTDEV) += hyperv_testdev.o
+ VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
-diff --git a/include/hw/misc/iotkit-secctl.h b/include/hw/misc/iotkit-secctl.h
++VRECPS_fp_3s     1111 001 0 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
-new file mode 100644
++VRSQRTS_fp_3s    1111 001 0 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
-index XXXXXXX..XXXXXXX
++VMAXNM_fp_3s     1111 001 1 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
---- /dev/null
++VMINNM_fp_3s     1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
-+++ b/include/hw/misc/iotkit-secctl.h
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@
+index XXXXXXX..XXXXXXX 100644
-+/*
+--- a/target/arm/translate-neon.inc.c
-+ * ARM IoT Kit security controller
++++ b/target/arm/translate-neon.inc.c
-+ *
+@@ -XXX,XX +XXX,XX @@ DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
-+ * Copyright (c) 2018 Linaro Limited
+ DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
-+ * Written by Peter Maydell
+ DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
-+ *
+ DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
-+ * This program is free software; you can redistribute it and/or modify
++DO_3S_FP(VMAX, gen_helper_vfp_maxs, false)
-+ * it under the terms of the GNU General Public License version 2 or
++DO_3S_FP(VMIN, gen_helper_vfp_mins, false)
-+ * (at your option) any later version.
-+ */
+ static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
-+
+                             TCGv_ptr fpstatus)
-+/* This is a model of the security controller which is part of the
+@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
-+ * Arm IoT Kit and documented in
+ DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
-+ * http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ecm0601256/index.html
+ DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
-+ *
-+ * QEMU interface:
++static bool trans_VMAXNM_fp_3s(DisasContext *s, arg_3same *a)
 + *  + sysbus MMIO region 0 is the "secure privilege control block" registers
 + *  + sysbus MMIO region 1 is the "non-secure privilege control block" registers
 + */
 +
 +#ifndef IOTKIT_SECCTL_H
 +#define IOTKIT_SECCTL_H
 +
 +#include "hw/sysbus.h"
 +
 +#define TYPE_IOTKIT_SECCTL "iotkit-secctl"
 +#define IOTKIT_SECCTL(obj) OBJECT_CHECK(IoTKitSecCtl, (obj), TYPE_IOTKIT_SECCTL)
 +
 +typedef struct IoTKitSecCtl {
 +    /*< private >*/
 +    SysBusDevice parent_obj;
 +
 +    /*< public >*/
 +
 +    MemoryRegion s_regs;
 +    MemoryRegion ns_regs;
 +} IoTKitSecCtl;
 +
 +#endif
 diff --git a/hw/misc/iotkit-secctl.c b/hw/misc/iotkit-secctl.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/hw/misc/iotkit-secctl.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + * Arm IoT Kit security controller
 + *
 + * Copyright (c) 2018 Linaro Limited
 + * Written by Peter Maydell
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License version 2 or
 + * (at your option) any later version.
 + */
 +
 +#include "qemu/osdep.h"
 +#include "qemu/log.h"
 +#include "qapi/error.h"
 +#include "trace.h"
 +#include "hw/sysbus.h"
 +#include "hw/registerfields.h"
 +#include "hw/misc/iotkit-secctl.h"
 +
 +/* Registers in the secure privilege control block */
 +REG32(SECRESPCFG, 0x10)
 +REG32(NSCCFG, 0x14)
 +REG32(SECMPCINTSTATUS, 0x1c)
 +REG32(SECPPCINTSTAT, 0x20)
 +REG32(SECPPCINTCLR, 0x24)
 +REG32(SECPPCINTEN, 0x28)
 +REG32(SECMSCINTSTAT, 0x30)
 +REG32(SECMSCINTCLR, 0x34)
 +REG32(SECMSCINTEN, 0x38)
 +REG32(BRGINTSTAT, 0x40)
 +REG32(BRGINTCLR, 0x44)
 +REG32(BRGINTEN, 0x48)
 +REG32(AHBNSPPC0, 0x50)
 +REG32(AHBNSPPCEXP0, 0x60)
 +REG32(AHBNSPPCEXP1, 0x64)
 +REG32(AHBNSPPCEXP2, 0x68)
 +REG32(AHBNSPPCEXP3, 0x6c)
 +REG32(APBNSPPC0, 0x70)
 +REG32(APBNSPPC1, 0x74)
 +REG32(APBNSPPCEXP0, 0x80)
 +REG32(APBNSPPCEXP1, 0x84)
 +REG32(APBNSPPCEXP2, 0x88)
 +REG32(APBNSPPCEXP3, 0x8c)
 +REG32(AHBSPPPC0, 0x90)
 +REG32(AHBSPPPCEXP0, 0xa0)
 +REG32(AHBSPPPCEXP1, 0xa4)
 +REG32(AHBSPPPCEXP2, 0xa8)
 +REG32(AHBSPPPCEXP3, 0xac)
 +REG32(APBSPPPC0, 0xb0)
 +REG32(APBSPPPC1, 0xb4)
 +REG32(APBSPPPCEXP0, 0xc0)
 +REG32(APBSPPPCEXP1, 0xc4)
 +REG32(APBSPPPCEXP2, 0xc8)
 +REG32(APBSPPPCEXP3, 0xcc)
 +REG32(NSMSCEXP, 0xd0)
 +REG32(PID4, 0xfd0)
 +REG32(PID5, 0xfd4)
 +REG32(PID6, 0xfd8)
 +REG32(PID7, 0xfdc)
 +REG32(PID0, 0xfe0)
 +REG32(PID1, 0xfe4)
 +REG32(PID2, 0xfe8)
 +REG32(PID3, 0xfec)
 +REG32(CID0, 0xff0)
 +REG32(CID1, 0xff4)
 +REG32(CID2, 0xff8)
 +REG32(CID3, 0xffc)
 +
 +/* Registers in the non-secure privilege control block */
 +REG32(AHBNSPPPC0, 0x90)
 +REG32(AHBNSPPPCEXP0, 0xa0)
 +REG32(AHBNSPPPCEXP1, 0xa4)
 +REG32(AHBNSPPPCEXP2, 0xa8)
 +REG32(AHBNSPPPCEXP3, 0xac)
 +REG32(APBNSPPPC0, 0xb0)
 +REG32(APBNSPPPC1, 0xb4)
 +REG32(APBNSPPPCEXP0, 0xc0)
 +REG32(APBNSPPPCEXP1, 0xc4)
 +REG32(APBNSPPPCEXP2, 0xc8)
 +REG32(APBNSPPPCEXP3, 0xcc)
 +/* PID and CID registers are also present in the NS block */
 +
 +static const uint8_t iotkit_secctl_s_idregs[] = {
 +    0x04, 0x00, 0x00, 0x00,
 +    0x52, 0xb8, 0x0b, 0x00,
 +    0x0d, 0xf0, 0x05, 0xb1,
 +};
 +
 +static const uint8_t iotkit_secctl_ns_idregs[] = {
 +    0x04, 0x00, 0x00, 0x00,
 +    0x53, 0xb8, 0x0b, 0x00,
 +    0x0d, 0xf0, 0x05, 0xb1,
 +};
 +
 +static MemTxResult iotkit_secctl_s_read(void *opaque, hwaddr addr,
 +                                        uint64_t *pdata,
 +                                        unsigned size, MemTxAttrs attrs)
 +{
-+    uint64_t r;
++    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
-+    uint32_t offset = addr & ~0x3;
++        return false;
 +
 +    switch (offset) {
 +    case A_AHBNSPPC0:
 +    case A_AHBSPPPC0:
 +        r = 0;
 +        break;
 +    case A_SECRESPCFG:
 +    case A_NSCCFG:
 +    case A_SECMPCINTSTATUS:
 +    case A_SECPPCINTSTAT:
 +    case A_SECPPCINTEN:
 +    case A_SECMSCINTSTAT:
 +    case A_SECMSCINTEN:
 +    case A_BRGINTSTAT:
 +    case A_BRGINTEN:
 +    case A_AHBNSPPCEXP0:
 +    case A_AHBNSPPCEXP1:
 +    case A_AHBNSPPCEXP2:
 +    case A_AHBNSPPCEXP3:
 +    case A_APBNSPPC0:
 +    case A_APBNSPPC1:
 +    case A_APBNSPPCEXP0:
 +    case A_APBNSPPCEXP1:
 +    case A_APBNSPPCEXP2:
 +    case A_APBNSPPCEXP3:
 +    case A_AHBSPPPCEXP0:
 +    case A_AHBSPPPCEXP1:
 +    case A_AHBSPPPCEXP2:
 +    case A_AHBSPPPCEXP3:
 +    case A_APBSPPPC0:
 +    case A_APBSPPPC1:
 +    case A_APBSPPPCEXP0:
 +    case A_APBSPPPCEXP1:
 +    case A_APBSPPPCEXP2:
 +    case A_APBSPPPCEXP3:
 +    case A_NSMSCEXP:
 +        qemu_log_mask(LOG_UNIMP,
 +                      "IoTKit SecCtl S block read: "
 +                      "unimplemented offset 0x%x\n", offset);
 +        r = 0;
 +        break;
 +    case A_PID4:
 +    case A_PID5:
 +    case A_PID6:
 +    case A_PID7:
 +    case A_PID0:
 +    case A_PID1:
 +    case A_PID2:
 +    case A_PID3:
 +    case A_CID0:
 +    case A_CID1:
 +    case A_CID2:
 +    case A_CID3:
 +        r = iotkit_secctl_s_idregs[(offset - A_PID4) / 4];
 +        break;
 +    case A_SECPPCINTCLR:
 +    case A_SECMSCINTCLR:
 +    case A_BRGINTCLR:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "IotKit SecCtl S block read: write-only offset 0x%x\n",
 +                      offset);
 +        r = 0;
 +        break;
 +    default:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "IotKit SecCtl S block read: bad offset 0x%x\n", offset);
 +        r = 0;
 +        break;
 +    }
 +
-+    if (size != 4) {
++    if (a->size != 0) {
-+        /* None of our registers are access-sensitive, so just pull the right
++        /* TODO fp16 support */
-+         * byte out of the word read result.
++        return false;
 +         */
 +        r = extract32(r, (addr & 3) * 8, size * 8);
 +    }
 +
-+    trace_iotkit_secctl_s_read(offset, r, size);
++    return do_3same_fp(s, a, gen_helper_vfp_maxnums, false);
 +    *pdata = r;
 +    return MEMTX_OK;
 +}
 +
-+static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
++static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
 +                                         uint64_t value,
 +                                         unsigned size, MemTxAttrs attrs)
 +{
-+    uint32_t offset = addr;
++    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
-+
++        return false;
 +    trace_iotkit_secctl_s_write(offset, value, size);
 +
 +    if (size != 4) {
 +        /* Byte and halfword writes are ignored */
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "IotKit SecCtl S block write: bad size, ignored\n");
 +        return MEMTX_OK;
 +    }
 +
-+    switch (offset) {
++    if (a->size != 0) {
-+    case A_SECRESPCFG:
++        /* TODO fp16 support */
-+    case A_NSCCFG:
++        return false;
 +    case A_SECPPCINTCLR:
 +    case A_SECPPCINTEN:
 +    case A_SECMSCINTCLR:
 +    case A_SECMSCINTEN:
 +    case A_BRGINTCLR:
 +    case A_BRGINTEN:
 +    case A_AHBNSPPCEXP0:
 +    case A_AHBNSPPCEXP1:
 +    case A_AHBNSPPCEXP2:
 +    case A_AHBNSPPCEXP3:
 +    case A_APBNSPPC0:
 +    case A_APBNSPPC1:
 +    case A_APBNSPPCEXP0:
 +    case A_APBNSPPCEXP1:
 +    case A_APBNSPPCEXP2:
 +    case A_APBNSPPCEXP3:
 +    case A_AHBSPPPCEXP0:
 +    case A_AHBSPPPCEXP1:
 +    case A_AHBSPPPCEXP2:
 +    case A_AHBSPPPCEXP3:
 +    case A_APBSPPPC0:
 +    case A_APBSPPPC1:
 +    case A_APBSPPPCEXP0:
 +    case A_APBSPPPCEXP1:
 +    case A_APBSPPPCEXP2:
 +    case A_APBSPPPCEXP3:
 +        qemu_log_mask(LOG_UNIMP,
 +                      "IoTKit SecCtl S block write: "
 +                      "unimplemented offset 0x%x\n", offset);
 +        break;
 +    case A_SECMPCINTSTATUS:
 +    case A_SECPPCINTSTAT:
 +    case A_SECMSCINTSTAT:
 +    case A_BRGINTSTAT:
 +    case A_AHBNSPPC0:
 +    case A_AHBSPPPC0:
 +    case A_NSMSCEXP:
 +    case A_PID4:
 +    case A_PID5:
 +    case A_PID6:
 +    case A_PID7:
 +    case A_PID0:
 +    case A_PID1:
 +    case A_PID2:
 +    case A_PID3:
 +    case A_CID0:
 +    case A_CID1:
 +    case A_CID2:
 +    case A_CID3:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "IoTKit SecCtl S block write: "
 +                      "read-only offset 0x%x\n", offset);
 +        break;
 +    default:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "IotKit SecCtl S block write: bad offset 0x%x\n",
 +                      offset);
 +        break;
 +    }
 +
-+    return MEMTX_OK;
++    return do_3same_fp(s, a, gen_helper_vfp_minnums, false);
 +}
 +
-+static MemTxResult iotkit_secctl_ns_read(void *opaque, hwaddr addr,
++WRAP_ENV_FN(gen_VRECPS_tramp, gen_helper_recps_f32)
-+                                         uint64_t *pdata,
++
-+                                         unsigned size, MemTxAttrs attrs)
++static void gen_VRECPS_fp_3s(unsigned vece, uint32_t rd_ofs,
 +                             uint32_t rn_ofs, uint32_t rm_ofs,
 +                             uint32_t oprsz, uint32_t maxsz)
 +{
-+    uint64_t r;
++    static const GVecGen3 ops = { .fni4 = gen_VRECPS_tramp };
-+    uint32_t offset = addr & ~0x3;
++    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
 +}
 +
-+    switch (offset) {
++static bool trans_VRECPS_fp_3s(DisasContext *s, arg_3same *a)
-+    case A_AHBNSPPPC0:
++{
-+        r = 0;
++    if (a->size != 0) {
-+        break;
++        /* TODO fp16 support */
-+    case A_AHBNSPPPCEXP0:
++        return false;
 +    case A_AHBNSPPPCEXP1:
 +    case A_AHBNSPPPCEXP2:
 +    case A_AHBNSPPPCEXP3:
 +    case A_APBNSPPPC0:
 +    case A_APBNSPPPC1:
 +    case A_APBNSPPPCEXP0:
 +    case A_APBNSPPPCEXP1:
 +    case A_APBNSPPPCEXP2:
 +    case A_APBNSPPPCEXP3:
 +        qemu_log_mask(LOG_UNIMP,
 +                      "IoTKit SecCtl NS block read: "
 +                      "unimplemented offset 0x%x\n", offset);
 +        break;
 +    case A_PID4:
 +    case A_PID5:
 +    case A_PID6:
 +    case A_PID7:
 +    case A_PID0:
 +    case A_PID1:
 +    case A_PID2:
 +    case A_PID3:
 +    case A_CID0:
 +    case A_CID1:
 +    case A_CID2:
 +    case A_CID3:
 +        r = iotkit_secctl_ns_idregs[(offset - A_PID4) / 4];
 +        break;
 +    default:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "IotKit SecCtl NS block write: bad offset 0x%x\n",
 +                      offset);
 +        r = 0;
 +        break;
 +    }
 +
-+    if (size != 4) {
++    return do_3same(s, a, gen_VRECPS_fp_3s);
-+        /* None of our registers are access-sensitive, so just pull the right
++}
-+         * byte out of the word read result.
++
-+         */
++WRAP_ENV_FN(gen_VRSQRTS_tramp, gen_helper_rsqrts_f32)
-+        r = extract32(r, (addr & 3) * 8, size * 8);
++
 +static void gen_VRSQRTS_fp_3s(unsigned vece, uint32_t rd_ofs,
 +                              uint32_t rn_ofs, uint32_t rm_ofs,
 +                              uint32_t oprsz, uint32_t maxsz)
 +{
 +    static const GVecGen3 ops = { .fni4 = gen_VRSQRTS_tramp };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
 +}
 +
 +static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
 +{
 +    if (a->size != 0) {
 +        /* TODO fp16 support */
 +        return false;
 +    }
 +
-+    trace_iotkit_secctl_ns_read(offset, r, size);
++    return do_3same(s, a, gen_VRSQRTS_fp_3s);
 +    *pdata = r;
 +    return MEMTX_OK;
 +}
 +
-+static MemTxResult iotkit_secctl_ns_write(void *opaque, hwaddr addr,
+ static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
-+                                          uint64_t value,
+ {
-+                                          unsigned size, MemTxAttrs attrs)
+     /* FP operations handled pairwise 32 bits at a time */
-+{
+diff --git a/target/arm/translate.c b/target/arm/translate.c
 +    uint32_t offset = addr;
 +
 +    trace_iotkit_secctl_ns_write(offset, value, size);
 +
 +    if (size != 4) {
 +        /* Byte and halfword writes are ignored */
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "IotKit SecCtl NS block write: bad size, ignored\n");
 +        return MEMTX_OK;
 +    }
 +
 +    switch (offset) {
 +    case A_AHBNSPPPCEXP0:
 +    case A_AHBNSPPPCEXP1:
 +    case A_AHBNSPPPCEXP2:
 +    case A_AHBNSPPPCEXP3:
 +    case A_APBNSPPPC0:
 +    case A_APBNSPPPC1:
 +    case A_APBNSPPPCEXP0:
 +    case A_APBNSPPPCEXP1:
 +    case A_APBNSPPPCEXP2:
 +    case A_APBNSPPPCEXP3:
 +        qemu_log_mask(LOG_UNIMP,
 +                      "IoTKit SecCtl NS block write: "
 +                      "unimplemented offset 0x%x\n", offset);
 +        break;
 +    case A_AHBNSPPPC0:
 +    case A_PID4:
 +    case A_PID5:
 +    case A_PID6:
 +    case A_PID7:
 +    case A_PID0:
 +    case A_PID1:
 +    case A_PID2:
 +    case A_PID3:
 +    case A_CID0:
 +    case A_CID1:
 +    case A_CID2:
 +    case A_CID3:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "IoTKit SecCtl NS block write: "
 +                      "read-only offset 0x%x\n", offset);
 +        break;
 +    default:
 +        qemu_log_mask(LOG_GUEST_ERROR,
 +                      "IotKit SecCtl NS block write: bad offset 0x%x\n",
 +                      offset);
 +        break;
 +    }
 +
 +    return MEMTX_OK;
 +}
 +
 +static const MemoryRegionOps iotkit_secctl_s_ops = {
 +    .read_with_attrs = iotkit_secctl_s_read,
 +    .write_with_attrs = iotkit_secctl_s_write,
 +    .endianness = DEVICE_LITTLE_ENDIAN,
 +    .valid.min_access_size = 1,
 +    .valid.max_access_size = 4,
 +    .impl.min_access_size = 1,
 +    .impl.max_access_size = 4,
 +};
 +
 +static const MemoryRegionOps iotkit_secctl_ns_ops = {
 +    .read_with_attrs = iotkit_secctl_ns_read,
 +    .write_with_attrs = iotkit_secctl_ns_write,
 +    .endianness = DEVICE_LITTLE_ENDIAN,
 +    .valid.min_access_size = 1,
 +    .valid.max_access_size = 4,
 +    .impl.min_access_size = 1,
 +    .impl.max_access_size = 4,
 +};
 +
 +static void iotkit_secctl_reset(DeviceState *dev)
 +{
 +
 +}
 +
 +static void iotkit_secctl_init(Object *obj)
 +{
 +    IoTKitSecCtl *s = IOTKIT_SECCTL(obj);
 +    SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
 +
 +    memory_region_init_io(&s->s_regs, obj, &iotkit_secctl_s_ops,
 +                          s, "iotkit-secctl-s-regs", 0x1000);
 +    memory_region_init_io(&s->ns_regs, obj, &iotkit_secctl_ns_ops,
 +                          s, "iotkit-secctl-ns-regs", 0x1000);
 +    sysbus_init_mmio(sbd, &s->s_regs);
 +    sysbus_init_mmio(sbd, &s->ns_regs);
 +}
 +
 +static const VMStateDescription iotkit_secctl_vmstate = {
 +    .name = "iotkit-secctl",
 +    .version_id = 1,
 +    .minimum_version_id = 1,
 +    .fields = (VMStateField[]) {
 +        VMSTATE_END_OF_LIST()
 +    }
 +};
 +
 +static void iotkit_secctl_class_init(ObjectClass *klass, void *data)
 +{
 +    DeviceClass *dc = DEVICE_CLASS(klass);
 +
 +    dc->vmsd = &iotkit_secctl_vmstate;
 +    dc->reset = iotkit_secctl_reset;
 +}
 +
 +static const TypeInfo iotkit_secctl_info = {
 +    .name = TYPE_IOTKIT_SECCTL,
 +    .parent = TYPE_SYS_BUS_DEVICE,
 +    .instance_size = sizeof(IoTKitSecCtl),
 +    .instance_init = iotkit_secctl_init,
 +    .class_init = iotkit_secctl_class_init,
 +};
 +
 +static void iotkit_secctl_register_types(void)
 +{
 +    type_register_static(&iotkit_secctl_info);
 +}
 +
 +type_init(iotkit_secctl_register_types);
 diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
 index XXXXXXX..XXXXXXX 100644
---- a/default-configs/arm-softmmu.mak
+--- a/target/arm/translate.c
-+++ b/default-configs/arm-softmmu.mak
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ CONFIG_MPS2_FPGAIO=y
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
- CONFIG_MPS2_SCC=y
+         case NEON_3R_FLOAT_MULTIPLY:
+         case NEON_3R_FLOAT_CMP:
- CONFIG_TZ_PPC=y
+         case NEON_3R_FLOAT_ACMP:
-+CONFIG_IOTKIT_SECCTL=y
++        case NEON_3R_FLOAT_MINMAX:
++        case NEON_3R_FLOAT_MISC:
- CONFIG_VERSATILE_PCI=y
+             /* Already handled by decodetree */
- CONFIG_VERSATILE_I2C=y
+             return 1;
-diff --git a/hw/misc/trace-events b/hw/misc/trace-events
+         }
-index XXXXXXX..XXXXXXX 100644
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
---- a/hw/misc/trace-events
+             return 1;
-+++ b/hw/misc/trace-events
+         }
-@@ -XXX,XX +XXX,XX @@ tz_ppc_irq_clear(int level) "TZ PPC: int_clear = %d"
+         switch (op) {
- tz_ppc_update_irq(int level) "TZ PPC: setting irq line to %d"
+-        case NEON_3R_FLOAT_MINMAX:
- tz_ppc_read_blocked(int n, hwaddr offset, bool secure, bool user) "TZ PPC: port %d offset 0x%" HWADDR_PRIx " read (secure %d user %d) blocked"
+-            if (u) {
- tz_ppc_write_blocked(int n, hwaddr offset, bool secure, bool user) "TZ PPC: port %d offset 0x%" HWADDR_PRIx " write (secure %d user %d) blocked"
+-                return 1; /* VPMIN/VPMAX handled by decodetree */
-+
+-            }
-+# hw/misc/iotkit-secctl.c
+-            break;
-+iotkit_secctl_s_read(uint32_t offset, uint64_t data, unsigned size) "IoTKit SecCtl S regs read: offset 0x%x data 0x%" PRIx64 " size %u"
+-        case NEON_3R_FLOAT_MISC:
-+iotkit_secctl_s_write(uint32_t offset, uint64_t data, unsigned size) "IoTKit SecCtl S regs write: offset 0x%x data 0x%" PRIx64 " size %u"
+-            /* VMAXNM/VMINNM in ARMv8 */
-+iotkit_secctl_ns_read(uint32_t offset, uint64_t data, unsigned size) "IoTKit SecCtl NS regs read: offset 0x%x data 0x%" PRIx64 " size %u"
+-            if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
-+iotkit_secctl_ns_write(uint32_t offset, uint64_t data, unsigned size) "IoTKit SecCtl NS regs write: offset 0x%x data 0x%" PRIx64 " size %u"
+-                return 1;
-+iotkit_secctl_reset(void) "IoTKit SecCtl: reset"
+-            }
 -            break;
          case NEON_3R_VFM_VQRDMLSH:
              if (!dc_isar_feature(aa32_simdfmac, s)) {
                  return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          tmp = neon_load_reg(rn, pass);
          tmp2 = neon_load_reg(rm, pass);
          switch (op) {
 -        case NEON_3R_FLOAT_MINMAX:
 -        {
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            if (size == 0) {
 -                gen_helper_vfp_maxs(tmp, tmp, tmp2, fpstatus);
 -            } else {
 -                gen_helper_vfp_mins(tmp, tmp, tmp2, fpstatus);
 -            }
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
 -        case NEON_3R_FLOAT_MISC:
 -            if (u) {
 -                /* VMAXNM/VMINNM */
 -                TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                if (size == 0) {
 -                    gen_helper_vfp_maxnums(tmp, tmp, tmp2, fpstatus);
 -                } else {
 -                    gen_helper_vfp_minnums(tmp, tmp, tmp2, fpstatus);
 -                }
 -                tcg_temp_free_ptr(fpstatus);
 -            } else {
 -                if (size == 0) {
 -                    gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
 -                } else {
 -                    gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
 -              }
 -            }
 -            break;
          case NEON_3R_VFM_VQRDMLSH:
          {
              /* VFMA, VFMS: fused multiply-add */
 --
-.16.2
+.20.1

-New patch
+[PULL 45/45] target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree
+Convert the Neon floating point VFMA and VFMS insn to decodetree.
 These are the last insns in the 3-reg-same group so we can
 remove all the support/loop code from the old decoder.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200512163904.10918-18-peter.maydell@linaro.org
 ---
  target/arm/neon-dp.decode       |   3 +
  target/arm/translate-neon.inc.c |  41 ++++++++
  target/arm/translate.c          | 176 +-------------------------------
 files changed, 46 insertions(+), 174 deletions(-)
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-dp.decode
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ SHA256H2_3s      1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
  SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
                   vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +VFMA_fp_3s       1111 001 0 0 . 0 . .... .... 1100 ... 1 .... @3same_fp
 +VFMS_fp_3s       1111 001 0 0 . 1 . .... .... 1100 ... 1 .... @3same_fp
 +
  VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
  VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
      return do_3same(s, a, gen_VRSQRTS_fp_3s);
  }
 +static void gen_VFMA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
 +                            TCGv_ptr fpstatus)
 +{
 +    gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
 +}
 +
 +static bool trans_VFMA_fp_3s(DisasContext *s, arg_3same *a)
 +{
 +    if (!dc_isar_feature(aa32_simdfmac, s)) {
 +        return false;
 +    }
 +
 +    if (a->size != 0) {
 +        /* TODO fp16 support */
 +        return false;
 +    }
 +
 +    return do_3same_fp(s, a, gen_VFMA_fp_3s, true);
 +}
 +
 +static void gen_VFMS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
 +                            TCGv_ptr fpstatus)
 +{
 +    gen_helper_vfp_negs(vn, vn);
 +    gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
 +}
 +
 +static bool trans_VFMS_fp_3s(DisasContext *s, arg_3same *a)
 +{
 +    if (!dc_isar_feature(aa32_simdfmac, s)) {
 +        return false;
 +    }
 +
 +    if (a->size != 0) {
 +        /* TODO fp16 support */
 +        return false;
 +    }
 +
 +    return do_3same_fp(s, a, gen_VFMS_fp_3s, true);
 +}
 +
  static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
  {
      /* FP operations handled pairwise 32 bits at a time */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
      }
  }
 -/* Symbolic constants for op fields for Neon 3-register same-length.
 - * The values correspond to bits [11:8,4]; see the ARM ARM DDI0406B
 - * table A7-9.
 - */
 -#define NEON_3R_VHADD 0
 -#define NEON_3R_VQADD 1
 -#define NEON_3R_VRHADD 2
 -#define NEON_3R_LOGIC 3 /* VAND,VBIC,VORR,VMOV,VORN,VEOR,VBIF,VBIT,VBSL */
 -#define NEON_3R_VHSUB 4
 -#define NEON_3R_VQSUB 5
 -#define NEON_3R_VCGT 6
 -#define NEON_3R_VCGE 7
 -#define NEON_3R_VSHL 8
 -#define NEON_3R_VQSHL 9
 -#define NEON_3R_VRSHL 10
 -#define NEON_3R_VQRSHL 11
 -#define NEON_3R_VMAX 12
 -#define NEON_3R_VMIN 13
 -#define NEON_3R_VABD 14
 -#define NEON_3R_VABA 15
 -#define NEON_3R_VADD_VSUB 16
 -#define NEON_3R_VTST_VCEQ 17
 -#define NEON_3R_VML 18 /* VMLA, VMLS */
 -#define NEON_3R_VMUL 19
 -#define NEON_3R_VPMAX 20
 -#define NEON_3R_VPMIN 21
 -#define NEON_3R_VQDMULH_VQRDMULH 22
 -#define NEON_3R_VPADD_VQRDMLAH 23
 -#define NEON_3R_SHA 24 /* SHA1C,SHA1P,SHA1M,SHA1SU0,SHA256H{2},SHA256SU1 */
 -#define NEON_3R_VFM_VQRDMLSH 25 /* VFMA, VFMS, VQRDMLSH */
 -#define NEON_3R_FLOAT_ARITH 26 /* float VADD, VSUB, VPADD, VABD */
 -#define NEON_3R_FLOAT_MULTIPLY 27 /* float VMLA, VMLS, VMUL */
 -#define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */
 -#define NEON_3R_FLOAT_ACMP 29 /* float VACGE, VACGT, VACLE, VACLT */
 -#define NEON_3R_FLOAT_MINMAX 30 /* float VMIN, VMAX */
 -#define NEON_3R_FLOAT_MISC 31 /* float VRECPS, VRSQRTS, VMAXNM/MINNM */
 -
 -static const uint8_t neon_3r_sizes[] = {
 -    [NEON_3R_VHADD] = 0x7,
 -    [NEON_3R_VQADD] = 0xf,
 -    [NEON_3R_VRHADD] = 0x7,
 -    [NEON_3R_LOGIC] = 0xf, /* size field encodes op type */
 -    [NEON_3R_VHSUB] = 0x7,
 -    [NEON_3R_VQSUB] = 0xf,
 -    [NEON_3R_VCGT] = 0x7,
 -    [NEON_3R_VCGE] = 0x7,
 -    [NEON_3R_VSHL] = 0xf,
 -    [NEON_3R_VQSHL] = 0xf,
 -    [NEON_3R_VRSHL] = 0xf,
 -    [NEON_3R_VQRSHL] = 0xf,
 -    [NEON_3R_VMAX] = 0x7,
 -    [NEON_3R_VMIN] = 0x7,
 -    [NEON_3R_VABD] = 0x7,
 -    [NEON_3R_VABA] = 0x7,
 -    [NEON_3R_VADD_VSUB] = 0xf,
 -    [NEON_3R_VTST_VCEQ] = 0x7,
 -    [NEON_3R_VML] = 0x7,
 -    [NEON_3R_VMUL] = 0x7,
 -    [NEON_3R_VPMAX] = 0x7,
 -    [NEON_3R_VPMIN] = 0x7,
 -    [NEON_3R_VQDMULH_VQRDMULH] = 0x6,
 -    [NEON_3R_VPADD_VQRDMLAH] = 0x7,
 -    [NEON_3R_SHA] = 0xf, /* size field encodes op type */
 -    [NEON_3R_VFM_VQRDMLSH] = 0x7, /* For VFM, size bit 1 encodes op */
 -    [NEON_3R_FLOAT_ARITH] = 0x5, /* size bit 1 encodes op */
 -    [NEON_3R_FLOAT_MULTIPLY] = 0x5, /* size bit 1 encodes op */
 -    [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */
 -    [NEON_3R_FLOAT_ACMP] = 0x5, /* size bit 1 encodes op */
 -    [NEON_3R_FLOAT_MINMAX] = 0x5, /* size bit 1 encodes op */
 -    [NEON_3R_FLOAT_MISC] = 0x5, /* size bit 1 encodes op */
 -};
 -
  /* Symbolic constants for op fields for Neon 2-register miscellaneous.
   * The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
   * table A7-13.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      rm_ofs = neon_reg_offset(rm, 0);
      if ((insn & (1 << 23)) == 0) {
 -        /* Three register same length.  */
 -        op = ((insn >> 7) & 0x1e) | ((insn >> 4) & 1);
 -        /* Catch invalid op and bad size combinations: UNDEF */
 -        if ((neon_3r_sizes[op] & (1 << size)) == 0) {
 -            return 1;
 -        }
 -        /* All insns of this form UNDEF for either this condition or the
 -         * superset of cases "Q==1"; we catch the latter later.
 -         */
 -        if (q && ((rd | rn | rm) & 1)) {
 -            return 1;
 -        }
 -        switch (op) {
 -        case NEON_3R_VFM_VQRDMLSH:
 -            if (!u) {
 -                /* VFM, VFMS */
 -                if (size == 1) {
 -                    return 1;
 -                }
 -                break;
 -            }
 -            /* VQRDMLSH : handled by decodetree */
 -            return 1;
 -
 -        case NEON_3R_VADD_VSUB:
 -        case NEON_3R_LOGIC:
 -        case NEON_3R_VMAX:
 -        case NEON_3R_VMIN:
 -        case NEON_3R_VTST_VCEQ:
 -        case NEON_3R_VCGT:
 -        case NEON_3R_VCGE:
 -        case NEON_3R_VQADD:
 -        case NEON_3R_VQSUB:
 -        case NEON_3R_VMUL:
 -        case NEON_3R_VML:
 -        case NEON_3R_VSHL:
 -        case NEON_3R_SHA:
 -        case NEON_3R_VHADD:
 -        case NEON_3R_VRHADD:
 -        case NEON_3R_VHSUB:
 -        case NEON_3R_VABD:
 -        case NEON_3R_VABA:
 -        case NEON_3R_VQSHL:
 -        case NEON_3R_VRSHL:
 -        case NEON_3R_VQRSHL:
 -        case NEON_3R_VPMAX:
 -        case NEON_3R_VPMIN:
 -        case NEON_3R_VPADD_VQRDMLAH:
 -        case NEON_3R_VQDMULH_VQRDMULH:
 -        case NEON_3R_FLOAT_ARITH:
 -        case NEON_3R_FLOAT_MULTIPLY:
 -        case NEON_3R_FLOAT_CMP:
 -        case NEON_3R_FLOAT_ACMP:
 -        case NEON_3R_FLOAT_MINMAX:
 -        case NEON_3R_FLOAT_MISC:
 -            /* Already handled by decodetree */
 -            return 1;
 -        }
 -
 -        if (size == 3) {
 -            /* 64-bit element instructions: handled by decodetree */
 -            return 1;
 -        }
 -        switch (op) {
 -        case NEON_3R_VFM_VQRDMLSH:
 -            if (!dc_isar_feature(aa32_simdfmac, s)) {
 -                return 1;
 -            }
 -            break;
 -        default:
 -            break;
 -        }
 -
 -        for (pass = 0; pass < (q ? 4 : 2); pass++) {
 -
 -        /* Elementwise.  */
 -        tmp = neon_load_reg(rn, pass);
 -        tmp2 = neon_load_reg(rm, pass);
 -        switch (op) {
 -        case NEON_3R_VFM_VQRDMLSH:
 -        {
 -            /* VFMA, VFMS: fused multiply-add */
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            TCGv_i32 tmp3 = neon_load_reg(rd, pass);
 -            if (size) {
 -                /* VFMS */
 -                gen_helper_vfp_negs(tmp, tmp);
 -            }
 -            gen_helper_vfp_muladds(tmp, tmp, tmp2, tmp3, fpstatus);
 -            tcg_temp_free_i32(tmp3);
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
 -        default:
 -            abort();
 -        }
 -        tcg_temp_free_i32(tmp2);
 -
 -        neon_store_reg(rd, pass, tmp);
 -
 -        } /* for pass */
 -        /* End of 3 register same size operations.  */
 +        /* Three register same length: handled by decodetree */
 +        return 1;
      } else if (insn & (1 << 4)) {
          if ((insn & 0x00380080) != 0) {
              /* Two registers and shift.  */
 --
 .20.1

Second pull request of the week; mostly RTH's support for some
new-in-v8.1/v8.3 instructions, and my v8M board model.

thanks
-- PMM

The following changes since commit 427cbc7e4136a061628cb4315cc8182ea36d772f:

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging (2018-03-01 18:46:41 +0000)

are available in the Git repository at:

git://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20180302

for you to fetch changes up to e66a67bf28e1b4fce2e3d72a2610dbd48d9d3078:

target/arm: Enable ARM_FEATURE_V8_FCMA (2018-03-02 11:03:45 +0000)

----------------------------------------------------------------
target-arm queue:
 * implement FCMA and RDM v8.1 and v8.3 instructions
 * enable Cortex-M33 v8M core, and provide new mps2-an505 board model
   that uses it
 * decodetree: Propagate return value from translate subroutines
 * xlnx-zynqmp: Implement the RTC device

----------------------------------------------------------------
Alistair Francis (3):
      xlnx-zynqmp-rtc: Initial commit
      xlnx-zynqmp-rtc: Add basic time support
      xlnx-zynqmp: Connect the RTC device

Peter Maydell (19):
      loader: Add new load_ramdisk_as()
      hw/arm/boot: Honour CPU's address space for image loads
      hw/arm/armv7m: Honour CPU's address space for image loads
      target/arm: Define an IDAU interface
      armv7m: Forward idau property to CPU object
      target/arm: Define init-svtor property for the reset secure VTOR value
      armv7m: Forward init-svtor property to CPU object
      target/arm: Add Cortex-M33
      hw/misc/unimp: Move struct to header file
      include/hw/or-irq.h: Add missing include guard
      qdev: Add new qdev_init_gpio_in_named_with_opaque()
      hw/core/split-irq: Device that splits IRQ lines
      hw/misc/mps2-fpgaio: FPGA control block for MPS2 AN505
      hw/misc/tz-ppc: Model TrustZone peripheral protection controller
      hw/misc/iotkit-secctl: Arm IoT Kit security controller initial skeleton
      hw/misc/iotkit-secctl: Add handling for PPCs
      hw/misc/iotkit-secctl: Add remaining simple registers
      hw/arm/iotkit: Model Arm IOT Kit
      mps2-an505: New board model: MPS2 with AN505 Cortex-M33 FPGA image

Richard Henderson (17):
      decodetree: Propagate return value from translate subroutines
      target/arm: Add ARM_FEATURE_V8_RDM
      target/arm: Refactor disas_simd_indexed decode
      target/arm: Refactor disas_simd_indexed size checks
      target/arm: Decode aa64 armv8.1 scalar three same extra
      target/arm: Decode aa64 armv8.1 three same extra
      target/arm: Decode aa64 armv8.1 scalar/vector x indexed element
      target/arm: Decode aa32 armv8.1 three same
      target/arm: Decode aa32 armv8.1 two reg and a scalar
      target/arm: Enable ARM_FEATURE_V8_RDM
      target/arm: Add ARM_FEATURE_V8_FCMA
      target/arm: Decode aa64 armv8.3 fcadd
      target/arm: Decode aa64 armv8.3 fcmla
      target/arm: Decode aa32 armv8.3 3-same
      target/arm: Decode aa32 armv8.3 2-reg-index
      target/arm: Decode t32 simd 3reg and 2reg_scalar extension
      target/arm: Enable ARM_FEATURE_V8_FCMA

hw/arm/Makefile.objs               |   2 +
 hw/core/Makefile.objs              |   1 +
 hw/misc/Makefile.objs              |   4 +
 hw/timer/Makefile.objs             |   1 +
 target/arm/Makefile.objs           |   2 +-
 include/hw/arm/armv7m.h            |   5 +
 include/hw/arm/iotkit.h            | 109 ++++++
 include/hw/arm/xlnx-zynqmp.h       |   2 +
 include/hw/core/split-irq.h        |  57 +++
 include/hw/irq.h                   |   4 +-
 include/hw/loader.h                |  12 +-
 include/hw/misc/iotkit-secctl.h    | 103 ++++++
 include/hw/misc/mps2-fpgaio.h      |  43 +++
 include/hw/misc/tz-ppc.h           | 101 ++++++
 include/hw/misc/unimp.h            |  10 +
 include/hw/or-irq.h                |   5 +
 include/hw/qdev-core.h             |  30 +-
 include/hw/timer/xlnx-zynqmp-rtc.h |  86 +++++
 target/arm/cpu.h                   |   8 +
 target/arm/helper.h                |  31 ++
 target/arm/idau.h                  |  61 ++++
 hw/arm/armv7m.c                    |  35 +-
 hw/arm/boot.c                      | 119 ++++---
 hw/arm/iotkit.c                    | 598 +++++++++++++++++++++++++++++++
 hw/arm/mps2-tz.c                   | 503 ++++++++++++++++++++++++++
 hw/arm/xlnx-zynqmp.c               |  14 +
 hw/core/loader.c                   |   8 +-
 hw/core/qdev.c                     |   8 +-
 hw/core/split-irq.c                |  89 +++++
 hw/misc/iotkit-secctl.c            | 704 +++++++++++++++++++++++++++++++++++++
 hw/misc/mps2-fpgaio.c              | 176 ++++++++++
 hw/misc/tz-ppc.c                   | 302 ++++++++++++++++
 hw/misc/unimp.c                    |  10 -
 hw/timer/xlnx-zynqmp-rtc.c         | 272 ++++++++++++++
 linux-user/elfload.c               |   2 +
 target/arm/cpu.c                   |  66 +++-
 target/arm/cpu64.c                 |   2 +
 target/arm/helper.c                |  28 +-
 target/arm/translate-a64.c         | 514 +++++++++++++++++++++------
 target/arm/translate.c             | 275 +++++++++++++--
 target/arm/vec_helper.c            | 429 ++++++++++++++++++++++
 default-configs/arm-softmmu.mak    |   5 +
 hw/misc/trace-events               |  24 ++
 hw/timer/trace-events              |   3 +
 scripts/decodetree.py              |   5 +-
 45 files changed, 4668 insertions(+), 200 deletions(-)
 create mode 100644 include/hw/arm/iotkit.h
 create mode 100644 include/hw/core/split-irq.h
 create mode 100644 include/hw/misc/iotkit-secctl.h
 create mode 100644 include/hw/misc/mps2-fpgaio.h
 create mode 100644 include/hw/misc/tz-ppc.h
 create mode 100644 include/hw/timer/xlnx-zynqmp-rtc.h
 create mode 100644 target/arm/idau.h
 create mode 100644 hw/arm/iotkit.c
 create mode 100644 hw/arm/mps2-tz.c
 create mode 100644 hw/core/split-irq.c
 create mode 100644 hw/misc/iotkit-secctl.c
 create mode 100644 hw/misc/mps2-fpgaio.c
 create mode 100644 hw/misc/tz-ppc.c
 create mode 100644 hw/timer/xlnx-zynqmp-rtc.c
 create mode 100644 target/arm/vec_helper.c

From: Alistair Francis <alistair.francis@xilinx.com>

Initial commit of the ZynqMP RTC device.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/timer/Makefile.objs             |   1 +
 include/hw/timer/xlnx-zynqmp-rtc.h |  84 +++++++++++++++
 hw/timer/xlnx-zynqmp-rtc.c         | 214 +++++++++++++++++++++++++++++++++++++
 3 files changed, 299 insertions(+)
 create mode 100644 include/hw/timer/xlnx-zynqmp-rtc.h
 create mode 100644 hw/timer/xlnx-zynqmp-rtc.c

diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/hw/timer/Makefile.objs
+++ b/hw/timer/Makefile.objs
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_IMX) += imx_epit.o
 common-obj-$(CONFIG_IMX) += imx_gpt.o
 common-obj-$(CONFIG_LM32) += lm32_timer.o
 common-obj-$(CONFIG_MILKYMIST) += milkymist-sysctl.o
+common-obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp-rtc.o
 
 obj-$(CONFIG_ALTERA_TIMER) += altera_timer.o
 obj-$(CONFIG_EXYNOS4) += exynos4210_mct.o
diff --git a/include/hw/timer/xlnx-zynqmp-rtc.h b/include/hw/timer/xlnx-zynqmp-rtc.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/timer/xlnx-zynqmp-rtc.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * QEMU model of the Xilinx ZynqMP Real Time Clock (RTC).
+ *
+ * Copyright (c) 2017 Xilinx Inc.
+ *
+ * Written-by: Alistair Francis <alistair.francis@xilinx.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "hw/register.h"
+
+#define TYPE_XLNX_ZYNQMP_RTC "xlnx-zynmp.rtc"
+
+#define XLNX_ZYNQMP_RTC(obj) \
+     OBJECT_CHECK(XlnxZynqMPRTC, (obj), TYPE_XLNX_ZYNQMP_RTC)
+
+REG32(SET_TIME_WRITE, 0x0)
+REG32(SET_TIME_READ, 0x4)
+REG32(CALIB_WRITE, 0x8)
+    FIELD(CALIB_WRITE, FRACTION_EN, 20, 1)
+    FIELD(CALIB_WRITE, FRACTION_DATA, 16, 4)
+    FIELD(CALIB_WRITE, MAX_TICK, 0, 16)
+REG32(CALIB_READ, 0xc)
+    FIELD(CALIB_READ, FRACTION_EN, 20, 1)
+    FIELD(CALIB_READ, FRACTION_DATA, 16, 4)
+    FIELD(CALIB_READ, MAX_TICK, 0, 16)
+REG32(CURRENT_TIME, 0x10)
+REG32(CURRENT_TICK, 0x14)
+    FIELD(CURRENT_TICK, VALUE, 0, 16)
+REG32(ALARM, 0x18)
+REG32(RTC_INT_STATUS, 0x20)
+    FIELD(RTC_INT_STATUS, ALARM, 1, 1)
+    FIELD(RTC_INT_STATUS, SECONDS, 0, 1)
+REG32(RTC_INT_MASK, 0x24)
+    FIELD(RTC_INT_MASK, ALARM, 1, 1)
+    FIELD(RTC_INT_MASK, SECONDS, 0, 1)
+REG32(RTC_INT_EN, 0x28)
+    FIELD(RTC_INT_EN, ALARM, 1, 1)
+    FIELD(RTC_INT_EN, SECONDS, 0, 1)
+REG32(RTC_INT_DIS, 0x2c)
+    FIELD(RTC_INT_DIS, ALARM, 1, 1)
+    FIELD(RTC_INT_DIS, SECONDS, 0, 1)
+REG32(ADDR_ERROR, 0x30)
+    FIELD(ADDR_ERROR, STATUS, 0, 1)
+REG32(ADDR_ERROR_INT_MASK, 0x34)
+    FIELD(ADDR_ERROR_INT_MASK, MASK, 0, 1)
+REG32(ADDR_ERROR_INT_EN, 0x38)
+    FIELD(ADDR_ERROR_INT_EN, MASK, 0, 1)
+REG32(ADDR_ERROR_INT_DIS, 0x3c)
+    FIELD(ADDR_ERROR_INT_DIS, MASK, 0, 1)
+REG32(CONTROL, 0x40)
+    FIELD(CONTROL, BATTERY_DISABLE, 31, 1)
+    FIELD(CONTROL, OSC_CNTRL, 24, 4)
+    FIELD(CONTROL, SLVERR_ENABLE, 0, 1)
+REG32(SAFETY_CHK, 0x50)
+
+#define XLNX_ZYNQMP_RTC_R_MAX (R_SAFETY_CHK + 1)
+
+typedef struct XlnxZynqMPRTC {
+    SysBusDevice parent_obj;
+    MemoryRegion iomem;
+    qemu_irq irq_rtc_int;
+    qemu_irq irq_addr_error_int;
+
+    uint32_t regs[XLNX_ZYNQMP_RTC_R_MAX];
+    RegisterInfo regs_info[XLNX_ZYNQMP_RTC_R_MAX];
+} XlnxZynqMPRTC;
diff --git a/hw/timer/xlnx-zynqmp-rtc.c b/hw/timer/xlnx-zynqmp-rtc.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/hw/timer/xlnx-zynqmp-rtc.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * QEMU model of the Xilinx ZynqMP Real Time Clock (RTC).
+ *
+ * Copyright (c) 2017 Xilinx Inc.
+ *
+ * Written-by: Alistair Francis <alistair.francis@xilinx.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "hw/register.h"
+#include "qemu/bitops.h"
+#include "qemu/log.h"
+#include "hw/timer/xlnx-zynqmp-rtc.h"
+
+#ifndef XLNX_ZYNQMP_RTC_ERR_DEBUG
+#define XLNX_ZYNQMP_RTC_ERR_DEBUG 0
+#endif
+
+static void rtc_int_update_irq(XlnxZynqMPRTC *s)
+{
+    bool pending = s->regs[R_RTC_INT_STATUS] & ~s->regs[R_RTC_INT_MASK];
+    qemu_set_irq(s->irq_rtc_int, pending);
+}
+
+static void addr_error_int_update_irq(XlnxZynqMPRTC *s)
+{
+    bool pending = s->regs[R_ADDR_ERROR] & ~s->regs[R_ADDR_ERROR_INT_MASK];
+    qemu_set_irq(s->irq_addr_error_int, pending);
+}
+
+static void rtc_int_status_postw(RegisterInfo *reg, uint64_t val64)
+{
+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
+    rtc_int_update_irq(s);
+}
+
+static uint64_t rtc_int_en_prew(RegisterInfo *reg, uint64_t val64)
+{
+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
+
+    s->regs[R_RTC_INT_MASK] &= (uint32_t) ~val64;
+    rtc_int_update_irq(s);
+    return 0;
+}
+
+static uint64_t rtc_int_dis_prew(RegisterInfo *reg, uint64_t val64)
+{
+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
+
+    s->regs[R_RTC_INT_MASK] |= (uint32_t) val64;
+    rtc_int_update_irq(s);
+    return 0;
+}
+
+static void addr_error_postw(RegisterInfo *reg, uint64_t val64)
+{
+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
+    addr_error_int_update_irq(s);
+}
+
+static uint64_t addr_error_int_en_prew(RegisterInfo *reg, uint64_t val64)
+{
+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
+
+    s->regs[R_ADDR_ERROR_INT_MASK] &= (uint32_t) ~val64;
+    addr_error_int_update_irq(s);
+    return 0;
+}
+
+static uint64_t addr_error_int_dis_prew(RegisterInfo *reg, uint64_t val64)
+{
+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
+
+    s->regs[R_ADDR_ERROR_INT_MASK] |= (uint32_t) val64;
+    addr_error_int_update_irq(s);
+    return 0;
+}
+
+static const RegisterAccessInfo rtc_regs_info[] = {
+    {   .name = "SET_TIME_WRITE",  .addr = A_SET_TIME_WRITE,
+    },{ .name = "SET_TIME_READ",  .addr = A_SET_TIME_READ,
+        .ro = 0xffffffff,
+    },{ .name = "CALIB_WRITE",  .addr = A_CALIB_WRITE,
+    },{ .name = "CALIB_READ",  .addr = A_CALIB_READ,
+        .ro = 0x1fffff,
+    },{ .name = "CURRENT_TIME",  .addr = A_CURRENT_TIME,
+        .ro = 0xffffffff,
+    },{ .name = "CURRENT_TICK",  .addr = A_CURRENT_TICK,
+        .ro = 0xffff,
+    },{ .name = "ALARM",  .addr = A_ALARM,
+    },{ .name = "RTC_INT_STATUS",  .addr = A_RTC_INT_STATUS,
+        .w1c = 0x3,
+        .post_write = rtc_int_status_postw,
+    },{ .name = "RTC_INT_MASK",  .addr = A_RTC_INT_MASK,
+        .reset = 0x3,
+        .ro = 0x3,
+    },{ .name = "RTC_INT_EN",  .addr = A_RTC_INT_EN,
+        .pre_write = rtc_int_en_prew,
+    },{ .name = "RTC_INT_DIS",  .addr = A_RTC_INT_DIS,
+        .pre_write = rtc_int_dis_prew,
+    },{ .name = "ADDR_ERROR",  .addr = A_ADDR_ERROR,
+        .w1c = 0x1,
+        .post_write = addr_error_postw,
+    },{ .name = "ADDR_ERROR_INT_MASK",  .addr = A_ADDR_ERROR_INT_MASK,
+        .reset = 0x1,
+        .ro = 0x1,
+    },{ .name = "ADDR_ERROR_INT_EN",  .addr = A_ADDR_ERROR_INT_EN,
+        .pre_write = addr_error_int_en_prew,
+    },{ .name = "ADDR_ERROR_INT_DIS",  .addr = A_ADDR_ERROR_INT_DIS,
+        .pre_write = addr_error_int_dis_prew,
+    },{ .name = "CONTROL",  .addr = A_CONTROL,
+        .reset = 0x1000000,
+        .rsvd = 0x70fffffe,
+    },{ .name = "SAFETY_CHK",  .addr = A_SAFETY_CHK,
+    }
+};
+
+static void rtc_reset(DeviceState *dev)
+{
+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(dev);
+    unsigned int i;
+
+    for (i = 0; i < ARRAY_SIZE(s->regs_info); ++i) {
+        register_reset(&s->regs_info[i]);
+    }
+
+    rtc_int_update_irq(s);
+    addr_error_int_update_irq(s);
+}
+
+static const MemoryRegionOps rtc_ops = {
+    .read = register_read_memory,
+    .write = register_write_memory,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+};
+
+static void rtc_init(Object *obj)
+{
+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(obj);
+    SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
+    RegisterInfoArray *reg_array;
+
+    memory_region_init(&s->iomem, obj, TYPE_XLNX_ZYNQMP_RTC,
+                       XLNX_ZYNQMP_RTC_R_MAX * 4);
+    reg_array =
+        register_init_block32(DEVICE(obj), rtc_regs_info,
+                              ARRAY_SIZE(rtc_regs_info),
+                              s->regs_info, s->regs,
+                              &rtc_ops,
+                              XLNX_ZYNQMP_RTC_ERR_DEBUG,
+                              XLNX_ZYNQMP_RTC_R_MAX * 4);
+    memory_region_add_subregion(&s->iomem,
+                                0x0,
+                                &reg_array->mem);
+    sysbus_init_mmio(sbd, &s->iomem);
+    sysbus_init_irq(sbd, &s->irq_rtc_int);
+    sysbus_init_irq(sbd, &s->irq_addr_error_int);
+}
+
+static const VMStateDescription vmstate_rtc = {
+    .name = TYPE_XLNX_ZYNQMP_RTC,
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32_ARRAY(regs, XlnxZynqMPRTC, XLNX_ZYNQMP_RTC_R_MAX),
+        VMSTATE_END_OF_LIST(),
+    }
+};
+
+static void rtc_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->reset = rtc_reset;
+    dc->vmsd = &vmstate_rtc;
+}
+
+static const TypeInfo rtc_info = {
+    .name          = TYPE_XLNX_ZYNQMP_RTC,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(XlnxZynqMPRTC),
+    .class_init    = rtc_class_init,
+    .instance_init = rtc_init,
+};
+
+static void rtc_register_types(void)
+{
+    type_register_static(&rtc_info);
+}
+
+type_init(rtc_register_types)
-- 
2.16.2

From: Alistair Francis <alistair.francis@xilinx.com>

Allow the guest to determine the time set from the QEMU command line.

This includes adding a trace event to debug the new time.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/timer/xlnx-zynqmp-rtc.h |  2 ++
 hw/timer/xlnx-zynqmp-rtc.c         | 58 ++++++++++++++++++++++++++++++++++++++
 hw/timer/trace-events              |  3 ++
 3 files changed, 63 insertions(+)

diff --git a/include/hw/timer/xlnx-zynqmp-rtc.h b/include/hw/timer/xlnx-zynqmp-rtc.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/timer/xlnx-zynqmp-rtc.h
+++ b/include/hw/timer/xlnx-zynqmp-rtc.h
@@ -XXX,XX +XXX,XX @@ typedef struct XlnxZynqMPRTC {
     qemu_irq irq_rtc_int;
     qemu_irq irq_addr_error_int;
 
+    uint32_t tick_offset;
+
     uint32_t regs[XLNX_ZYNQMP_RTC_R_MAX];
     RegisterInfo regs_info[XLNX_ZYNQMP_RTC_R_MAX];
 } XlnxZynqMPRTC;
diff --git a/hw/timer/xlnx-zynqmp-rtc.c b/hw/timer/xlnx-zynqmp-rtc.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/timer/xlnx-zynqmp-rtc.c
+++ b/hw/timer/xlnx-zynqmp-rtc.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/register.h"
 #include "qemu/bitops.h"
 #include "qemu/log.h"
+#include "hw/ptimer.h"
+#include "qemu/cutils.h"
+#include "sysemu/sysemu.h"
+#include "trace.h"
 #include "hw/timer/xlnx-zynqmp-rtc.h"
 
 #ifndef XLNX_ZYNQMP_RTC_ERR_DEBUG
@@ -XXX,XX +XXX,XX @@ static void addr_error_int_update_irq(XlnxZynqMPRTC *s)
     qemu_set_irq(s->irq_addr_error_int, pending);
 }
 
+static uint32_t rtc_get_count(XlnxZynqMPRTC *s)
+{
+    int64_t now = qemu_clock_get_ns(rtc_clock);
+    return s->tick_offset + now / NANOSECONDS_PER_SECOND;
+}
+
+static uint64_t current_time_postr(RegisterInfo *reg, uint64_t val64)
+{
+    XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
+
+    return rtc_get_count(s);
+}
+
 static void rtc_int_status_postw(RegisterInfo *reg, uint64_t val64)
 {
     XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(reg->opaque);
@@ -XXX,XX +XXX,XX @@ static uint64_t addr_error_int_dis_prew(RegisterInfo *reg, uint64_t val64)
 
 static const RegisterAccessInfo rtc_regs_info[] = {
     {   .name = "SET_TIME_WRITE",  .addr = A_SET_TIME_WRITE,
+        .unimp = MAKE_64BIT_MASK(0, 32),
     },{ .name = "SET_TIME_READ",  .addr = A_SET_TIME_READ,
         .ro = 0xffffffff,
+        .post_read = current_time_postr,
     },{ .name = "CALIB_WRITE",  .addr = A_CALIB_WRITE,
+        .unimp = MAKE_64BIT_MASK(0, 32),
     },{ .name = "CALIB_READ",  .addr = A_CALIB_READ,
         .ro = 0x1fffff,
     },{ .name = "CURRENT_TIME",  .addr = A_CURRENT_TIME,
         .ro = 0xffffffff,
+        .post_read = current_time_postr,
     },{ .name = "CURRENT_TICK",  .addr = A_CURRENT_TICK,
         .ro = 0xffff,
     },{ .name = "ALARM",  .addr = A_ALARM,
@@ -XXX,XX +XXX,XX @@ static void rtc_init(Object *obj)
     XlnxZynqMPRTC *s = XLNX_ZYNQMP_RTC(obj);
     SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
     RegisterInfoArray *reg_array;
+    struct tm current_tm;
 
     memory_region_init(&s->iomem, obj, TYPE_XLNX_ZYNQMP_RTC,
                        XLNX_ZYNQMP_RTC_R_MAX * 4);
@@ -XXX,XX +XXX,XX @@ static void rtc_init(Object *obj)
     sysbus_init_mmio(sbd, &s->iomem);
     sysbus_init_irq(sbd, &s->irq_rtc_int);
     sysbus_init_irq(sbd, &s->irq_addr_error_int);
+
+    qemu_get_timedate(&current_tm, 0);
+    s->tick_offset = mktimegm(&current_tm) -
+        qemu_clock_get_ns(rtc_clock) / NANOSECONDS_PER_SECOND;
+
+    trace_xlnx_zynqmp_rtc_gettime(current_tm.tm_year, current_tm.tm_mon,
+                                  current_tm.tm_mday, current_tm.tm_hour,
+                                  current_tm.tm_min, current_tm.tm_sec);
+}
+
+static int rtc_pre_save(void *opaque)
+{
+    XlnxZynqMPRTC *s = opaque;
+    int64_t now = qemu_clock_get_ns(rtc_clock) / NANOSECONDS_PER_SECOND;
+
+    /* Add the time at migration */
+    s->tick_offset = s->tick_offset + now;
+
+    return 0;
+}
+
+static int rtc_post_load(void *opaque, int version_id)
+{
+    XlnxZynqMPRTC *s = opaque;
+    int64_t now = qemu_clock_get_ns(rtc_clock) / NANOSECONDS_PER_SECOND;
+
+    /* Subtract the time after migration. This combined with the pre_save
+     * action results in us having subtracted the time that the guest was
+     * stopped to the offset.
+     */
+    s->tick_offset = s->tick_offset - now;
+
+    return 0;
 }
 
 static const VMStateDescription vmstate_rtc = {
     .name = TYPE_XLNX_ZYNQMP_RTC,
     .version_id = 1,
     .minimum_version_id = 1,
+    .pre_save = rtc_pre_save,
+    .post_load = rtc_post_load,
     .fields = (VMStateField[]) {
         VMSTATE_UINT32_ARRAY(regs, XlnxZynqMPRTC, XLNX_ZYNQMP_RTC_R_MAX),
+        VMSTATE_UINT32(tick_offset, XlnxZynqMPRTC),
         VMSTATE_END_OF_LIST(),
     }
 };
diff --git a/hw/timer/trace-events b/hw/timer/trace-events
index XXXXXXX..XXXXXXX 100644
--- a/hw/timer/trace-events
+++ b/hw/timer/trace-events
@@ -XXX,XX +XXX,XX @@ systick_write(uint64_t addr, uint32_t value, unsigned size) "systick write addr
 cmsdk_apb_timer_read(uint64_t offset, uint64_t data, unsigned size) "CMSDK APB timer read: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
 cmsdk_apb_timer_write(uint64_t offset, uint64_t data, unsigned size) "CMSDK APB timer write: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u"
 cmsdk_apb_timer_reset(void) "CMSDK APB timer: reset"
+
+# hw/timer/xlnx-zynqmp-rtc.c
+xlnx_zynqmp_rtc_gettime(int year, int month, int day, int hour, int min, int sec) "Get time from host: %d-%d-%d %2d:%02d:%02d"
-- 
2.16.2

From: Alistair Francis <alistair.francis@xilinx.com>

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/xlnx-zynqmp.h |  2 ++
 hw/arm/xlnx-zynqmp.c         | 14 ++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/include/hw/arm/xlnx-zynqmp.h b/include/hw/arm/xlnx-zynqmp.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/xlnx-zynqmp.h
+++ b/include/hw/arm/xlnx-zynqmp.h
@@ -XXX,XX +XXX,XX @@
 #include "hw/dma/xlnx_dpdma.h"
 #include "hw/display/xlnx_dp.h"
 #include "hw/intc/xlnx-zynqmp-ipi.h"
+#include "hw/timer/xlnx-zynqmp-rtc.h"
 
 #define TYPE_XLNX_ZYNQMP "xlnx,zynqmp"
 #define XLNX_ZYNQMP(obj) OBJECT_CHECK(XlnxZynqMPState, (obj), \
@@ -XXX,XX +XXX,XX @@ typedef struct XlnxZynqMPState {
     XlnxDPState dp;
     XlnxDPDMAState dpdma;
     XlnxZynqMPIPI ipi;
+    XlnxZynqMPRTC rtc;
 
     char *boot_cpu;
     ARMCPU *boot_cpu_ptr;
diff --git a/hw/arm/xlnx-zynqmp.c b/hw/arm/xlnx-zynqmp.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/xlnx-zynqmp.c
+++ b/hw/arm/xlnx-zynqmp.c
@@ -XXX,XX +XXX,XX @@
 #define IPI_ADDR            0xFF300000
 #define IPI_IRQ             64
 
+#define RTC_ADDR            0xffa60000
+#define RTC_IRQ             26
+
 #define SDHCI_CAPABILITIES  0x280737ec6481 /* Datasheet: UG1085 (v1.7) */
 
 static const uint64_t gem_addr[XLNX_ZYNQMP_NUM_GEMS] = {
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_init(Object *obj)
 
     object_initialize(&s->ipi, sizeof(s->ipi), TYPE_XLNX_ZYNQMP_IPI);
     qdev_set_parent_bus(DEVICE(&s->ipi), sysbus_get_default());
+
+    object_initialize(&s->rtc, sizeof(s->rtc), TYPE_XLNX_ZYNQMP_RTC);
+    qdev_set_parent_bus(DEVICE(&s->rtc), sysbus_get_default());
 }
 
 static void xlnx_zynqmp_realize(DeviceState *dev, Error **errp)
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error **errp)
     }
     sysbus_mmio_map(SYS_BUS_DEVICE(&s->ipi), 0, IPI_ADDR);
     sysbus_connect_irq(SYS_BUS_DEVICE(&s->ipi), 0, gic_spi[IPI_IRQ]);
+
+    object_property_set_bool(OBJECT(&s->rtc), true, "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    sysbus_mmio_map(SYS_BUS_DEVICE(&s->rtc), 0, RTC_ADDR);
+    sysbus_connect_irq(SYS_BUS_DEVICE(&s->rtc), 0, gic_spi[RTC_IRQ]);
 }
 
 static Property xlnx_zynqmp_props[] = {
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Allow the translate subroutines to return false for invalid insns.

At present we can of course invoke an invalid insn exception from within
the translate subroutine, but in the short term this consolidates code.
In the long term it would allow the decodetree language to support
overlapping patterns for ISA extensions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180227232618.2908-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 scripts/decodetree.py | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/scripts/decodetree.py b/scripts/decodetree.py
index XXXXXXX..XXXXXXX 100755
--- a/scripts/decodetree.py
+++ b/scripts/decodetree.py
@@ -XXX,XX +XXX,XX @@ class Pattern(General):
         global translate_prefix
         output('typedef ', self.base.base.struct_name(),
                ' arg_', self.name, ';\n')
-        output(translate_scope, 'void ', translate_prefix, '_', self.name,
+        output(translate_scope, 'bool ', translate_prefix, '_', self.name,
                '(DisasContext *ctx, arg_', self.name,
                ' *a, ', insntype, ' insn);\n')
 
@@ -XXX,XX +XXX,XX @@ class Pattern(General):
             output(ind, self.base.extract_name(), '(&u.f_', arg, ', insn);\n')
         for n, f in self.fields.items():
             output(ind, 'u.f_', arg, '.', n, ' = ', f.str_extract(), ';\n')
-        output(ind, translate_prefix, '_', self.name,
+        output(ind, 'return ', translate_prefix, '_', self.name,
                '(ctx, &u.f_', arg, ', insn);\n')
-        output(ind, 'return true;\n')
 # end Pattern
 
 
-- 
2.16.2

Add a function load_ramdisk_as() which behaves like the existing
load_ramdisk() but allows the caller to specify the AddressSpace
to use. This matches the pattern we have already for various
other loader functions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-2-peter.maydell@linaro.org
---
 include/hw/loader.h | 12 +++++++++++-
 hw/core/loader.c    |  8 +++++++-
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/include/hw/loader.h b/include/hw/loader.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -XXX,XX +XXX,XX @@ int load_uimage(const char *filename, hwaddr *ep,
                 void *translate_opaque);
 
 /**
- * load_ramdisk:
+ * load_ramdisk_as:
  * @filename: Path to the ramdisk image
  * @addr: Memory address to load the ramdisk to
  * @max_sz: Maximum allowed ramdisk size (for non-u-boot ramdisks)
+ * @as: The AddressSpace to load the ELF to. The value of address_space_memory
+ *      is used if nothing is supplied here.
  *
  * Load a ramdisk image with U-Boot header to the specified memory
  * address.
  *
  * Returns the size of the loaded image on success, -1 otherwise.
  */
+int load_ramdisk_as(const char *filename, hwaddr addr, uint64_t max_sz,
+                    AddressSpace *as);
+
+/**
+ * load_ramdisk:
+ * Same as load_ramdisk_as(), but doesn't allow the caller to specify
+ * an AddressSpace.
+ */
 int load_ramdisk(const char *filename, hwaddr addr, uint64_t max_sz);
 
 ssize_t gunzip(void *dst, size_t dstlen, uint8_t *src, size_t srclen);
diff --git a/hw/core/loader.c b/hw/core/loader.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -XXX,XX +XXX,XX @@ int load_uimage_as(const char *filename, hwaddr *ep, hwaddr *loadaddr,
 
 /* Load a ramdisk.  */
 int load_ramdisk(const char *filename, hwaddr addr, uint64_t max_sz)
+{
+    return load_ramdisk_as(filename, addr, max_sz, NULL);
+}
+
+int load_ramdisk_as(const char *filename, hwaddr addr, uint64_t max_sz,
+                    AddressSpace *as)
 {
     return load_uboot_image(filename, NULL, &addr, NULL, IH_TYPE_RAMDISK,
-                            NULL, NULL, NULL);
+                            NULL, NULL, as);
 }
 
 /* Load a gzip-compressed kernel to a dynamically allocated buffer. */
-- 
2.16.2

Instead of loading kernels, device trees, and the like to
the system address space, use the CPU's address space. This
is important if we're trying to load the file to memory or
via an alias memory region that is provided by an SoC
object and thus not mapped into the system address space.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-3-peter.maydell@linaro.org
---
 hw/arm/boot.c | 119 +++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 76 insertions(+), 43 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -XXX,XX +XXX,XX @@
 #define ARM64_TEXT_OFFSET_OFFSET    8
 #define ARM64_MAGIC_OFFSET          56
 
+static AddressSpace *arm_boot_address_space(ARMCPU *cpu,
+                                            const struct arm_boot_info *info)
+{
+    /* Return the address space to use for bootloader reads and writes.
+     * We prefer the secure address space if the CPU has it and we're
+     * going to boot the guest into it.
+     */
+    int asidx;
+    CPUState *cs = CPU(cpu);
+
+    if (arm_feature(&cpu->env, ARM_FEATURE_EL3) && info->secure_boot) {
+        asidx = ARMASIdx_S;
+    } else {
+        asidx = ARMASIdx_NS;
+    }
+
+    return cpu_get_address_space(cs, asidx);
+}
+
 typedef enum {
     FIXUP_NONE = 0,     /* do nothing */
     FIXUP_TERMINATOR,   /* end of insns */
@@ -XXX,XX +XXX,XX @@ static const ARMInsnFixup smpboot[] = {
 };
 
 static void write_bootloader(const char *name, hwaddr addr,
-                             const ARMInsnFixup *insns, uint32_t *fixupcontext)
+                             const ARMInsnFixup *insns, uint32_t *fixupcontext,
+                             AddressSpace *as)
 {
     /* Fix up the specified bootloader fragment and write it into
      * guest memory using rom_add_blob_fixed(). fixupcontext is
@@ -XXX,XX +XXX,XX @@ static void write_bootloader(const char *name, hwaddr addr,
         code[i] = tswap32(insn);
     }
 
-    rom_add_blob_fixed(name, code, len * sizeof(uint32_t), addr);
+    rom_add_blob_fixed_as(name, code, len * sizeof(uint32_t), addr, as);
 
     g_free(code);
 }
@@ -XXX,XX +XXX,XX @@ static void default_write_secondary(ARMCPU *cpu,
                                     const struct arm_boot_info *info)
 {
     uint32_t fixupcontext[FIXUP_MAX];
+    AddressSpace *as = arm_boot_address_space(cpu, info);
 
     fixupcontext[FIXUP_GIC_CPU_IF] = info->gic_cpu_if_addr;
     fixupcontext[FIXUP_BOOTREG] = info->smp_bootreg_addr;
@@ -XXX,XX +XXX,XX @@ static void default_write_secondary(ARMCPU *cpu,
     }
 
     write_bootloader("smpboot", info->smp_loader_start,
-                     smpboot, fixupcontext);
+                     smpboot, fixupcontext, as);
 }
 
 void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu,
                                             const struct arm_boot_info *info,
                                             hwaddr mvbar_addr)
 {
+    AddressSpace *as = arm_boot_address_space(cpu, info);
     int n;
     uint32_t mvbar_blob[] = {
         /* mvbar_addr: secure monitor vectors
@@ -XXX,XX +XXX,XX @@ void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu,
     for (n = 0; n < ARRAY_SIZE(mvbar_blob); n++) {
         mvbar_blob[n] = tswap32(mvbar_blob[n]);
     }
-    rom_add_blob_fixed("board-setup-mvbar", mvbar_blob, sizeof(mvbar_blob),
-                       mvbar_addr);
+    rom_add_blob_fixed_as("board-setup-mvbar", mvbar_blob, sizeof(mvbar_blob),
+                          mvbar_addr, as);
 
     for (n = 0; n < ARRAY_SIZE(board_setup_blob); n++) {
         board_setup_blob[n] = tswap32(board_setup_blob[n]);
     }
-    rom_add_blob_fixed("board-setup", board_setup_blob,
-                       sizeof(board_setup_blob), info->board_setup_addr);
+    rom_add_blob_fixed_as("board-setup", board_setup_blob,
+                          sizeof(board_setup_blob), info->board_setup_addr, as);
 }
 
 static void default_reset_secondary(ARMCPU *cpu,
                                     const struct arm_boot_info *info)
 {
+    AddressSpace *as = arm_boot_address_space(cpu, info);
     CPUState *cs = CPU(cpu);
 
-    address_space_stl_notdirty(&address_space_memory, info->smp_bootreg_addr,
+    address_space_stl_notdirty(as, info->smp_bootreg_addr,
                                0, MEMTXATTRS_UNSPECIFIED, NULL);
     cpu_set_pc(cs, info->smp_loader_start);
 }
@@ -XXX,XX +XXX,XX @@ static inline bool have_dtb(const struct arm_boot_info *info)
 }
 
 #define WRITE_WORD(p, value) do { \
-    address_space_stl_notdirty(&address_space_memory, p, value, \
+    address_space_stl_notdirty(as, p, value, \
                                MEMTXATTRS_UNSPECIFIED, NULL);  \
     p += 4;                       \
 } while (0)
 
-static void set_kernel_args(const struct arm_boot_info *info)
+static void set_kernel_args(const struct arm_boot_info *info, AddressSpace *as)
 {
     int initrd_size = info->initrd_size;
     hwaddr base = info->loader_start;
@@ -XXX,XX +XXX,XX @@ static void set_kernel_args(const struct arm_boot_info *info)
         int cmdline_size;
 
         cmdline_size = strlen(info->kernel_cmdline);
-        cpu_physical_memory_write(p + 8, info->kernel_cmdline,
-                                  cmdline_size + 1);
+        address_space_write(as, p + 8, MEMTXATTRS_UNSPECIFIED,
+                            (const uint8_t *)info->kernel_cmdline,
+                            cmdline_size + 1);
         cmdline_size = (cmdline_size >> 2) + 1;
         WRITE_WORD(p, cmdline_size + 2);
         WRITE_WORD(p, 0x54410009);
@@ -XXX,XX +XXX,XX @@ static void set_kernel_args(const struct arm_boot_info *info)
         atag_board_len = (info->atag_board(info, atag_board_buf) + 3) & ~3;
         WRITE_WORD(p, (atag_board_len + 8) >> 2);
         WRITE_WORD(p, 0x414f4d50);
-        cpu_physical_memory_write(p, atag_board_buf, atag_board_len);
+        address_space_write(as, p, MEMTXATTRS_UNSPECIFIED,
+                            atag_board_buf, atag_board_len);
         p += atag_board_len;
     }
     /* ATAG_END */
@@ -XXX,XX +XXX,XX @@ static void set_kernel_args(const struct arm_boot_info *info)
     WRITE_WORD(p, 0);
 }
 
-static void set_kernel_args_old(const struct arm_boot_info *info)
+static void set_kernel_args_old(const struct arm_boot_info *info,
+                                AddressSpace *as)
 {
     hwaddr p;
     const char *s;
@@ -XXX,XX +XXX,XX @@ static void set_kernel_args_old(const struct arm_boot_info *info)
     }
     s = info->kernel_cmdline;
     if (s) {
-        cpu_physical_memory_write(p, s, strlen(s) + 1);
+        address_space_write(as, p, MEMTXATTRS_UNSPECIFIED,
+                            (const uint8_t *)s, strlen(s) + 1);
     } else {
         WRITE_WORD(p, 0);
     }
@@ -XXX,XX +XXX,XX @@ static void fdt_add_psci_node(void *fdt)
  * @addr:       the address to load the image at
  * @binfo:      struct describing the boot environment
  * @addr_limit: upper limit of the available memory area at @addr
+ * @as:         address space to load image to
  *
  * Load a device tree supplied by the machine or by the user  with the
  * '-dtb' command line option, and put it at offset @addr in target
@@ -XXX,XX +XXX,XX @@ static void fdt_add_psci_node(void *fdt)
  * Note: Must not be called unless have_dtb(binfo) is true.
  */
 static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
-                    hwaddr addr_limit)
+                    hwaddr addr_limit, AddressSpace *as)
 {
     void *fdt = NULL;
     int size, rc;
@@ -XXX,XX +XXX,XX @@ static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
     /* Put the DTB into the memory map as a ROM image: this will ensure
      * the DTB is copied again upon reset, even if addr points into RAM.
      */
-    rom_add_blob_fixed("dtb", fdt, size, addr);
+    rom_add_blob_fixed_as("dtb", fdt, size, addr, as);
 
     g_free(fdt);
 
@@ -XXX,XX +XXX,XX @@ static void do_cpu_reset(void *opaque)
             }
 
             if (cs == first_cpu) {
+                AddressSpace *as = arm_boot_address_space(cpu, info);
+
                 cpu_set_pc(cs, info->loader_start);
 
                 if (!have_dtb(info)) {
                     if (old_param) {
-                        set_kernel_args_old(info);
+                        set_kernel_args_old(info, as);
                     } else {
-                        set_kernel_args(info);
+                        set_kernel_args(info, as);
                     }
                 }
             } else {
@@ -XXX,XX +XXX,XX @@ static int do_arm_linux_init(Object *obj, void *opaque)
 
 static uint64_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
                              uint64_t *lowaddr, uint64_t *highaddr,
-                             int elf_machine)
+                             int elf_machine, AddressSpace *as)
 {
     bool elf_is64;
     union {
@@ -XXX,XX +XXX,XX @@ static uint64_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
         }
     }
 
-    ret = load_elf(info->kernel_filename, NULL, NULL,
-                   pentry, lowaddr, highaddr, big_endian, elf_machine,
-                   1, data_swab);
+    ret = load_elf_as(info->kernel_filename, NULL, NULL,
+                      pentry, lowaddr, highaddr, big_endian, elf_machine,
+                      1, data_swab, as);
     if (ret <= 0) {
         /* The header loaded but the image didn't */
         exit(1);
@@ -XXX,XX +XXX,XX @@ static uint64_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
 }
 
 static uint64_t load_aarch64_image(const char *filename, hwaddr mem_base,
-                                   hwaddr *entry)
+                                   hwaddr *entry, AddressSpace *as)
 {
     hwaddr kernel_load_offset = KERNEL64_LOAD_ADDR;
     uint8_t *buffer;
@@ -XXX,XX +XXX,XX @@ static uint64_t load_aarch64_image(const char *filename, hwaddr mem_base,
     }
 
     *entry = mem_base + kernel_load_offset;
-    rom_add_blob_fixed(filename, buffer, size, *entry);
+    rom_add_blob_fixed_as(filename, buffer, size, *entry, as);
 
     g_free(buffer);
 
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
     ARMCPU *cpu = n->cpu;
     struct arm_boot_info *info =
         container_of(n, struct arm_boot_info, load_kernel_notifier);
+    AddressSpace *as = arm_boot_address_space(cpu, info);
 
     /* The board code is not supposed to set secure_board_setup unless
      * running its code in secure mode is actually possible, and KVM
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
              * the kernel is supposed to be loaded by the bootloader), copy the
              * DTB to the base of RAM for the bootloader to pick up.
              */
-            if (load_dtb(info->loader_start, info, 0) < 0) {
+            if (load_dtb(info->loader_start, info, 0, as) < 0) {
                 exit(1);
             }
         }
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
 
     /* Assume that raw images are linux kernels, and ELF images are not.  */
     kernel_size = arm_load_elf(info, &elf_entry, &elf_low_addr,
-                               &elf_high_addr, elf_machine);
+                               &elf_high_addr, elf_machine, as);
     if (kernel_size > 0 && have_dtb(info)) {
         /* If there is still some room left at the base of RAM, try and put
          * the DTB there like we do for images loaded with -bios or -pflash.
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
             if (elf_low_addr < info->loader_start) {
                 elf_low_addr = 0;
             }
-            if (load_dtb(info->loader_start, info, elf_low_addr) < 0) {
+            if (load_dtb(info->loader_start, info, elf_low_addr, as) < 0) {
                 exit(1);
             }
         }
     }
     entry = elf_entry;
     if (kernel_size < 0) {
-        kernel_size = load_uimage(info->kernel_filename, &entry, NULL,
-                                  &is_linux, NULL, NULL);
+        kernel_size = load_uimage_as(info->kernel_filename, &entry, NULL,
+                                     &is_linux, NULL, NULL, as);
     }
     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64) && kernel_size < 0) {
         kernel_size = load_aarch64_image(info->kernel_filename,
-                                         info->loader_start, &entry);
+                                         info->loader_start, &entry, as);
         is_linux = 1;
     } else if (kernel_size < 0) {
         /* 32-bit ARM */
         entry = info->loader_start + KERNEL_LOAD_ADDR;
-        kernel_size = load_image_targphys(info->kernel_filename, entry,
-                                          info->ram_size - KERNEL_LOAD_ADDR);
+        kernel_size = load_image_targphys_as(info->kernel_filename, entry,
+                                             info->ram_size - KERNEL_LOAD_ADDR,
+                                             as);
         is_linux = 1;
     }
     if (kernel_size < 0) {
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
         uint32_t fixupcontext[FIXUP_MAX];
 
         if (info->initrd_filename) {
-            initrd_size = load_ramdisk(info->initrd_filename,
-                                       info->initrd_start,
-                                       info->ram_size -
-                                       info->initrd_start);
+            initrd_size = load_ramdisk_as(info->initrd_filename,
+                                          info->initrd_start,
+                                          info->ram_size - info->initrd_start,
+                                          as);
             if (initrd_size < 0) {
-                initrd_size = load_image_targphys(info->initrd_filename,
-                                                  info->initrd_start,
-                                                  info->ram_size -
-                                                  info->initrd_start);
+                initrd_size = load_image_targphys_as(info->initrd_filename,
+                                                     info->initrd_start,
+                                                     info->ram_size -
+                                                     info->initrd_start,
+                                                     as);
             }
             if (initrd_size < 0) {
                 error_report("could not load initrd '%s'",
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
 
             /* Place the DTB after the initrd in memory with alignment. */
             dtb_start = QEMU_ALIGN_UP(info->initrd_start + initrd_size, align);
-            if (load_dtb(dtb_start, info, 0) < 0) {
+            if (load_dtb(dtb_start, info, 0, as) < 0) {
                 exit(1);
             }
             fixupcontext[FIXUP_ARGPTR] = dtb_start;
@@ -XXX,XX +XXX,XX @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
         fixupcontext[FIXUP_ENTRYPOINT] = entry;
 
         write_bootloader("bootloader", info->loader_start,
-                         primary_loader, fixupcontext);
+                         primary_loader, fixupcontext, as);
 
         if (info->nb_cpus > 1) {
             info->write_secondary_boot(cpu, info);
-- 
2.16.2

Instead of loading guest images to the system address space, use the
CPU's address space.  This is important if we're trying to load the
file to memory or via an alias memory region that is provided by an
SoC object and thus not mapped into the system address space.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-4-peter.maydell@linaro.org
---
 hw/arm/armv7m.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -XXX,XX +XXX,XX @@ void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size)
     uint64_t entry;
     uint64_t lowaddr;
     int big_endian;
+    AddressSpace *as;
+    int asidx;
+    CPUState *cs = CPU(cpu);
 
 #ifdef TARGET_WORDS_BIGENDIAN
     big_endian = 1;
@@ -XXX,XX +XXX,XX @@ void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size)
         exit(1);
     }
 
+    if (arm_feature(&cpu->env, ARM_FEATURE_EL3)) {
+        asidx = ARMASIdx_S;
+    } else {
+        asidx = ARMASIdx_NS;
+    }
+    as = cpu_get_address_space(cs, asidx);
+
     if (kernel_filename) {
-        image_size = load_elf(kernel_filename, NULL, NULL, &entry, &lowaddr,
-                              NULL, big_endian, EM_ARM, 1, 0);
+        image_size = load_elf_as(kernel_filename, NULL, NULL, &entry, &lowaddr,
+                                 NULL, big_endian, EM_ARM, 1, 0, as);
         if (image_size < 0) {
-            image_size = load_image_targphys(kernel_filename, 0, mem_size);
+            image_size = load_image_targphys_as(kernel_filename, 0,
+                                                mem_size, as);
             lowaddr = 0;
         }
         if (image_size < 0) {
-- 
2.16.2

In v8M, the Implementation Defined Attribution Unit (IDAU) is
a small piece of hardware typically implemented in the SoC
which provides board or SoC specific security attribution
information for each address that the CPU performs MPU/SAU
checks on. For QEMU, we model this with a QOM interface which
is implemented by the board or SoC object and connected to
the CPU using a link property.

This commit defines the new interface class, adds the link
property to the CPU object, and makes the SAU checking
code call the IDAU interface if one is present.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-5-peter.maydell@linaro.org
---
 target/arm/cpu.h    |  3 +++
 target/arm/idau.h   | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 target/arm/cpu.c    | 15 +++++++++++++
 target/arm/helper.c | 28 +++++++++++++++++++++---
 4 files changed, 104 insertions(+), 3 deletions(-)
 create mode 100644 target/arm/idau.h

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
     /* MemoryRegion to use for secure physical accesses */
     MemoryRegion *secure_memory;
 
+    /* For v8M, pointer to the IDAU interface provided by board/SoC */
+    Object *idau;
+
     /* 'compatible' string for this CPU for Linux device trees */
     const char *dtb_compatible;
 
diff --git a/target/arm/idau.h b/target/arm/idau.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/idau.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * QEMU ARM CPU -- interface for the Arm v8M IDAU
+ *
+ * Copyright (c) 2018 Linaro Ltd
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see
+ * <http://www.gnu.org/licenses/gpl-2.0.html>
+ *
+ * In the v8M architecture, the IDAU is a small piece of hardware
+ * typically implemented in the SoC which provides board or SoC
+ * specific security attribution information for each address that
+ * the CPU performs MPU/SAU checks on. For QEMU, we model this with a
+ * QOM interface which is implemented by the board or SoC object and
+ * connected to the CPU using a link property.
+ */
+
+#ifndef TARGET_ARM_IDAU_H
+#define TARGET_ARM_IDAU_H
+
+#include "qom/object.h"
+
+#define TYPE_IDAU_INTERFACE "idau-interface"
+#define IDAU_INTERFACE(obj) \
+    INTERFACE_CHECK(IDAUInterface, (obj), TYPE_IDAU_INTERFACE)
+#define IDAU_INTERFACE_CLASS(class) \
+    OBJECT_CLASS_CHECK(IDAUInterfaceClass, (class), TYPE_IDAU_INTERFACE)
+#define IDAU_INTERFACE_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(IDAUInterfaceClass, (obj), TYPE_IDAU_INTERFACE)
+
+typedef struct IDAUInterface {
+    Object parent;
+} IDAUInterface;
+
+#define IREGION_NOTVALID -1
+
+typedef struct IDAUInterfaceClass {
+    InterfaceClass parent;
+
+    /* Check the specified address and return the IDAU security information
+     * for it by filling in iregion, exempt, ns and nsc:
+     *  iregion: IDAU region number, or IREGION_NOTVALID if not valid
+     *  exempt: true if address is exempt from security attribution
+     *  ns: true if the address is NonSecure
+     *  nsc: true if the address is NonSecure-callable
+     */
+    void (*check)(IDAUInterface *ii, uint32_t address, int *iregion,
+                  bool *exempt, bool *ns, bool *nsc);
+} IDAUInterfaceClass;
+
+#endif
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@
  */
 
 #include "qemu/osdep.h"
+#include "target/arm/idau.h"
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "cpu.h"
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_post_init(Object *obj)
         }
     }
 
+    if (arm_feature(&cpu->env, ARM_FEATURE_M_SECURITY)) {
+        object_property_add_link(obj, "idau", TYPE_IDAU_INTERFACE, &cpu->idau,
+                                 qdev_prop_allow_set_link_before_realize,
+                                 OBJ_PROP_LINK_UNREF_ON_RELEASE,
+                                 &error_abort);
+    }
+
     qdev_property_add_static(DEVICE(obj), &arm_cpu_cfgend_property,
                              &error_abort);
 }
@@ -XXX,XX +XXX,XX @@ static const TypeInfo arm_cpu_type_info = {
     .class_init = arm_cpu_class_init,
 };
 
+static const TypeInfo idau_interface_type_info = {
+    .name = TYPE_IDAU_INTERFACE,
+    .parent = TYPE_INTERFACE,
+    .class_size = sizeof(IDAUInterfaceClass),
+};
+
 static void arm_cpu_register_types(void)
 {
     const ARMCPUInfo *info = arm_cpus;
 
     type_register_static(&arm_cpu_type_info);
+    type_register_static(&idau_interface_type_info);
 
     while (info->name) {
         cpu_register(info);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/osdep.h"
+#include "target/arm/idau.h"
 #include "trace.h"
 #include "cpu.h"
 #include "internals.h"
@@ -XXX,XX +XXX,XX @@ static void v8m_security_lookup(CPUARMState *env, uint32_t address,
      */
     ARMCPU *cpu = arm_env_get_cpu(env);
     int r;
+    bool idau_exempt = false, idau_ns = true, idau_nsc = true;
+    int idau_region = IREGION_NOTVALID;
 
-    /* TODO: implement IDAU */
+    if (cpu->idau) {
+        IDAUInterfaceClass *iic = IDAU_INTERFACE_GET_CLASS(cpu->idau);
+        IDAUInterface *ii = IDAU_INTERFACE(cpu->idau);
+
+        iic->check(ii, address, &idau_region, &idau_exempt, &idau_ns,
+                   &idau_nsc);
+    }
 
     if (access_type == MMU_INST_FETCH && extract32(address, 28, 4) == 0xf) {
         /* 0xf0000000..0xffffffff is always S for insn fetches */
         return;
     }
 
-    if (v8m_is_sau_exempt(env, address, access_type)) {
+    if (idau_exempt || v8m_is_sau_exempt(env, address, access_type)) {
         sattrs->ns = !regime_is_secure(env, mmu_idx);
         return;
     }
 
+    if (idau_region != IREGION_NOTVALID) {
+        sattrs->irvalid = true;
+        sattrs->iregion = idau_region;
+    }
+
     switch (env->sau.ctrl & 3) {
     case 0: /* SAU.ENABLE == 0, SAU.ALLNS == 0 */
         break;
@@ -XXX,XX +XXX,XX @@ static void v8m_security_lookup(CPUARMState *env, uint32_t address,
             }
         }
 
-        /* TODO when we support the IDAU then it may override the result here */
+        /* The IDAU will override the SAU lookup results if it specifies
+         * higher security than the SAU does.
+         */
+        if (!idau_ns) {
+            if (sattrs->ns || (!idau_nsc && sattrs->nsc)) {
+                sattrs->ns = false;
+                sattrs->nsc = idau_nsc;
+            }
+        }
         break;
     }
 }
-- 
2.16.2

Create an "idau" property on the armv7m container object which
we can forward to the CPU object. Annoyingly, we can't use
object_property_add_alias() because the CPU object we want to
forward to doesn't exist until the armv7m container is realized.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-6-peter.maydell@linaro.org
---
 include/hw/arm/armv7m.h | 3 +++
 hw/arm/armv7m.c         | 9 +++++++++
 2 files changed, 12 insertions(+)

diff --git a/include/hw/arm/armv7m.h b/include/hw/arm/armv7m.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/armv7m.h
+++ b/include/hw/arm/armv7m.h
@@ -XXX,XX +XXX,XX @@
 
 #include "hw/sysbus.h"
 #include "hw/intc/armv7m_nvic.h"
+#include "target/arm/idau.h"
 
 #define TYPE_BITBAND "ARM,bitband-memory"
 #define BITBAND(obj) OBJECT_CHECK(BitBandState, (obj), TYPE_BITBAND)
@@ -XXX,XX +XXX,XX @@ typedef struct {
  * + Property "memory": MemoryRegion defining the physical address space
  *   that CPU accesses see. (The NVIC, bitbanding and other CPU-internal
  *   devices will be automatically layered on top of this view.)
+ * + Property "idau": IDAU interface (forwarded to CPU object)
  */
 typedef struct ARMv7MState {
     /*< private >*/
@@ -XXX,XX +XXX,XX @@ typedef struct ARMv7MState {
     char *cpu_type;
     /* MemoryRegion the board provides to us (with its devices, RAM, etc) */
     MemoryRegion *board_memory;
+    Object *idau;
 } ARMv7MState;
 
 #endif
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -XXX,XX +XXX,XX @@
 #include "sysemu/qtest.h"
 #include "qemu/error-report.h"
 #include "exec/address-spaces.h"
+#include "target/arm/idau.h"
 
 /* Bitbanded IO.  Each word corresponds to a single bit.  */
 
@@ -XXX,XX +XXX,XX @@ static void armv7m_realize(DeviceState *dev, Error **errp)
 
     object_property_set_link(OBJECT(s->cpu), OBJECT(&s->container), "memory",
                              &error_abort);
+    if (object_property_find(OBJECT(s->cpu), "idau", NULL)) {
+        object_property_set_link(OBJECT(s->cpu), s->idau, "idau", &err);
+        if (err != NULL) {
+            error_propagate(errp, err);
+            return;
+        }
+    }
     object_property_set_bool(OBJECT(s->cpu), true, "realized", &err);
     if (err != NULL) {
         error_propagate(errp, err);
@@ -XXX,XX +XXX,XX @@ static Property armv7m_properties[] = {
     DEFINE_PROP_STRING("cpu-type", ARMv7MState, cpu_type),
     DEFINE_PROP_LINK("memory", ARMv7MState, board_memory, TYPE_MEMORY_REGION,
                      MemoryRegion *),
+    DEFINE_PROP_LINK("idau", ARMv7MState, idau, TYPE_IDAU_INTERFACE, Object *),
     DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.16.2

The Cortex-M33 allows the system to specify the reset value of the
secure Vector Table Offset Register (VTOR) by asserting config
signals. In particular, guest images for the MPS2 AN505 board rely
on the MPS2's initial VTOR being correct for that board.
Implement a QEMU property so board and SoC code can set the reset
value to the correct value.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-7-peter.maydell@linaro.org
---
 target/arm/cpu.h |  3 +++
 target/arm/cpu.c | 18 ++++++++++++++----
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ARMCPU {
      */
     uint32_t psci_conduit;
 
+    /* For v8M, initial value of the Secure VTOR */
+    uint32_t init_svtor;
+
     /* [QEMU_]KVM_ARM_TARGET_* constant for this CPU, or
      * QEMU_KVM_ARM_TARGET_NONE if the kernel doesn't support this CPU type.
      */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
         uint32_t initial_msp; /* Loaded from 0x0 */
         uint32_t initial_pc; /* Loaded from 0x4 */
         uint8_t *rom;
+        uint32_t vecbase;
 
         if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
             env->v7m.secure = true;
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
         /* Unlike A/R profile, M profile defines the reset LR value */
         env->regs[14] = 0xffffffff;
 
-        /* Load the initial SP and PC from the vector table at address 0 */
-        rom = rom_ptr(0);
+        env->v7m.vecbase[M_REG_S] = cpu->init_svtor & 0xffffff80;
+
+        /* Load the initial SP and PC from offset 0 and 4 in the vector table */
+        vecbase = env->v7m.vecbase[env->v7m.secure];
+        rom = rom_ptr(vecbase);
         if (rom) {
             /* Address zero is covered by ROM which hasn't yet been
              * copied into physical memory.
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
              * it got copied into memory. In the latter case, rom_ptr
              * will return a NULL pointer and we should use ldl_phys instead.
              */
-            initial_msp = ldl_phys(s->as, 0);
-            initial_pc = ldl_phys(s->as, 4);
+            initial_msp = ldl_phys(s->as, vecbase);
+            initial_pc = ldl_phys(s->as, vecbase + 4);
         }
 
         env->regs[13] = initial_msp & 0xFFFFFFFC;
@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_pmsav7_dregion_property =
                                            pmsav7_dregion,
                                            qdev_prop_uint32, uint32_t);
 
+/* M profile: initial value of the Secure VTOR */
+static Property arm_cpu_initsvtor_property =
+            DEFINE_PROP_UINT32("init-svtor", ARMCPU, init_svtor, 0);
+
 static void arm_cpu_post_init(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_post_init(Object *obj)
                                  qdev_prop_allow_set_link_before_realize,
                                  OBJ_PROP_LINK_UNREF_ON_RELEASE,
                                  &error_abort);
+        qdev_property_add_static(DEVICE(obj), &arm_cpu_initsvtor_property,
+                                 &error_abort);
     }
 
     qdev_property_add_static(DEVICE(obj), &arm_cpu_cfgend_property,
-- 
2.16.2

Create an "init-svtor" property on the armv7m container
object which we can forward to the CPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-8-peter.maydell@linaro.org
---
 include/hw/arm/armv7m.h | 2 ++
 hw/arm/armv7m.c         | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/include/hw/arm/armv7m.h b/include/hw/arm/armv7m.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/armv7m.h
+++ b/include/hw/arm/armv7m.h
@@ -XXX,XX +XXX,XX @@ typedef struct {
  *   that CPU accesses see. (The NVIC, bitbanding and other CPU-internal
  *   devices will be automatically layered on top of this view.)
  * + Property "idau": IDAU interface (forwarded to CPU object)
+ * + Property "init-svtor": secure VTOR reset value (forwarded to CPU object)
  */
 typedef struct ARMv7MState {
     /*< private >*/
@@ -XXX,XX +XXX,XX @@ typedef struct ARMv7MState {
     /* MemoryRegion the board provides to us (with its devices, RAM, etc) */
     MemoryRegion *board_memory;
     Object *idau;
+    uint32_t init_svtor;
 } ARMv7MState;
 
 #endif
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -XXX,XX +XXX,XX @@ static void armv7m_realize(DeviceState *dev, Error **errp)
             return;
         }
     }
+    if (object_property_find(OBJECT(s->cpu), "init-svtor", NULL)) {
+        object_property_set_uint(OBJECT(s->cpu), s->init_svtor,
+                                 "init-svtor", &err);
+        if (err != NULL) {
+            error_propagate(errp, err);
+            return;
+        }
+    }
     object_property_set_bool(OBJECT(s->cpu), true, "realized", &err);
     if (err != NULL) {
         error_propagate(errp, err);
@@ -XXX,XX +XXX,XX @@ static Property armv7m_properties[] = {
     DEFINE_PROP_LINK("memory", ARMv7MState, board_memory, TYPE_MEMORY_REGION,
                      MemoryRegion *),
     DEFINE_PROP_LINK("idau", ARMv7MState, idau, TYPE_IDAU_INTERFACE, Object *),
+    DEFINE_PROP_UINT32("init-svtor", ARMv7MState, init_svtor, 0),
     DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.16.2

Add a Cortex-M33 definition. The M33 is an M profile CPU
which implements the ARM v8M architecture, including the
M profile Security Extension.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-9-peter.maydell@linaro.org
---
 target/arm/cpu.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
     cpu->id_isar5 = 0x00000000;
 }
 
+static void cortex_m33_initfn(Object *obj)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    set_feature(&cpu->env, ARM_FEATURE_V8);
+    set_feature(&cpu->env, ARM_FEATURE_M);
+    set_feature(&cpu->env, ARM_FEATURE_M_SECURITY);
+    set_feature(&cpu->env, ARM_FEATURE_THUMB_DSP);
+    cpu->midr = 0x410fd213; /* r0p3 */
+    cpu->pmsav7_dregion = 16;
+    cpu->sau_sregion = 8;
+    cpu->id_pfr0 = 0x00000030;
+    cpu->id_pfr1 = 0x00000210;
+    cpu->id_dfr0 = 0x00200000;
+    cpu->id_afr0 = 0x00000000;
+    cpu->id_mmfr0 = 0x00101F40;
+    cpu->id_mmfr1 = 0x00000000;
+    cpu->id_mmfr2 = 0x01000000;
+    cpu->id_mmfr3 = 0x00000000;
+    cpu->id_isar0 = 0x01101110;
+    cpu->id_isar1 = 0x02212000;
+    cpu->id_isar2 = 0x20232232;
+    cpu->id_isar3 = 0x01111131;
+    cpu->id_isar4 = 0x01310132;
+    cpu->id_isar5 = 0x00000000;
+    cpu->clidr = 0x00000000;
+    cpu->ctr = 0x8000c000;
+}
+
 static void arm_v7m_class_init(ObjectClass *oc, void *data)
 {
     CPUClass *cc = CPU_CLASS(oc);
@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo arm_cpus[] = {
                              .class_init = arm_v7m_class_init },
     { .name = "cortex-m4",   .initfn = cortex_m4_initfn,
                              .class_init = arm_v7m_class_init },
+    { .name = "cortex-m33",  .initfn = cortex_m33_initfn,
+                             .class_init = arm_v7m_class_init },
     { .name = "cortex-r5",   .initfn = cortex_r5_initfn },
     { .name = "cortex-a7",   .initfn = cortex_a7_initfn },
     { .name = "cortex-a8",   .initfn = cortex_a8_initfn },
-- 
2.16.2

Move the definition of the struct for the unimplemented-device
from unimp.c to unimp.h, so that users can embed the struct
in their own device structs if they prefer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-10-peter.maydell@linaro.org
---
 include/hw/misc/unimp.h | 10 ++++++++++
 hw/misc/unimp.c         | 10 ----------
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/hw/misc/unimp.h b/include/hw/misc/unimp.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/misc/unimp.h
+++ b/include/hw/misc/unimp.h
@@ -XXX,XX +XXX,XX @@
 
 #define TYPE_UNIMPLEMENTED_DEVICE "unimplemented-device"
 
+#define UNIMPLEMENTED_DEVICE(obj) \
+    OBJECT_CHECK(UnimplementedDeviceState, (obj), TYPE_UNIMPLEMENTED_DEVICE)
+
+typedef struct {
+    SysBusDevice parent_obj;
+    MemoryRegion iomem;
+    char *name;
+    uint64_t size;
+} UnimplementedDeviceState;
+
 /**
  * create_unimplemented_device: create and map a dummy device
  * @name: name of the device for debug logging
diff --git a/hw/misc/unimp.c b/hw/misc/unimp.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/unimp.c
+++ b/hw/misc/unimp.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/log.h"
 #include "qapi/error.h"
 
-#define UNIMPLEMENTED_DEVICE(obj) \
-    OBJECT_CHECK(UnimplementedDeviceState, (obj), TYPE_UNIMPLEMENTED_DEVICE)
-
-typedef struct {
-    SysBusDevice parent_obj;
-    MemoryRegion iomem;
-    char *name;
-    uint64_t size;
-} UnimplementedDeviceState;
-
 static uint64_t unimp_read(void *opaque, hwaddr offset, unsigned size)
 {
     UnimplementedDeviceState *s = UNIMPLEMENTED_DEVICE(opaque);
-- 
2.16.2

The function qdev_init_gpio_in_named() passes the DeviceState pointer
as the opaque data pointor for the irq handler function.  Usually
this is what you want, but in some cases it would be helpful to use
some other data pointer.

Add a new function qdev_init_gpio_in_named_with_opaque() which allows
the caller to specify the data pointer they want.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-12-peter.maydell@linaro.org
---
 include/hw/qdev-core.h | 30 ++++++++++++++++++++++++++++--
 hw/core/qdev.c         |  8 +++++---
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -XXX,XX +XXX,XX @@ BusState *qdev_get_child_bus(DeviceState *dev, const char *name);
 /* GPIO inputs also double as IRQ sinks.  */
 void qdev_init_gpio_in(DeviceState *dev, qemu_irq_handler handler, int n);
 void qdev_init_gpio_out(DeviceState *dev, qemu_irq *pins, int n);
-void qdev_init_gpio_in_named(DeviceState *dev, qemu_irq_handler handler,
-                             const char *name, int n);
 void qdev_init_gpio_out_named(DeviceState *dev, qemu_irq *pins,
                               const char *name, int n);
+/**
+ * qdev_init_gpio_in_named_with_opaque: create an array of input GPIO lines
+ *   for the specified device
+ *
+ * @dev: Device to create input GPIOs for
+ * @handler: Function to call when GPIO line value is set
+ * @opaque: Opaque data pointer to pass to @handler
+ * @name: Name of the GPIO input (must be unique for this device)
+ * @n: Number of GPIO lines in this input set
+ */
+void qdev_init_gpio_in_named_with_opaque(DeviceState *dev,
+                                         qemu_irq_handler handler,
+                                         void *opaque,
+                                         const char *name, int n);
+
+/**
+ * qdev_init_gpio_in_named: create an array of input GPIO lines
+ *   for the specified device
+ *
+ * Like qdev_init_gpio_in_named_with_opaque(), but the opaque pointer
+ * passed to the handler is @dev (which is the most commonly desired behaviour).
+ */
+static inline void qdev_init_gpio_in_named(DeviceState *dev,
+                                           qemu_irq_handler handler,
+                                           const char *name, int n)
+{
+    qdev_init_gpio_in_named_with_opaque(dev, handler, dev, name, n);
+}
 
 void qdev_pass_gpios(DeviceState *dev, DeviceState *container,
                      const char *name);
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -XXX,XX +XXX,XX @@ static NamedGPIOList *qdev_get_named_gpio_list(DeviceState *dev,
     return ngl;
 }
 
-void qdev_init_gpio_in_named(DeviceState *dev, qemu_irq_handler handler,
-                             const char *name, int n)
+void qdev_init_gpio_in_named_with_opaque(DeviceState *dev,
+                                         qemu_irq_handler handler,
+                                         void *opaque,
+                                         const char *name, int n)
 {
     int i;
     NamedGPIOList *gpio_list = qdev_get_named_gpio_list(dev, name);
 
     assert(gpio_list->num_out == 0 || !name);
     gpio_list->in = qemu_extend_irqs(gpio_list->in, gpio_list->num_in, handler,
-                                     dev, n);
+                                     opaque, n);
 
     if (!name) {
         name = "unnamed-gpio-in";
-- 
2.16.2

In some board or SoC models it is necessary to split a qemu_irq line
so that one input can feed multiple outputs.  We currently have
qemu_irq_split() for this, but that has several deficiencies:
 * it can only handle splitting a line into two
 * it unavoidably leaks memory, so it can't be used
   in a device that can be deleted

Implement a qdev device that encapsulates splitting of IRQs, with a
configurable number of outputs.  (This is in some ways the inverse of
the TYPE_OR_IRQ device.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-13-peter.maydell@linaro.org
---
 hw/core/Makefile.objs       |  1 +
 include/hw/core/split-irq.h | 57 +++++++++++++++++++++++++++++
 include/hw/irq.h            |  4 +-
 hw/core/split-irq.c         | 89 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 150 insertions(+), 1 deletion(-)
 create mode 100644 include/hw/core/split-irq.h
 create mode 100644 hw/core/split-irq.c

diff --git a/hw/core/Makefile.objs b/hw/core/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/hw/core/Makefile.objs
+++ b/hw/core/Makefile.objs
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_FITLOADER) += loader-fit.o
 common-obj-$(CONFIG_SOFTMMU) += qdev-properties-system.o
 common-obj-$(CONFIG_SOFTMMU) += register.o
 common-obj-$(CONFIG_SOFTMMU) += or-irq.o
+common-obj-$(CONFIG_SOFTMMU) += split-irq.o
 common-obj-$(CONFIG_PLATFORM_BUS) += platform-bus.o
 
 obj-$(CONFIG_SOFTMMU) += generic-loader.o
diff --git a/include/hw/core/split-irq.h b/include/hw/core/split-irq.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/core/split-irq.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * IRQ splitter device.
+ *
+ * Copyright (c) 2018 Linaro Limited.
+ * Written by Peter Maydell
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+/* This is a simple device which has one GPIO input line and multiple
+ * GPIO output lines. Any change on the input line is forwarded to all
+ * of the outputs.
+ *
+ * QEMU interface:
+ *  + one unnamed GPIO input: the input line
+ *  + N unnamed GPIO outputs: the output lines
+ *  + QOM property "num-lines": sets the number of output lines
+ */
+#ifndef HW_SPLIT_IRQ_H
+#define HW_SPLIT_IRQ_H
+
+#include "hw/irq.h"
+#include "hw/sysbus.h"
+#include "qom/object.h"
+
+#define TYPE_SPLIT_IRQ "split-irq"
+
+#define MAX_SPLIT_LINES 16
+
+typedef struct SplitIRQ SplitIRQ;
+
+#define SPLIT_IRQ(obj) OBJECT_CHECK(SplitIRQ, (obj), TYPE_SPLIT_IRQ)
+
+struct SplitIRQ {
+    DeviceState parent_obj;
+
+    qemu_irq out_irq[MAX_SPLIT_LINES];
+    uint16_t num_lines;
+};
+
+#endif
diff --git a/include/hw/irq.h b/include/hw/irq.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/irq.h
+++ b/include/hw/irq.h
@@ -XXX,XX +XXX,XX @@ void qemu_free_irq(qemu_irq irq);
 /* Returns a new IRQ with opposite polarity.  */
 qemu_irq qemu_irq_invert(qemu_irq irq);
 
-/* Returns a new IRQ which feeds into both the passed IRQs */
+/* Returns a new IRQ which feeds into both the passed IRQs.
+ * It's probably better to use the TYPE_SPLIT_IRQ device instead.
+ */
 qemu_irq qemu_irq_split(qemu_irq irq1, qemu_irq irq2);
 
 /* Returns a new IRQ set which connects 1:1 to another IRQ set, which
diff --git a/hw/core/split-irq.c b/hw/core/split-irq.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/hw/core/split-irq.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * IRQ splitter device.
+ *
+ * Copyright (c) 2018 Linaro Limited.
+ * Written by Peter Maydell
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/core/split-irq.h"
+#include "qapi/error.h"
+
+static void split_irq_handler(void *opaque, int n, int level)
+{
+    SplitIRQ *s = SPLIT_IRQ(opaque);
+    int i;
+
+    for (i = 0; i < s->num_lines; i++) {
+        qemu_set_irq(s->out_irq[i], level);
+    }
+}
+
+static void split_irq_init(Object *obj)
+{
+    qdev_init_gpio_in(DEVICE(obj), split_irq_handler, 1);
+}
+
+static void split_irq_realize(DeviceState *dev, Error **errp)
+{
+    SplitIRQ *s = SPLIT_IRQ(dev);
+
+    if (s->num_lines < 1 || s->num_lines >= MAX_SPLIT_LINES) {
+        error_setg(errp,
+                   "IRQ splitter number of lines %d is not between 1 and %d",
+                   s->num_lines, MAX_SPLIT_LINES);
+        return;
+    }
+
+    qdev_init_gpio_out(dev, s->out_irq, s->num_lines);
+}
+
+static Property split_irq_properties[] = {
+    DEFINE_PROP_UINT16("num-lines", SplitIRQ, num_lines, 1),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void split_irq_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    /* No state to reset or migrate */
+    dc->props = split_irq_properties;
+    dc->realize = split_irq_realize;
+
+    /* Reason: Needs to be wired up to work */
+    dc->user_creatable = false;
+}
+
+static const TypeInfo split_irq_type_info = {
+   .name = TYPE_SPLIT_IRQ,
+   .parent = TYPE_DEVICE,
+   .instance_size = sizeof(SplitIRQ),
+   .instance_init = split_irq_init,
+   .class_init = split_irq_class_init,
+};
+
+static void split_irq_register_types(void)
+{
+    type_register_static(&split_irq_type_info);
+}
+
+type_init(split_irq_register_types)
-- 
2.16.2

The MPS2 AN505 FPGA image includes a "FPGA control block"
which is a small set of registers handling LEDs, buttons
and some counters.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-14-peter.maydell@linaro.org
---
 hw/misc/Makefile.objs           |   1 +
 include/hw/misc/mps2-fpgaio.h   |  43 ++++++++++
 hw/misc/mps2-fpgaio.c           | 176 ++++++++++++++++++++++++++++++++++++++++
 default-configs/arm-softmmu.mak |   1 +
 hw/misc/trace-events            |   6 ++
 5 files changed, 227 insertions(+)
 create mode 100644 include/hw/misc/mps2-fpgaio.h
 create mode 100644 hw/misc/mps2-fpgaio.c

Add a model of the TrustZone peripheral protection controller (PPC),
which is used to gate transactions to non-TZ-aware peripherals so
that secure software can configure them to not be accessible to
non-secure software.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-15-peter.maydell@linaro.org
---
 hw/misc/Makefile.objs           |   2 +
 include/hw/misc/tz-ppc.h        | 101 ++++++++++++++
 hw/misc/tz-ppc.c                | 302 ++++++++++++++++++++++++++++++++++++++++
 default-configs/arm-softmmu.mak |   2 +
 hw/misc/trace-events            |  11 ++
 5 files changed, 418 insertions(+)
 create mode 100644 include/hw/misc/tz-ppc.h
 create mode 100644 hw/misc/tz-ppc.c

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_MIPS_ITU) += mips_itu.o
 obj-$(CONFIG_MPS2_FPGAIO) += mps2-fpgaio.o
 obj-$(CONFIG_MPS2_SCC) += mps2-scc.o
 
+obj-$(CONFIG_TZ_PPC) += tz-ppc.o
+
 obj-$(CONFIG_PVPANIC) += pvpanic.o
 obj-$(CONFIG_HYPERV_TESTDEV) += hyperv_testdev.o
 obj-$(CONFIG_AUX) += auxbus.o
diff --git a/include/hw/misc/tz-ppc.h b/include/hw/misc/tz-ppc.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/misc/tz-ppc.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * ARM TrustZone peripheral protection controller emulation
+ *
+ * Copyright (c) 2018 Linaro Limited
+ * Written by Peter Maydell
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 or
+ * (at your option) any later version.
+ */
+
+/* This is a model of the TrustZone peripheral protection controller (PPC).
+ * It is documented in the ARM CoreLink SIE-200 System IP for Embedded TRM
+ * (DDI 0571G):
+ * https://developer.arm.com/products/architecture/m-profile/docs/ddi0571/g
+ *
+ * The PPC sits in front of peripherals and allows secure software to
+ * configure it to either pass through or reject transactions.
+ * Rejected transactions may be configured to either be aborted, or to
+ * behave as RAZ/WI. An interrupt can be signalled for a rejected transaction.
+ *
+ * The PPC has no register interface -- it is configured purely by a
+ * collection of input signals from other hardware in the system. Typically
+ * they are either hardwired or exposed in an ad-hoc register interface by
+ * the SoC that uses the PPC.
+ *
+ * This QEMU model can be used to model either the AHB5 or APB4 TZ PPC,
+ * since the only difference between them is that the AHB version has a
+ * "default" port which has no security checks applied. In QEMU the default
+ * port can be emulated simply by wiring its downstream devices directly
+ * into the parent address space, since the PPC does not need to intercept
+ * transactions there.
+ *
+ * In the hardware, selection of which downstream port to use is done by
+ * the user's decode logic asserting one of the hsel[] signals. In QEMU,
+ * we provide 16 MMIO regions, one per port, and the user maps these into
+ * the desired addresses to implement the address decode.
+ *
+ * QEMU interface:
+ * + sysbus MMIO regions 0..15: MemoryRegions defining the upstream end
+ *   of each of the 16 ports of the PPC
+ * + Property "port[0..15]": MemoryRegion defining the downstream device(s)
+ *   for each of the 16 ports of the PPC
+ * + Named GPIO inputs "cfg_nonsec[0..15]": set to 1 if the port should be
+ *   accessible to NonSecure transactions
+ * + Named GPIO inputs "cfg_ap[0..15]": set to 1 if the port should be
+ *   accessible to non-privileged transactions
+ * + Named GPIO input "cfg_sec_resp": set to 1 if a rejected transaction should
+ *   result in a transaction error, or 0 for the transaction to RAZ/WI
+ * + Named GPIO input "irq_enable": set to 1 to enable interrupts
+ * + Named GPIO input "irq_clear": set to 1 to clear a pending interrupt
+ * + Named GPIO output "irq": set for a transaction-failed interrupt
+ * + Property "NONSEC_MASK": if a bit is set in this mask then accesses to
+ *   the associated port do not have the TZ security check performed. (This
+ *   corresponds to the hardware allowing this to be set as a Verilog
+ *   parameter.)
+ */
+
+#ifndef TZ_PPC_H
+#define TZ_PPC_H
+
+#include "hw/sysbus.h"
+
+#define TYPE_TZ_PPC "tz-ppc"
+#define TZ_PPC(obj) OBJECT_CHECK(TZPPC, (obj), TYPE_TZ_PPC)
+
+#define TZ_NUM_PORTS 16
+
+typedef struct TZPPC TZPPC;
+
+typedef struct TZPPCPort {
+    TZPPC *ppc;
+    MemoryRegion upstream;
+    AddressSpace downstream_as;
+    MemoryRegion *downstream;
+} TZPPCPort;
+
+struct TZPPC {
+    /*< private >*/
+    SysBusDevice parent_obj;
+
+    /*< public >*/
+
+    /* State: these just track the values of our input signals */
+    bool cfg_nonsec[TZ_NUM_PORTS];
+    bool cfg_ap[TZ_NUM_PORTS];
+    bool cfg_sec_resp;
+    bool irq_enable;
+    bool irq_clear;
+    /* State: are we asserting irq ? */
+    bool irq_status;
+
+    qemu_irq irq;
+
+    /* Properties */
+    uint32_t nonsec_mask;
+
+    TZPPCPort port[TZ_NUM_PORTS];
+};
+
+#endif
diff --git a/hw/misc/tz-ppc.c b/hw/misc/tz-ppc.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/hw/misc/tz-ppc.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * ARM TrustZone peripheral protection controller emulation
+ *
+ * Copyright (c) 2018 Linaro Limited
+ * Written by Peter Maydell
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 or
+ * (at your option) any later version.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qapi/error.h"
+#include "trace.h"
+#include "hw/sysbus.h"
+#include "hw/registerfields.h"
+#include "hw/misc/tz-ppc.h"
+
+static void tz_ppc_update_irq(TZPPC *s)
+{
+    bool level = s->irq_status && s->irq_enable;
+
+    trace_tz_ppc_update_irq(level);
+    qemu_set_irq(s->irq, level);
+}
+
+static void tz_ppc_cfg_nonsec(void *opaque, int n, int level)
+{
+    TZPPC *s = TZ_PPC(opaque);
+
+    assert(n < TZ_NUM_PORTS);
+    trace_tz_ppc_cfg_nonsec(n, level);
+    s->cfg_nonsec[n] = level;
+}
+
+static void tz_ppc_cfg_ap(void *opaque, int n, int level)
+{
+    TZPPC *s = TZ_PPC(opaque);
+
+    assert(n < TZ_NUM_PORTS);
+    trace_tz_ppc_cfg_ap(n, level);
+    s->cfg_ap[n] = level;
+}
+
+static void tz_ppc_cfg_sec_resp(void *opaque, int n, int level)
+{
+    TZPPC *s = TZ_PPC(opaque);
+
+    trace_tz_ppc_cfg_sec_resp(level);
+    s->cfg_sec_resp = level;
+}
+
+static void tz_ppc_irq_enable(void *opaque, int n, int level)
+{
+    TZPPC *s = TZ_PPC(opaque);
+
+    trace_tz_ppc_irq_enable(level);
+    s->irq_enable = level;
+    tz_ppc_update_irq(s);
+}
+
+static void tz_ppc_irq_clear(void *opaque, int n, int level)
+{
+    TZPPC *s = TZ_PPC(opaque);
+
+    trace_tz_ppc_irq_clear(level);
+
+    s->irq_clear = level;
+    if (level) {
+        s->irq_status = false;
+        tz_ppc_update_irq(s);
+    }
+}
+
+static bool tz_ppc_check(TZPPC *s, int n, MemTxAttrs attrs)
+{
+    /* Check whether to allow an access to port n; return true if
+     * the check passes, and false if the transaction must be blocked.
+     * If the latter, the caller must check cfg_sec_resp to determine
+     * whether to abort or RAZ/WI the transaction.
+     * The checks are:
+     *  + nonsec_mask suppresses any check of the secure attribute
+     *  + otherwise, block if cfg_nonsec is 1 and transaction is secure,
+     *    or if cfg_nonsec is 0 and transaction is non-secure
+     *  + block if transaction is usermode and cfg_ap is 0
+     */
+    if ((attrs.secure == s->cfg_nonsec[n] && !(s->nonsec_mask & (1 << n))) ||
+        (attrs.user && !s->cfg_ap[n])) {
+        /* Block the transaction. */
+        if (!s->irq_clear) {
+            /* Note that holding irq_clear high suppresses interrupts */
+            s->irq_status = true;
+            tz_ppc_update_irq(s);
+        }
+        return false;
+    }
+    return true;
+}
+
+static MemTxResult tz_ppc_read(void *opaque, hwaddr addr, uint64_t *pdata,
+                               unsigned size, MemTxAttrs attrs)
+{
+    TZPPCPort *p = opaque;
+    TZPPC *s = p->ppc;
+    int n = p - s->port;
+    AddressSpace *as = &p->downstream_as;
+    uint64_t data;
+    MemTxResult res;
+
+    if (!tz_ppc_check(s, n, attrs)) {
+        trace_tz_ppc_read_blocked(n, addr, attrs.secure, attrs.user);
+        if (s->cfg_sec_resp) {
+            return MEMTX_ERROR;
+        } else {
+            *pdata = 0;
+            return MEMTX_OK;
+        }
+    }
+
+    switch (size) {
+    case 1:
+        data = address_space_ldub(as, addr, attrs, &res);
+        break;
+    case 2:
+        data = address_space_lduw_le(as, addr, attrs, &res);
+        break;
+    case 4:
+        data = address_space_ldl_le(as, addr, attrs, &res);
+        break;
+    case 8:
+        data = address_space_ldq_le(as, addr, attrs, &res);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    *pdata = data;
+    return res;
+}
+
+static MemTxResult tz_ppc_write(void *opaque, hwaddr addr, uint64_t val,
+                                unsigned size, MemTxAttrs attrs)
+{
+    TZPPCPort *p = opaque;
+    TZPPC *s = p->ppc;
+    AddressSpace *as = &p->downstream_as;
+    int n = p - s->port;
+    MemTxResult res;
+
+    if (!tz_ppc_check(s, n, attrs)) {
+        trace_tz_ppc_write_blocked(n, addr, attrs.secure, attrs.user);
+        if (s->cfg_sec_resp) {
+            return MEMTX_ERROR;
+        } else {
+            return MEMTX_OK;
+        }
+    }
+
+    switch (size) {
+    case 1:
+        address_space_stb(as, addr, val, attrs, &res);
+        break;
+    case 2:
+        address_space_stw_le(as, addr, val, attrs, &res);
+        break;
+    case 4:
+        address_space_stl_le(as, addr, val, attrs, &res);
+        break;
+    case 8:
+        address_space_stq_le(as, addr, val, attrs, &res);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    return res;
+}
+
+static const MemoryRegionOps tz_ppc_ops = {
+    .read_with_attrs = tz_ppc_read,
+    .write_with_attrs = tz_ppc_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static void tz_ppc_reset(DeviceState *dev)
+{
+    TZPPC *s = TZ_PPC(dev);
+
+    trace_tz_ppc_reset();
+    s->cfg_sec_resp = false;
+    memset(s->cfg_nonsec, 0, sizeof(s->cfg_nonsec));
+    memset(s->cfg_ap, 0, sizeof(s->cfg_ap));
+}
+
+static void tz_ppc_init(Object *obj)
+{
+    DeviceState *dev = DEVICE(obj);
+    TZPPC *s = TZ_PPC(obj);
+
+    qdev_init_gpio_in_named(dev, tz_ppc_cfg_nonsec, "cfg_nonsec", TZ_NUM_PORTS);
+    qdev_init_gpio_in_named(dev, tz_ppc_cfg_ap, "cfg_ap", TZ_NUM_PORTS);
+    qdev_init_gpio_in_named(dev, tz_ppc_cfg_sec_resp, "cfg_sec_resp", 1);
+    qdev_init_gpio_in_named(dev, tz_ppc_irq_enable, "irq_enable", 1);
+    qdev_init_gpio_in_named(dev, tz_ppc_irq_clear, "irq_clear", 1);
+    qdev_init_gpio_out_named(dev, &s->irq, "irq", 1);
+}
+
+static void tz_ppc_realize(DeviceState *dev, Error **errp)
+{
+    Object *obj = OBJECT(dev);
+    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+    TZPPC *s = TZ_PPC(dev);
+    int i;
+
+    /* We can't create the upstream end of the port until realize,
+     * as we don't know the size of the MR used as the downstream until then.
+     */
+    for (i = 0; i < TZ_NUM_PORTS; i++) {
+        TZPPCPort *port = &s->port[i];
+        char *name;
+        uint64_t size;
+
+        if (!port->downstream) {
+            continue;
+        }
+
+        name = g_strdup_printf("tz-ppc-port[%d]", i);
+
+        port->ppc = s;
+        address_space_init(&port->downstream_as, port->downstream, name);
+
+        size = memory_region_size(port->downstream);
+        memory_region_init_io(&port->upstream, obj, &tz_ppc_ops,
+                              port, name, size);
+        sysbus_init_mmio(sbd, &port->upstream);
+        g_free(name);
+    }
+}
+
+static const VMStateDescription tz_ppc_vmstate = {
+    .name = "tz-ppc",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_BOOL_ARRAY(cfg_nonsec, TZPPC, 16),
+        VMSTATE_BOOL_ARRAY(cfg_ap, TZPPC, 16),
+        VMSTATE_BOOL(cfg_sec_resp, TZPPC),
+        VMSTATE_BOOL(irq_enable, TZPPC),
+        VMSTATE_BOOL(irq_clear, TZPPC),
+        VMSTATE_BOOL(irq_status, TZPPC),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+#define DEFINE_PORT(N)                                          \
+    DEFINE_PROP_LINK("port[" #N "]", TZPPC, port[N].downstream, \
+                     TYPE_MEMORY_REGION, MemoryRegion *)
+
+static Property tz_ppc_properties[] = {
+    DEFINE_PROP_UINT32("NONSEC_MASK", TZPPC, nonsec_mask, 0),
+    DEFINE_PORT(0),
+    DEFINE_PORT(1),
+    DEFINE_PORT(2),
+    DEFINE_PORT(3),
+    DEFINE_PORT(4),
+    DEFINE_PORT(5),
+    DEFINE_PORT(6),
+    DEFINE_PORT(7),
+    DEFINE_PORT(8),
+    DEFINE_PORT(9),
+    DEFINE_PORT(10),
+    DEFINE_PORT(11),
+    DEFINE_PORT(12),
+    DEFINE_PORT(13),
+    DEFINE_PORT(14),
+    DEFINE_PORT(15),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void tz_ppc_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->realize = tz_ppc_realize;
+    dc->vmsd = &tz_ppc_vmstate;
+    dc->reset = tz_ppc_reset;
+    dc->props = tz_ppc_properties;
+}
+
+static const TypeInfo tz_ppc_info = {
+    .name = TYPE_TZ_PPC,
+    .parent = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(TZPPC),
+    .instance_init = tz_ppc_init,
+    .class_init = tz_ppc_class_init,
+};
+
+static void tz_ppc_register_types(void)
+{
+    type_register_static(&tz_ppc_info);
+}
+
+type_init(tz_ppc_register_types);
diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index XXXXXXX..XXXXXXX 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -XXX,XX +XXX,XX @@ CONFIG_CMSDK_APB_UART=y
 CONFIG_MPS2_FPGAIO=y
 CONFIG_MPS2_SCC=y
 
+CONFIG_TZ_PPC=y
+
 CONFIG_VERSATILE_PCI=y
 CONFIG_VERSATILE_I2C=y
 
diff --git a/hw/misc/trace-events b/hw/misc/trace-events
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/trace-events
+++ b/hw/misc/trace-events
@@ -XXX,XX +XXX,XX @@ mos6522_get_next_irq_time(uint16_t latch, int64_t d, int64_t delta) "latch=%d co
 mos6522_set_sr_int(void) "set sr_int"
 mos6522_write(uint64_t addr, uint64_t val) "reg=0x%"PRIx64 " val=0x%"PRIx64
 mos6522_read(uint64_t addr, unsigned val) "reg=0x%"PRIx64 " val=0x%x"
+
+# hw/misc/tz-ppc.c
+tz_ppc_reset(void) "TZ PPC: reset"
+tz_ppc_cfg_nonsec(int n, int level) "TZ PPC: cfg_nonsec[%d] = %d"
+tz_ppc_cfg_ap(int n, int level) "TZ PPC: cfg_ap[%d] = %d"
+tz_ppc_cfg_sec_resp(int level) "TZ PPC: cfg_sec_resp = %d"
+tz_ppc_irq_enable(int level) "TZ PPC: int_enable = %d"
+tz_ppc_irq_clear(int level) "TZ PPC: int_clear = %d"
+tz_ppc_update_irq(int level) "TZ PPC: setting irq line to %d"
+tz_ppc_read_blocked(int n, hwaddr offset, bool secure, bool user) "TZ PPC: port %d offset 0x%" HWADDR_PRIx " read (secure %d user %d) blocked"
+tz_ppc_write_blocked(int n, hwaddr offset, bool secure, bool user) "TZ PPC: port %d offset 0x%" HWADDR_PRIx " write (secure %d user %d) blocked"
-- 
2.16.2

The Arm IoT Kit includes a "security controller" which is largely a
collection of registers for controlling the PPCs and other bits of
glue in the system.  This commit provides the initial skeleton of the
device, implementing just the ID registers, and a couple of read-only
read-as-zero registers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-16-peter.maydell@linaro.org
---
 hw/misc/Makefile.objs           |   1 +
 include/hw/misc/iotkit-secctl.h |  39 ++++
 hw/misc/iotkit-secctl.c         | 448 ++++++++++++++++++++++++++++++++++++++++
 default-configs/arm-softmmu.mak |   1 +
 hw/misc/trace-events            |   7 +
 5 files changed, 496 insertions(+)
 create mode 100644 include/hw/misc/iotkit-secctl.h
 create mode 100644 hw/misc/iotkit-secctl.c

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_MPS2_FPGAIO) += mps2-fpgaio.o
 obj-$(CONFIG_MPS2_SCC) += mps2-scc.o
 
 obj-$(CONFIG_TZ_PPC) += tz-ppc.o
+obj-$(CONFIG_IOTKIT_SECCTL) += iotkit-secctl.o
 
 obj-$(CONFIG_PVPANIC) += pvpanic.o
 obj-$(CONFIG_HYPERV_TESTDEV) += hyperv_testdev.o
diff --git a/include/hw/misc/iotkit-secctl.h b/include/hw/misc/iotkit-secctl.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/misc/iotkit-secctl.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * ARM IoT Kit security controller
+ *
+ * Copyright (c) 2018 Linaro Limited
+ * Written by Peter Maydell
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 or
+ * (at your option) any later version.
+ */
+
+/* This is a model of the security controller which is part of the
+ * Arm IoT Kit and documented in
+ * http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ecm0601256/index.html
+ *
+ * QEMU interface:
+ *  + sysbus MMIO region 0 is the "secure privilege control block" registers
+ *  + sysbus MMIO region 1 is the "non-secure privilege control block" registers
+ */
+
+#ifndef IOTKIT_SECCTL_H
+#define IOTKIT_SECCTL_H
+
+#include "hw/sysbus.h"
+
+#define TYPE_IOTKIT_SECCTL "iotkit-secctl"
+#define IOTKIT_SECCTL(obj) OBJECT_CHECK(IoTKitSecCtl, (obj), TYPE_IOTKIT_SECCTL)
+
+typedef struct IoTKitSecCtl {
+    /*< private >*/
+    SysBusDevice parent_obj;
+
+    /*< public >*/
+
+    MemoryRegion s_regs;
+    MemoryRegion ns_regs;
+} IoTKitSecCtl;
+
+#endif
diff --git a/hw/misc/iotkit-secctl.c b/hw/misc/iotkit-secctl.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/hw/misc/iotkit-secctl.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * Arm IoT Kit security controller
+ *
+ * Copyright (c) 2018 Linaro Limited
+ * Written by Peter Maydell
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 or
+ * (at your option) any later version.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qapi/error.h"
+#include "trace.h"
+#include "hw/sysbus.h"
+#include "hw/registerfields.h"
+#include "hw/misc/iotkit-secctl.h"
+
+/* Registers in the secure privilege control block */
+REG32(SECRESPCFG, 0x10)
+REG32(NSCCFG, 0x14)
+REG32(SECMPCINTSTATUS, 0x1c)
+REG32(SECPPCINTSTAT, 0x20)
+REG32(SECPPCINTCLR, 0x24)
+REG32(SECPPCINTEN, 0x28)
+REG32(SECMSCINTSTAT, 0x30)
+REG32(SECMSCINTCLR, 0x34)
+REG32(SECMSCINTEN, 0x38)
+REG32(BRGINTSTAT, 0x40)
+REG32(BRGINTCLR, 0x44)
+REG32(BRGINTEN, 0x48)
+REG32(AHBNSPPC0, 0x50)
+REG32(AHBNSPPCEXP0, 0x60)
+REG32(AHBNSPPCEXP1, 0x64)
+REG32(AHBNSPPCEXP2, 0x68)
+REG32(AHBNSPPCEXP3, 0x6c)
+REG32(APBNSPPC0, 0x70)
+REG32(APBNSPPC1, 0x74)
+REG32(APBNSPPCEXP0, 0x80)
+REG32(APBNSPPCEXP1, 0x84)
+REG32(APBNSPPCEXP2, 0x88)
+REG32(APBNSPPCEXP3, 0x8c)
+REG32(AHBSPPPC0, 0x90)
+REG32(AHBSPPPCEXP0, 0xa0)
+REG32(AHBSPPPCEXP1, 0xa4)
+REG32(AHBSPPPCEXP2, 0xa8)
+REG32(AHBSPPPCEXP3, 0xac)
+REG32(APBSPPPC0, 0xb0)
+REG32(APBSPPPC1, 0xb4)
+REG32(APBSPPPCEXP0, 0xc0)
+REG32(APBSPPPCEXP1, 0xc4)
+REG32(APBSPPPCEXP2, 0xc8)
+REG32(APBSPPPCEXP3, 0xcc)
+REG32(NSMSCEXP, 0xd0)
+REG32(PID4, 0xfd0)
+REG32(PID5, 0xfd4)
+REG32(PID6, 0xfd8)
+REG32(PID7, 0xfdc)
+REG32(PID0, 0xfe0)
+REG32(PID1, 0xfe4)
+REG32(PID2, 0xfe8)
+REG32(PID3, 0xfec)
+REG32(CID0, 0xff0)
+REG32(CID1, 0xff4)
+REG32(CID2, 0xff8)
+REG32(CID3, 0xffc)
+
+/* Registers in the non-secure privilege control block */
+REG32(AHBNSPPPC0, 0x90)
+REG32(AHBNSPPPCEXP0, 0xa0)
+REG32(AHBNSPPPCEXP1, 0xa4)
+REG32(AHBNSPPPCEXP2, 0xa8)
+REG32(AHBNSPPPCEXP3, 0xac)
+REG32(APBNSPPPC0, 0xb0)
+REG32(APBNSPPPC1, 0xb4)
+REG32(APBNSPPPCEXP0, 0xc0)
+REG32(APBNSPPPCEXP1, 0xc4)
+REG32(APBNSPPPCEXP2, 0xc8)
+REG32(APBNSPPPCEXP3, 0xcc)
+/* PID and CID registers are also present in the NS block */
+
+static const uint8_t iotkit_secctl_s_idregs[] = {
+    0x04, 0x00, 0x00, 0x00,
+    0x52, 0xb8, 0x0b, 0x00,
+    0x0d, 0xf0, 0x05, 0xb1,
+};
+
+static const uint8_t iotkit_secctl_ns_idregs[] = {
+    0x04, 0x00, 0x00, 0x00,
+    0x53, 0xb8, 0x0b, 0x00,
+    0x0d, 0xf0, 0x05, 0xb1,
+};
+
+static MemTxResult iotkit_secctl_s_read(void *opaque, hwaddr addr,
+                                        uint64_t *pdata,
+                                        unsigned size, MemTxAttrs attrs)
+{
+    uint64_t r;
+    uint32_t offset = addr & ~0x3;
+
+    switch (offset) {
+    case A_AHBNSPPC0:
+    case A_AHBSPPPC0:
+        r = 0;
+        break;
+    case A_SECRESPCFG:
+    case A_NSCCFG:
+    case A_SECMPCINTSTATUS:
+    case A_SECPPCINTSTAT:
+    case A_SECPPCINTEN:
+    case A_SECMSCINTSTAT:
+    case A_SECMSCINTEN:
+    case A_BRGINTSTAT:
+    case A_BRGINTEN:
+    case A_AHBNSPPCEXP0:
+    case A_AHBNSPPCEXP1:
+    case A_AHBNSPPCEXP2:
+    case A_AHBNSPPCEXP3:
+    case A_APBNSPPC0:
+    case A_APBNSPPC1:
+    case A_APBNSPPCEXP0:
+    case A_APBNSPPCEXP1:
+    case A_APBNSPPCEXP2:
+    case A_APBNSPPCEXP3:
+    case A_AHBSPPPCEXP0:
+    case A_AHBSPPPCEXP1:
+    case A_AHBSPPPCEXP2:
+    case A_AHBSPPPCEXP3:
+    case A_APBSPPPC0:
+    case A_APBSPPPC1:
+    case A_APBSPPPCEXP0:
+    case A_APBSPPPCEXP1:
+    case A_APBSPPPCEXP2:
+    case A_APBSPPPCEXP3:
+    case A_NSMSCEXP:
+        qemu_log_mask(LOG_UNIMP,
+                      "IoTKit SecCtl S block read: "
+                      "unimplemented offset 0x%x\n", offset);
+        r = 0;
+        break;
+    case A_PID4:
+    case A_PID5:
+    case A_PID6:
+    case A_PID7:
+    case A_PID0:
+    case A_PID1:
+    case A_PID2:
+    case A_PID3:
+    case A_CID0:
+    case A_CID1:
+    case A_CID2:
+    case A_CID3:
+        r = iotkit_secctl_s_idregs[(offset - A_PID4) / 4];
+        break;
+    case A_SECPPCINTCLR:
+    case A_SECMSCINTCLR:
+    case A_BRGINTCLR:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "IotKit SecCtl S block read: write-only offset 0x%x\n",
+                      offset);
+        r = 0;
+        break;
+    default:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "IotKit SecCtl S block read: bad offset 0x%x\n", offset);
+        r = 0;
+        break;
+    }
+
+    if (size != 4) {
+        /* None of our registers are access-sensitive, so just pull the right
+         * byte out of the word read result.
+         */
+        r = extract32(r, (addr & 3) * 8, size * 8);
+    }
+
+    trace_iotkit_secctl_s_read(offset, r, size);
+    *pdata = r;
+    return MEMTX_OK;
+}
+
+static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
+                                         uint64_t value,
+                                         unsigned size, MemTxAttrs attrs)
+{
+    uint32_t offset = addr;
+
+    trace_iotkit_secctl_s_write(offset, value, size);
+
+    if (size != 4) {
+        /* Byte and halfword writes are ignored */
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "IotKit SecCtl S block write: bad size, ignored\n");
+        return MEMTX_OK;
+    }
+
+    switch (offset) {
+    case A_SECRESPCFG:
+    case A_NSCCFG:
+    case A_SECPPCINTCLR:
+    case A_SECPPCINTEN:
+    case A_SECMSCINTCLR:
+    case A_SECMSCINTEN:
+    case A_BRGINTCLR:
+    case A_BRGINTEN:
+    case A_AHBNSPPCEXP0:
+    case A_AHBNSPPCEXP1:
+    case A_AHBNSPPCEXP2:
+    case A_AHBNSPPCEXP3:
+    case A_APBNSPPC0:
+    case A_APBNSPPC1:
+    case A_APBNSPPCEXP0:
+    case A_APBNSPPCEXP1:
+    case A_APBNSPPCEXP2:
+    case A_APBNSPPCEXP3:
+    case A_AHBSPPPCEXP0:
+    case A_AHBSPPPCEXP1:
+    case A_AHBSPPPCEXP2:
+    case A_AHBSPPPCEXP3:
+    case A_APBSPPPC0:
+    case A_APBSPPPC1:
+    case A_APBSPPPCEXP0:
+    case A_APBSPPPCEXP1:
+    case A_APBSPPPCEXP2:
+    case A_APBSPPPCEXP3:
+        qemu_log_mask(LOG_UNIMP,
+                      "IoTKit SecCtl S block write: "
+                      "unimplemented offset 0x%x\n", offset);
+        break;
+    case A_SECMPCINTSTATUS:
+    case A_SECPPCINTSTAT:
+    case A_SECMSCINTSTAT:
+    case A_BRGINTSTAT:
+    case A_AHBNSPPC0:
+    case A_AHBSPPPC0:
+    case A_NSMSCEXP:
+    case A_PID4:
+    case A_PID5:
+    case A_PID6:
+    case A_PID7:
+    case A_PID0:
+    case A_PID1:
+    case A_PID2:
+    case A_PID3:
+    case A_CID0:
+    case A_CID1:
+    case A_CID2:
+    case A_CID3:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "IoTKit SecCtl S block write: "
+                      "read-only offset 0x%x\n", offset);
+        break;
+    default:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "IotKit SecCtl S block write: bad offset 0x%x\n",
+                      offset);
+        break;
+    }
+
+    return MEMTX_OK;
+}
+
+static MemTxResult iotkit_secctl_ns_read(void *opaque, hwaddr addr,
+                                         uint64_t *pdata,
+                                         unsigned size, MemTxAttrs attrs)
+{
+    uint64_t r;
+    uint32_t offset = addr & ~0x3;
+
+    switch (offset) {
+    case A_AHBNSPPPC0:
+        r = 0;
+        break;
+    case A_AHBNSPPPCEXP0:
+    case A_AHBNSPPPCEXP1:
+    case A_AHBNSPPPCEXP2:
+    case A_AHBNSPPPCEXP3:
+    case A_APBNSPPPC0:
+    case A_APBNSPPPC1:
+    case A_APBNSPPPCEXP0:
+    case A_APBNSPPPCEXP1:
+    case A_APBNSPPPCEXP2:
+    case A_APBNSPPPCEXP3:
+        qemu_log_mask(LOG_UNIMP,
+                      "IoTKit SecCtl NS block read: "
+                      "unimplemented offset 0x%x\n", offset);
+        break;
+    case A_PID4:
+    case A_PID5:
+    case A_PID6:
+    case A_PID7:
+    case A_PID0:
+    case A_PID1:
+    case A_PID2:
+    case A_PID3:
+    case A_CID0:
+    case A_CID1:
+    case A_CID2:
+    case A_CID3:
+        r = iotkit_secctl_ns_idregs[(offset - A_PID4) / 4];
+        break;
+    default:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "IotKit SecCtl NS block write: bad offset 0x%x\n",
+                      offset);
+        r = 0;
+        break;
+    }
+
+    if (size != 4) {
+        /* None of our registers are access-sensitive, so just pull the right
+         * byte out of the word read result.
+         */
+        r = extract32(r, (addr & 3) * 8, size * 8);
+    }
+
+    trace_iotkit_secctl_ns_read(offset, r, size);
+    *pdata = r;
+    return MEMTX_OK;
+}
+
+static MemTxResult iotkit_secctl_ns_write(void *opaque, hwaddr addr,
+                                          uint64_t value,
+                                          unsigned size, MemTxAttrs attrs)
+{
+    uint32_t offset = addr;
+
+    trace_iotkit_secctl_ns_write(offset, value, size);
+
+    if (size != 4) {
+        /* Byte and halfword writes are ignored */
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "IotKit SecCtl NS block write: bad size, ignored\n");
+        return MEMTX_OK;
+    }
+
+    switch (offset) {
+    case A_AHBNSPPPCEXP0:
+    case A_AHBNSPPPCEXP1:
+    case A_AHBNSPPPCEXP2:
+    case A_AHBNSPPPCEXP3:
+    case A_APBNSPPPC0:
+    case A_APBNSPPPC1:
+    case A_APBNSPPPCEXP0:
+    case A_APBNSPPPCEXP1:
+    case A_APBNSPPPCEXP2:
+    case A_APBNSPPPCEXP3:
+        qemu_log_mask(LOG_UNIMP,
+                      "IoTKit SecCtl NS block write: "
+                      "unimplemented offset 0x%x\n", offset);
+        break;
+    case A_AHBNSPPPC0:
+    case A_PID4:
+    case A_PID5:
+    case A_PID6:
+    case A_PID7:
+    case A_PID0:
+    case A_PID1:
+    case A_PID2:
+    case A_PID3:
+    case A_CID0:
+    case A_CID1:
+    case A_CID2:
+    case A_CID3:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "IoTKit SecCtl NS block write: "
+                      "read-only offset 0x%x\n", offset);
+        break;
+    default:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "IotKit SecCtl NS block write: bad offset 0x%x\n",
+                      offset);
+        break;
+    }
+
+    return MEMTX_OK;
+}
+
+static const MemoryRegionOps iotkit_secctl_s_ops = {
+    .read_with_attrs = iotkit_secctl_s_read,
+    .write_with_attrs = iotkit_secctl_s_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid.min_access_size = 1,
+    .valid.max_access_size = 4,
+    .impl.min_access_size = 1,
+    .impl.max_access_size = 4,
+};
+
+static const MemoryRegionOps iotkit_secctl_ns_ops = {
+    .read_with_attrs = iotkit_secctl_ns_read,
+    .write_with_attrs = iotkit_secctl_ns_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid.min_access_size = 1,
+    .valid.max_access_size = 4,
+    .impl.min_access_size = 1,
+    .impl.max_access_size = 4,
+};
+
+static void iotkit_secctl_reset(DeviceState *dev)
+{
+
+}
+
+static void iotkit_secctl_init(Object *obj)
+{
+    IoTKitSecCtl *s = IOTKIT_SECCTL(obj);
+    SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
+
+    memory_region_init_io(&s->s_regs, obj, &iotkit_secctl_s_ops,
+                          s, "iotkit-secctl-s-regs", 0x1000);
+    memory_region_init_io(&s->ns_regs, obj, &iotkit_secctl_ns_ops,
+                          s, "iotkit-secctl-ns-regs", 0x1000);
+    sysbus_init_mmio(sbd, &s->s_regs);
+    sysbus_init_mmio(sbd, &s->ns_regs);
+}
+
+static const VMStateDescription iotkit_secctl_vmstate = {
+    .name = "iotkit-secctl",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static void iotkit_secctl_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->vmsd = &iotkit_secctl_vmstate;
+    dc->reset = iotkit_secctl_reset;
+}
+
+static const TypeInfo iotkit_secctl_info = {
+    .name = TYPE_IOTKIT_SECCTL,
+    .parent = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(IoTKitSecCtl),
+    .instance_init = iotkit_secctl_init,
+    .class_init = iotkit_secctl_class_init,
+};
+
+static void iotkit_secctl_register_types(void)
+{
+    type_register_static(&iotkit_secctl_info);
+}
+
+type_init(iotkit_secctl_register_types);
diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index XXXXXXX..XXXXXXX 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -XXX,XX +XXX,XX @@ CONFIG_MPS2_FPGAIO=y
 CONFIG_MPS2_SCC=y
 
 CONFIG_TZ_PPC=y
+CONFIG_IOTKIT_SECCTL=y
 
 CONFIG_VERSATILE_PCI=y
 CONFIG_VERSATILE_I2C=y
diff --git a/hw/misc/trace-events b/hw/misc/trace-events
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/trace-events
+++ b/hw/misc/trace-events
@@ -XXX,XX +XXX,XX @@ tz_ppc_irq_clear(int level) "TZ PPC: int_clear = %d"
 tz_ppc_update_irq(int level) "TZ PPC: setting irq line to %d"
 tz_ppc_read_blocked(int n, hwaddr offset, bool secure, bool user) "TZ PPC: port %d offset 0x%" HWADDR_PRIx " read (secure %d user %d) blocked"
 tz_ppc_write_blocked(int n, hwaddr offset, bool secure, bool user) "TZ PPC: port %d offset 0x%" HWADDR_PRIx " write (secure %d user %d) blocked"
+
+# hw/misc/iotkit-secctl.c
+iotkit_secctl_s_read(uint32_t offset, uint64_t data, unsigned size) "IoTKit SecCtl S regs read: offset 0x%x data 0x%" PRIx64 " size %u"
+iotkit_secctl_s_write(uint32_t offset, uint64_t data, unsigned size) "IoTKit SecCtl S regs write: offset 0x%x data 0x%" PRIx64 " size %u"
+iotkit_secctl_ns_read(uint32_t offset, uint64_t data, unsigned size) "IoTKit SecCtl NS regs read: offset 0x%x data 0x%" PRIx64 " size %u"
+iotkit_secctl_ns_write(uint32_t offset, uint64_t data, unsigned size) "IoTKit SecCtl NS regs write: offset 0x%x data 0x%" PRIx64 " size %u"
+iotkit_secctl_reset(void) "IoTKit SecCtl: reset"
-- 
2.16.2

The IoTKit Security Controller includes various registers
that expose to software the controls for the Peripheral
Protection Controllers in the system. Implement these.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-17-peter.maydell@linaro.org
---
 include/hw/misc/iotkit-secctl.h |  64 +++++++++-
 hw/misc/iotkit-secctl.c         | 270 +++++++++++++++++++++++++++++++++++++---
 2 files changed, 315 insertions(+), 19 deletions(-)

diff --git a/include/hw/misc/iotkit-secctl.h b/include/hw/misc/iotkit-secctl.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/misc/iotkit-secctl.h
+++ b/include/hw/misc/iotkit-secctl.h
@@ -XXX,XX +XXX,XX @@
  * QEMU interface:
  *  + sysbus MMIO region 0 is the "secure privilege control block" registers
  *  + sysbus MMIO region 1 is the "non-secure privilege control block" registers
+ *  + named GPIO output "sec_resp_cfg" indicating whether blocked accesses
+ *    should RAZ/WI or bus error
+ * Controlling the 2 APB PPCs in the IoTKit:
+ *  + named GPIO outputs apb_ppc0_nonsec[0..2] and apb_ppc1_nonsec
+ *  + named GPIO outputs apb_ppc0_ap[0..2] and apb_ppc1_ap
+ *  + named GPIO outputs apb_ppc{0,1}_irq_enable
+ *  + named GPIO outputs apb_ppc{0,1}_irq_clear
+ *  + named GPIO inputs apb_ppc{0,1}_irq_status
+ * Controlling each of the 4 expansion APB PPCs which a system using the IoTKit
+ * might provide:
+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_nonsec[0..15]
+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_ap[0..15]
+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_irq_enable
+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_irq_clear
+ *  + named GPIO inputs apb_ppcexp{0,1,2,3}_irq_status
+ * Controlling each of the 4 expansion AHB PPCs which a system using the IoTKit
+ * might provide:
+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_nonsec[0..15]
+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_ap[0..15]
+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_irq_enable
+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_irq_clear
+ *  + named GPIO inputs ahb_ppcexp{0,1,2,3}_irq_status
  */
 
 #ifndef IOTKIT_SECCTL_H
@@ -XXX,XX +XXX,XX @@
 #define TYPE_IOTKIT_SECCTL "iotkit-secctl"
 #define IOTKIT_SECCTL(obj) OBJECT_CHECK(IoTKitSecCtl, (obj), TYPE_IOTKIT_SECCTL)
 
-typedef struct IoTKitSecCtl {
+#define IOTS_APB_PPC0_NUM_PORTS 3
+#define IOTS_APB_PPC1_NUM_PORTS 1
+#define IOTS_PPC_NUM_PORTS 16
+#define IOTS_NUM_APB_PPC 2
+#define IOTS_NUM_APB_EXP_PPC 4
+#define IOTS_NUM_AHB_EXP_PPC 4
+
+typedef struct IoTKitSecCtl IoTKitSecCtl;
+
+/* State and IRQ lines relating to a PPC. For the
+ * PPCs in the IoTKit not all the IRQ lines are used.
+ */
+typedef struct IoTKitSecCtlPPC {
+    qemu_irq nonsec[IOTS_PPC_NUM_PORTS];
+    qemu_irq ap[IOTS_PPC_NUM_PORTS];
+    qemu_irq irq_enable;
+    qemu_irq irq_clear;
+
+    uint32_t ns;
+    uint32_t sp;
+    uint32_t nsp;
+
+    /* Number of ports actually present */
+    int numports;
+    /* Offset of this PPC's interrupt bits in SECPPCINTSTAT */
+    int irq_bit_offset;
+    IoTKitSecCtl *parent;
+} IoTKitSecCtlPPC;
+
+struct IoTKitSecCtl {
     /*< private >*/
     SysBusDevice parent_obj;
 
     /*< public >*/
+    qemu_irq sec_resp_cfg;
 
     MemoryRegion s_regs;
     MemoryRegion ns_regs;
-} IoTKitSecCtl;
+
+    uint32_t secppcintstat;
+    uint32_t secppcinten;
+    uint32_t secrespcfg;
+
+    IoTKitSecCtlPPC apb[IOTS_NUM_APB_PPC];
+    IoTKitSecCtlPPC apbexp[IOTS_NUM_APB_EXP_PPC];
+    IoTKitSecCtlPPC ahbexp[IOTS_NUM_APB_EXP_PPC];
+};
 
 #endif
diff --git a/hw/misc/iotkit-secctl.c b/hw/misc/iotkit-secctl.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/iotkit-secctl.c
+++ b/hw/misc/iotkit-secctl.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t iotkit_secctl_ns_idregs[] = {
     0x0d, 0xf0, 0x05, 0xb1,
 };
 
+/* The register sets for the various PPCs (AHB internal, APB internal,
+ * AHB expansion, APB expansion) are all set up so that they are
+ * in 16-aligned blocks so offsets 0xN0, 0xN4, 0xN8, 0xNC are PPCs
+ * 0, 1, 2, 3 of that type, so we can convert a register address offset
+ * into an an index into a PPC array easily.
+ */
+static inline int offset_to_ppc_idx(uint32_t offset)
+{
+    return extract32(offset, 2, 2);
+}
+
+typedef void PerPPCFunction(IoTKitSecCtlPPC *ppc);
+
+static void foreach_ppc(IoTKitSecCtl *s, PerPPCFunction *fn)
+{
+    int i;
+
+    for (i = 0; i < IOTS_NUM_APB_PPC; i++) {
+        fn(&s->apb[i]);
+    }
+    for (i = 0; i < IOTS_NUM_APB_EXP_PPC; i++) {
+        fn(&s->apbexp[i]);
+    }
+    for (i = 0; i < IOTS_NUM_AHB_EXP_PPC; i++) {
+        fn(&s->ahbexp[i]);
+    }
+}
+
 static MemTxResult iotkit_secctl_s_read(void *opaque, hwaddr addr,
                                         uint64_t *pdata,
                                         unsigned size, MemTxAttrs attrs)
 {
     uint64_t r;
     uint32_t offset = addr & ~0x3;
+    IoTKitSecCtl *s = IOTKIT_SECCTL(opaque);
 
     switch (offset) {
     case A_AHBNSPPC0:
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_read(void *opaque, hwaddr addr,
         r = 0;
         break;
     case A_SECRESPCFG:
-    case A_NSCCFG:
-    case A_SECMPCINTSTATUS:
+        r = s->secrespcfg;
+        break;
     case A_SECPPCINTSTAT:
+        r = s->secppcintstat;
+        break;
     case A_SECPPCINTEN:
-    case A_SECMSCINTSTAT:
-    case A_SECMSCINTEN:
-    case A_BRGINTSTAT:
-    case A_BRGINTEN:
+        r = s->secppcinten;
+        break;
     case A_AHBNSPPCEXP0:
     case A_AHBNSPPCEXP1:
     case A_AHBNSPPCEXP2:
     case A_AHBNSPPCEXP3:
+        r = s->ahbexp[offset_to_ppc_idx(offset)].ns;
+        break;
     case A_APBNSPPC0:
     case A_APBNSPPC1:
+        r = s->apb[offset_to_ppc_idx(offset)].ns;
+        break;
     case A_APBNSPPCEXP0:
     case A_APBNSPPCEXP1:
     case A_APBNSPPCEXP2:
     case A_APBNSPPCEXP3:
+        r = s->apbexp[offset_to_ppc_idx(offset)].ns;
+        break;
     case A_AHBSPPPCEXP0:
     case A_AHBSPPPCEXP1:
     case A_AHBSPPPCEXP2:
     case A_AHBSPPPCEXP3:
+        r = s->apbexp[offset_to_ppc_idx(offset)].sp;
+        break;
     case A_APBSPPPC0:
     case A_APBSPPPC1:
+        r = s->apb[offset_to_ppc_idx(offset)].sp;
+        break;
     case A_APBSPPPCEXP0:
     case A_APBSPPPCEXP1:
     case A_APBSPPPCEXP2:
     case A_APBSPPPCEXP3:
+        r = s->apbexp[offset_to_ppc_idx(offset)].sp;
+        break;
+    case A_NSCCFG:
+    case A_SECMPCINTSTATUS:
+    case A_SECMSCINTSTAT:
+    case A_SECMSCINTEN:
+    case A_BRGINTSTAT:
+    case A_BRGINTEN:
     case A_NSMSCEXP:
         qemu_log_mask(LOG_UNIMP,
                       "IoTKit SecCtl S block read: "
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_read(void *opaque, hwaddr addr,
     return MEMTX_OK;
 }
 
+static void iotkit_secctl_update_ppc_ap(IoTKitSecCtlPPC *ppc)
+{
+    int i;
+
+    for (i = 0; i < ppc->numports; i++) {
+        bool v;
+
+        if (extract32(ppc->ns, i, 1)) {
+            v = extract32(ppc->nsp, i, 1);
+        } else {
+            v = extract32(ppc->sp, i, 1);
+        }
+        qemu_set_irq(ppc->ap[i], v);
+    }
+}
+
+static void iotkit_secctl_ppc_ns_write(IoTKitSecCtlPPC *ppc, uint32_t value)
+{
+    int i;
+
+    ppc->ns = value & MAKE_64BIT_MASK(0, ppc->numports);
+    for (i = 0; i < ppc->numports; i++) {
+        qemu_set_irq(ppc->nonsec[i], extract32(ppc->ns, i, 1));
+    }
+    iotkit_secctl_update_ppc_ap(ppc);
+}
+
+static void iotkit_secctl_ppc_sp_write(IoTKitSecCtlPPC *ppc, uint32_t value)
+{
+    ppc->sp = value & MAKE_64BIT_MASK(0, ppc->numports);
+    iotkit_secctl_update_ppc_ap(ppc);
+}
+
+static void iotkit_secctl_ppc_nsp_write(IoTKitSecCtlPPC *ppc, uint32_t value)
+{
+    ppc->nsp = value & MAKE_64BIT_MASK(0, ppc->numports);
+    iotkit_secctl_update_ppc_ap(ppc);
+}
+
+static void iotkit_secctl_ppc_update_irq_clear(IoTKitSecCtlPPC *ppc)
+{
+    uint32_t value = ppc->parent->secppcintstat;
+
+    qemu_set_irq(ppc->irq_clear, extract32(value, ppc->irq_bit_offset, 1));
+}
+
+static void iotkit_secctl_ppc_update_irq_enable(IoTKitSecCtlPPC *ppc)
+{
+    uint32_t value = ppc->parent->secppcinten;
+
+    qemu_set_irq(ppc->irq_enable, extract32(value, ppc->irq_bit_offset, 1));
+}
+
 static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
                                          uint64_t value,
                                          unsigned size, MemTxAttrs attrs)
 {
+    IoTKitSecCtl *s = IOTKIT_SECCTL(opaque);
     uint32_t offset = addr;
+    IoTKitSecCtlPPC *ppc;
 
     trace_iotkit_secctl_s_write(offset, value, size);
 
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_s_write(void *opaque, hwaddr addr,
 
     switch (offset) {
     case A_SECRESPCFG:
-    case A_NSCCFG:
+        value &= 1;
+        s->secrespcfg = value;
+        qemu_set_irq(s->sec_resp_cfg, s->secrespcfg);
+        break;
     case A_SECPPCINTCLR:
+        value &= 0x00f000f3;
+        foreach_ppc(s, iotkit_secctl_ppc_update_irq_clear);
+        break;
     case A_SECPPCINTEN:
-    case A_SECMSCINTCLR:
-    case A_SECMSCINTEN:
-    case A_BRGINTCLR:
-    case A_BRGINTEN:
+        s->secppcinten = value & 0x00f000f3;
+        foreach_ppc(s, iotkit_secctl_ppc_update_irq_enable);
+        break;
     case A_AHBNSPPCEXP0:
     case A_AHBNSPPCEXP1:
     case A_AHBNSPPCEXP2:
     case A_AHBNSPPCEXP3:
+        ppc = &s->ahbexp[offset_to_ppc_idx(offset)];
+        iotkit_secctl_ppc_ns_write(ppc, value);
+        break;
     case A_APBNSPPC0:
     case A_APBNSPPC1:
+        ppc = &s->apb[offset_to_ppc_idx(offset)];
+        iotkit_secctl_ppc_ns_write(ppc, value);
+        break;
     case A_APBNSPPCEXP0:
     case A_APBNSPPCEXP1:
     case A_APBNSPPCEXP2:
     case A_APBNSPPCEXP3:
+        ppc = &s->apbexp[offset_to_ppc_idx(offset)];
+        iotkit_secctl_ppc_ns_write(ppc, value);
+        break;
     case A_AHBSPPPCEXP0:
     case A_AHBSPPPCEXP1:
     case A_AHBSPPPCEXP2:
     case A_AHBSPPPCEXP3:
+        ppc = &s->ahbexp[offset_to_ppc_idx(offset)];
+        iotkit_secctl_ppc_sp_write(ppc, value);
+        break;
     case A_APBSPPPC0:
     case A_APBSPPPC1:
+        ppc = &s->apb[offset_to_ppc_idx(offset)];
+        iotkit_secctl_ppc_sp_write(ppc, value);
+        break;
     case A_APBSPPPCEXP0:
     case A_APBSPPPCEXP1:
     case A_APBSPPPCEXP2:
     case A_APBSPPPCEXP3:
+        ppc = &s->apbexp[offset_to_ppc_idx(offset)];
+        iotkit_secctl_ppc_sp_write(ppc, value);
+        break;
+    case A_NSCCFG:
+    case A_SECMSCINTCLR:
+    case A_SECMSCINTEN:
+    case A_BRGINTCLR:
+    case A_BRGINTEN:
         qemu_log_mask(LOG_UNIMP,
                       "IoTKit SecCtl S block write: "
                       "unimplemented offset 0x%x\n", offset);
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_ns_read(void *opaque, hwaddr addr,
                                          uint64_t *pdata,
                                          unsigned size, MemTxAttrs attrs)
 {
+    IoTKitSecCtl *s = IOTKIT_SECCTL(opaque);
     uint64_t r;
     uint32_t offset = addr & ~0x3;
 
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_ns_read(void *opaque, hwaddr addr,
     case A_AHBNSPPPCEXP1:
     case A_AHBNSPPPCEXP2:
     case A_AHBNSPPPCEXP3:
+        r = s->ahbexp[offset_to_ppc_idx(offset)].nsp;
+        break;
     case A_APBNSPPPC0:
     case A_APBNSPPPC1:
+        r = s->apb[offset_to_ppc_idx(offset)].nsp;
+        break;
     case A_APBNSPPPCEXP0:
     case A_APBNSPPPCEXP1:
     case A_APBNSPPPCEXP2:
     case A_APBNSPPPCEXP3:
-        qemu_log_mask(LOG_UNIMP,
-                      "IoTKit SecCtl NS block read: "
-                      "unimplemented offset 0x%x\n", offset);
+        r = s->apbexp[offset_to_ppc_idx(offset)].nsp;
         break;
     case A_PID4:
     case A_PID5:
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_ns_write(void *opaque, hwaddr addr,
                                           uint64_t value,
                                           unsigned size, MemTxAttrs attrs)
 {
+    IoTKitSecCtl *s = IOTKIT_SECCTL(opaque);
     uint32_t offset = addr;
+    IoTKitSecCtlPPC *ppc;
 
     trace_iotkit_secctl_ns_write(offset, value, size);
 
@@ -XXX,XX +XXX,XX @@ static MemTxResult iotkit_secctl_ns_write(void *opaque, hwaddr addr,
     case A_AHBNSPPPCEXP1:
     case A_AHBNSPPPCEXP2:
     case A_AHBNSPPPCEXP3:
+        ppc = &s->ahbexp[offset_to_ppc_idx(offset)];
+        iotkit_secctl_ppc_nsp_write(ppc, value);
+        break;
     case A_APBNSPPPC0:
     case A_APBNSPPPC1:
+        ppc = &s->apb[offset_to_ppc_idx(offset)];
+        iotkit_secctl_ppc_nsp_write(ppc, value);
+        break;
     case A_APBNSPPPCEXP0:
     case A_APBNSPPPCEXP1:
     case A_APBNSPPPCEXP2:
     case A_APBNSPPPCEXP3:
-        qemu_log_mask(LOG_UNIMP,
-                      "IoTKit SecCtl NS block write: "
-                      "unimplemented offset 0x%x\n", offset);
+        ppc = &s->apbexp[offset_to_ppc_idx(offset)];
+        iotkit_secctl_ppc_nsp_write(ppc, value);
         break;
     case A_AHBNSPPPC0:
     case A_PID4:
@@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps iotkit_secctl_ns_ops = {
     .impl.max_access_size = 4,
 };
 
+static void iotkit_secctl_reset_ppc(IoTKitSecCtlPPC *ppc)
+{
+    ppc->ns = 0;
+    ppc->sp = 0;
+    ppc->nsp = 0;
+}
+
 static void iotkit_secctl_reset(DeviceState *dev)
 {
+    IoTKitSecCtl *s = IOTKIT_SECCTL(dev);
 
+    s->secppcintstat = 0;
+    s->secppcinten = 0;
+    s->secrespcfg = 0;
+
+    foreach_ppc(s, iotkit_secctl_reset_ppc);
+}
+
+static void iotkit_secctl_ppc_irqstatus(void *opaque, int n, int level)
+{
+    IoTKitSecCtlPPC *ppc = opaque;
+    IoTKitSecCtl *s = IOTKIT_SECCTL(ppc->parent);
+    int irqbit = ppc->irq_bit_offset + n;
+
+    s->secppcintstat = deposit32(s->secppcintstat, irqbit, 1, level);
+}
+
+static void iotkit_secctl_init_ppc(IoTKitSecCtl *s,
+                                   IoTKitSecCtlPPC *ppc,
+                                   const char *name,
+                                   int numports,
+                                   int irq_bit_offset)
+{
+    char *gpioname;
+    DeviceState *dev = DEVICE(s);
+
+    ppc->numports = numports;
+    ppc->irq_bit_offset = irq_bit_offset;
+    ppc->parent = s;
+
+    gpioname = g_strdup_printf("%s_nonsec", name);
+    qdev_init_gpio_out_named(dev, ppc->nonsec, gpioname, numports);
+    g_free(gpioname);
+    gpioname = g_strdup_printf("%s_ap", name);
+    qdev_init_gpio_out_named(dev, ppc->ap, gpioname, numports);
+    g_free(gpioname);
+    gpioname = g_strdup_printf("%s_irq_enable", name);
+    qdev_init_gpio_out_named(dev, &ppc->irq_enable, gpioname, 1);
+    g_free(gpioname);
+    gpioname = g_strdup_printf("%s_irq_clear", name);
+    qdev_init_gpio_out_named(dev, &ppc->irq_clear, gpioname, 1);
+    g_free(gpioname);
+    gpioname = g_strdup_printf("%s_irq_status", name);
+    qdev_init_gpio_in_named_with_opaque(dev, iotkit_secctl_ppc_irqstatus,
+                                        ppc, gpioname, 1);
+    g_free(gpioname);
 }
 
 static void iotkit_secctl_init(Object *obj)
 {
     IoTKitSecCtl *s = IOTKIT_SECCTL(obj);
     SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
+    DeviceState *dev = DEVICE(obj);
+    int i;
+
+    iotkit_secctl_init_ppc(s, &s->apb[0], "apb_ppc0",
+                           IOTS_APB_PPC0_NUM_PORTS, 0);
+    iotkit_secctl_init_ppc(s, &s->apb[1], "apb_ppc1",
+                           IOTS_APB_PPC1_NUM_PORTS, 1);
+
+    for (i = 0; i < IOTS_NUM_APB_EXP_PPC; i++) {
+        IoTKitSecCtlPPC *ppc = &s->apbexp[i];
+        char *ppcname = g_strdup_printf("apb_ppcexp%d", i);
+        iotkit_secctl_init_ppc(s, ppc, ppcname, IOTS_PPC_NUM_PORTS, 4 + i);
+        g_free(ppcname);
+    }
+    for (i = 0; i < IOTS_NUM_AHB_EXP_PPC; i++) {
+        IoTKitSecCtlPPC *ppc = &s->ahbexp[i];
+        char *ppcname = g_strdup_printf("ahb_ppcexp%d", i);
+        iotkit_secctl_init_ppc(s, ppc, ppcname, IOTS_PPC_NUM_PORTS, 20 + i);
+        g_free(ppcname);
+    }
+
+    qdev_init_gpio_out_named(dev, &s->sec_resp_cfg, "sec_resp_cfg", 1);
 
     memory_region_init_io(&s->s_regs, obj, &iotkit_secctl_s_ops,
                           s, "iotkit-secctl-s-regs", 0x1000);
@@ -XXX,XX +XXX,XX @@ static void iotkit_secctl_init(Object *obj)
     sysbus_init_mmio(sbd, &s->ns_regs);
 }
 
+static const VMStateDescription iotkit_secctl_ppc_vmstate = {
+    .name = "iotkit-secctl-ppc",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(ns, IoTKitSecCtlPPC),
+        VMSTATE_UINT32(sp, IoTKitSecCtlPPC),
+        VMSTATE_UINT32(nsp, IoTKitSecCtlPPC),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription iotkit_secctl_vmstate = {
     .name = "iotkit-secctl",
     .version_id = 1,
     .minimum_version_id = 1,
     .fields = (VMStateField[]) {
+        VMSTATE_UINT32(secppcintstat, IoTKitSecCtl),
+        VMSTATE_UINT32(secppcinten, IoTKitSecCtl),
+        VMSTATE_UINT32(secrespcfg, IoTKitSecCtl),
+        VMSTATE_STRUCT_ARRAY(apb, IoTKitSecCtl, IOTS_NUM_APB_PPC, 1,
+                             iotkit_secctl_ppc_vmstate, IoTKitSecCtlPPC),
+        VMSTATE_STRUCT_ARRAY(apbexp, IoTKitSecCtl, IOTS_NUM_APB_EXP_PPC, 1,
+                             iotkit_secctl_ppc_vmstate, IoTKitSecCtlPPC),
+        VMSTATE_STRUCT_ARRAY(ahbexp, IoTKitSecCtl, IOTS_NUM_AHB_EXP_PPC, 1,
+                             iotkit_secctl_ppc_vmstate, IoTKitSecCtlPPC),
         VMSTATE_END_OF_LIST()
     }
 };
-- 
2.16.2

Add remaining easy registers to iotkit-secctl:
 * NSCCFG just routes its two bits out to external GPIO lines
 * BRGINSTAT/BRGINTCLR/BRGINTEN can be dummies, because QEMU's
   bus fabric can never report errors

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180220180325.29818-18-peter.maydell@linaro.org
---
 include/hw/misc/iotkit-secctl.h |  4 ++++
 hw/misc/iotkit-secctl.c         | 32 ++++++++++++++++++++++++++------
 2 files changed, 30 insertions(+), 6 deletions(-)

Model the Arm IoT Kit documented in
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ecm0601256/index.html

The Arm IoT Kit is a subsystem which includes a CPU and some devices,
and is intended be extended by adding extra devices to form a
complete system.  It is used in the MPS2 board's AN505 image for the
Cortex-M33.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-19-peter.maydell@linaro.org
---
 hw/arm/Makefile.objs            |   1 +
 include/hw/arm/iotkit.h         | 109 ++++++++
 hw/arm/iotkit.c                 | 598 ++++++++++++++++++++++++++++++++++++++++
 default-configs/arm-softmmu.mak |   1 +
 4 files changed, 709 insertions(+)
 create mode 100644 include/hw/arm/iotkit.h
 create mode 100644 hw/arm/iotkit.c

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
 obj-$(CONFIG_MPS2) += mps2.o
 obj-$(CONFIG_MSF2) += msf2-soc.o msf2-som.o
+obj-$(CONFIG_IOTKIT) += iotkit.o
diff --git a/include/hw/arm/iotkit.h b/include/hw/arm/iotkit.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/arm/iotkit.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * ARM IoT Kit
+ *
+ * Copyright (c) 2018 Linaro Limited
+ * Written by Peter Maydell
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 or
+ * (at your option) any later version.
+ */
+
+/* This is a model of the Arm IoT Kit which is documented in
+ * http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ecm0601256/index.html
+ * It contains:
+ *  a Cortex-M33
+ *  the IDAU
+ *  some timers and watchdogs
+ *  two peripheral protection controllers
+ *  a memory protection controller
+ *  a security controller
+ *  a bus fabric which arranges that some parts of the address
+ *  space are secure and non-secure aliases of each other
+ *
+ * QEMU interface:
+ *  + QOM property "memory" is a MemoryRegion containing the devices provided
+ *    by the board model.
+ *  + QOM property "MAINCLK" is the frequency of the main system clock
+ *  + QOM property "EXP_NUMIRQ" sets the number of expansion interrupts
+ *  + Named GPIO inputs "EXP_IRQ" 0..n are the expansion interrupts, which
+ *    are wired to the NVIC lines 32 .. n+32
+ * Controlling up to 4 AHB expansion PPBs which a system using the IoTKit
+ * might provide:
+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_nonsec[0..15]
+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_ap[0..15]
+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_irq_enable
+ *  + named GPIO outputs apb_ppcexp{0,1,2,3}_irq_clear
+ *  + named GPIO inputs apb_ppcexp{0,1,2,3}_irq_status
+ * Controlling each of the 4 expansion AHB PPCs which a system using the IoTKit
+ * might provide:
+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_nonsec[0..15]
+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_ap[0..15]
+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_irq_enable
+ *  + named GPIO outputs ahb_ppcexp{0,1,2,3}_irq_clear
+ *  + named GPIO inputs ahb_ppcexp{0,1,2,3}_irq_status
+ */
+
+#ifndef IOTKIT_H
+#define IOTKIT_H
+
+#include "hw/sysbus.h"
+#include "hw/arm/armv7m.h"
+#include "hw/misc/iotkit-secctl.h"
+#include "hw/misc/tz-ppc.h"
+#include "hw/timer/cmsdk-apb-timer.h"
+#include "hw/misc/unimp.h"
+#include "hw/or-irq.h"
+#include "hw/core/split-irq.h"
+
+#define TYPE_IOTKIT "iotkit"
+#define IOTKIT(obj) OBJECT_CHECK(IoTKit, (obj), TYPE_IOTKIT)
+
+/* We have an IRQ splitter and an OR gate input for each external PPC
+ * and the 2 internal PPCs
+ */
+#define NUM_EXTERNAL_PPCS (IOTS_NUM_AHB_EXP_PPC + IOTS_NUM_APB_EXP_PPC)
+#define NUM_PPCS (NUM_EXTERNAL_PPCS + 2)
+
+typedef struct IoTKit {
+    /*< private >*/
+    SysBusDevice parent_obj;
+
+    /*< public >*/
+    ARMv7MState armv7m;
+    IoTKitSecCtl secctl;
+    TZPPC apb_ppc0;
+    TZPPC apb_ppc1;
+    CMSDKAPBTIMER timer0;
+    CMSDKAPBTIMER timer1;
+    qemu_or_irq ppc_irq_orgate;
+    SplitIRQ sec_resp_splitter;
+    SplitIRQ ppc_irq_splitter[NUM_PPCS];
+
+    UnimplementedDeviceState dualtimer;
+    UnimplementedDeviceState s32ktimer;
+
+    MemoryRegion container;
+    MemoryRegion alias1;
+    MemoryRegion alias2;
+    MemoryRegion alias3;
+    MemoryRegion sram0;
+
+    qemu_irq *exp_irqs;
+    qemu_irq ppc0_irq;
+    qemu_irq ppc1_irq;
+    qemu_irq sec_resp_cfg;
+    qemu_irq sec_resp_cfg_in;
+    qemu_irq nsc_cfg_in;
+
+    qemu_irq irq_status_in[NUM_EXTERNAL_PPCS];
+
+    uint32_t nsccfg;
+
+    /* Properties */
+    MemoryRegion *board_memory;
+    uint32_t exp_numirq;
+    uint32_t mainclk_frq;
+} IoTKit;
+
+#endif
diff --git a/hw/arm/iotkit.c b/hw/arm/iotkit.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/hw/arm/iotkit.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * Arm IoT Kit
+ *
+ * Copyright (c) 2018 Linaro Limited
+ * Written by Peter Maydell
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 or
+ * (at your option) any later version.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qapi/error.h"
+#include "trace.h"
+#include "hw/sysbus.h"
+#include "hw/registerfields.h"
+#include "hw/arm/iotkit.h"
+#include "hw/misc/unimp.h"
+#include "hw/arm/arm.h"
+
+/* Create an alias region of @size bytes starting at @base
+ * which mirrors the memory starting at @orig.
+ */
+static void make_alias(IoTKit *s, MemoryRegion *mr, const char *name,
+                       hwaddr base, hwaddr size, hwaddr orig)
+{
+    memory_region_init_alias(mr, NULL, name, &s->container, orig, size);
+    /* The alias is even lower priority than unimplemented_device regions */
+    memory_region_add_subregion_overlap(&s->container, base, mr, -1500);
+}
+
+static void init_sysbus_child(Object *parent, const char *childname,
+                              void *child, size_t childsize,
+                              const char *childtype)
+{
+    object_initialize(child, childsize, childtype);
+    object_property_add_child(parent, childname, OBJECT(child), &error_abort);
+    qdev_set_parent_bus(DEVICE(child), sysbus_get_default());
+}
+
+static void irq_status_forwarder(void *opaque, int n, int level)
+{
+    qemu_irq destirq = opaque;
+
+    qemu_set_irq(destirq, level);
+}
+
+static void nsccfg_handler(void *opaque, int n, int level)
+{
+    IoTKit *s = IOTKIT(opaque);
+
+    s->nsccfg = level;
+}
+
+static void iotkit_forward_ppc(IoTKit *s, const char *ppcname, int ppcnum)
+{
+    /* Each of the 4 AHB and 4 APB PPCs that might be present in a
+     * system using the IoTKit has a collection of control lines which
+     * are provided by the security controller and which we want to
+     * expose as control lines on the IoTKit device itself, so the
+     * code using the IoTKit can wire them up to the PPCs.
+     */
+    SplitIRQ *splitter = &s->ppc_irq_splitter[ppcnum];
+    DeviceState *iotkitdev = DEVICE(s);
+    DeviceState *dev_secctl = DEVICE(&s->secctl);
+    DeviceState *dev_splitter = DEVICE(splitter);
+    char *name;
+
+    name = g_strdup_printf("%s_nonsec", ppcname);
+    qdev_pass_gpios(dev_secctl, iotkitdev, name);
+    g_free(name);
+    name = g_strdup_printf("%s_ap", ppcname);
+    qdev_pass_gpios(dev_secctl, iotkitdev, name);
+    g_free(name);
+    name = g_strdup_printf("%s_irq_enable", ppcname);
+    qdev_pass_gpios(dev_secctl, iotkitdev, name);
+    g_free(name);
+    name = g_strdup_printf("%s_irq_clear", ppcname);
+    qdev_pass_gpios(dev_secctl, iotkitdev, name);
+    g_free(name);
+
+    /* irq_status is a little more tricky, because we need to
+     * split it so we can send it both to the security controller
+     * and to our OR gate for the NVIC interrupt line.
+     * Connect up the splitter's outputs, and create a GPIO input
+     * which will pass the line state to the input splitter.
+     */
+    name = g_strdup_printf("%s_irq_status", ppcname);
+    qdev_connect_gpio_out(dev_splitter, 0,
+                          qdev_get_gpio_in_named(dev_secctl,
+                                                 name, 0));
+    qdev_connect_gpio_out(dev_splitter, 1,
+                          qdev_get_gpio_in(DEVICE(&s->ppc_irq_orgate), ppcnum));
+    s->irq_status_in[ppcnum] = qdev_get_gpio_in(dev_splitter, 0);
+    qdev_init_gpio_in_named_with_opaque(iotkitdev, irq_status_forwarder,
+                                        s->irq_status_in[ppcnum], name, 1);
+    g_free(name);
+}
+
+static void iotkit_forward_sec_resp_cfg(IoTKit *s)
+{
+    /* Forward the 3rd output from the splitter device as a
+     * named GPIO output of the iotkit object.
+     */
+    DeviceState *dev = DEVICE(s);
+    DeviceState *dev_splitter = DEVICE(&s->sec_resp_splitter);
+
+    qdev_init_gpio_out_named(dev, &s->sec_resp_cfg, "sec_resp_cfg", 1);
+    s->sec_resp_cfg_in = qemu_allocate_irq(irq_status_forwarder,
+                                           s->sec_resp_cfg, 1);
+    qdev_connect_gpio_out(dev_splitter, 2, s->sec_resp_cfg_in);
+}
+
+static void iotkit_init(Object *obj)
+{
+    IoTKit *s = IOTKIT(obj);
+    int i;
+
+    memory_region_init(&s->container, obj, "iotkit-container", UINT64_MAX);
+
+    init_sysbus_child(obj, "armv7m", &s->armv7m, sizeof(s->armv7m),
+                      TYPE_ARMV7M);
+    qdev_prop_set_string(DEVICE(&s->armv7m), "cpu-type",
+                         ARM_CPU_TYPE_NAME("cortex-m33"));
+
+    init_sysbus_child(obj, "secctl", &s->secctl, sizeof(s->secctl),
+                      TYPE_IOTKIT_SECCTL);
+    init_sysbus_child(obj, "apb-ppc0", &s->apb_ppc0, sizeof(s->apb_ppc0),
+                      TYPE_TZ_PPC);
+    init_sysbus_child(obj, "apb-ppc1", &s->apb_ppc1, sizeof(s->apb_ppc1),
+                      TYPE_TZ_PPC);
+    init_sysbus_child(obj, "timer0", &s->timer0, sizeof(s->timer0),
+                      TYPE_CMSDK_APB_TIMER);
+    init_sysbus_child(obj, "timer1", &s->timer1, sizeof(s->timer1),
+                      TYPE_CMSDK_APB_TIMER);
+    init_sysbus_child(obj, "dualtimer", &s->dualtimer, sizeof(s->dualtimer),
+                      TYPE_UNIMPLEMENTED_DEVICE);
+    object_initialize(&s->ppc_irq_orgate, sizeof(s->ppc_irq_orgate),
+                      TYPE_OR_IRQ);
+    object_property_add_child(obj, "ppc-irq-orgate",
+                              OBJECT(&s->ppc_irq_orgate), &error_abort);
+    object_initialize(&s->sec_resp_splitter, sizeof(s->sec_resp_splitter),
+                      TYPE_SPLIT_IRQ);
+    object_property_add_child(obj, "sec-resp-splitter",
+                              OBJECT(&s->sec_resp_splitter), &error_abort);
+    for (i = 0; i < ARRAY_SIZE(s->ppc_irq_splitter); i++) {
+        char *name = g_strdup_printf("ppc-irq-splitter-%d", i);
+        SplitIRQ *splitter = &s->ppc_irq_splitter[i];
+
+        object_initialize(splitter, sizeof(*splitter), TYPE_SPLIT_IRQ);
+        object_property_add_child(obj, name, OBJECT(splitter), &error_abort);
+    }
+    init_sysbus_child(obj, "s32ktimer", &s->s32ktimer, sizeof(s->s32ktimer),
+                      TYPE_UNIMPLEMENTED_DEVICE);
+}
+
+static void iotkit_exp_irq(void *opaque, int n, int level)
+{
+    IoTKit *s = IOTKIT(opaque);
+
+    qemu_set_irq(s->exp_irqs[n], level);
+}
+
+static void iotkit_realize(DeviceState *dev, Error **errp)
+{
+    IoTKit *s = IOTKIT(dev);
+    int i;
+    MemoryRegion *mr;
+    Error *err = NULL;
+    SysBusDevice *sbd_apb_ppc0;
+    SysBusDevice *sbd_secctl;
+    DeviceState *dev_apb_ppc0;
+    DeviceState *dev_apb_ppc1;
+    DeviceState *dev_secctl;
+    DeviceState *dev_splitter;
+
+    if (!s->board_memory) {
+        error_setg(errp, "memory property was not set");
+        return;
+    }
+
+    if (!s->mainclk_frq) {
+        error_setg(errp, "MAINCLK property was not set");
+        return;
+    }
+
+    /* Handling of which devices should be available only to secure
+     * code is usually done differently for M profile than for A profile.
+     * Instead of putting some devices only into the secure address space,
+     * devices exist in both address spaces but with hard-wired security
+     * permissions that will cause the CPU to fault for non-secure accesses.
+     *
+     * The IoTKit has an IDAU (Implementation Defined Access Unit),
+     * which specifies hard-wired security permissions for different
+     * areas of the physical address space. For the IoTKit IDAU, the
+     * top 4 bits of the physical address are the IDAU region ID, and
+     * if bit 28 (ie the lowest bit of the ID) is 0 then this is an NS
+     * region, otherwise it is an S region.
+     *
+     * The various devices and RAMs are generally all mapped twice,
+     * once into a region that the IDAU defines as secure and once
+     * into a non-secure region. They sit behind either a Memory
+     * Protection Controller (for RAM) or a Peripheral Protection
+     * Controller (for devices), which allow a more fine grained
+     * configuration of whether non-secure accesses are permitted.
+     *
+     * (The other place that guest software can configure security
+     * permissions is in the architected SAU (Security Attribution
+     * Unit), which is entirely inside the CPU. The IDAU can upgrade
+     * the security attributes for a region to more restrictive than
+     * the SAU specifies, but cannot downgrade them.)
+     *
+     * 0x10000000..0x1fffffff  alias of 0x00000000..0x0fffffff
+     * 0x20000000..0x2007ffff  32KB FPGA block RAM
+     * 0x30000000..0x3fffffff  alias of 0x20000000..0x2fffffff
+     * 0x40000000..0x4000ffff  base peripheral region 1
+     * 0x40010000..0x4001ffff  CPU peripherals (none for IoTKit)
+     * 0x40020000..0x4002ffff  system control element peripherals
+     * 0x40080000..0x400fffff  base peripheral region 2
+     * 0x50000000..0x5fffffff  alias of 0x40000000..0x4fffffff
+     */
+
+    memory_region_add_subregion_overlap(&s->container, 0, s->board_memory, -1);
+
+    qdev_prop_set_uint32(DEVICE(&s->armv7m), "num-irq", s->exp_numirq + 32);
+    /* In real hardware the initial Secure VTOR is set from the INITSVTOR0
+     * register in the IoT Kit System Control Register block, and the
+     * initial value of that is in turn specifiable by the FPGA that
+     * instantiates the IoT Kit. In QEMU we don't implement this wrinkle,
+     * and simply set the CPU's init-svtor to the IoT Kit default value.
+     */
+    qdev_prop_set_uint32(DEVICE(&s->armv7m), "init-svtor", 0x10000000);
+    object_property_set_link(OBJECT(&s->armv7m), OBJECT(&s->container),
+                             "memory", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    object_property_set_link(OBJECT(&s->armv7m), OBJECT(s), "idau", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    object_property_set_bool(OBJECT(&s->armv7m), true, "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+
+    /* Connect our EXP_IRQ GPIOs to the NVIC's lines 32 and up. */
+    s->exp_irqs = g_new(qemu_irq, s->exp_numirq);
+    for (i = 0; i < s->exp_numirq; i++) {
+        s->exp_irqs[i] = qdev_get_gpio_in(DEVICE(&s->armv7m), i + 32);
+    }
+    qdev_init_gpio_in_named(dev, iotkit_exp_irq, "EXP_IRQ", s->exp_numirq);
+
+    /* Set up the big aliases first */
+    make_alias(s, &s->alias1, "alias 1", 0x10000000, 0x10000000, 0x00000000);
+    make_alias(s, &s->alias2, "alias 2", 0x30000000, 0x10000000, 0x20000000);
+    /* The 0x50000000..0x5fffffff region is not a pure alias: it has
+     * a few extra devices that only appear there (generally the
+     * control interfaces for the protection controllers).
+     * We implement this by mapping those devices over the top of this
+     * alias MR at a higher priority.
+     */
+    make_alias(s, &s->alias3, "alias 3", 0x50000000, 0x10000000, 0x40000000);
+
+    /* This RAM should be behind a Memory Protection Controller, but we
+     * don't implement that yet.
+     */
+    memory_region_init_ram(&s->sram0, NULL, "iotkit.sram0", 0x00008000, &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    memory_region_add_subregion(&s->container, 0x20000000, &s->sram0);
+
+    /* Security controller */
+    object_property_set_bool(OBJECT(&s->secctl), true, "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    sbd_secctl = SYS_BUS_DEVICE(&s->secctl);
+    dev_secctl = DEVICE(&s->secctl);
+    sysbus_mmio_map(sbd_secctl, 0, 0x50080000);
+    sysbus_mmio_map(sbd_secctl, 1, 0x40080000);
+
+    s->nsc_cfg_in = qemu_allocate_irq(nsccfg_handler, s, 1);
+    qdev_connect_gpio_out_named(dev_secctl, "nsc_cfg", 0, s->nsc_cfg_in);
+
+    /* The sec_resp_cfg output from the security controller must be split into
+     * multiple lines, one for each of the PPCs within the IoTKit and one
+     * that will be an output from the IoTKit to the system.
+     */
+    object_property_set_int(OBJECT(&s->sec_resp_splitter), 3,
+                            "num-lines", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    object_property_set_bool(OBJECT(&s->sec_resp_splitter), true,
+                             "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    dev_splitter = DEVICE(&s->sec_resp_splitter);
+    qdev_connect_gpio_out_named(dev_secctl, "sec_resp_cfg", 0,
+                                qdev_get_gpio_in(dev_splitter, 0));
+
+    /* Devices behind APB PPC0:
+     *   0x40000000: timer0
+     *   0x40001000: timer1
+     *   0x40002000: dual timer
+     * We must configure and realize each downstream device and connect
+     * it to the appropriate PPC port; then we can realize the PPC and
+     * map its upstream ends to the right place in the container.
+     */
+    qdev_prop_set_uint32(DEVICE(&s->timer0), "pclk-frq", s->mainclk_frq);
+    object_property_set_bool(OBJECT(&s->timer0), true, "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    sysbus_connect_irq(SYS_BUS_DEVICE(&s->timer0), 0,
+                       qdev_get_gpio_in(DEVICE(&s->armv7m), 3));
+    mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->timer0), 0);
+    object_property_set_link(OBJECT(&s->apb_ppc0), OBJECT(mr), "port[0]", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+
+    qdev_prop_set_uint32(DEVICE(&s->timer1), "pclk-frq", s->mainclk_frq);
+    object_property_set_bool(OBJECT(&s->timer1), true, "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    sysbus_connect_irq(SYS_BUS_DEVICE(&s->timer1), 0,
+                       qdev_get_gpio_in(DEVICE(&s->armv7m), 3));
+    mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->timer1), 0);
+    object_property_set_link(OBJECT(&s->apb_ppc0), OBJECT(mr), "port[1]", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+
+    qdev_prop_set_string(DEVICE(&s->dualtimer), "name", "Dual timer");
+    qdev_prop_set_uint64(DEVICE(&s->dualtimer), "size", 0x1000);
+    object_property_set_bool(OBJECT(&s->dualtimer), true, "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->dualtimer), 0);
+    object_property_set_link(OBJECT(&s->apb_ppc0), OBJECT(mr), "port[2]", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+
+    object_property_set_bool(OBJECT(&s->apb_ppc0), true, "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+
+    sbd_apb_ppc0 = SYS_BUS_DEVICE(&s->apb_ppc0);
+    dev_apb_ppc0 = DEVICE(&s->apb_ppc0);
+
+    mr = sysbus_mmio_get_region(sbd_apb_ppc0, 0);
+    memory_region_add_subregion(&s->container, 0x40000000, mr);
+    mr = sysbus_mmio_get_region(sbd_apb_ppc0, 1);
+    memory_region_add_subregion(&s->container, 0x40001000, mr);
+    mr = sysbus_mmio_get_region(sbd_apb_ppc0, 2);
+    memory_region_add_subregion(&s->container, 0x40002000, mr);
+    for (i = 0; i < IOTS_APB_PPC0_NUM_PORTS; i++) {
+        qdev_connect_gpio_out_named(dev_secctl, "apb_ppc0_nonsec", i,
+                                    qdev_get_gpio_in_named(dev_apb_ppc0,
+                                                           "cfg_nonsec", i));
+        qdev_connect_gpio_out_named(dev_secctl, "apb_ppc0_ap", i,
+                                    qdev_get_gpio_in_named(dev_apb_ppc0,
+                                                           "cfg_ap", i));
+    }
+    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc0_irq_enable", 0,
+                                qdev_get_gpio_in_named(dev_apb_ppc0,
+                                                       "irq_enable", 0));
+    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc0_irq_clear", 0,
+                                qdev_get_gpio_in_named(dev_apb_ppc0,
+                                                       "irq_clear", 0));
+    qdev_connect_gpio_out(dev_splitter, 0,
+                          qdev_get_gpio_in_named(dev_apb_ppc0,
+                                                 "cfg_sec_resp", 0));
+
+    /* All the PPC irq lines (from the 2 internal PPCs and the 8 external
+     * ones) are sent individually to the security controller, and also
+     * ORed together to give a single combined PPC interrupt to the NVIC.
+     */
+    object_property_set_int(OBJECT(&s->ppc_irq_orgate),
+                            NUM_PPCS, "num-lines", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    object_property_set_bool(OBJECT(&s->ppc_irq_orgate), true,
+                             "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    qdev_connect_gpio_out(DEVICE(&s->ppc_irq_orgate), 0,
+                          qdev_get_gpio_in(DEVICE(&s->armv7m), 10));
+
+    /* 0x40010000 .. 0x4001ffff: private CPU region: unused in IoTKit */
+
+    /* 0x40020000 .. 0x4002ffff : IoTKit system control peripheral region */
+    /* Devices behind APB PPC1:
+     *   0x4002f000: S32K timer
+     */
+    qdev_prop_set_string(DEVICE(&s->s32ktimer), "name", "S32KTIMER");
+    qdev_prop_set_uint64(DEVICE(&s->s32ktimer), "size", 0x1000);
+    object_property_set_bool(OBJECT(&s->s32ktimer), true, "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->s32ktimer), 0);
+    object_property_set_link(OBJECT(&s->apb_ppc1), OBJECT(mr), "port[0]", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+
+    object_property_set_bool(OBJECT(&s->apb_ppc1), true, "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+    mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->apb_ppc1), 0);
+    memory_region_add_subregion(&s->container, 0x4002f000, mr);
+
+    dev_apb_ppc1 = DEVICE(&s->apb_ppc1);
+    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc1_nonsec", 0,
+                                qdev_get_gpio_in_named(dev_apb_ppc1,
+                                                       "cfg_nonsec", 0));
+    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc1_ap", 0,
+                                qdev_get_gpio_in_named(dev_apb_ppc1,
+                                                       "cfg_ap", 0));
+    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc1_irq_enable", 0,
+                                qdev_get_gpio_in_named(dev_apb_ppc1,
+                                                       "irq_enable", 0));
+    qdev_connect_gpio_out_named(dev_secctl, "apb_ppc1_irq_clear", 0,
+                                qdev_get_gpio_in_named(dev_apb_ppc1,
+                                                       "irq_clear", 0));
+    qdev_connect_gpio_out(dev_splitter, 1,
+                          qdev_get_gpio_in_named(dev_apb_ppc1,
+                                                 "cfg_sec_resp", 0));
+
+    /* Using create_unimplemented_device() maps the stub into the
+     * system address space rather than into our container, but the
+     * overall effect to the guest is the same.
+     */
+    create_unimplemented_device("SYSINFO", 0x40020000, 0x1000);
+
+    create_unimplemented_device("SYSCONTROL", 0x50021000, 0x1000);
+    create_unimplemented_device("S32KWATCHDOG", 0x5002e000, 0x1000);
+
+    /* 0x40080000 .. 0x4008ffff : IoTKit second Base peripheral region */
+
+    create_unimplemented_device("NS watchdog", 0x40081000, 0x1000);
+    create_unimplemented_device("S watchdog", 0x50081000, 0x1000);
+
+    create_unimplemented_device("SRAM0 MPC", 0x50083000, 0x1000);
+
+    for (i = 0; i < ARRAY_SIZE(s->ppc_irq_splitter); i++) {
+        Object *splitter = OBJECT(&s->ppc_irq_splitter[i]);
+
+        object_property_set_int(splitter, 2, "num-lines", &err);
+        if (err) {
+            error_propagate(errp, err);
+            return;
+        }
+        object_property_set_bool(splitter, true, "realized", &err);
+        if (err) {
+            error_propagate(errp, err);
+            return;
+        }
+    }
+
+    for (i = 0; i < IOTS_NUM_AHB_EXP_PPC; i++) {
+        char *ppcname = g_strdup_printf("ahb_ppcexp%d", i);
+
+        iotkit_forward_ppc(s, ppcname, i);
+        g_free(ppcname);
+    }
+
+    for (i = 0; i < IOTS_NUM_APB_EXP_PPC; i++) {
+        char *ppcname = g_strdup_printf("apb_ppcexp%d", i);
+
+        iotkit_forward_ppc(s, ppcname, i + IOTS_NUM_AHB_EXP_PPC);
+        g_free(ppcname);
+    }
+
+    for (i = NUM_EXTERNAL_PPCS; i < NUM_PPCS; i++) {
+        /* Wire up IRQ splitter for internal PPCs */
+        DeviceState *devs = DEVICE(&s->ppc_irq_splitter[i]);
+        char *gpioname = g_strdup_printf("apb_ppc%d_irq_status",
+                                         i - NUM_EXTERNAL_PPCS);
+        TZPPC *ppc = (i == NUM_EXTERNAL_PPCS) ? &s->apb_ppc0 : &s->apb_ppc1;
+
+        qdev_connect_gpio_out(devs, 0,
+                              qdev_get_gpio_in_named(dev_secctl, gpioname, 0));
+        qdev_connect_gpio_out(devs, 1,
+                              qdev_get_gpio_in(DEVICE(&s->ppc_irq_orgate), i));
+        qdev_connect_gpio_out_named(DEVICE(ppc), "irq", 0,
+                                    qdev_get_gpio_in(devs, 0));
+    }
+
+    iotkit_forward_sec_resp_cfg(s);
+
+    system_clock_scale = NANOSECONDS_PER_SECOND / s->mainclk_frq;
+}
+
+static void iotkit_idau_check(IDAUInterface *ii, uint32_t address,
+                              int *iregion, bool *exempt, bool *ns, bool *nsc)
+{
+    /* For IoTKit systems the IDAU responses are simple logical functions
+     * of the address bits. The NSC attribute is guest-adjustable via the
+     * NSCCFG register in the security controller.
+     */
+    IoTKit *s = IOTKIT(ii);
+    int region = extract32(address, 28, 4);
+
+    *ns = !(region & 1);
+    *nsc = (region == 1 && (s->nsccfg & 1)) || (region == 3 && (s->nsccfg & 2));
+    /* 0xe0000000..0xe00fffff and 0xf0000000..0xf00fffff are exempt */
+    *exempt = (address & 0xeff00000) == 0xe0000000;
+    *iregion = region;
+}
+
+static const VMStateDescription iotkit_vmstate = {
+    .name = "iotkit",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(nsccfg, IoTKit),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static Property iotkit_properties[] = {
+    DEFINE_PROP_LINK("memory", IoTKit, board_memory, TYPE_MEMORY_REGION,
+                     MemoryRegion *),
+    DEFINE_PROP_UINT32("EXP_NUMIRQ", IoTKit, exp_numirq, 64),
+    DEFINE_PROP_UINT32("MAINCLK", IoTKit, mainclk_frq, 0),
+    DEFINE_PROP_END_OF_LIST()
+};
+
+static void iotkit_reset(DeviceState *dev)
+{
+    IoTKit *s = IOTKIT(dev);
+
+    s->nsccfg = 0;
+}
+
+static void iotkit_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    IDAUInterfaceClass *iic = IDAU_INTERFACE_CLASS(klass);
+
+    dc->realize = iotkit_realize;
+    dc->vmsd = &iotkit_vmstate;
+    dc->props = iotkit_properties;
+    dc->reset = iotkit_reset;
+    iic->check = iotkit_idau_check;
+}
+
+static const TypeInfo iotkit_info = {
+    .name = TYPE_IOTKIT,
+    .parent = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(IoTKit),
+    .instance_init = iotkit_init,
+    .class_init = iotkit_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_IDAU_INTERFACE },
+        { }
+    }
+};
+
+static void iotkit_register_types(void)
+{
+    type_register_static(&iotkit_info);
+}
+
+type_init(iotkit_register_types);
diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index XXXXXXX..XXXXXXX 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -XXX,XX +XXX,XX @@ CONFIG_MPS2_FPGAIO=y
 CONFIG_MPS2_SCC=y
 
 CONFIG_TZ_PPC=y
+CONFIG_IOTKIT=y
 CONFIG_IOTKIT_SECCTL=y
 
 CONFIG_VERSATILE_PCI=y
-- 
2.16.2

Define a new board model for the MPS2 with an AN505 FPGA image
containing a Cortex-M33. Since the FPGA images for TrustZone
cores (AN505, and the similar AN519 for Cortex-M23) have a
significantly different layout of devices to the non-TrustZone
images, we use a new source file rather than shoehorning them
into the existing mps2.c.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-20-peter.maydell@linaro.org
---
 hw/arm/Makefile.objs |   1 +
 hw/arm/mps2-tz.c     | 503 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 504 insertions(+)
 create mode 100644 hw/arm/mps2-tz.c

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
 obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
 obj-$(CONFIG_MPS2) += mps2.o
+obj-$(CONFIG_MPS2) += mps2-tz.o
 obj-$(CONFIG_MSF2) += msf2-soc.o msf2-som.o
 obj-$(CONFIG_IOTKIT) += iotkit.o
diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/hw/arm/mps2-tz.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * ARM V2M MPS2 board emulation, trustzone aware FPGA images
+ *
+ * Copyright (c) 2017 Linaro Limited
+ * Written by Peter Maydell
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License version 2 or
+ *  (at your option) any later version.
+ */
+
+/* The MPS2 and MPS2+ dev boards are FPGA based (the 2+ has a bigger
+ * FPGA but is otherwise the same as the 2). Since the CPU itself
+ * and most of the devices are in the FPGA, the details of the board
+ * as seen by the guest depend significantly on the FPGA image.
+ * This source file covers the following FPGA images, for TrustZone cores:
+ *  "mps2-an505" -- Cortex-M33 as documented in ARM Application Note AN505
+ *
+ * Links to the TRM for the board itself and to the various Application
+ * Notes which document the FPGA images can be found here:
+ * https://developer.arm.com/products/system-design/development-boards/fpga-prototyping-boards/mps2
+ *
+ * Board TRM:
+ * http://infocenter.arm.com/help/topic/com.arm.doc.100112_0200_06_en/versatile_express_cortex_m_prototyping_systems_v2m_mps2_and_v2m_mps2plus_technical_reference_100112_0200_06_en.pdf
+ * Application Note AN505:
+ * http://infocenter.arm.com/help/topic/com.arm.doc.dai0505b/index.html
+ *
+ * The AN505 defers to the Cortex-M33 processor ARMv8M IoT Kit FVP User Guide
+ * (ARM ECM0601256) for the details of some of the device layout:
+ *   http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ecm0601256/index.html
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "hw/arm/arm.h"
+#include "hw/arm/armv7m.h"
+#include "hw/or-irq.h"
+#include "hw/boards.h"
+#include "exec/address-spaces.h"
+#include "sysemu/sysemu.h"
+#include "hw/misc/unimp.h"
+#include "hw/char/cmsdk-apb-uart.h"
+#include "hw/timer/cmsdk-apb-timer.h"
+#include "hw/misc/mps2-scc.h"
+#include "hw/misc/mps2-fpgaio.h"
+#include "hw/arm/iotkit.h"
+#include "hw/devices.h"
+#include "net/net.h"
+#include "hw/core/split-irq.h"
+
+typedef enum MPS2TZFPGAType {
+    FPGA_AN505,
+} MPS2TZFPGAType;
+
+typedef struct {
+    MachineClass parent;
+    MPS2TZFPGAType fpga_type;
+    uint32_t scc_id;
+} MPS2TZMachineClass;
+
+typedef struct {
+    MachineState parent;
+
+    IoTKit iotkit;
+    MemoryRegion psram;
+    MemoryRegion ssram1;
+    MemoryRegion ssram1_m;
+    MemoryRegion ssram23;
+    MPS2SCC scc;
+    MPS2FPGAIO fpgaio;
+    TZPPC ppc[5];
+    UnimplementedDeviceState ssram_mpc[3];
+    UnimplementedDeviceState spi[5];
+    UnimplementedDeviceState i2c[4];
+    UnimplementedDeviceState i2s_audio;
+    UnimplementedDeviceState gpio[5];
+    UnimplementedDeviceState dma[4];
+    UnimplementedDeviceState gfx;
+    CMSDKAPBUART uart[5];
+    SplitIRQ sec_resp_splitter;
+    qemu_or_irq uart_irq_orgate;
+} MPS2TZMachineState;
+
+#define TYPE_MPS2TZ_MACHINE "mps2tz"
+#define TYPE_MPS2TZ_AN505_MACHINE MACHINE_TYPE_NAME("mps2-an505")
+
+#define MPS2TZ_MACHINE(obj) \
+    OBJECT_CHECK(MPS2TZMachineState, obj, TYPE_MPS2TZ_MACHINE)
+#define MPS2TZ_MACHINE_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(MPS2TZMachineClass, obj, TYPE_MPS2TZ_MACHINE)
+#define MPS2TZ_MACHINE_CLASS(klass) \
+    OBJECT_CLASS_CHECK(MPS2TZMachineClass, klass, TYPE_MPS2TZ_MACHINE)
+
+/* Main SYSCLK frequency in Hz */
+#define SYSCLK_FRQ 20000000
+
+/* Initialize the auxiliary RAM region @mr and map it into
+ * the memory map at @base.
+ */
+static void make_ram(MemoryRegion *mr, const char *name,
+                     hwaddr base, hwaddr size)
+{
+    memory_region_init_ram(mr, NULL, name, size, &error_fatal);
+    memory_region_add_subregion(get_system_memory(), base, mr);
+}
+
+/* Create an alias of an entire original MemoryRegion @orig
+ * located at @base in the memory map.
+ */
+static void make_ram_alias(MemoryRegion *mr, const char *name,
+                           MemoryRegion *orig, hwaddr base)
+{
+    memory_region_init_alias(mr, NULL, name, orig, 0,
+                             memory_region_size(orig));
+    memory_region_add_subregion(get_system_memory(), base, mr);
+}
+
+static void init_sysbus_child(Object *parent, const char *childname,
+                              void *child, size_t childsize,
+                              const char *childtype)
+{
+    object_initialize(child, childsize, childtype);
+    object_property_add_child(parent, childname, OBJECT(child), &error_abort);
+    qdev_set_parent_bus(DEVICE(child), sysbus_get_default());
+
+}
+
+/* Most of the devices in the AN505 FPGA image sit behind
+ * Peripheral Protection Controllers. These data structures
+ * define the layout of which devices sit behind which PPCs.
+ * The devfn for each port is a function which creates, configures
+ * and initializes the device, returning the MemoryRegion which
+ * needs to be plugged into the downstream end of the PPC port.
+ */
+typedef MemoryRegion *MakeDevFn(MPS2TZMachineState *mms, void *opaque,
+                                const char *name, hwaddr size);
+
+typedef struct PPCPortInfo {
+    const char *name;
+    MakeDevFn *devfn;
+    void *opaque;
+    hwaddr addr;
+    hwaddr size;
+} PPCPortInfo;
+
+typedef struct PPCInfo {
+    const char *name;
+    PPCPortInfo ports[TZ_NUM_PORTS];
+} PPCInfo;
+
+static MemoryRegion *make_unimp_dev(MPS2TZMachineState *mms,
+                                       void *opaque,
+                                       const char *name, hwaddr size)
+{
+    /* Initialize, configure and realize a TYPE_UNIMPLEMENTED_DEVICE,
+     * and return a pointer to its MemoryRegion.
+     */
+    UnimplementedDeviceState *uds = opaque;
+
+    init_sysbus_child(OBJECT(mms), name, uds,
+                      sizeof(UnimplementedDeviceState),
+                      TYPE_UNIMPLEMENTED_DEVICE);
+    qdev_prop_set_string(DEVICE(uds), "name", name);
+    qdev_prop_set_uint64(DEVICE(uds), "size", size);
+    object_property_set_bool(OBJECT(uds), true, "realized", &error_fatal);
+    return sysbus_mmio_get_region(SYS_BUS_DEVICE(uds), 0);
+}
+
+static MemoryRegion *make_uart(MPS2TZMachineState *mms, void *opaque,
+                               const char *name, hwaddr size)
+{
+    CMSDKAPBUART *uart = opaque;
+    int i = uart - &mms->uart[0];
+    Chardev *uartchr = i < MAX_SERIAL_PORTS ? serial_hds[i] : NULL;
+    int rxirqno = i * 2;
+    int txirqno = i * 2 + 1;
+    int combirqno = i + 10;
+    SysBusDevice *s;
+    DeviceState *iotkitdev = DEVICE(&mms->iotkit);
+    DeviceState *orgate_dev = DEVICE(&mms->uart_irq_orgate);
+
+    init_sysbus_child(OBJECT(mms), name, uart,
+                      sizeof(mms->uart[0]), TYPE_CMSDK_APB_UART);
+    qdev_prop_set_chr(DEVICE(uart), "chardev", uartchr);
+    qdev_prop_set_uint32(DEVICE(uart), "pclk-frq", SYSCLK_FRQ);
+    object_property_set_bool(OBJECT(uart), true, "realized", &error_fatal);
+    s = SYS_BUS_DEVICE(uart);
+    sysbus_connect_irq(s, 0, qdev_get_gpio_in_named(iotkitdev,
+                                                    "EXP_IRQ", txirqno));
+    sysbus_connect_irq(s, 1, qdev_get_gpio_in_named(iotkitdev,
+                                                    "EXP_IRQ", rxirqno));
+    sysbus_connect_irq(s, 2, qdev_get_gpio_in(orgate_dev, i * 2));
+    sysbus_connect_irq(s, 3, qdev_get_gpio_in(orgate_dev, i * 2 + 1));
+    sysbus_connect_irq(s, 4, qdev_get_gpio_in_named(iotkitdev,
+                                                    "EXP_IRQ", combirqno));
+    return sysbus_mmio_get_region(SYS_BUS_DEVICE(uart), 0);
+}
+
+static MemoryRegion *make_scc(MPS2TZMachineState *mms, void *opaque,
+                              const char *name, hwaddr size)
+{
+    MPS2SCC *scc = opaque;
+    DeviceState *sccdev;
+    MPS2TZMachineClass *mmc = MPS2TZ_MACHINE_GET_CLASS(mms);
+
+    object_initialize(scc, sizeof(mms->scc), TYPE_MPS2_SCC);
+    sccdev = DEVICE(scc);
+    qdev_set_parent_bus(sccdev, sysbus_get_default());
+    qdev_prop_set_uint32(sccdev, "scc-cfg4", 0x2);
+    qdev_prop_set_uint32(sccdev, "scc-aid", 0x02000008);
+    qdev_prop_set_uint32(sccdev, "scc-id", mmc->scc_id);
+    object_property_set_bool(OBJECT(scc), true, "realized", &error_fatal);
+    return sysbus_mmio_get_region(SYS_BUS_DEVICE(sccdev), 0);
+}
+
+static MemoryRegion *make_fpgaio(MPS2TZMachineState *mms, void *opaque,
+                                 const char *name, hwaddr size)
+{
+    MPS2FPGAIO *fpgaio = opaque;
+
+    object_initialize(fpgaio, sizeof(mms->fpgaio), TYPE_MPS2_FPGAIO);
+    qdev_set_parent_bus(DEVICE(fpgaio), sysbus_get_default());
+    object_property_set_bool(OBJECT(fpgaio), true, "realized", &error_fatal);
+    return sysbus_mmio_get_region(SYS_BUS_DEVICE(fpgaio), 0);
+}
+
+static void mps2tz_common_init(MachineState *machine)
+{
+    MPS2TZMachineState *mms = MPS2TZ_MACHINE(machine);
+    MachineClass *mc = MACHINE_GET_CLASS(machine);
+    MemoryRegion *system_memory = get_system_memory();
+    DeviceState *iotkitdev;
+    DeviceState *dev_splitter;
+    int i;
+
+    if (strcmp(machine->cpu_type, mc->default_cpu_type) != 0) {
+        error_report("This board can only be used with CPU %s",
+                     mc->default_cpu_type);
+        exit(1);
+    }
+
+    init_sysbus_child(OBJECT(machine), "iotkit", &mms->iotkit,
+                      sizeof(mms->iotkit), TYPE_IOTKIT);
+    iotkitdev = DEVICE(&mms->iotkit);
+    object_property_set_link(OBJECT(&mms->iotkit), OBJECT(system_memory),
+                             "memory", &error_abort);
+    qdev_prop_set_uint32(iotkitdev, "EXP_NUMIRQ", 92);
+    qdev_prop_set_uint32(iotkitdev, "MAINCLK", SYSCLK_FRQ);
+    object_property_set_bool(OBJECT(&mms->iotkit), true, "realized",
+                             &error_fatal);
+
+    /* The sec_resp_cfg output from the IoTKit must be split into multiple
+     * lines, one for each of the PPCs we create here.
+     */
+    object_initialize(&mms->sec_resp_splitter, sizeof(mms->sec_resp_splitter),
+                      TYPE_SPLIT_IRQ);
+    object_property_add_child(OBJECT(machine), "sec-resp-splitter",
+                              OBJECT(&mms->sec_resp_splitter), &error_abort);
+    object_property_set_int(OBJECT(&mms->sec_resp_splitter), 5,
+                            "num-lines", &error_fatal);
+    object_property_set_bool(OBJECT(&mms->sec_resp_splitter), true,
+                             "realized", &error_fatal);
+    dev_splitter = DEVICE(&mms->sec_resp_splitter);
+    qdev_connect_gpio_out_named(iotkitdev, "sec_resp_cfg", 0,
+                                qdev_get_gpio_in(dev_splitter, 0));
+
+    /* The IoTKit sets up much of the memory layout, including
+     * the aliases between secure and non-secure regions in the
+     * address space. The FPGA itself contains:
+     *
+     * 0x00000000..0x003fffff  SSRAM1
+     * 0x00400000..0x007fffff  alias of SSRAM1
+     * 0x28000000..0x283fffff  4MB SSRAM2 + SSRAM3
+     * 0x40100000..0x4fffffff  AHB Master Expansion 1 interface devices
+     * 0x80000000..0x80ffffff  16MB PSRAM
+     */
+
+    /* The FPGA images have an odd combination of different RAMs,
+     * because in hardware they are different implementations and
+     * connected to different buses, giving varying performance/size
+     * tradeoffs. For QEMU they're all just RAM, though. We arbitrarily
+     * call the 16MB our "system memory", as it's the largest lump.
+     */
+    memory_region_allocate_system_memory(&mms->psram,
+                                         NULL, "mps.ram", 0x01000000);
+    memory_region_add_subregion(system_memory, 0x80000000, &mms->psram);
+
+    /* The SSRAM memories should all be behind Memory Protection Controllers,
+     * but we don't implement that yet.
+     */
+    make_ram(&mms->ssram1, "mps.ssram1", 0x00000000, 0x00400000);
+    make_ram_alias(&mms->ssram1_m, "mps.ssram1_m", &mms->ssram1, 0x00400000);
+
+    make_ram(&mms->ssram23, "mps.ssram23", 0x28000000, 0x00400000);
+
+    /* The overflow IRQs for all UARTs are ORed together.
+     * Tx, Rx and "combined" IRQs are sent to the NVIC separately.
+     * Create the OR gate for this.
+     */
+    object_initialize(&mms->uart_irq_orgate, sizeof(mms->uart_irq_orgate),
+                      TYPE_OR_IRQ);
+    object_property_add_child(OBJECT(mms), "uart-irq-orgate",
+                              OBJECT(&mms->uart_irq_orgate), &error_abort);
+    object_property_set_int(OBJECT(&mms->uart_irq_orgate), 10, "num-lines",
+                            &error_fatal);
+    object_property_set_bool(OBJECT(&mms->uart_irq_orgate), true,
+                             "realized", &error_fatal);
+    qdev_connect_gpio_out(DEVICE(&mms->uart_irq_orgate), 0,
+                          qdev_get_gpio_in_named(iotkitdev, "EXP_IRQ", 15));
+
+    /* Most of the devices in the FPGA are behind Peripheral Protection
+     * Controllers. The required order for initializing things is:
+     *  + initialize the PPC
+     *  + initialize, configure and realize downstream devices
+     *  + connect downstream device MemoryRegions to the PPC
+     *  + realize the PPC
+     *  + map the PPC's MemoryRegions to the places in the address map
+     *    where the downstream devices should appear
+     *  + wire up the PPC's control lines to the IoTKit object
+     */
+
+    const PPCInfo ppcs[] = { {
+            .name = "apb_ppcexp0",
+            .ports = {
+                { "ssram-mpc0", make_unimp_dev, &mms->ssram_mpc[0],
+                  0x58007000, 0x1000 },
+                { "ssram-mpc1", make_unimp_dev, &mms->ssram_mpc[1],
+                  0x58008000, 0x1000 },
+                { "ssram-mpc2", make_unimp_dev, &mms->ssram_mpc[2],
+                  0x58009000, 0x1000 },
+            },
+        }, {
+            .name = "apb_ppcexp1",
+            .ports = {
+                { "spi0", make_unimp_dev, &mms->spi[0], 0x40205000, 0x1000 },
+                { "spi1", make_unimp_dev, &mms->spi[1], 0x40206000, 0x1000 },
+                { "spi2", make_unimp_dev, &mms->spi[2], 0x40209000, 0x1000 },
+                { "spi3", make_unimp_dev, &mms->spi[3], 0x4020a000, 0x1000 },
+                { "spi4", make_unimp_dev, &mms->spi[4], 0x4020b000, 0x1000 },
+                { "uart0", make_uart, &mms->uart[0], 0x40200000, 0x1000 },
+                { "uart1", make_uart, &mms->uart[1], 0x40201000, 0x1000 },
+                { "uart2", make_uart, &mms->uart[2], 0x40202000, 0x1000 },
+                { "uart3", make_uart, &mms->uart[3], 0x40203000, 0x1000 },
+                { "uart4", make_uart, &mms->uart[4], 0x40204000, 0x1000 },
+                { "i2c0", make_unimp_dev, &mms->i2c[0], 0x40207000, 0x1000 },
+                { "i2c1", make_unimp_dev, &mms->i2c[1], 0x40208000, 0x1000 },
+                { "i2c2", make_unimp_dev, &mms->i2c[2], 0x4020c000, 0x1000 },
+                { "i2c3", make_unimp_dev, &mms->i2c[3], 0x4020d000, 0x1000 },
+            },
+        }, {
+            .name = "apb_ppcexp2",
+            .ports = {
+                { "scc", make_scc, &mms->scc, 0x40300000, 0x1000 },
+                { "i2s-audio", make_unimp_dev, &mms->i2s_audio,
+                  0x40301000, 0x1000 },
+                { "fpgaio", make_fpgaio, &mms->fpgaio, 0x40302000, 0x1000 },
+            },
+        }, {
+            .name = "ahb_ppcexp0",
+            .ports = {
+                { "gfx", make_unimp_dev, &mms->gfx, 0x41000000, 0x140000 },
+                { "gpio0", make_unimp_dev, &mms->gpio[0], 0x40100000, 0x1000 },
+                { "gpio1", make_unimp_dev, &mms->gpio[1], 0x40101000, 0x1000 },
+                { "gpio2", make_unimp_dev, &mms->gpio[2], 0x40102000, 0x1000 },
+                { "gpio3", make_unimp_dev, &mms->gpio[3], 0x40103000, 0x1000 },
+                { "gpio4", make_unimp_dev, &mms->gpio[4], 0x40104000, 0x1000 },
+            },
+        }, {
+            .name = "ahb_ppcexp1",
+            .ports = {
+                { "dma0", make_unimp_dev, &mms->dma[0], 0x40110000, 0x1000 },
+                { "dma1", make_unimp_dev, &mms->dma[1], 0x40111000, 0x1000 },
+                { "dma2", make_unimp_dev, &mms->dma[2], 0x40112000, 0x1000 },
+                { "dma3", make_unimp_dev, &mms->dma[3], 0x40113000, 0x1000 },
+            },
+        },
+    };
+
+    for (i = 0; i < ARRAY_SIZE(ppcs); i++) {
+        const PPCInfo *ppcinfo = &ppcs[i];
+        TZPPC *ppc = &mms->ppc[i];
+        DeviceState *ppcdev;
+        int port;
+        char *gpioname;
+
+        init_sysbus_child(OBJECT(machine), ppcinfo->name, ppc,
+                          sizeof(TZPPC), TYPE_TZ_PPC);
+        ppcdev = DEVICE(ppc);
+
+        for (port = 0; port < TZ_NUM_PORTS; port++) {
+            const PPCPortInfo *pinfo = &ppcinfo->ports[port];
+            MemoryRegion *mr;
+            char *portname;
+
+            if (!pinfo->devfn) {
+                continue;
+            }
+
+            mr = pinfo->devfn(mms, pinfo->opaque, pinfo->name, pinfo->size);
+            portname = g_strdup_printf("port[%d]", port);
+            object_property_set_link(OBJECT(ppc), OBJECT(mr),
+                                     portname, &error_fatal);
+            g_free(portname);
+        }
+
+        object_property_set_bool(OBJECT(ppc), true, "realized", &error_fatal);
+
+        for (port = 0; port < TZ_NUM_PORTS; port++) {
+            const PPCPortInfo *pinfo = &ppcinfo->ports[port];
+
+            if (!pinfo->devfn) {
+                continue;
+            }
+            sysbus_mmio_map(SYS_BUS_DEVICE(ppc), port, pinfo->addr);
+
+            gpioname = g_strdup_printf("%s_nonsec", ppcinfo->name);
+            qdev_connect_gpio_out_named(iotkitdev, gpioname, port,
+                                        qdev_get_gpio_in_named(ppcdev,
+                                                               "cfg_nonsec",
+                                                               port));
+            g_free(gpioname);
+            gpioname = g_strdup_printf("%s_ap", ppcinfo->name);
+            qdev_connect_gpio_out_named(iotkitdev, gpioname, port,
+                                        qdev_get_gpio_in_named(ppcdev,
+                                                               "cfg_ap", port));
+            g_free(gpioname);
+        }
+
+        gpioname = g_strdup_printf("%s_irq_enable", ppcinfo->name);
+        qdev_connect_gpio_out_named(iotkitdev, gpioname, 0,
+                                    qdev_get_gpio_in_named(ppcdev,
+                                                           "irq_enable", 0));
+        g_free(gpioname);
+        gpioname = g_strdup_printf("%s_irq_clear", ppcinfo->name);
+        qdev_connect_gpio_out_named(iotkitdev, gpioname, 0,
+                                    qdev_get_gpio_in_named(ppcdev,
+                                                           "irq_clear", 0));
+        g_free(gpioname);
+        gpioname = g_strdup_printf("%s_irq_status", ppcinfo->name);
+        qdev_connect_gpio_out_named(ppcdev, "irq", 0,
+                                    qdev_get_gpio_in_named(iotkitdev,
+                                                           gpioname, 0));
+        g_free(gpioname);
+
+        qdev_connect_gpio_out(dev_splitter, i,
+                              qdev_get_gpio_in_named(ppcdev,
+                                                     "cfg_sec_resp", 0));
+    }
+
+    /* In hardware this is a LAN9220; the LAN9118 is software compatible
+     * except that it doesn't support the checksum-offload feature.
+     * The ethernet controller is not behind a PPC.
+     */
+    lan9118_init(&nd_table[0], 0x42000000,
+                 qdev_get_gpio_in_named(iotkitdev, "EXP_IRQ", 16));
+
+    create_unimplemented_device("FPGA NS PC", 0x48007000, 0x1000);
+
+    armv7m_load_kernel(ARM_CPU(first_cpu), machine->kernel_filename, 0x400000);
+}
+
+static void mps2tz_class_init(ObjectClass *oc, void *data)
+{
+    MachineClass *mc = MACHINE_CLASS(oc);
+
+    mc->init = mps2tz_common_init;
+    mc->max_cpus = 1;
+}
+
+static void mps2tz_an505_class_init(ObjectClass *oc, void *data)
+{
+    MachineClass *mc = MACHINE_CLASS(oc);
+    MPS2TZMachineClass *mmc = MPS2TZ_MACHINE_CLASS(oc);
+
+    mc->desc = "ARM MPS2 with AN505 FPGA image for Cortex-M33";
+    mmc->fpga_type = FPGA_AN505;
+    mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-m33");
+    mmc->scc_id = 0x41040000 | (505 << 4);
+}
+
+static const TypeInfo mps2tz_info = {
+    .name = TYPE_MPS2TZ_MACHINE,
+    .parent = TYPE_MACHINE,
+    .abstract = true,
+    .instance_size = sizeof(MPS2TZMachineState),
+    .class_size = sizeof(MPS2TZMachineClass),
+    .class_init = mps2tz_class_init,
+};
+
+static const TypeInfo mps2tz_an505_info = {
+    .name = TYPE_MPS2TZ_AN505_MACHINE,
+    .parent = TYPE_MPS2TZ_MACHINE,
+    .class_init = mps2tz_an505_class_init,
+};
+
+static void mps2tz_machine_init(void)
+{
+    type_register_static(&mps2tz_info);
+    type_register_static(&mps2tz_an505_info);
+}
+
+type_init(mps2tz_machine_init);
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Not enabled anywhere yet.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h     | 1 +
 linux-user/elfload.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ enum arm_features {
     ARM_FEATURE_V8_SHA3, /* implements SHA3 part of v8 Crypto Extensions */
     ARM_FEATURE_V8_SM3, /* implements SM3 part of v8 Crypto Extensions */
     ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */
+    ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */
     ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
 };
 
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
     GET_FEATURE(ARM_FEATURE_V8_SHA512, ARM_HWCAP_A64_SHA512);
     GET_FEATURE(ARM_FEATURE_V8_FP16,
                 ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
+    GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
 #undef GET_FEATURE
 
     return hwcaps;
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Include the U bit in the switches rather than testing separately.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180228193125.20577-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 129 +++++++++++++++++++++------------------------
 1 file changed, 61 insertions(+), 68 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
     int index;
     TCGv_ptr fpst;
 
-    switch (opcode) {
-    case 0x0: /* MLA */
-    case 0x4: /* MLS */
-        if (!u || is_scalar) {
+    switch (16 * u + opcode) {
+    case 0x08: /* MUL */
+    case 0x10: /* MLA */
+    case 0x14: /* MLS */
+        if (is_scalar) {
             unallocated_encoding(s);
             return;
         }
         break;
-    case 0x2: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */
-    case 0x6: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */
-    case 0xa: /* SMULL, SMULL2, UMULL, UMULL2 */
+    case 0x02: /* SMLAL, SMLAL2 */
+    case 0x12: /* UMLAL, UMLAL2 */
+    case 0x06: /* SMLSL, SMLSL2 */
+    case 0x16: /* UMLSL, UMLSL2 */
+    case 0x0a: /* SMULL, SMULL2 */
+    case 0x1a: /* UMULL, UMULL2 */
         if (is_scalar) {
             unallocated_encoding(s);
             return;
         }
         is_long = true;
         break;
-    case 0x3: /* SQDMLAL, SQDMLAL2 */
-    case 0x7: /* SQDMLSL, SQDMLSL2 */
-    case 0xb: /* SQDMULL, SQDMULL2 */
+    case 0x03: /* SQDMLAL, SQDMLAL2 */
+    case 0x07: /* SQDMLSL, SQDMLSL2 */
+    case 0x0b: /* SQDMULL, SQDMULL2 */
         is_long = true;
-        /* fall through */
-    case 0xc: /* SQDMULH */
-    case 0xd: /* SQRDMULH */
-        if (u) {
-            unallocated_encoding(s);
-            return;
-        }
         break;
-    case 0x8: /* MUL */
-        if (u || is_scalar) {
-            unallocated_encoding(s);
-            return;
-        }
+    case 0x0c: /* SQDMULH */
+    case 0x0d: /* SQRDMULH */
         break;
-    case 0x1: /* FMLA */
-    case 0x5: /* FMLS */
-        if (u) {
-            unallocated_encoding(s);
-            return;
-        }
-        /* fall through */
-    case 0x9: /* FMUL, FMULX */
+    case 0x01: /* FMLA */
+    case 0x05: /* FMLS */
+    case 0x09: /* FMUL */
+    case 0x19: /* FMULX */
         if (size == 1) {
             unallocated_encoding(s);
             return;
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
 
             read_vec_element(s, tcg_op, rn, pass, MO_64);
 
-            switch (opcode) {
-            case 0x5: /* FMLS */
+            switch (16 * u + opcode) {
+            case 0x05: /* FMLS */
                 /* As usual for ARM, separate negation for fused multiply-add */
                 gen_helper_vfp_negd(tcg_op, tcg_op);
                 /* fall through */
-            case 0x1: /* FMLA */
+            case 0x01: /* FMLA */
                 read_vec_element(s, tcg_res, rd, pass, MO_64);
                 gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst);
                 break;
-            case 0x9: /* FMUL, FMULX */
-                if (u) {
-                    gen_helper_vfp_mulxd(tcg_res, tcg_op, tcg_idx, fpst);
-                } else {
-                    gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst);
-                }
+            case 0x09: /* FMUL */
+                gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst);
+                break;
+            case 0x19: /* FMULX */
+                gen_helper_vfp_mulxd(tcg_res, tcg_op, tcg_idx, fpst);
                 break;
             default:
                 g_assert_not_reached();
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
 
             read_vec_element_i32(s, tcg_op, rn, pass, is_scalar ? size : MO_32);
 
-            switch (opcode) {
-            case 0x0: /* MLA */
-            case 0x4: /* MLS */
-            case 0x8: /* MUL */
+            switch (16 * u + opcode) {
+            case 0x08: /* MUL */
+            case 0x10: /* MLA */
+            case 0x14: /* MLS */
             {
                 static NeonGenTwoOpFn * const fns[2][2] = {
                     { gen_helper_neon_add_u16, gen_helper_neon_sub_u16 },
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
                 genfn(tcg_res, tcg_op, tcg_res);
                 break;
             }
-            case 0x5: /* FMLS */
-            case 0x1: /* FMLA */
+            case 0x05: /* FMLS */
+            case 0x01: /* FMLA */
                 read_vec_element_i32(s, tcg_res, rd, pass,
                                      is_scalar ? size : MO_32);
                 switch (size) {
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
                     g_assert_not_reached();
                 }
                 break;
-            case 0x9: /* FMUL, FMULX */
+            case 0x09: /* FMUL */
                 switch (size) {
                 case 1:
-                    if (u) {
-                        if (is_scalar) {
-                            gen_helper_advsimd_mulxh(tcg_res, tcg_op,
-                                                     tcg_idx, fpst);
-                        } else {
-                            gen_helper_advsimd_mulx2h(tcg_res, tcg_op,
-                                                      tcg_idx, fpst);
-                        }
+                    if (is_scalar) {
+                        gen_helper_advsimd_mulh(tcg_res, tcg_op,
+                                                tcg_idx, fpst);
                     } else {
-                        if (is_scalar) {
-                            gen_helper_advsimd_mulh(tcg_res, tcg_op,
-                                                    tcg_idx, fpst);
-                        } else {
-                            gen_helper_advsimd_mul2h(tcg_res, tcg_op,
-                                                     tcg_idx, fpst);
-                        }
+                        gen_helper_advsimd_mul2h(tcg_res, tcg_op,
+                                                 tcg_idx, fpst);
                     }
                     break;
                 case 2:
-                    if (u) {
-                        gen_helper_vfp_mulxs(tcg_res, tcg_op, tcg_idx, fpst);
-                    } else {
-                        gen_helper_vfp_muls(tcg_res, tcg_op, tcg_idx, fpst);
-                    }
+                    gen_helper_vfp_muls(tcg_res, tcg_op, tcg_idx, fpst);
                     break;
                 default:
                     g_assert_not_reached();
                 }
                 break;
-            case 0xc: /* SQDMULH */
+            case 0x19: /* FMULX */
+                switch (size) {
+                case 1:
+                    if (is_scalar) {
+                        gen_helper_advsimd_mulxh(tcg_res, tcg_op,
+                                                 tcg_idx, fpst);
+                    } else {
+                        gen_helper_advsimd_mulx2h(tcg_res, tcg_op,
+                                                  tcg_idx, fpst);
+                    }
+                    break;
+                case 2:
+                    gen_helper_vfp_mulxs(tcg_res, tcg_op, tcg_idx, fpst);
+                    break;
+                default:
+                    g_assert_not_reached();
+                }
+                break;
+            case 0x0c: /* SQDMULH */
                 if (size == 1) {
                     gen_helper_neon_qdmulh_s16(tcg_res, cpu_env,
                                                tcg_op, tcg_idx);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
                                                tcg_op, tcg_idx);
                 }
                 break;
-            case 0xd: /* SQRDMULH */
+            case 0x0d: /* SQRDMULH */
                 if (size == 1) {
                     gen_helper_neon_qrdmulh_s16(tcg_res, cpu_env,
                                                 tcg_op, tcg_idx);
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

The integer size check was already outside of the opcode switch;
move the floating-point size check outside as well.  Unify the
size vs index adjustment between fp and integer paths.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180228193125.20577-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 65 +++++++++++++++++++++++-----------------------
 1 file changed, 32 insertions(+), 33 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/Makefile.objs   |   2 +-
 target/arm/helper.h        |   4 ++
 target/arm/translate-a64.c |  84 ++++++++++++++++++++++++++++++++++
 target/arm/vec_helper.c    | 109 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 198 insertions(+), 1 deletion(-)
 create mode 100644 target/arm/vec_helper.c

diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/Makefile.objs
+++ b/target/arm/Makefile.objs
@@ -XXX,XX +XXX,XX @@ obj-$(call land,$(CONFIG_KVM),$(call lnot,$(TARGET_AARCH64))) += kvm32.o
 obj-$(call land,$(CONFIG_KVM),$(TARGET_AARCH64)) += kvm64.o
 obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
 obj-y += translate.o op_helper.o helper.o cpu.o
-obj-y += neon_helper.o iwmmxt_helper.o
+obj-y += neon_helper.o iwmmxt_helper.o vec_helper.o
 obj-y += gdbstub.o
 obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o helper-a64.o gdbstub64.o
 obj-y += crypto_helper.o
diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_1(neon_rbit_u8, TCG_CALL_NO_RWG_SE, i32, i32)
 
 DEF_HELPER_3(neon_qdmulh_s16, i32, env, i32, i32)
 DEF_HELPER_3(neon_qrdmulh_s16, i32, env, i32, i32)
+DEF_HELPER_4(neon_qrdmlah_s16, i32, env, i32, i32, i32)
+DEF_HELPER_4(neon_qrdmlsh_s16, i32, env, i32, i32, i32)
 DEF_HELPER_3(neon_qdmulh_s32, i32, env, i32, i32)
 DEF_HELPER_3(neon_qrdmulh_s32, i32, env, i32, i32)
+DEF_HELPER_4(neon_qrdmlah_s32, i32, env, s32, s32, s32)
+DEF_HELPER_4(neon_qrdmlsh_s32, i32, env, s32, s32, s32)
 
 DEF_HELPER_1(neon_narrow_u8, i32, i64)
 DEF_HELPER_1(neon_narrow_u16, i32, i64)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
     tcg_temp_free_ptr(fpst);
 }
 
+/* AdvSIMD scalar three same extra
+ *  31 30  29 28       24 23  22  21 20  16  15 14    11  10 9  5 4  0
+ * +-----+---+-----------+------+---+------+---+--------+---+----+----+
+ * | 0 1 | U | 1 1 1 1 0 | size | 0 |  Rm  | 1 | opcode | 1 | Rn | Rd |
+ * +-----+---+-----------+------+---+------+---+--------+---+----+----+
+ */
+static void disas_simd_scalar_three_reg_same_extra(DisasContext *s,
+                                                   uint32_t insn)
+{
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int opcode = extract32(insn, 11, 4);
+    int rm = extract32(insn, 16, 5);
+    int size = extract32(insn, 22, 2);
+    bool u = extract32(insn, 29, 1);
+    TCGv_i32 ele1, ele2, ele3;
+    TCGv_i64 res;
+    int feature;
+
+    switch (u * 16 + opcode) {
+    case 0x10: /* SQRDMLAH (vector) */
+    case 0x11: /* SQRDMLSH (vector) */
+        if (size != 1 && size != 2) {
+            unallocated_encoding(s);
+            return;
+        }
+        feature = ARM_FEATURE_V8_RDM;
+        break;
+    default:
+        unallocated_encoding(s);
+        return;
+    }
+    if (!arm_dc_feature(s, feature)) {
+        unallocated_encoding(s);
+        return;
+    }
+    if (!fp_access_check(s)) {
+        return;
+    }
+
+    /* Do a single operation on the lowest element in the vector.
+     * We use the standard Neon helpers and rely on 0 OP 0 == 0
+     * with no side effects for all these operations.
+     * OPTME: special-purpose helpers would avoid doing some
+     * unnecessary work in the helper for the 16 bit cases.
+     */
+    ele1 = tcg_temp_new_i32();
+    ele2 = tcg_temp_new_i32();
+    ele3 = tcg_temp_new_i32();
+
+    read_vec_element_i32(s, ele1, rn, 0, size);
+    read_vec_element_i32(s, ele2, rm, 0, size);
+    read_vec_element_i32(s, ele3, rd, 0, size);
+
+    switch (opcode) {
+    case 0x0: /* SQRDMLAH */
+        if (size == 1) {
+            gen_helper_neon_qrdmlah_s16(ele3, cpu_env, ele1, ele2, ele3);
+        } else {
+            gen_helper_neon_qrdmlah_s32(ele3, cpu_env, ele1, ele2, ele3);
+        }
+        break;
+    case 0x1: /* SQRDMLSH */
+        if (size == 1) {
+            gen_helper_neon_qrdmlsh_s16(ele3, cpu_env, ele1, ele2, ele3);
+        } else {
+            gen_helper_neon_qrdmlsh_s32(ele3, cpu_env, ele1, ele2, ele3);
+        }
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    tcg_temp_free_i32(ele1);
+    tcg_temp_free_i32(ele2);
+
+    res = tcg_temp_new_i64();
+    tcg_gen_extu_i32_i64(res, ele3);
+    tcg_temp_free_i32(ele3);
+
+    write_fp_dreg(s, rd, res);
+    tcg_temp_free_i64(res);
+}
+
 static void handle_2misc_64(DisasContext *s, int opcode, bool u,
                             TCGv_i64 tcg_rd, TCGv_i64 tcg_rn,
                             TCGv_i32 tcg_rmode, TCGv_ptr tcg_fpstatus)
@@ -XXX,XX +XXX,XX @@ static const AArch64DecodeTable data_proc_simd[] = {
     { 0x0e000800, 0xbf208c00, disas_simd_zip_trn },
     { 0x2e000000, 0xbf208400, disas_simd_ext },
     { 0x5e200400, 0xdf200400, disas_simd_scalar_three_reg_same },
+    { 0x5e008400, 0xdf208400, disas_simd_scalar_three_reg_same_extra },
     { 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff },
     { 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
     { 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * ARM AdvSIMD / SVE Vector Operations
+ *
+ * Copyright (c) 2018 Linaro
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "exec/exec-all.h"
+#include "exec/helper-proto.h"
+#include "tcg/tcg-gvec-desc.h"
+
+
+#define SET_QC() env->vfp.xregs[ARM_VFP_FPSCR] |= CPSR_Q
+
+/* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
+static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
+                                int16_t src2, int16_t src3)
+{
+    /* Simplify:
+     * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16
+     * = ((a3 << 15) + (e1 * e2) + (1 << 14)) >> 15
+     */
+    int32_t ret = (int32_t)src1 * src2;
+    ret = ((int32_t)src3 << 15) + ret + (1 << 14);
+    ret >>= 15;
+    if (ret != (int16_t)ret) {
+        SET_QC();
+        ret = (ret < 0 ? -0x8000 : 0x7fff);
+    }
+    return ret;
+}
+
+uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
+                                  uint32_t src2, uint32_t src3)
+{
+    uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3);
+    uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
+    return deposit32(e1, 16, 16, e2);
+}
+
+/* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
+static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
+                                int16_t src2, int16_t src3)
+{
+    /* Similarly, using subtraction:
+     * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16
+     * = ((a3 << 15) - (e1 * e2) + (1 << 14)) >> 15
+     */
+    int32_t ret = (int32_t)src1 * src2;
+    ret = ((int32_t)src3 << 15) - ret + (1 << 14);
+    ret >>= 15;
+    if (ret != (int16_t)ret) {
+        SET_QC();
+        ret = (ret < 0 ? -0x8000 : 0x7fff);
+    }
+    return ret;
+}
+
+uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
+                                  uint32_t src2, uint32_t src3)
+{
+    uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3);
+    uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
+    return deposit32(e1, 16, 16, e2);
+}
+
+/* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
+uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
+                                  int32_t src2, int32_t src3)
+{
+    /* Simplify similarly to int_qrdmlah_s16 above.  */
+    int64_t ret = (int64_t)src1 * src2;
+    ret = ((int64_t)src3 << 31) + ret + (1 << 30);
+    ret >>= 31;
+    if (ret != (int32_t)ret) {
+        SET_QC();
+        ret = (ret < 0 ? INT32_MIN : INT32_MAX);
+    }
+    return ret;
+}
+
+/* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
+uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
+                                  int32_t src2, int32_t src3)
+{
+    /* Simplify similarly to int_qrdmlsh_s16 above.  */
+    int64_t ret = (int64_t)src1 * src2;
+    ret = ((int64_t)src3 << 31) - ret + (1 << 30);
+    ret >>= 31;
+    if (ret != (int32_t)ret) {
+        SET_QC();
+        ret = (ret < 0 ? INT32_MIN : INT32_MAX);
+    }
+    return ret;
+}
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  9 +++++
 target/arm/translate-a64.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++
 target/arm/vec_helper.c    | 74 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 166 insertions(+)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(dc_zva, void, env, i64)
 DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 
+DEF_HELPER_FLAGS_5(gvec_qrdmlah_s16, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s16, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, ptr, i32)
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #endif
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
                    vec_full_reg_size(s), gvec_op);
 }
 
+/* Expand a 3-operand + env pointer operation using
+ * an out-of-line helper.
+ */
+static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd,
+                             int rn, int rm, gen_helper_gvec_3_ptr *fn)
+{
+    tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
+                       vec_full_reg_offset(s, rn),
+                       vec_full_reg_offset(s, rm), cpu_env,
+                       is_q ? 16 : 8, vec_full_reg_size(s), 0, fn);
+}
+
 /* Set ZF and NF based on a 64 bit result. This is alas fiddlier
  * than the 32 bit equivalent.
  */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
     clear_vec_high(s, is_q, rd);
 }
 
+/* AdvSIMD three same extra
+ *  31   30  29 28       24 23  22  21 20  16  15 14    11  10 9  5 4  0
+ * +---+---+---+-----------+------+---+------+---+--------+---+----+----+
+ * | 0 | Q | U | 0 1 1 1 0 | size | 0 |  Rm  | 1 | opcode | 1 | Rn | Rd |
+ * +---+---+---+-----------+------+---+------+---+--------+---+----+----+
+ */
+static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
+{
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int opcode = extract32(insn, 11, 4);
+    int rm = extract32(insn, 16, 5);
+    int size = extract32(insn, 22, 2);
+    bool u = extract32(insn, 29, 1);
+    bool is_q = extract32(insn, 30, 1);
+    int feature;
+
+    switch (u * 16 + opcode) {
+    case 0x10: /* SQRDMLAH (vector) */
+    case 0x11: /* SQRDMLSH (vector) */
+        if (size != 1 && size != 2) {
+            unallocated_encoding(s);
+            return;
+        }
+        feature = ARM_FEATURE_V8_RDM;
+        break;
+    default:
+        unallocated_encoding(s);
+        return;
+    }
+    if (!arm_dc_feature(s, feature)) {
+        unallocated_encoding(s);
+        return;
+    }
+    if (!fp_access_check(s)) {
+        return;
+    }
+
+    switch (opcode) {
+    case 0x0: /* SQRDMLAH (vector) */
+        switch (size) {
+        case 1:
+            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16);
+            break;
+        case 2:
+            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32);
+            break;
+        default:
+            g_assert_not_reached();
+        }
+        return;
+
+    case 0x1: /* SQRDMLSH (vector) */
+        switch (size) {
+        case 1:
+            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16);
+            break;
+        case 2:
+            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32);
+            break;
+        default:
+            g_assert_not_reached();
+        }
+        return;
+
+    default:
+        g_assert_not_reached();
+    }
+}
+
 static void handle_2misc_widening(DisasContext *s, int opcode, bool is_q,
                                   int size, int rn, int rd)
 {
@@ -XXX,XX +XXX,XX @@ static void disas_crypto_three_reg_imm2(DisasContext *s, uint32_t insn)
 static const AArch64DecodeTable data_proc_simd[] = {
     /* pattern  ,  mask     ,  fn                        */
     { 0x0e200400, 0x9f200400, disas_simd_three_reg_same },
+    { 0x0e008400, 0x9f208400, disas_simd_three_reg_same_extra },
     { 0x0e200000, 0x9f200c00, disas_simd_three_reg_diff },
     { 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },
     { 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes },
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@
 
 #define SET_QC() env->vfp.xregs[ARM_VFP_FPSCR] |= CPSR_Q
 
+static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
+{
+    uint64_t *d = vd + opr_sz;
+    uintptr_t i;
+
+    for (i = opr_sz; i < max_sz; i += 8) {
+        *d++ = 0;
+    }
+}
+
 /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
 static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
                                 int16_t src2, int16_t src3)
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
     return deposit32(e1, 16, 16, e2);
 }
 
+void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm,
+                              void *ve, uint32_t desc)
+{
+    uintptr_t opr_sz = simd_oprsz(desc);
+    int16_t *d = vd;
+    int16_t *n = vn;
+    int16_t *m = vm;
+    CPUARMState *env = ve;
+    uintptr_t i;
+
+    for (i = 0; i < opr_sz / 2; ++i) {
+        d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]);
+    }
+    clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
 /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
 static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
                                 int16_t src2, int16_t src3)
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
     return deposit32(e1, 16, 16, e2);
 }
 
+void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
+                              void *ve, uint32_t desc)
+{
+    uintptr_t opr_sz = simd_oprsz(desc);
+    int16_t *d = vd;
+    int16_t *n = vn;
+    int16_t *m = vm;
+    CPUARMState *env = ve;
+    uintptr_t i;
+
+    for (i = 0; i < opr_sz / 2; ++i) {
+        d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]);
+    }
+    clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
 /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
 uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
                                   int32_t src2, int32_t src3)
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
     return ret;
 }
 
+void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
+                              void *ve, uint32_t desc)
+{
+    uintptr_t opr_sz = simd_oprsz(desc);
+    int32_t *d = vd;
+    int32_t *n = vn;
+    int32_t *m = vm;
+    CPUARMState *env = ve;
+    uintptr_t i;
+
+    for (i = 0; i < opr_sz / 4; ++i) {
+        d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]);
+    }
+    clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
 /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
 uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
                                   int32_t src2, int32_t src3)
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
     }
     return ret;
 }
+
+void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
+                              void *ve, uint32_t desc)
+{
+    uintptr_t opr_sz = simd_oprsz(desc);
+    int32_t *d = vd;
+    int32_t *n = vn;
+    int32_t *m = vm;
+    CPUARMState *env = ve;
+    uintptr_t i;
+
+    for (i = 0; i < opr_sz / 4; ++i) {
+        d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]);
+    }
+    clear_tail(d, opr_sz, simd_maxsz(desc));
+}
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 86 +++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 67 insertions(+), 19 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@
 #include "disas/disas.h"
 #include "exec/exec-all.h"
 #include "tcg-op.h"
+#include "tcg-op-gvec.h"
 #include "qemu/log.h"
 #include "qemu/bitops.h"
 #include "arm_ldst.h"
@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
 #define NEON_3R_VPMAX 20
 #define NEON_3R_VPMIN 21
 #define NEON_3R_VQDMULH_VQRDMULH 22
-#define NEON_3R_VPADD 23
+#define NEON_3R_VPADD_VQRDMLAH 23
 #define NEON_3R_SHA 24 /* SHA1C,SHA1P,SHA1M,SHA1SU0,SHA256H{2},SHA256SU1 */
-#define NEON_3R_VFM 25 /* VFMA, VFMS : float fused multiply-add */
+#define NEON_3R_VFM_VQRDMLSH 25 /* VFMA, VFMS, VQRDMLSH */
 #define NEON_3R_FLOAT_ARITH 26 /* float VADD, VSUB, VPADD, VABD */
 #define NEON_3R_FLOAT_MULTIPLY 27 /* float VMLA, VMLS, VMUL */
 #define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_3r_sizes[] = {
     [NEON_3R_VPMAX] = 0x7,
     [NEON_3R_VPMIN] = 0x7,
     [NEON_3R_VQDMULH_VQRDMULH] = 0x6,
-    [NEON_3R_VPADD] = 0x7,
+    [NEON_3R_VPADD_VQRDMLAH] = 0x7,
     [NEON_3R_SHA] = 0xf, /* size field encodes op type */
-    [NEON_3R_VFM] = 0x5, /* size bit 1 encodes op */
+    [NEON_3R_VFM_VQRDMLSH] = 0x7, /* For VFM, size bit 1 encodes op */
     [NEON_3R_FLOAT_ARITH] = 0x5, /* size bit 1 encodes op */
     [NEON_3R_FLOAT_MULTIPLY] = 0x5, /* size bit 1 encodes op */
     [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
     [NEON_2RM_VCVT_UF] = 0x4,
 };
 
+
+/* Expand v8.1 simd helper.  */
+static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
+                         int q, int rd, int rn, int rm)
+{
+    if (arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
+        int opr_sz = (1 + q) * 8;
+        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
+                           vfp_reg_offset(1, rn),
+                           vfp_reg_offset(1, rm), cpu_env,
+                           opr_sz, opr_sz, 0, fn);
+        return 0;
+    }
+    return 1;
+}
+
 /* Translate a NEON data processing instruction.  Return nonzero if the
    instruction is invalid.
    We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         if (q && ((rd | rn | rm) & 1)) {
             return 1;
         }
-        /*
-         * The SHA-1/SHA-256 3-register instructions require special treatment
-         * here, as their size field is overloaded as an op type selector, and
-         * they all consume their input in a single pass.
-         */
-        if (op == NEON_3R_SHA) {
+        switch (op) {
+        case NEON_3R_SHA:
+            /* The SHA-1/SHA-256 3-register instructions require special
+             * treatment here, as their size field is overloaded as an
+             * op type selector, and they all consume their input in a
+             * single pass.
+             */
             if (!q) {
                 return 1;
             }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tcg_temp_free_ptr(ptr2);
             tcg_temp_free_ptr(ptr3);
             return 0;
+
+        case NEON_3R_VPADD_VQRDMLAH:
+            if (!u) {
+                break;  /* VPADD */
+            }
+            /* VQRDMLAH */
+            switch (size) {
+            case 1:
+                return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16,
+                                     q, rd, rn, rm);
+            case 2:
+                return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32,
+                                     q, rd, rn, rm);
+            }
+            return 1;
+
+        case NEON_3R_VFM_VQRDMLSH:
+            if (!u) {
+                /* VFM, VFMS */
+                if (size == 1) {
+                    return 1;
+                }
+                break;
+            }
+            /* VQRDMLSH */
+            switch (size) {
+            case 1:
+                return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16,
+                                     q, rd, rn, rm);
+            case 2:
+                return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32,
+                                     q, rd, rn, rm);
+            }
+            return 1;
         }
         if (size == 3 && op != NEON_3R_LOGIC) {
             /* 64-bit element instructions. */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 rm = rtmp;
             }
             break;
-        case NEON_3R_VPADD:
-            if (u) {
-                return 1;
-            }
-            /* Fall through */
+        case NEON_3R_VPADD_VQRDMLAH:
         case NEON_3R_VPMAX:
         case NEON_3R_VPMIN:
             pairwise = 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 return 1;
             }
             break;
-        case NEON_3R_VFM:
-            if (!arm_dc_feature(s, ARM_FEATURE_VFP4) || u) {
+        case NEON_3R_VFM_VQRDMLSH:
+            if (!arm_dc_feature(s, ARM_FEATURE_VFP4)) {
                 return 1;
             }
             break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 }
             }
             break;
-        case NEON_3R_VPADD:
+        case NEON_3R_VPADD_VQRDMLAH:
             switch (size) {
             case 0: gen_helper_neon_padd_u8(tmp, tmp, tmp2); break;
             case 1: gen_helper_neon_padd_u16(tmp, tmp, tmp2); break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
               }
             }
             break;
-        case NEON_3R_VFM:
+        case NEON_3R_VFM_VQRDMLSH:
         {
             /* VFMA, VFMS: fused multiply-add */
             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 46 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static const char *regnames[] =
     { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
       "r8", "r9", "r10", "r11", "r12", "r13", "r14", "pc" };
 
+/* Function prototypes for gen_ functions calling Neon helpers.  */
+typedef void NeonGenThreeOpEnvFn(TCGv_i32, TCGv_env, TCGv_i32,
+                                 TCGv_i32, TCGv_i32);
+
 /* initialize TCG globals.  */
 void arm_translate_init(void)
 {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         }
                         neon_store_reg64(cpu_V0, rd + pass);
                     }
-
-
                     break;
-                default: /* 14 and 15 are RESERVED */
-                    return 1;
+                case 14: /* VQRDMLAH scalar */
+                case 15: /* VQRDMLSH scalar */
+                    {
+                        NeonGenThreeOpEnvFn *fn;
+
+                        if (!arm_dc_feature(s, ARM_FEATURE_V8_RDM)) {
+                            return 1;
+                        }
+                        if (u && ((rd | rn) & 1)) {
+                            return 1;
+                        }
+                        if (op == 14) {
+                            if (size == 1) {
+                                fn = gen_helper_neon_qrdmlah_s16;
+                            } else {
+                                fn = gen_helper_neon_qrdmlah_s32;
+                            }
+                        } else {
+                            if (size == 1) {
+                                fn = gen_helper_neon_qrdmlsh_s16;
+                            } else {
+                                fn = gen_helper_neon_qrdmlsh_s32;
+                            }
+                        }
+
+                        tmp2 = neon_get_scalar(size, rm);
+                        for (pass = 0; pass < (u ? 4 : 2); pass++) {
+                            tmp = neon_load_reg(rn, pass);
+                            tmp3 = neon_load_reg(rd, pass);
+                            fn(tmp, cpu_env, tmp, tmp2, tmp3);
+                            tcg_temp_free_i32(tmp3);
+                            neon_store_reg(rd, pass, tmp);
+                        }
+                        tcg_temp_free_i32(tmp2);
+                    }
+                    break;
+                default:
+                    g_assert_not_reached();
                 }
             }
         } else { /* size == 3 */
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Enable it for the "any" CPU used by *-linux-user.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180228193125.20577-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c   | 1 +
 target/arm/cpu64.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_any_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
     set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
     set_feature(&cpu->env, ARM_FEATURE_CRC);
+    set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
     cpu->midr = 0xffffffff;
 }
 #endif
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_any_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_V8_SM4);
     set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
     set_feature(&cpu->env, ARM_FEATURE_CRC);
+    set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
     set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
     cpu->ctr = 0x80038003; /* 32 byte I and D cacheline size, VIPT icache */
     cpu->dcz_blocksize = 7; /*  512 bytes */
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Not enabled anywhere yet.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h     | 1 +
 linux-user/elfload.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ enum arm_features {
     ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */
     ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */
     ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
+    ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions.  */
 };
 
 static inline int arm_feature(CPUARMState *env, int feature)
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index XXXXXXX..XXXXXXX 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap(void)
     GET_FEATURE(ARM_FEATURE_V8_FP16,
                 ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
     GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
+    GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
 #undef GET_FEATURE
 
     return hwcaps;
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  7 ++++
 target/arm/translate-a64.c | 48 ++++++++++++++++++++++-
 target/arm/vec_helper.c    | 97 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 151 insertions(+), 1 deletion(-)

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: renamed e1/e2/e3/e4 to use the same naming as the version
 of the pseudocode in the Arm ARM]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  11 ++++
 target/arm/translate-a64.c |  94 +++++++++++++++++++++++++---
 target/arm/vec_helper.c    | 149 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 246 insertions(+), 8 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180228193125.20577-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     return 0;
 }
 
+/* Advanced SIMD three registers of the same length extension.
+ *  31           25    23  22    20   16   12  11   10   9    8        3     0
+ * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
+ * | 1 1 1 1 1 1 0 | op1 | D | op2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
+ * +---------------+-----+---+-----+----+----+---+----+---+----+---------+----+
+ */
+static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
+{
+    gen_helper_gvec_3_ptr *fn_gvec_ptr;
+    int rd, rn, rm, rot, size, opr_sz;
+    TCGv_ptr fpst;
+    bool q;
+
+    q = extract32(insn, 6, 1);
+    VFP_DREG_D(rd, insn);
+    VFP_DREG_N(rn, insn);
+    VFP_DREG_M(rm, insn);
+    if ((rd | rn | rm) & q) {
+        return 1;
+    }
+
+    if ((insn & 0xfe200f10) == 0xfc200800) {
+        /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */
+        size = extract32(insn, 20, 1);
+        rot = extract32(insn, 23, 2);
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
+            || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
+            return 1;
+        }
+        fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah;
+    } else if ((insn & 0xfea00f10) == 0xfc800800) {
+        /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */
+        size = extract32(insn, 20, 1);
+        rot = extract32(insn, 24, 1);
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
+            || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
+            return 1;
+        }
+        fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh;
+    } else {
+        return 1;
+    }
+
+    if (s->fp_excp_el) {
+        gen_exception_insn(s, 4, EXCP_UDEF,
+                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
+        return 0;
+    }
+    if (!s->vfp_enabled) {
+        return 1;
+    }
+
+    opr_sz = (1 + q) * 8;
+    fpst = get_fpstatus_ptr(1);
+    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
+                       vfp_reg_offset(1, rn),
+                       vfp_reg_offset(1, rm), fpst,
+                       opr_sz, opr_sz, rot, fn_gvec_ptr);
+    tcg_temp_free_ptr(fpst);
+    return 0;
+}
+
 static int disas_coproc_insn(DisasContext *s, uint32_t insn)
 {
     int cpnum, is64, crn, crm, opc1, opc2, isread, rt, rt2;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                     }
                 }
             }
+        } else if ((insn & 0x0e000a00) == 0x0c000800
+                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
+            if (disas_neon_insn_3same_ext(s, insn)) {
+                goto illegal_op;
+            }
+            return;
         } else if ((insn & 0x0fe00000) == 0x0c400000) {
             /* Coprocessor double register transfer.  */
             ARCH(5TE);
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-15-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn)
     return 0;
 }
 
+/* Advanced SIMD two registers and a scalar extension.
+ *  31             24   23  22   20   16   12  11   10   9    8        3     0
+ * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
+ * | 1 1 1 1 1 1 1 0 | o1 | D | o2 | Vn | Vd | 1 | o3 | 0 | o4 | N Q M U | Vm |
+ * +-----------------+----+---+----+----+----+---+----+---+----+---------+----+
+ *
+ */
+
+static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
+{
+    int rd, rn, rm, rot, size, opr_sz;
+    TCGv_ptr fpst;
+    bool q;
+
+    q = extract32(insn, 6, 1);
+    VFP_DREG_D(rd, insn);
+    VFP_DREG_N(rn, insn);
+    VFP_DREG_M(rm, insn);
+    if ((rd | rn) & q) {
+        return 1;
+    }
+
+    if ((insn & 0xff000f10) == 0xfe000800) {
+        /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */
+        rot = extract32(insn, 20, 2);
+        size = extract32(insn, 23, 1);
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
+            || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
+            return 1;
+        }
+    } else {
+        return 1;
+    }
+
+    if (s->fp_excp_el) {
+        gen_exception_insn(s, 4, EXCP_UDEF,
+                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
+        return 0;
+    }
+    if (!s->vfp_enabled) {
+        return 1;
+    }
+
+    opr_sz = (1 + q) * 8;
+    fpst = get_fpstatus_ptr(1);
+    tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
+                       vfp_reg_offset(1, rn),
+                       vfp_reg_offset(1, rm), fpst,
+                       opr_sz, opr_sz, rot,
+                       size ? gen_helper_gvec_fcmlas_idx
+                       : gen_helper_gvec_fcmlah_idx);
+    tcg_temp_free_ptr(fpst);
+    return 0;
+}
+
 static int disas_coproc_insn(DisasContext *s, uint32_t insn)
 {
     int cpnum, is64, crn, crm, opc1, opc2, isread, rt, rt2;
@@ -XXX,XX +XXX,XX @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                 goto illegal_op;
             }
             return;
+        } else if ((insn & 0x0f000a00) == 0x0e000800
+                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
+            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
+                goto illegal_op;
+            }
+            return;
         } else if ((insn & 0x0fe00000) == 0x0c400000) {
             /* Coprocessor double register transfer.  */
             ARCH(5TE);
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Happily, the bits are in the same places compared to a32.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180228193125.20577-16-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                                default_exception_el(s));
             break;
         }
-        if (((insn >> 24) & 3) == 3) {
+        if ((insn & 0xfe000a00) == 0xfc000800
+            && arm_dc_feature(s, ARM_FEATURE_V8)) {
+            /* The Thumb2 and ARM encodings are identical.  */
+            if (disas_neon_insn_3same_ext(s, insn)) {
+                goto illegal_op;
+            }
+        } else if ((insn & 0xff000a00) == 0xfe000800
+                   && arm_dc_feature(s, ARM_FEATURE_V8)) {
+            /* The Thumb2 and ARM encodings are identical.  */
+            if (disas_neon_insn_2reg_scalar_ext(s, insn)) {
+                goto illegal_op;
+            }
+        } else if (((insn >> 24) & 3) == 3) {
             /* Translate into the equivalent ARM encoding.  */
             insn = (insn & 0xe2ffffff) | ((insn & (1 << 28)) >> 4) | (1 << 28);
             if (disas_neon_data_insn(s, insn)) {
-- 
2.16.2

From: Richard Henderson <richard.henderson@linaro.org>

Enable it for the "any" CPU used by *-linux-user.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180228193125.20577-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c   | 1 +
 target/arm/cpu64.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_any_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
     set_feature(&cpu->env, ARM_FEATURE_CRC);
     set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
+    set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
     cpu->midr = 0xffffffff;
 }
 #endif
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_any_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_CRC);
     set_feature(&cpu->env, ARM_FEATURE_V8_RDM);
     set_feature(&cpu->env, ARM_FEATURE_V8_FP16);
+    set_feature(&cpu->env, ARM_FEATURE_V8_FCMA);
     cpu->ctr = 0x80038003; /* 32 byte I and D cacheline size, VIPT icache */
     cpu->dcz_blocksize = 7; /*  512 bytes */
 }
-- 
2.16.2

Mostly this is patches from me and RTH cleaning up and doing
more decodetree conversion for AArch32 Neon. The major new feature
is Dongjiu Geng's patchset to report host memory errors to KVM guests;
also a new aspeed board from Patrick Williams.

thanks
-- PMM

The following changes since commit 035b448b84f3557206abc44d786c5d3db2638f7d:

Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-05-14' into staging (2020-05-14 10:58:30 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200514

for you to fetch changes up to e95485f85657be21135c17a9226e297c21e73360:

target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree (2020-05-14 15:03:09 +0100)

----------------------------------------------------------------
target-arm queue:
 * target/arm: Use correct GDB XML for M-profile cores
 * target/arm: Code cleanup to use gvec APIs better
 * aspeed: Add support for the sonorapass-bmc board
 * target/arm: Support reporting KVM host memory errors
   to the guest via ACPI notifications
 * target/arm: Finish conversion of Neon 3-reg-same insns to decodetree

----------------------------------------------------------------
Dongjiu Geng (10):
      acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
      hw/arm/virt: Introduce a RAS machine option
      docs: APEI GHES generation and CPER record description
      ACPI: Build related register address fields via hardware error fw_cfg blob
      ACPI: Build Hardware Error Source Table
      ACPI: Record the Generic Error Status Block address
      KVM: Move hwpoison page related functions into kvm-all.c
      ACPI: Record Generic Error Status Block(GESB) table
      target-arm: kvm64: handle SIGBUS signal from kernel or KVM
      MAINTAINERS: Add ACPI/HEST/GHES entries

Patrick Williams (1):
      aspeed: Add support for the sonorapass-bmc board

Peter Maydell (18):
      target/arm: Use correct GDB XML for M-profile cores
      target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
      target/arm: Convert Neon 3-reg-same SHA to decodetree
      target/arm: Convert Neon 64-bit element 3-reg-same insns
      target/arm: Convert Neon VHADD 3-reg-same insns
      target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
      target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
      target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
      target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
      target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
      target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
      target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
      target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
      target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
      target/arm: Convert Neon 3-reg-same compare insns to decodetree
      target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
      target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
      target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree

Richard Henderson (16):
      target/arm: Create gen_gvec_[us]sra
      target/arm: Create gen_gvec_{u,s}{rshr,rsra}
      target/arm: Create gen_gvec_{sri,sli}
      target/arm: Remove unnecessary range check for VSHL
      target/arm: Tidy handle_vec_simd_shri
      target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
      target/arm: Create gen_gvec_{mla,mls}
      target/arm: Swap argument order for VSHL during decode
      target/arm: Create gen_gvec_{cmtst,ushl,sshl}
      target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
      target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
      target/arm: Create gen_gvec_{qrdmla,qrdmls}
      target/arm: Pass pointer to qc to qrdmla/qrdmls
      target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
      target/arm: Vectorize SABD/UABD
      target/arm: Vectorize SABA/UABA

GDB's remote protocol requires M-profile cores to use the feature
name 'org.gnu.gdb.arm.m-profile' instead of the 'org.gnu.gdb.arm.core'
feature used for A- and R-profile cores. We weren't doing this, which
meant GDB treated our M-profile cores like A-profile ones. This mostly
doesn't matter, but for instance means that it doesn't correctly
handle backtraces where an M-profile exception frame is involved.

Ship a copy of GDB's arm-m-profile.xml and use it on the M-profile
cores.  The integer registers have the same offsets as the
arm-core.xml, but register 25 is the M-profile XPSR rather than the
A-profile CPSR, so we need to update arm_cpu_gdb_read_register() and
arm_cpu_gdb_write_register() to handle XSPR reads and writes.

Fixes: https://bugs.launchpad.net/qemu/+bug/1877136
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200507134755.13997-1-peter.maydell@linaro.org
---
 configure                 |  4 ++--
 target/arm/cpu_tcg.c      |  1 +
 target/arm/gdbstub.c      | 22 ++++++++++++++++++----
 gdb-xml/arm-m-profile.xml | 27 +++++++++++++++++++++++++++
 4 files changed, 48 insertions(+), 6 deletions(-)
 create mode 100644 gdb-xml/arm-m-profile.xml

diff --git a/configure b/configure
index XXXXXXX..XXXXXXX 100755
--- a/configure
+++ b/configure
@@ -XXX,XX +XXX,XX @@ case "$target_name" in
     TARGET_SYSTBL_ABI=common,oabi
     bflt="yes"
     mttcg="yes"
-    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
+    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
   ;;
   aarch64|aarch64_be)
     TARGET_ARCH=aarch64
     TARGET_BASE_ARCH=arm
     bflt="yes"
     mttcg="yes"
-    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
+    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
   ;;
   cris)
   ;;
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -XXX,XX +XXX,XX @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
 #endif
 
     cc->cpu_exec_interrupt = arm_v7m_cpu_exec_interrupt;
+    cc->gdb_core_xml_file = "arm-m-profile.xml";
 }
 
 static const ARMCPUInfo arm_tcg_cpus[] = {
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
         }
         return gdb_get_reg32(mem_buf, 0);
     case 25:
-        /* CPSR */
-        return gdb_get_reg32(mem_buf, cpsr_read(env));
+        /* CPSR, or XPSR for M-profile */
+        if (arm_feature(env, ARM_FEATURE_M)) {
+            return gdb_get_reg32(mem_buf, xpsr_read(env));
+        } else {
+            return gdb_get_reg32(mem_buf, cpsr_read(env));
+        }
     }
     /* Unknown register.  */
     return 0;
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
         }
         return 4;
     case 25:
-        /* CPSR */
-        cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
+        /* CPSR, or XPSR for M-profile */
+        if (arm_feature(env, ARM_FEATURE_M)) {
+            /*
+             * Don't allow writing to XPSR.Exception as it can cause
+             * a transition into or out of handler mode (it's not
+             * writeable via the MSR insn so this is a reasonable
+             * restriction). Other fields are safe to update.
+             */
+            xpsr_write(env, tmp, ~XPSR_EXCP);
+        } else {
+            cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
+        }
         return 4;
     }
     /* Unknown register.  */
diff --git a/gdb-xml/arm-m-profile.xml b/gdb-xml/arm-m-profile.xml
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/gdb-xml/arm-m-profile.xml
@@ -XXX,XX +XXX,XX @@
+<?xml version="1.0"?>
+
+
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
+<feature name="org.gnu.gdb.arm.m-profile">
+  <reg name="r0" bitsize="32"/>
+  <reg name="r1" bitsize="32"/>
+  <reg name="r2" bitsize="32"/>
+  <reg name="r3" bitsize="32"/>
+  <reg name="r4" bitsize="32"/>
+  <reg name="r5" bitsize="32"/>
+  <reg name="r6" bitsize="32"/>
+  <reg name="r7" bitsize="32"/>
+  <reg name="r8" bitsize="32"/>
+  <reg name="r9" bitsize="32"/>
+  <reg name="r10" bitsize="32"/>
+  <reg name="r11" bitsize="32"/>
+  <reg name="r12" bitsize="32"/>
+  <reg name="sp" bitsize="32" type="data_ptr"/>
+  <reg name="lr" bitsize="32"/>
+  <reg name="pc" bitsize="32" type="code_ptr"/>
+  <reg name="xpsr" bitsize="32" regnum="25"/>
+</feature>
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The functions eliminate duplication of the special cases for
this operation.  They match up with the GVecGen2iFn typedef.

Add out-of-line helpers.  We got away with only having inline
expanders because the neon vector size is only 16 bytes, and
we know that the inline expansion will always succeed.
When we reuse this for SVE, tcg-gvec-op may decide to use an
out-of-line helper due to longer vector lengths.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  10 +++
 target/arm/translate.h     |   7 +-
 target/arm/translate-a64.c |  15 +---
 target/arm/translate.c     | 161 ++++++++++++++++++++++---------------
 target/arm/vec_helper.c    |  25 ++++++
 5 files changed, 139 insertions(+), 79 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Create vectorized versions of handle_shri_with_rndacc
for shift+round and shift+round+accumulate.  Add out-of-line
helpers in preparation for longer vector lengths from SVE.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  20 ++
 target/arm/translate.h     |   9 +
 target/arm/translate-a64.c |  11 +-
 target/arm/translate.c     | 463 +++++++++++++++++++++++++++++++++++--
 target/arm/vec_helper.c    |  50 ++++
 5 files changed, 527 insertions(+), 26 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #include "helper-sve.h"
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 
+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+
 /*
  * Forward to the isar_feature_* tests given a DisasContext pointer.
  */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
         return;
 
     case 0x04: /* SRSHR / URSHR (rounding) */
-        break;
+        gen_gvec_fn2i(s, is_q, rd, rn, shift,
+                      is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
+        return;
+
     case 0x06: /* SRSRA / URSRA (accum + rounding) */
-        accumulate = true;
-        break;
+        gen_gvec_fn2i(s, is_q, rd, rn, shift,
+                      is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
+        return;
+
     default:
         g_assert_not_reached();
     }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
     }
 }
 
+/*
+ * Shift one less than the requested amount, and the low bit is
+ * the rounding bit.  For the 8 and 16-bit operations, because we
+ * mask the low bit, we can perform a normal integer shift instead
+ * of a vector shift.
+ */
+static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shri_i64(t, a, sh - 1);
+    tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
+    tcg_gen_vec_sar8i_i64(d, a, sh);
+    tcg_gen_vec_add8_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shri_i64(t, a, sh - 1);
+    tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
+    tcg_gen_vec_sar16i_i64(d, a, sh);
+    tcg_gen_vec_add16_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+
+    tcg_gen_extract_i32(t, a, sh - 1, 1);
+    tcg_gen_sari_i32(d, a, sh);
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_extract_i64(t, a, sh - 1, 1);
+    tcg_gen_sari_i64(d, a, sh);
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+    TCGv_vec ones = tcg_temp_new_vec_matching(d);
+
+    tcg_gen_shri_vec(vece, t, a, sh - 1);
+    tcg_gen_dupi_vec(vece, ones, 1);
+    tcg_gen_and_vec(vece, t, t, ones);
+    tcg_gen_sari_vec(vece, d, a, sh);
+    tcg_gen_add_vec(vece, d, d, t);
+
+    tcg_temp_free_vec(t);
+    tcg_temp_free_vec(ones);
+}
+
+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen2i ops[4] = {
+        { .fni8 = gen_srshr8_i64,
+          .fniv = gen_srshr_vec,
+          .fno = gen_helper_gvec_srshr_b,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fni8 = gen_srshr16_i64,
+          .fniv = gen_srshr_vec,
+          .fno = gen_helper_gvec_srshr_h,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fni4 = gen_srshr32_i32,
+          .fniv = gen_srshr_vec,
+          .fno = gen_helper_gvec_srshr_s,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fni8 = gen_srshr64_i64,
+          .fniv = gen_srshr_vec,
+          .fno = gen_helper_gvec_srshr_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+
+    /* tszimm encoding produces immediates in the range [1..esize] */
+    tcg_debug_assert(shift > 0);
+    tcg_debug_assert(shift <= (8 << vece));
+
+    if (shift == (8 << vece)) {
+        /*
+         * Shifts larger than the element size are architecturally valid.
+         * Signed results in all sign bits.  With rounding, this produces
+         *   (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
+         * I.e. always zero.
+         */
+        tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
+    } else {
+        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
+    }
+}
+
+static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    gen_srshr8_i64(t, a, sh);
+    tcg_gen_vec_add8_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    gen_srshr16_i64(t, a, sh);
+    tcg_gen_vec_add16_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+
+    gen_srshr32_i32(t, a, sh);
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    gen_srshr64_i64(t, a, sh);
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+
+    gen_srshr_vec(vece, t, a, sh);
+    tcg_gen_add_vec(vece, d, d, t);
+    tcg_temp_free_vec(t);
+}
+
+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen2i ops[4] = {
+        { .fni8 = gen_srsra8_i64,
+          .fniv = gen_srsra_vec,
+          .fno = gen_helper_gvec_srsra_b,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_8 },
+        { .fni8 = gen_srsra16_i64,
+          .fniv = gen_srsra_vec,
+          .fno = gen_helper_gvec_srsra_h,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_16 },
+        { .fni4 = gen_srsra32_i32,
+          .fniv = gen_srsra_vec,
+          .fno = gen_helper_gvec_srsra_s,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_32 },
+        { .fni8 = gen_srsra64_i64,
+          .fniv = gen_srsra_vec,
+          .fno = gen_helper_gvec_srsra_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_64 },
+    };
+
+    /* tszimm encoding produces immediates in the range [1..esize] */
+    tcg_debug_assert(shift > 0);
+    tcg_debug_assert(shift <= (8 << vece));
+
+    /*
+     * Shifts larger than the element size are architecturally valid.
+     * Signed results in all sign bits.  With rounding, this produces
+     *   (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
+     * I.e. always zero.  With accumulation, this leaves D unchanged.
+     */
+    if (shift == (8 << vece)) {
+        /* Nop, but we do need to clear the tail. */
+        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
+    } else {
+        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
+    }
+}
+
+static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shri_i64(t, a, sh - 1);
+    tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
+    tcg_gen_vec_shr8i_i64(d, a, sh);
+    tcg_gen_vec_add8_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shri_i64(t, a, sh - 1);
+    tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
+    tcg_gen_vec_shr16i_i64(d, a, sh);
+    tcg_gen_vec_add16_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+
+    tcg_gen_extract_i32(t, a, sh - 1, 1);
+    tcg_gen_shri_i32(d, a, sh);
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_extract_i64(t, a, sh - 1, 1);
+    tcg_gen_shri_i64(d, a, sh);
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+    TCGv_vec ones = tcg_temp_new_vec_matching(d);
+
+    tcg_gen_shri_vec(vece, t, a, shift - 1);
+    tcg_gen_dupi_vec(vece, ones, 1);
+    tcg_gen_and_vec(vece, t, t, ones);
+    tcg_gen_shri_vec(vece, d, a, shift);
+    tcg_gen_add_vec(vece, d, d, t);
+
+    tcg_temp_free_vec(t);
+    tcg_temp_free_vec(ones);
+}
+
+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_shri_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen2i ops[4] = {
+        { .fni8 = gen_urshr8_i64,
+          .fniv = gen_urshr_vec,
+          .fno = gen_helper_gvec_urshr_b,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fni8 = gen_urshr16_i64,
+          .fniv = gen_urshr_vec,
+          .fno = gen_helper_gvec_urshr_h,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fni4 = gen_urshr32_i32,
+          .fniv = gen_urshr_vec,
+          .fno = gen_helper_gvec_urshr_s,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fni8 = gen_urshr64_i64,
+          .fniv = gen_urshr_vec,
+          .fno = gen_helper_gvec_urshr_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+
+    /* tszimm encoding produces immediates in the range [1..esize] */
+    tcg_debug_assert(shift > 0);
+    tcg_debug_assert(shift <= (8 << vece));
+
+    if (shift == (8 << vece)) {
+        /*
+         * Shifts larger than the element size are architecturally valid.
+         * Unsigned results in zero.  With rounding, this produces a
+         * copy of the most significant bit.
+         */
+        tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
+    } else {
+        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
+    }
+}
+
+static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    if (sh == 8) {
+        tcg_gen_vec_shr8i_i64(t, a, 7);
+    } else {
+        gen_urshr8_i64(t, a, sh);
+    }
+    tcg_gen_vec_add8_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    if (sh == 16) {
+        tcg_gen_vec_shr16i_i64(t, a, 15);
+    } else {
+        gen_urshr16_i64(t, a, sh);
+    }
+    tcg_gen_vec_add16_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+
+    if (sh == 32) {
+        tcg_gen_shri_i32(t, a, 31);
+    } else {
+        gen_urshr32_i32(t, a, sh);
+    }
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    if (sh == 64) {
+        tcg_gen_shri_i64(t, a, 63);
+    } else {
+        gen_urshr64_i64(t, a, sh);
+    }
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+
+    if (sh == (8 << vece)) {
+        tcg_gen_shri_vec(vece, t, a, sh - 1);
+    } else {
+        gen_urshr_vec(vece, t, a, sh);
+    }
+    tcg_gen_add_vec(vece, d, d, t);
+    tcg_temp_free_vec(t);
+}
+
+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_shri_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen2i ops[4] = {
+        { .fni8 = gen_ursra8_i64,
+          .fniv = gen_ursra_vec,
+          .fno = gen_helper_gvec_ursra_b,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_8 },
+        { .fni8 = gen_ursra16_i64,
+          .fniv = gen_ursra_vec,
+          .fno = gen_helper_gvec_ursra_h,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_16 },
+        { .fni4 = gen_ursra32_i32,
+          .fniv = gen_ursra_vec,
+          .fno = gen_helper_gvec_ursra_s,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_32 },
+        { .fni8 = gen_ursra64_i64,
+          .fniv = gen_ursra_vec,
+          .fno = gen_helper_gvec_ursra_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_64 },
+    };
+
+    /* tszimm encoding produces immediates in the range [1..esize] */
+    tcg_debug_assert(shift > 0);
+    tcg_debug_assert(shift <= (8 << vece));
+
+    tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
+}
+
 static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 {
     uint64_t mask = dup_const(MO_8, 0xff >> shift);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     }
                     return 0;
 
+                case 2: /* VRSHR */
+                    /* Right shift comes here negative.  */
+                    shift = -shift;
+                    if (u) {
+                        gen_gvec_urshr(size, rd_ofs, rm_ofs, shift,
+                                       vec_size, vec_size);
+                    } else {
+                        gen_gvec_srshr(size, rd_ofs, rm_ofs, shift,
+                                       vec_size, vec_size);
+                    }
+                    return 0;
+
+                case 3: /* VRSRA */
+                    /* Right shift comes here negative.  */
+                    shift = -shift;
+                    if (u) {
+                        gen_gvec_ursra(size, rd_ofs, rm_ofs, shift,
+                                       vec_size, vec_size);
+                    } else {
+                        gen_gvec_srsra(size, rd_ofs, rm_ofs, shift,
+                                       vec_size, vec_size);
+                    }
+                    return 0;
+
                 case 4: /* VSRI */
                     if (!u) {
                         return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         neon_load_reg64(cpu_V0, rm + pass);
                         tcg_gen_movi_i64(cpu_V1, imm);
                         switch (op) {
-                        case 2: /* VRSHR */
-                        case 3: /* VRSRA */
-                            if (u)
-                                gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1);
-                            else
-                                gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1);
-                            break;
                         case 6: /* VQSHLU */
                             gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
                                                       cpu_V0, cpu_V1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         default:
                             g_assert_not_reached();
                         }
-                        if (op == 3) {
-                            /* Accumulate.  */
-                            neon_load_reg64(cpu_V1, rd + pass);
-                            tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
-                        }
                         neon_store_reg64(cpu_V0, rd + pass);
                     } else { /* size < 3 */
                         /* Operands in T0 and T1.  */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         tmp2 = tcg_temp_new_i32();
                         tcg_gen_movi_i32(tmp2, imm);
                         switch (op) {
-                        case 2: /* VRSHR */
-                        case 3: /* VRSRA */
-                            GEN_NEON_INTEGER_OP(rshl);
-                            break;
                         case 6: /* VQSHLU */
                             switch (size) {
                             case 0:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             g_assert_not_reached();
                         }
                         tcg_temp_free_i32(tmp2);
-
-                        if (op == 3) {
-                            /* Accumulate.  */
-                            tmp2 = neon_load_reg(rd, pass);
-                            gen_neon_add(size, tmp, tmp2);
-                            tcg_temp_free_i32(tmp2);
-                        }
                         neon_store_reg(rd, pass, tmp);
                     }
                 } /* for pass */
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_SRA(gvec_usra_d, uint64_t)
 
 #undef DO_SRA
 
+#define DO_RSHR(NAME, TYPE)                             \
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
+{                                                       \
+    intptr_t i, oprsz = simd_oprsz(desc);               \
+    int shift = simd_data(desc);                        \
+    TYPE *d = vd, *n = vn;                              \
+    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
+        TYPE tmp = n[i] >> (shift - 1);                 \
+        d[i] = (tmp >> 1) + (tmp & 1);                  \
+    }                                                   \
+    clear_tail(d, oprsz, simd_maxsz(desc));             \
+}
+
+DO_RSHR(gvec_srshr_b, int8_t)
+DO_RSHR(gvec_srshr_h, int16_t)
+DO_RSHR(gvec_srshr_s, int32_t)
+DO_RSHR(gvec_srshr_d, int64_t)
+
+DO_RSHR(gvec_urshr_b, uint8_t)
+DO_RSHR(gvec_urshr_h, uint16_t)
+DO_RSHR(gvec_urshr_s, uint32_t)
+DO_RSHR(gvec_urshr_d, uint64_t)
+
+#undef DO_RSHR
+
+#define DO_RSRA(NAME, TYPE)                             \
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
+{                                                       \
+    intptr_t i, oprsz = simd_oprsz(desc);               \
+    int shift = simd_data(desc);                        \
+    TYPE *d = vd, *n = vn;                              \
+    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
+        TYPE tmp = n[i] >> (shift - 1);                 \
+        d[i] += (tmp >> 1) + (tmp & 1);                 \
+    }                                                   \
+    clear_tail(d, oprsz, simd_maxsz(desc));             \
+}
+
+DO_RSRA(gvec_srsra_b, int8_t)
+DO_RSRA(gvec_srsra_h, int16_t)
+DO_RSRA(gvec_srsra_s, int32_t)
+DO_RSRA(gvec_srsra_d, int64_t)
+
+DO_RSRA(gvec_ursra_b, uint8_t)
+DO_RSRA(gvec_ursra_h, uint16_t)
+DO_RSRA(gvec_ursra_s, uint32_t)
+DO_RSRA(gvec_ursra_d, uint64_t)
+
+#undef DO_RSRA
+
 /*
  * Convert float16 to float32, raising no exceptions and
  * preserving exceptional values, including SNaN.
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The functions eliminate duplication of the special cases for
this operation.  They match up with the GVecGen2iFn typedef.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  10 ++
 target/arm/translate.h     |   7 +-
 target/arm/translate-a64.c |  20 +---
 target/arm/translate.c     | 186 +++++++++++++++++++++----------------
 target/arm/vec_helper.c    |  38 ++++++++
 5 files changed, 160 insertions(+), 101 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

In 1dc8425e551, while converting to gvec, I added an extra range check
against the shift count.  This was unnecessary because the encoding of
the shift count produces 0 to the element size - 1.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
                                      vec_size, vec_size);
                     } else { /* VSHL */
-                        /* Shifts larger than the element size are
-                         * architecturally valid and results in zero.
-                         */
-                        if (shift >= 8 << size) {
-                            tcg_gen_gvec_dup_imm(size, rd_ofs,
-                                                 vec_size, vec_size, 0);
-                        } else {
-                            tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
-                                              vec_size, vec_size);
-                        }
+                        tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
+                                          vec_size, vec_size);
                     }
                     return 0;
                 }
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Now that we've converted all cases to gvec, there is quite a bit
of dead code at the end of the function.  Remove it.

Sink the call to gen_gvec_fn2i to the end, loading a function
pointer within the switch statement.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 56 ++++++++++----------------------------
 1 file changed, 14 insertions(+), 42 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
     int size = 32 - clz32(immh) - 1;
     int immhb = immh << 3 | immb;
     int shift = 2 * (8 << size) - immhb;
-    bool accumulate = false;
-    int dsize = is_q ? 128 : 64;
-    int esize = 8 << size;
-    int elements = dsize/esize;
-    MemOp memop = size | (is_u ? 0 : MO_SIGN);
-    TCGv_i64 tcg_rn = new_tmp_a64(s);
-    TCGv_i64 tcg_rd = new_tmp_a64(s);
-    TCGv_i64 tcg_round;
-    uint64_t round_const;
-    int i;
+    GVecGen2iFn *gvec_fn;
 
     if (extract32(immh, 3, 1) && !is_q) {
         unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
 
     switch (opcode) {
     case 0x02: /* SSRA / USRA (accumulate) */
-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
-                      is_u ? gen_gvec_usra : gen_gvec_ssra, size);
-        return;
+        gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra;
+        break;
 
     case 0x08: /* SRI */
-        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
-        return;
+        gvec_fn = gen_gvec_sri;
+        break;
 
     case 0x00: /* SSHR / USHR */
         if (is_u) {
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
                 /* Shift count the same size as element size produces zero.  */
                 tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd),
                                      is_q ? 16 : 8, vec_full_reg_size(s), 0);
-            } else {
-                gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size);
+                return;
             }
+            gvec_fn = tcg_gen_gvec_shri;
         } else {
             /* Shift count the same size as element size produces all sign.  */
             if (shift == 8 << size) {
                 shift -= 1;
             }
-            gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size);
+            gvec_fn = tcg_gen_gvec_sari;
         }
-        return;
+        break;
 
     case 0x04: /* SRSHR / URSHR (rounding) */
-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
-                      is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
-        return;
+        gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr;
+        break;
 
     case 0x06: /* SRSRA / URSRA (accum + rounding) */
-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
-                      is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
-        return;
+        gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra;
+        break;
 
     default:
         g_assert_not_reached();
     }
 
-    round_const = 1ULL << (shift - 1);
-    tcg_round = tcg_const_i64(round_const);
-
-    for (i = 0; i < elements; i++) {
-        read_vec_element(s, tcg_rn, rn, i, memop);
-        if (accumulate) {
-            read_vec_element(s, tcg_rd, rd, i, memop);
-        }
-
-        handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
-                                accumulate, is_u, size, shift);
-
-        write_vec_element(s, tcg_rd, rd, i, size);
-    }
-    tcg_temp_free_i64(tcg_round);
-
-    clear_vec_high(s, is_q, rd);
+    gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size);
 }
 
 /* SHL/SLI - Vector shift left */
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Macro-ize the 5 nearly identical comparisons.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     |  16 ++-
 target/arm/translate-a64.c |  22 ++--
 target/arm/translate.c     | 254 ++++++++-----------------------------
 3 files changed, 74 insertions(+), 218 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
 uint64_t vfp_expand_imm(int size, uint8_t imm8);
 
 /* Vector operations shared between ARM and AArch64.  */
-extern const GVecGen2 ceq0_op[4];
-extern const GVecGen2 clt0_op[4];
-extern const GVecGen2 cgt0_op[4];
-extern const GVecGen2 cle0_op[4];
-extern const GVecGen2 cge0_op[4];
+void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                   uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                   uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                   uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                   uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                   uint32_t opr_sz, uint32_t max_sz);
+
 extern const GVecGen3 mla_op[4];
 extern const GVecGen3 mls_op[4];
 extern const GVecGen3 cmtst_op[4];
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
             is_q ? 16 : 8, vec_full_reg_size(s));
 }
 
-/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */
-static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
-                         int rn, const GVecGen2 *gvec_op)
-{
-    tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
-                   is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
-}
-
 /* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
 static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
                          int rn, int rm, const GVecGen3 *gvec_op)
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
         }
         break;
     case 0x8: /* CMGT, CMGE */
-        gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]);
+        if (u) {
+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size);
+        } else {
+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size);
+        }
         return;
     case 0x9: /* CMEQ, CMLE */
-        gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]);
+        if (u) {
+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size);
+        } else {
+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size);
+        }
         return;
     case 0xa: /* CMLT */
-        gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]);
+        gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size);
         return;
     case 0xb:
         if (u) { /* ABS, NEG */
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
     return 1;
 }
 
-static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a)
-{
-    tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0);
-    tcg_gen_neg_i32(d, d);
-}
-
-static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a)
-{
-    tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0);
-    tcg_gen_neg_i64(d, d);
-}
-
-static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-{
-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
-    tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero);
-    tcg_temp_free_vec(zero);
-}
+#define GEN_CMP0(NAME, COND)                                            \
+    static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a)               \
+    {                                                                   \
+        tcg_gen_setcondi_i32(COND, d, a, 0);                            \
+        tcg_gen_neg_i32(d, d);                                          \
+    }                                                                   \
+    static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a)               \
+    {                                                                   \
+        tcg_gen_setcondi_i64(COND, d, a, 0);                            \
+        tcg_gen_neg_i64(d, d);                                          \
+    }                                                                   \
+    static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \
+    {                                                                   \
+        TCGv_vec zero = tcg_const_zeros_vec_matching(d);                \
+        tcg_gen_cmp_vec(COND, vece, d, a, zero);                        \
+        tcg_temp_free_vec(zero);                                        \
+    }                                                                   \
+    void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m,      \
+                            uint32_t opr_sz, uint32_t max_sz)           \
+    {                                                                   \
+        const GVecGen2 op[4] = {                                        \
+            { .fno = gen_helper_gvec_##NAME##0_b,                       \
+              .fniv = gen_##NAME##0_vec,                                \
+              .opt_opc = vecop_list_cmp,                                \
+              .vece = MO_8 },                                           \
+            { .fno = gen_helper_gvec_##NAME##0_h,                       \
+              .fniv = gen_##NAME##0_vec,                                \
+              .opt_opc = vecop_list_cmp,                                \
+              .vece = MO_16 },                                          \
+            { .fni4 = gen_##NAME##0_i32,                                \
+              .fniv = gen_##NAME##0_vec,                                \
+              .opt_opc = vecop_list_cmp,                                \
+              .vece = MO_32 },                                          \
+            { .fni8 = gen_##NAME##0_i64,                                \
+              .fniv = gen_##NAME##0_vec,                                \
+              .opt_opc = vecop_list_cmp,                                \
+              .prefer_i64 = TCG_TARGET_REG_BITS == 64,                  \
+              .vece = MO_64 },                                          \
+        };                                                              \
+        tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]);                \
+    }
 
 static const TCGOpcode vecop_list_cmp[] = {
     INDEX_op_cmp_vec, 0
 };
 
-const GVecGen2 ceq0_op[4] = {
-    { .fno = gen_helper_gvec_ceq0_b,
-      .fniv = gen_ceq0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_8 },
-    { .fno = gen_helper_gvec_ceq0_h,
-      .fniv = gen_ceq0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_16 },
-    { .fni4 = gen_ceq0_i32,
-      .fniv = gen_ceq0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_32 },
-    { .fni8 = gen_ceq0_i64,
-      .fniv = gen_ceq0_vec,
-      .opt_opc = vecop_list_cmp,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .vece = MO_64 },
-};
+GEN_CMP0(ceq, TCG_COND_EQ)
+GEN_CMP0(cle, TCG_COND_LE)
+GEN_CMP0(cge, TCG_COND_GE)
+GEN_CMP0(clt, TCG_COND_LT)
+GEN_CMP0(cgt, TCG_COND_GT)
 
-static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a)
-{
-    tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0);
-    tcg_gen_neg_i32(d, d);
-}
-
-static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a)
-{
-    tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0);
-    tcg_gen_neg_i64(d, d);
-}
-
-static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-{
-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
-    tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero);
-    tcg_temp_free_vec(zero);
-}
-
-const GVecGen2 cle0_op[4] = {
-    { .fno = gen_helper_gvec_cle0_b,
-      .fniv = gen_cle0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_8 },
-    { .fno = gen_helper_gvec_cle0_h,
-      .fniv = gen_cle0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_16 },
-    { .fni4 = gen_cle0_i32,
-      .fniv = gen_cle0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_32 },
-    { .fni8 = gen_cle0_i64,
-      .fniv = gen_cle0_vec,
-      .opt_opc = vecop_list_cmp,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .vece = MO_64 },
-};
-
-static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a)
-{
-    tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0);
-    tcg_gen_neg_i32(d, d);
-}
-
-static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a)
-{
-    tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0);
-    tcg_gen_neg_i64(d, d);
-}
-
-static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-{
-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
-    tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero);
-    tcg_temp_free_vec(zero);
-}
-
-const GVecGen2 cge0_op[4] = {
-    { .fno = gen_helper_gvec_cge0_b,
-      .fniv = gen_cge0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_8 },
-    { .fno = gen_helper_gvec_cge0_h,
-      .fniv = gen_cge0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_16 },
-    { .fni4 = gen_cge0_i32,
-      .fniv = gen_cge0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_32 },
-    { .fni8 = gen_cge0_i64,
-      .fniv = gen_cge0_vec,
-      .opt_opc = vecop_list_cmp,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .vece = MO_64 },
-};
-
-static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a)
-{
-    tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0);
-    tcg_gen_neg_i32(d, d);
-}
-
-static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a)
-{
-    tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0);
-    tcg_gen_neg_i64(d, d);
-}
-
-static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-{
-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
-    tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero);
-    tcg_temp_free_vec(zero);
-}
-
-const GVecGen2 clt0_op[4] = {
-    { .fno = gen_helper_gvec_clt0_b,
-      .fniv = gen_clt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_8 },
-    { .fno = gen_helper_gvec_clt0_h,
-      .fniv = gen_clt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_16 },
-    { .fni4 = gen_clt0_i32,
-      .fniv = gen_clt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_32 },
-    { .fni8 = gen_clt0_i64,
-      .fniv = gen_clt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .vece = MO_64 },
-};
-
-static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a)
-{
-    tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0);
-    tcg_gen_neg_i32(d, d);
-}
-
-static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a)
-{
-    tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0);
-    tcg_gen_neg_i64(d, d);
-}
-
-static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-{
-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
-    tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero);
-    tcg_temp_free_vec(zero);
-}
-
-const GVecGen2 cgt0_op[4] = {
-    { .fno = gen_helper_gvec_cgt0_b,
-      .fniv = gen_cgt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_8 },
-    { .fno = gen_helper_gvec_cgt0_h,
-      .fniv = gen_cgt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_16 },
-    { .fni4 = gen_cgt0_i32,
-      .fniv = gen_cgt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_32 },
-    { .fni8 = gen_cgt0_i64,
-      .fniv = gen_cgt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .vece = MO_64 },
-};
+#undef GEN_CMP0
 
 static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     break;
 
                 case NEON_2RM_VCEQ0:
-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-                                   vec_size, &ceq0_op[size]);
+                    gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                     break;
                 case NEON_2RM_VCGT0:
-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-                                   vec_size, &cgt0_op[size]);
+                    gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                     break;
                 case NEON_2RM_VCLE0:
-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-                                   vec_size, &cle0_op[size]);
+                    gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                     break;
                 case NEON_2RM_VCGE0:
-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-                                   vec_size, &cge0_op[size]);
+                    gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                     break;
                 case NEON_2RM_VCLT0:
-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-                                   vec_size, &clt0_op[size]);
+                    gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                     break;
 
                 default:
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h          |   7 +-
 target/arm/translate-a64.c      |   4 +-
 target/arm/translate-neon.inc.c |  16 +----
 target/arm/translate.c          | 117 +++++++++++++++++---------------
 4 files changed, 71 insertions(+), 73 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Rather than perform the argument swap during code generation,
perform it during decode.  This means it doesn't have to be
special cased later, and we can share code with aarch64 code
generation.  Hopefully the decode comment addresses any confusion
that might arise in between.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       | 17 +++++++++++++++--
 target/arm/translate-neon.inc.c |  3 +--
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
 VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
 VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
 
-VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
-VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
+# The _rev suffix indicates that Vn and Vm are reversed. This is
+# the case for shifts. In the Arm ARM these insns are documented
+# with the Vm and Vn fields in their usual places, but in the
+# assembly the operands are listed "backwards", ie in the order
+# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose
+# to consider Vm and Vn as being in different fields in the insn,
+# which allows us to avoid special-casing shifts in the trans_
+# function code. We would otherwise need to manually swap the operands
+# over to call Neon helper functions that are shared with AArch64,
+# which does not have this odd reversed-operand situation.
+@3same_rev       .... ... . . . size:2 .... .... .... . q:1 . . .... \
+                 &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
+
+VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
+VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
 
 VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
 VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
                                 uint32_t rn_ofs, uint32_t rm_ofs,       \
                                 uint32_t oprsz, uint32_t maxsz)         \
     {                                                                   \
-        /* Note the operation is vshl vd,vm,vn */                       \
-        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
                        oprsz, maxsz, &OPARRAY[vece]);                   \
     }                                                                   \
     DO_3SAME(INSN, gen_##INSN##_3s)
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h          |  10 ++-
 target/arm/translate-a64.c      |  18 ++--
 target/arm/translate-neon.inc.c |  23 +----
 target/arm/translate.c          | 146 +++++++++++++++++---------------
 4 files changed, 95 insertions(+), 102 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 
-extern const GVecGen3 cmtst_op[4];
-extern const GVecGen3 sshl_op[4];
-extern const GVecGen3 ushl_op[4];
+void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+
 extern const GVecGen4 uqadd_op[4];
 extern const GVecGen4 sqadd_op[4];
 extern const GVecGen4 uqsub_op[4];
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
             is_q ? 16 : 8, vec_full_reg_size(s));
 }
 
-/* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
-static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
-                         int rn, int rm, const GVecGen3 *gvec_op)
-{
-    tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
-                   vec_full_reg_offset(s, rm), is_q ? 16 : 8,
-                   vec_full_reg_size(s), gvec_op);
-}
-
 /* Expand a 3-operand operation using an out-of-line helper.  */
 static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
                              int rn, int rm, int data, gen_helper_gvec_3 *fn)
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                        (u ? uqsub_op : sqsub_op) + size);
         return;
     case 0x08: /* SSHL, USHL */
-        gen_gvec_op3(s, is_q, rd, rn, rm,
-                     u ? &ushl_op[size] : &sshl_op[size]);
+        if (u) {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size);
+        } else {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size);
+        }
         return;
     case 0x0c: /* SMAX, UMAX */
         if (u) {
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
         return;
     case 0x11:
         if (!u) { /* CMTST */
-            gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]);
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size);
             return;
         }
         /* else CMEQ */
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME(VBIC, tcg_gen_gvec_andc)
 DO_3SAME(VORR, tcg_gen_gvec_or)
 DO_3SAME(VORN, tcg_gen_gvec_orc)
 DO_3SAME(VEOR, tcg_gen_gvec_xor)
+DO_3SAME(VSHL_S, gen_gvec_sshl)
+DO_3SAME(VSHL_U, gen_gvec_ushl)
 
 /* These insns are all gvec_bitsel but with the inputs in various orders. */
 #define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
 DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
 DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
 DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
+DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
 
 #define DO_3SAME_CMP(INSN, COND)                                        \
     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
 DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
 DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
 
-static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-                         uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
-{
-    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
-}
-DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
-
 #define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
                                 uint32_t rn_ofs, uint32_t rm_ofs,       \
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
     }
     return do_3same(s, a, gen_VMUL_p_3s);
 }
-
-#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
-    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-                                uint32_t oprsz, uint32_t maxsz)         \
-    {                                                                   \
-        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
-                       oprsz, maxsz, &OPARRAY[vece]);                   \
-    }                                                                   \
-    DO_3SAME(INSN, gen_##INSN##_3s)
-
-DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
-DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
     tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
 }
 
-static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 };
-
-const GVecGen3 cmtst_op[4] = {
-    { .fni4 = gen_helper_neon_tst_u8,
-      .fniv = gen_cmtst_vec,
-      .opt_opc = vecop_list_cmtst,
-      .vece = MO_8 },
-    { .fni4 = gen_helper_neon_tst_u16,
-      .fniv = gen_cmtst_vec,
-      .opt_opc = vecop_list_cmtst,
-      .vece = MO_16 },
-    { .fni4 = gen_cmtst_i32,
-      .fniv = gen_cmtst_vec,
-      .opt_opc = vecop_list_cmtst,
-      .vece = MO_32 },
-    { .fni8 = gen_cmtst_i64,
-      .fniv = gen_cmtst_vec,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .opt_opc = vecop_list_cmtst,
-      .vece = MO_64 },
-};
+void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
+    static const GVecGen3 ops[4] = {
+        { .fni4 = gen_helper_neon_tst_u8,
+          .fniv = gen_cmtst_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fni4 = gen_helper_neon_tst_u16,
+          .fniv = gen_cmtst_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fni4 = gen_cmtst_i32,
+          .fniv = gen_cmtst_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fni8 = gen_cmtst_i64,
+          .fniv = gen_cmtst_vec,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
 {
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
     tcg_temp_free_vec(rsh);
 }
 
-static const TCGOpcode ushl_list[] = {
-    INDEX_op_neg_vec, INDEX_op_shlv_vec,
-    INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
-};
-
-const GVecGen3 ushl_op[4] = {
-    { .fniv = gen_ushl_vec,
-      .fno = gen_helper_gvec_ushl_b,
-      .opt_opc = ushl_list,
-      .vece = MO_8 },
-    { .fniv = gen_ushl_vec,
-      .fno = gen_helper_gvec_ushl_h,
-      .opt_opc = ushl_list,
-      .vece = MO_16 },
-    { .fni4 = gen_ushl_i32,
-      .fniv = gen_ushl_vec,
-      .opt_opc = ushl_list,
-      .vece = MO_32 },
-    { .fni8 = gen_ushl_i64,
-      .fniv = gen_ushl_vec,
-      .opt_opc = ushl_list,
-      .vece = MO_64 },
-};
+void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_neg_vec, INDEX_op_shlv_vec,
+        INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
+    };
+    static const GVecGen3 ops[4] = {
+        { .fniv = gen_ushl_vec,
+          .fno = gen_helper_gvec_ushl_b,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fniv = gen_ushl_vec,
+          .fno = gen_helper_gvec_ushl_h,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fni4 = gen_ushl_i32,
+          .fniv = gen_ushl_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fni8 = gen_ushl_i64,
+          .fniv = gen_ushl_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
 {
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
     tcg_temp_free_vec(tmp);
 }
 
-static const TCGOpcode sshl_list[] = {
-    INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
-    INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
-};
-
-const GVecGen3 sshl_op[4] = {
-    { .fniv = gen_sshl_vec,
-      .fno = gen_helper_gvec_sshl_b,
-      .opt_opc = sshl_list,
-      .vece = MO_8 },
-    { .fniv = gen_sshl_vec,
-      .fno = gen_helper_gvec_sshl_h,
-      .opt_opc = sshl_list,
-      .vece = MO_16 },
-    { .fni4 = gen_sshl_i32,
-      .fniv = gen_sshl_vec,
-      .opt_opc = sshl_list,
-      .vece = MO_32 },
-    { .fni8 = gen_sshl_i64,
-      .fniv = gen_sshl_vec,
-      .opt_opc = sshl_list,
-      .vece = MO_64 },
-};
+void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
+        INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
+    };
+    static const GVecGen3 ops[4] = {
+        { .fniv = gen_sshl_vec,
+          .fno = gen_helper_gvec_sshl_b,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fniv = gen_sshl_vec,
+          .fno = gen_helper_gvec_sshl_h,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fni4 = gen_sshl_i32,
+          .fniv = gen_sshl_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fni8 = gen_sshl_i64,
+          .fniv = gen_sshl_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                           TCGv_vec a, TCGv_vec b)
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h          |  13 +-
 target/arm/translate-a64.c      |  22 ++-
 target/arm/translate-neon.inc.c |  19 +--
 target/arm/translate.c          | 228 +++++++++++++++++---------------
 4 files changed, 147 insertions(+), 135 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 
-extern const GVecGen4 uqadd_op[4];
-extern const GVecGen4 sqadd_op[4];
-extern const GVecGen4 uqsub_op[4];
-extern const GVecGen4 sqsub_op[4];
 void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
 void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
 void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 
+void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+
 void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
 
     switch (opcode) {
     case 0x01: /* SQADD, UQADD */
-        tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
-                       offsetof(CPUARMState, vfp.qc),
-                       vec_full_reg_offset(s, rn),
-                       vec_full_reg_offset(s, rm),
-                       is_q ? 16 : 8, vec_full_reg_size(s),
-                       (u ? uqadd_op : sqadd_op) + size);
+        if (u) {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size);
+        } else {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size);
+        }
         return;
     case 0x05: /* SQSUB, UQSUB */
-        tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
-                       offsetof(CPUARMState, vfp.qc),
-                       vec_full_reg_offset(s, rn),
-                       vec_full_reg_offset(s, rm),
-                       is_q ? 16 : 8, vec_full_reg_size(s),
-                       (u ? uqsub_op : sqsub_op) + size);
+        if (u) {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size);
+        } else {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size);
+        }
         return;
     case 0x08: /* SSHL, USHL */
         if (u) {
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME(VORN, tcg_gen_gvec_orc)
 DO_3SAME(VEOR, tcg_gen_gvec_xor)
 DO_3SAME(VSHL_S, gen_gvec_sshl)
 DO_3SAME(VSHL_U, gen_gvec_ushl)
+DO_3SAME(VQADD_S, gen_gvec_sqadd_qc)
+DO_3SAME(VQADD_U, gen_gvec_uqadd_qc)
+DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc)
+DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc)
 
 /* These insns are all gvec_bitsel but with the inputs in various orders. */
 #define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
 DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
 DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
 
-#define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
-    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-                                uint32_t oprsz, uint32_t maxsz)         \
-    {                                                                   \
-        tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),           \
-                       rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]);   \
-    }                                                                   \
-    DO_3SAME(INSN, gen_##INSN##_3s)
-
-DO_3SAME_GVEC4(VQADD_S, sqadd_op)
-DO_3SAME_GVEC4(VQADD_U, uqadd_op)
-DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
-DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
-
 static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                            uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
 {
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
     tcg_temp_free_vec(x);
 }
 
-static const TCGOpcode vecop_list_uqadd[] = {
-    INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
-};
-
-const GVecGen4 uqadd_op[4] = {
-    { .fniv = gen_uqadd_vec,
-      .fno = gen_helper_gvec_uqadd_b,
-      .write_aofs = true,
-      .opt_opc = vecop_list_uqadd,
-      .vece = MO_8 },
-    { .fniv = gen_uqadd_vec,
-      .fno = gen_helper_gvec_uqadd_h,
-      .write_aofs = true,
-      .opt_opc = vecop_list_uqadd,
-      .vece = MO_16 },
-    { .fniv = gen_uqadd_vec,
-      .fno = gen_helper_gvec_uqadd_s,
-      .write_aofs = true,
-      .opt_opc = vecop_list_uqadd,
-      .vece = MO_32 },
-    { .fniv = gen_uqadd_vec,
-      .fno = gen_helper_gvec_uqadd_d,
-      .write_aofs = true,
-      .opt_opc = vecop_list_uqadd,
-      .vece = MO_64 },
-};
+void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen4 ops[4] = {
+        { .fniv = gen_uqadd_vec,
+          .fno = gen_helper_gvec_uqadd_b,
+          .write_aofs = true,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fniv = gen_uqadd_vec,
+          .fno = gen_helper_gvec_uqadd_h,
+          .write_aofs = true,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fniv = gen_uqadd_vec,
+          .fno = gen_helper_gvec_uqadd_s,
+          .write_aofs = true,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fniv = gen_uqadd_vec,
+          .fno = gen_helper_gvec_uqadd_d,
+          .write_aofs = true,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                           TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
     tcg_temp_free_vec(x);
 }
 
-static const TCGOpcode vecop_list_sqadd[] = {
-    INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
-};
-
-const GVecGen4 sqadd_op[4] = {
-    { .fniv = gen_sqadd_vec,
-      .fno = gen_helper_gvec_sqadd_b,
-      .opt_opc = vecop_list_sqadd,
-      .write_aofs = true,
-      .vece = MO_8 },
-    { .fniv = gen_sqadd_vec,
-      .fno = gen_helper_gvec_sqadd_h,
-      .opt_opc = vecop_list_sqadd,
-      .write_aofs = true,
-      .vece = MO_16 },
-    { .fniv = gen_sqadd_vec,
-      .fno = gen_helper_gvec_sqadd_s,
-      .opt_opc = vecop_list_sqadd,
-      .write_aofs = true,
-      .vece = MO_32 },
-    { .fniv = gen_sqadd_vec,
-      .fno = gen_helper_gvec_sqadd_d,
-      .opt_opc = vecop_list_sqadd,
-      .write_aofs = true,
-      .vece = MO_64 },
-};
+void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen4 ops[4] = {
+        { .fniv = gen_sqadd_vec,
+          .fno = gen_helper_gvec_sqadd_b,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_8 },
+        { .fniv = gen_sqadd_vec,
+          .fno = gen_helper_gvec_sqadd_h,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_16 },
+        { .fniv = gen_sqadd_vec,
+          .fno = gen_helper_gvec_sqadd_s,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_32 },
+        { .fniv = gen_sqadd_vec,
+          .fno = gen_helper_gvec_sqadd_d,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                           TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
     tcg_temp_free_vec(x);
 }
 
-static const TCGOpcode vecop_list_uqsub[] = {
-    INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
-};
-
-const GVecGen4 uqsub_op[4] = {
-    { .fniv = gen_uqsub_vec,
-      .fno = gen_helper_gvec_uqsub_b,
-      .opt_opc = vecop_list_uqsub,
-      .write_aofs = true,
-      .vece = MO_8 },
-    { .fniv = gen_uqsub_vec,
-      .fno = gen_helper_gvec_uqsub_h,
-      .opt_opc = vecop_list_uqsub,
-      .write_aofs = true,
-      .vece = MO_16 },
-    { .fniv = gen_uqsub_vec,
-      .fno = gen_helper_gvec_uqsub_s,
-      .opt_opc = vecop_list_uqsub,
-      .write_aofs = true,
-      .vece = MO_32 },
-    { .fniv = gen_uqsub_vec,
-      .fno = gen_helper_gvec_uqsub_d,
-      .opt_opc = vecop_list_uqsub,
-      .write_aofs = true,
-      .vece = MO_64 },
-};
+void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
+    };
+    static const GVecGen4 ops[4] = {
+        { .fniv = gen_uqsub_vec,
+          .fno = gen_helper_gvec_uqsub_b,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_8 },
+        { .fniv = gen_uqsub_vec,
+          .fno = gen_helper_gvec_uqsub_h,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_16 },
+        { .fniv = gen_uqsub_vec,
+          .fno = gen_helper_gvec_uqsub_s,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_32 },
+        { .fniv = gen_uqsub_vec,
+          .fno = gen_helper_gvec_uqsub_d,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                           TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
     tcg_temp_free_vec(x);
 }
 
-static const TCGOpcode vecop_list_sqsub[] = {
-    INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
-};
-
-const GVecGen4 sqsub_op[4] = {
-    { .fniv = gen_sqsub_vec,
-      .fno = gen_helper_gvec_sqsub_b,
-      .opt_opc = vecop_list_sqsub,
-      .write_aofs = true,
-      .vece = MO_8 },
-    { .fniv = gen_sqsub_vec,
-      .fno = gen_helper_gvec_sqsub_h,
-      .opt_opc = vecop_list_sqsub,
-      .write_aofs = true,
-      .vece = MO_16 },
-    { .fniv = gen_sqsub_vec,
-      .fno = gen_helper_gvec_sqsub_s,
-      .opt_opc = vecop_list_sqsub,
-      .write_aofs = true,
-      .vece = MO_32 },
-    { .fniv = gen_sqsub_vec,
-      .fno = gen_helper_gvec_sqsub_d,
-      .opt_opc = vecop_list_sqsub,
-      .write_aofs = true,
-      .vece = MO_64 },
-};
+void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
+    };
+    static const GVecGen4 ops[4] = {
+        { .fniv = gen_sqsub_vec,
+          .fno = gen_helper_gvec_sqsub_b,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_8 },
+        { .fniv = gen_sqsub_vec,
+          .fno = gen_helper_gvec_sqsub_h,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_16 },
+        { .fniv = gen_sqsub_vec,
+          .fno = gen_helper_gvec_sqsub_s,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_32 },
+        { .fniv = gen_sqsub_vec,
+          .fno = gen_helper_gvec_sqsub_d,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 /* Translate a NEON data processing instruction.  Return nonzero if the
    instruction is invalid.
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

These operations do not touch fp_status.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  4 ++--
 target/arm/translate-a64.c |  5 ++---
 target/arm/translate.c     | 12 ++----------
 target/arm/vfp_helper.c    |  5 ++---
 4 files changed, 8 insertions(+), 18 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     |  5 ++++
 target/arm/translate-a64.c | 34 ++----------------------
 target/arm/translate.c     | 54 +++++++++++++++++++-------------------
 3 files changed, 34 insertions(+), 59 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Pass a pointer directly to env->vfp.qc[0], rather than env.
This will allow SVE2, which does not modify QC, to pass a
pointer to dummy storage.

Change the return type of inl_qrdml.h_s16 to match the
sense of the operation: signed.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c  | 18 ++++++++---
 target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------
 2 files changed, 54 insertions(+), 34 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
     [NEON_2RM_VCVT_UF] = 0x4,
 };
 
+static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
+                            uint32_t opr_sz, uint32_t max_sz,
+                            gen_helper_gvec_3_ptr *fn)
+{
+    TCGv_ptr qc_ptr = tcg_temp_new_ptr();
+
+    tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc));
+    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
+                       opr_sz, max_sz, 0, fn);
+    tcg_temp_free_ptr(qc_ptr);
+}
+
 void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                           uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 {
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
         gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
     };
     tcg_debug_assert(vece >= 1 && vece <= 2);
-    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
-                       opr_sz, max_sz, 0, fns[vece - 1]);
+    gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
 }
 
 void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
         gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
     };
     tcg_debug_assert(vece >= 1 && vece <= 2);
-    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
-                       opr_sz, max_sz, 0, fns[vece - 1]);
+    gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
 }
 
 #define GEN_CMP0(NAME, COND)                                            \
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@
 #define H4(x)  (x)
 #endif
 
-#define SET_QC() env->vfp.qc[0] = 1
-
 static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
 {
     uint64_t *d = vd + opr_sz;
@@ -XXX,XX +XXX,XX @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
 }
 
 /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
-static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
-                                int16_t src2, int16_t src3)
+static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2,
+                               int16_t src3, uint32_t *sat)
 {
     /* Simplify:
      * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
     ret = ((int32_t)src3 << 15) + ret + (1 << 14);
     ret >>= 15;
     if (ret != (int16_t)ret) {
-        SET_QC();
+        *sat = 1;
         ret = (ret < 0 ? -0x8000 : 0x7fff);
     }
     return ret;
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
 uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
                                   uint32_t src2, uint32_t src3)
 {
-    uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3);
-    uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
+    uint32_t *sat = &env->vfp.qc[0];
+    uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat);
+    uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
     return deposit32(e1, 16, 16, e2);
 }
 
 void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm,
-                              void *ve, uint32_t desc)
+                              void *vq, uint32_t desc)
 {
     uintptr_t opr_sz = simd_oprsz(desc);
     int16_t *d = vd;
     int16_t *n = vn;
     int16_t *m = vm;
-    CPUARMState *env = ve;
     uintptr_t i;
 
     for (i = 0; i < opr_sz / 2; ++i) {
-        d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]);
+        d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq);
     }
     clear_tail(d, opr_sz, simd_maxsz(desc));
 }
 
 /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
-static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
-                                int16_t src2, int16_t src3)
+static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2,
+                               int16_t src3, uint32_t *sat)
 {
     /* Similarly, using subtraction:
      * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
     ret = ((int32_t)src3 << 15) - ret + (1 << 14);
     ret >>= 15;
     if (ret != (int16_t)ret) {
-        SET_QC();
+        *sat = 1;
         ret = (ret < 0 ? -0x8000 : 0x7fff);
     }
     return ret;
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
 uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
                                   uint32_t src2, uint32_t src3)
 {
-    uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3);
-    uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
+    uint32_t *sat = &env->vfp.qc[0];
+    uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat);
+    uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
     return deposit32(e1, 16, 16, e2);
 }
 
 void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
-                              void *ve, uint32_t desc)
+                              void *vq, uint32_t desc)
 {
     uintptr_t opr_sz = simd_oprsz(desc);
     int16_t *d = vd;
     int16_t *n = vn;
     int16_t *m = vm;
-    CPUARMState *env = ve;
     uintptr_t i;
 
     for (i = 0; i < opr_sz / 2; ++i) {
-        d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]);
+        d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq);
     }
     clear_tail(d, opr_sz, simd_maxsz(desc));
 }
 
 /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
-uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
-                                  int32_t src2, int32_t src3)
+static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2,
+                               int32_t src3, uint32_t *sat)
 {
     /* Simplify similarly to int_qrdmlah_s16 above.  */
     int64_t ret = (int64_t)src1 * src2;
     ret = ((int64_t)src3 << 31) + ret + (1 << 30);
     ret >>= 31;
     if (ret != (int32_t)ret) {
-        SET_QC();
+        *sat = 1;
         ret = (ret < 0 ? INT32_MIN : INT32_MAX);
     }
     return ret;
 }
 
+uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
+                                  int32_t src2, int32_t src3)
+{
+    uint32_t *sat = &env->vfp.qc[0];
+    return inl_qrdmlah_s32(src1, src2, src3, sat);
+}
+
 void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
-                              void *ve, uint32_t desc)
+                              void *vq, uint32_t desc)
 {
     uintptr_t opr_sz = simd_oprsz(desc);
     int32_t *d = vd;
     int32_t *n = vn;
     int32_t *m = vm;
-    CPUARMState *env = ve;
     uintptr_t i;
 
     for (i = 0; i < opr_sz / 4; ++i) {
-        d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]);
+        d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq);
     }
     clear_tail(d, opr_sz, simd_maxsz(desc));
 }
 
 /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
-uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
-                                  int32_t src2, int32_t src3)
+static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2,
+                               int32_t src3, uint32_t *sat)
 {
     /* Simplify similarly to int_qrdmlsh_s16 above.  */
     int64_t ret = (int64_t)src1 * src2;
     ret = ((int64_t)src3 << 31) - ret + (1 << 30);
     ret >>= 31;
     if (ret != (int32_t)ret) {
-        SET_QC();
+        *sat = 1;
         ret = (ret < 0 ? INT32_MIN : INT32_MAX);
     }
     return ret;
 }
 
+uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
+                                  int32_t src2, int32_t src3)
+{
+    uint32_t *sat = &env->vfp.qc[0];
+    return inl_qrdmlsh_s32(src1, src2, src3, sat);
+}
+
 void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
-                              void *ve, uint32_t desc)
+                              void *vq, uint32_t desc)
 {
     uintptr_t opr_sz = simd_oprsz(desc);
     int32_t *d = vd;
     int32_t *n = vn;
     int32_t *m = vm;
-    CPUARMState *env = ve;
     uintptr_t i;
 
     for (i = 0; i < opr_sz / 4; ++i) {
-        d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]);
+        d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq);
     }
     clear_tail(d, opr_sz, simd_maxsz(desc));
 }
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Must clear the tail for AdvSIMD when SVE is enabled.

Fixes: ca40a6e6e39
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/vec_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
             d[i + j] = TYPE##_mul(n[i + j], mm, stat);                     \
         }                                                                  \
     }                                                                      \
+    clear_tail(d, oprsz, simd_maxsz(desc));                                \
 }
 
 DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va,                  \
                                      mm, a[i + j], 0, stat);               \
         }                                                                  \
     }                                                                      \
+    clear_tail(d, oprsz, simd_maxsz(desc));                                \
 }
 
 DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2)
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Include 64-bit element size in preparation for SVE2.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  10 +++
 target/arm/translate.h     |   5 ++
 target/arm/translate-a64.c |   8 ++-
 target/arm/translate.c     | 133 ++++++++++++++++++++++++++++++++++++-
 target/arm/vec_helper.c    |  24 +++++++
 5 files changed, 176 insertions(+), 4 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Include 64-bit element size in preparation for SVE2.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  17 +++--
 target/arm/translate.h     |   5 ++
 target/arm/neon_helper.c   |  10 ---
 target/arm/translate-a64.c |  17 ++---
 target/arm/translate.c     | 134 +++++++++++++++++++++++++++++++++++--
 target/arm/vec_helper.c    |  24 +++++++
 6 files changed, 174 insertions(+), 33 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32)
 DEF_HELPER_2(neon_pmax_u16, i32, i32, i32)
 DEF_HELPER_2(neon_pmax_s16, i32, i32, i32)
 
-DEF_HELPER_2(neon_abd_u8, i32, i32, i32)
-DEF_HELPER_2(neon_abd_s8, i32, i32, i32)
-DEF_HELPER_2(neon_abd_u16, i32, i32, i32)
-DEF_HELPER_2(neon_abd_s16, i32, i32, i32)
-DEF_HELPER_2(neon_abd_u32, i32, i32, i32)
-DEF_HELPER_2(neon_abd_s32, i32, i32, i32)
-
 DEF_HELPER_2(neon_shl_u16, i32, i32, i32)
 DEF_HELPER_2(neon_shl_s16, i32, i32, i32)
 DEF_HELPER_2(neon_rshl_u8, i32, i32, i32)
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #include "helper-sve.h"
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 
+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+
 /*
  * Forward to the isar_feature_* tests given a DisasContext pointer.
  */
diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon_helper.c
+++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ NEON_POP(pmax_s16, neon_s16, 2)
 NEON_POP(pmax_u16, neon_u16, 2)
 #undef NEON_FN
 
-#define NEON_FN(dest, src1, src2) \
-    dest = (src1 > src2) ? (src1 - src2) : (src2 - src1)
-NEON_VOP(abd_s8, neon_s8, 4)
-NEON_VOP(abd_u8, neon_u8, 4)
-NEON_VOP(abd_s16, neon_s16, 2)
-NEON_VOP(abd_u16, neon_u16, 2)
-NEON_VOP(abd_s32, neon_s32, 1)
-NEON_VOP(abd_u32, neon_u32, 1)
-#undef NEON_FN
-
 #define NEON_FN(dest, src1, src2) do { \
     int8_t tmp; \
     tmp = (int8_t)src2; \
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
             gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
         }
         return;
+    case 0xf: /* SABA, UABA */
+        if (u) {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size);
+        } else {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size);
+        }
+        return;
     case 0x10: /* ADD, SUB */
         if (u) {
             gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                 genenvfn = fns[size][u];
                 break;
             }
-            case 0xf: /* SABA, UABA */
-            {
-                static NeonGenTwoOpFn * const fns[3][2] = {
-                    { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 },
-                    { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 },
-                    { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 },
-                };
-                genfn = fns[size][u];
-                break;
-            }
             case 0x16: /* SQDMULH, SQRDMULH */
             {
                 static NeonGenTwoOpEnvFn * const fns[2][2] = {
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
     tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 }
 
+static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+    gen_sabd_i32(t, a, b);
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+    gen_sabd_i64(t, a, b);
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+    gen_sabd_vec(vece, t, a, b);
+    tcg_gen_add_vec(vece, d, d, t);
+    tcg_temp_free_vec(t);
+}
+
+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_sub_vec, INDEX_op_add_vec,
+        INDEX_op_smin_vec, INDEX_op_smax_vec, 0
+    };
+    static const GVecGen3 ops[4] = {
+        { .fniv = gen_saba_vec,
+          .fno = gen_helper_gvec_saba_b,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_8 },
+        { .fniv = gen_saba_vec,
+          .fno = gen_helper_gvec_saba_h,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_16 },
+        { .fni4 = gen_saba_i32,
+          .fniv = gen_saba_vec,
+          .fno = gen_helper_gvec_saba_s,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_32 },
+        { .fni8 = gen_saba_i64,
+          .fniv = gen_saba_vec,
+          .fno = gen_helper_gvec_saba_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
+
+static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+    gen_uabd_i32(t, a, b);
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+    gen_uabd_i64(t, a, b);
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+    gen_uabd_vec(vece, t, a, b);
+    tcg_gen_add_vec(vece, d, d, t);
+    tcg_temp_free_vec(t);
+}
+
+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_sub_vec, INDEX_op_add_vec,
+        INDEX_op_umin_vec, INDEX_op_umax_vec, 0
+    };
+    static const GVecGen3 ops[4] = {
+        { .fniv = gen_uaba_vec,
+          .fno = gen_helper_gvec_uaba_b,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_8 },
+        { .fniv = gen_uaba_vec,
+          .fno = gen_helper_gvec_uaba_h,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_16 },
+        { .fni4 = gen_uaba_i32,
+          .fniv = gen_uaba_vec,
+          .fno = gen_helper_gvec_uaba_s,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_32 },
+        { .fni8 = gen_uaba_i64,
+          .fniv = gen_uaba_vec,
+          .fno = gen_helper_gvec_uaba_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
+
 /* Translate a NEON data processing instruction.  Return nonzero if the
    instruction is invalid.
    We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             }
             return 0;
 
+        case NEON_3R_VABA:
+            if (u) {
+                gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
+                              vec_size, vec_size);
+            } else {
+                gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
+                              vec_size, vec_size);
+            }
+            return 0;
+
         case NEON_3R_VADD_VSUB:
         case NEON_3R_LOGIC:
         case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VQRSHL:
             GEN_NEON_INTEGER_OP_ENV(qrshl);
             break;
-        case NEON_3R_VABA:
-            GEN_NEON_INTEGER_OP(abd);
-            tcg_temp_free_i32(tmp2);
-            tmp2 = neon_load_reg(rd, pass);
-            gen_neon_add(size, tmp, tmp2);
-            break;
         case NEON_3R_VPMAX:
             GEN_NEON_INTEGER_OP(pmax);
             break;
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_ABD(gvec_uabd_s, uint32_t)
 DO_ABD(gvec_uabd_d, uint64_t)
 
 #undef DO_ABD
+
+#define DO_ABA(NAME, TYPE)                                      \
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc)  \
+{                                                               \
+    intptr_t i, opr_sz = simd_oprsz(desc);                      \
+    TYPE *d = vd, *n = vn, *m = vm;                             \
+                                                                \
+    for (i = 0; i < opr_sz / sizeof(TYPE); ++i) {               \
+        d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i];        \
+    }                                                           \
+    clear_tail(d, opr_sz, simd_maxsz(desc));                    \
+}
+
+DO_ABA(gvec_saba_b, int8_t)
+DO_ABA(gvec_saba_h, int16_t)
+DO_ABA(gvec_saba_s, int32_t)
+DO_ABA(gvec_saba_d, int64_t)
+
+DO_ABA(gvec_uaba_b, uint8_t)
+DO_ABA(gvec_uaba_h, uint16_t)
+DO_ABA(gvec_uaba_s, uint32_t)
+DO_ABA(gvec_uaba_d, uint64_t)
+
+#undef DO_ABA
-- 
2.20.1

From: Patrick Williams <patrick@stwcx.xyz>

Sonora Pass is a 2 socket x86 motherboard designed by Facebook
and supported by OpenBMC.  Strapping configuration was obtained
from hardware and i2c configuration is based on dts found at:

https://github.com/facebook/openbmc-linux/blob/1633c87b8ba7c162095787c988979b748ba65dc8/arch/arm/boot/dts/aspeed-bmc-facebook-sonorapass.dts

Booted a test image of http://github.com/facebook/openbmc to login
prompt.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Reviewed-by: Amithash Prasad <amithash@fb.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[PMM: fixed block comment style nit]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/aspeed.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -XXX,XX +XXX,XX @@ struct AspeedBoardState {
         SCU_AST2500_HW_STRAP_ACPI_ENABLE |                              \
         SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER))
 
+/* Sonorapass hardware value: 0xF100D216 */
+#define SONORAPASS_BMC_HW_STRAP1 (                                      \
+        SCU_AST2500_HW_STRAP_SPI_AUTOFETCH_ENABLE |                     \
+        SCU_AST2500_HW_STRAP_GPIO_STRAP_ENABLE |                        \
+        SCU_AST2500_HW_STRAP_UART_DEBUG |                               \
+        SCU_AST2500_HW_STRAP_RESERVED28 |                               \
+        SCU_AST2500_HW_STRAP_DDR4_ENABLE |                              \
+        SCU_HW_STRAP_VGA_CLASS_CODE |                                   \
+        SCU_HW_STRAP_LPC_RESET_PIN |                                    \
+        SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER) |                \
+        SCU_AST2500_HW_STRAP_SET_AXI_AHB_RATIO(AXI_AHB_RATIO_2_1) |     \
+        SCU_HW_STRAP_VGA_BIOS_ROM |                                     \
+        SCU_HW_STRAP_VGA_SIZE_SET(VGA_16M_DRAM) |                       \
+        SCU_AST2500_HW_STRAP_RESERVED1)
+
 /* Swift hardware value: 0xF11AD206 */
 #define SWIFT_BMC_HW_STRAP1 (                                           \
         AST2500_HW_STRAP1_DEFAULTS |                                    \
@@ -XXX,XX +XXX,XX @@ static void swift_bmc_i2c_init(AspeedBoardState *bmc)
     i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 12), "tmp105", 0x4a);
 }
 
+static void sonorapass_bmc_i2c_init(AspeedBoardState *bmc)
+{
+    AspeedSoCState *soc = &bmc->soc;
+
+    /* bus 2 : */
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x48);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x49);
+    /* bus 2 : pca9546 @ 0x73 */
+
+    /* bus 3 : pca9548 @ 0x70 */
+
+    /* bus 4 : */
+    uint8_t *eeprom4_54 = g_malloc0(8 * 1024);
+    smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), 0x54,
+                          eeprom4_54);
+    /* PCA9539 @ 0x76, but PCA9552 is compatible */
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x76);
+    /* PCA9539 @ 0x77, but PCA9552 is compatible */
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x77);
+
+    /* bus 6 : */
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x48);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x49);
+    /* bus 6 : pca9546 @ 0x73 */
+
+    /* bus 8 : */
+    uint8_t *eeprom8_56 = g_malloc0(8 * 1024);
+    smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), 0x56,
+                          eeprom8_56);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x60);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x61);
+    /* bus 8 : adc128d818 @ 0x1d */
+    /* bus 8 : adc128d818 @ 0x1f */
+
+    /*
+     * bus 13 : pca9548 @ 0x71
+     *      - channel 3:
+     *          - tmm421 @ 0x4c
+     *          - tmp421 @ 0x4e
+     *          - tmp421 @ 0x4f
+     */
+
+}
+
 static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
 {
     AspeedSoCState *soc = &bmc->soc;
@@ -XXX,XX +XXX,XX @@ static void aspeed_machine_romulus_class_init(ObjectClass *oc, void *data)
     mc->default_ram_size       = 512 * MiB;
 };
 
+static void aspeed_machine_sonorapass_class_init(ObjectClass *oc, void *data)
+{
+    MachineClass *mc = MACHINE_CLASS(oc);
+    AspeedMachineClass *amc = ASPEED_MACHINE_CLASS(oc);
+
+    mc->desc       = "OCP SonoraPass BMC (ARM1176)";
+    amc->soc_name  = "ast2500-a1";
+    amc->hw_strap1 = SONORAPASS_BMC_HW_STRAP1;
+    amc->fmc_model = "mx66l1g45g";
+    amc->spi_model = "mx66l1g45g";
+    amc->num_cs    = 2;
+    amc->i2c_init  = sonorapass_bmc_i2c_init;
+    mc->default_ram_size       = 512 * MiB;
+};
+
 static void aspeed_machine_swift_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -XXX,XX +XXX,XX @@ static const TypeInfo aspeed_machine_types[] = {
         .name          = MACHINE_TYPE_NAME("swift-bmc"),
         .parent        = TYPE_ASPEED_MACHINE,
         .class_init    = aspeed_machine_swift_class_init,
+    }, {
+        .name          = MACHINE_TYPE_NAME("sonorapass-bmc"),
+        .parent        = TYPE_ASPEED_MACHINE,
+        .class_init    = aspeed_machine_sonorapass_class_init,
     }, {
         .name          = MACHINE_TYPE_NAME("witherspoon-bmc"),
         .parent        = TYPE_ASPEED_MACHINE,
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

The little end UUID is used in many places, so make
NVDIMM_UUID_LE to a common macro to convert the UUID
to a little end array.

Reviewed-by: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Message-id: 20200512030609.19593-2-gengdongjiu@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/qemu/uuid.h | 27 +++++++++++++++++++++++++++
 hw/acpi/nvdimm.c    | 10 +++-------
 2 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/include/qemu/uuid.h b/include/qemu/uuid.h
index XXXXXXX..XXXXXXX 100644
--- a/include/qemu/uuid.h
+++ b/include/qemu/uuid.h
@@ -XXX,XX +XXX,XX @@ typedef struct {
     };
 } QemuUUID;
 
+/**
+ * UUID_LE - converts the fields of UUID to little-endian array,
+ * each of parameters is the filed of UUID.
+ *
+ * @time_low: The low field of the timestamp
+ * @time_mid: The middle field of the timestamp
+ * @time_hi_and_version: The high field of the timestamp
+ *                       multiplexed with the version number
+ * @clock_seq_hi_and_reserved: The high field of the clock
+ *                             sequence multiplexed with the variant
+ * @clock_seq_low: The low field of the clock sequence
+ * @node0: The spatially unique node0 identifier
+ * @node1: The spatially unique node1 identifier
+ * @node2: The spatially unique node2 identifier
+ * @node3: The spatially unique node3 identifier
+ * @node4: The spatially unique node4 identifier
+ * @node5: The spatially unique node5 identifier
+ */
+#define UUID_LE(time_low, time_mid, time_hi_and_version,                    \
+  clock_seq_hi_and_reserved, clock_seq_low, node0, node1, node2,            \
+  node3, node4, node5)                                                      \
+  { (time_low) & 0xff, ((time_low) >> 8) & 0xff, ((time_low) >> 16) & 0xff, \
+    ((time_low) >> 24) & 0xff, (time_mid) & 0xff, ((time_mid) >> 8) & 0xff, \
+    (time_hi_and_version) & 0xff, ((time_hi_and_version) >> 8) & 0xff,      \
+    (clock_seq_hi_and_reserved), (clock_seq_low), (node0), (node1), (node2),\
+    (node3), (node4), (node5) }
+
 #define UUID_FMT "%02hhx%02hhx%02hhx%02hhx-" \
                  "%02hhx%02hhx-%02hhx%02hhx-" \
                  "%02hhx%02hhx-" \
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -XXX,XX +XXX,XX @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/uuid.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/bios-linker-loader.h"
@@ -XXX,XX +XXX,XX @@
 #include "hw/mem/nvdimm.h"
 #include "qemu/nvdimm-utils.h"
 
-#define NVDIMM_UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)             \
-   { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
-     (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
-     (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) }
-
 /*
  * define Byte Addressable Persistent Memory (PM) Region according to
  * ACPI 6.0: 5.2.25.1 System Physical Address Range Structure.
  */
 static const uint8_t nvdimm_nfit_spa_uuid[] =
-      NVDIMM_UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
-                     0x18, 0xb7, 0x8c, 0xdb);
+      UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
+              0x18, 0xb7, 0x8c, 0xdb);
 
 /*
  * NVDIMM Firmware Interface Table
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

RAS Virtualization feature is not supported now, so
add a RAS machine option and disable it by default.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20200512030609.19593-3-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/virt.h |  1 +
 hw/arm/virt.c         | 23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -XXX,XX +XXX,XX @@ typedef struct {
     bool highmem_ecam;
     bool its;
     bool virt;
+    bool ras;
     OnOffAuto acpi;
     VirtGICType gic_version;
     VirtIOMMUType iommu;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -XXX,XX +XXX,XX @@ static void virt_set_acpi(Object *obj, Visitor *v, const char *name,
     visit_type_OnOffAuto(v, name, &vms->acpi, errp);
 }
 
+static bool virt_get_ras(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->ras;
+}
+
+static void virt_set_ras(Object *obj, bool value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->ras = value;
+}
+
 static char *virt_get_gic_version(Object *obj, Error **errp)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -XXX,XX +XXX,XX @@ static void virt_instance_init(Object *obj)
                                     "Valid values are none and smmuv3",
                                     NULL);
 
+    /* Default disallows RAS instantiation */
+    vms->ras = false;
+    object_property_add_bool(obj, "ras", virt_get_ras,
+                             virt_set_ras, NULL);
+    object_property_set_description(obj, "ras",
+                                    "Set on/off to enable/disable reporting host memory errors "
+                                    "to a KVM guest using ACPI and guest external abort exceptions",
+                                    NULL);
+
     vms->irqmap = a15irqmap;
 
     virt_flash_create(vms);
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

Add APEI/GHES detailed design document

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20200512030609.19593-4-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/specs/acpi_hest_ghes.rst | 110 ++++++++++++++++++++++++++++++++++
 docs/specs/index.rst          |   1 +
 2 files changed, 111 insertions(+)
 create mode 100644 docs/specs/acpi_hest_ghes.rst

diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/docs/specs/acpi_hest_ghes.rst
@@ -XXX,XX +XXX,XX @@
+APEI tables generating and CPER record
+======================================
+
+..
+   Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
+
+   This work is licensed under the terms of the GNU GPL, version 2 or later.
+   See the COPYING file in the top-level directory.
+
+Design Details
+--------------
+
+::
+
+         etc/acpi/tables                           etc/hardware_errors
+      ====================                   ===============================
+  + +--------------------------+            +----------------------------+
+  | | HEST                     | +--------->|    error_block_address1    |------+
+  | +--------------------------+ |          +----------------------------+      |
+  | | GHES1                    | | +------->|    error_block_address2    |------+-+
+  | +--------------------------+ | |        +----------------------------+      | |
+  | | .................        | | |        |      ..............        |      | |
+  | | error_status_address-----+-+ |        -----------------------------+      | |
+  | | .................        |   |   +--->|    error_block_addressN    |------+-+---+
+  | | read_ack_register--------+-+ |   |    +----------------------------+      | |   |
+  | | read_ack_preserve        | +-+---+--->|     read_ack_register1     |      | |   |
+  | | read_ack_write           |   |   |    +----------------------------+      | |   |
+  + +--------------------------+   | +-+--->|     read_ack_register2     |      | |   |
+  | | GHES2                    |   | | |    +----------------------------+      | |   |
+  + +--------------------------+   | | |    |       .............        |      | |   |
+  | | .................        |   | | |    +----------------------------+      | |   |
+  | | error_status_address-----+---+ | | +->|     read_ack_registerN     |      | |   |
+  | | .................        |     | | |  +----------------------------+      | |   |
+  | | read_ack_register--------+-----+ | |  |Generic Error Status Block 1|<-----+ |   |
+  | | read_ack_preserve        |       | |  |-+------------------------+-+        |   |
+  | | read_ack_write           |       | |  | |          CPER          | |        |   |
+  + +--------------------------|       | |  | |          CPER          | |        |   |
+  | | ...............          |       | |  | |          ....          | |        |   |
+  + +--------------------------+       | |  | |          CPER          | |        |   |
+  | | GHESN                    |       | |  |-+------------------------+-|        |   |
+  + +--------------------------+       | |  |Generic Error Status Block 2|<-------+   |
+  | | .................        |       | |  |-+------------------------+-+            |
+  | | error_status_address-----+-------+ |  | |           CPER         | |            |
+  | | .................        |         |  | |           CPER         | |            |
+  | | read_ack_register--------+---------+  | |           ....         | |            |
+  | | read_ack_preserve        |            | |           CPER         | |            |
+  | | read_ack_write           |            +-+------------------------+-+            |
+  + +--------------------------+            |         ..........         |            |
+                                            |----------------------------+            |
+                                            |Generic Error Status Block N |<----------+
+                                            |-+-------------------------+-+
+                                            | |          CPER           | |
+                                            | |          CPER           | |
+                                            | |          ....           | |
+                                            | |          CPER           | |
+                                            +-+-------------------------+-+
+
+
+(1) QEMU generates the ACPI HEST table. This table goes in the current
+    "etc/acpi/tables" fw_cfg blob. Each error source has different
+    notification types.
+
+(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
+    also needs to populate this blob. The "etc/hardware_errors" fw_cfg blob
+    contains an address registers table and an Error Status Data Block table.
+
+(3) The address registers table contains N Error Block Address entries
+    and N Read Ack Register entries. The size for each entry is 8-byte.
+    The Error Status Data Block table contains N Error Status Data Block
+    entries. The size for each entry is 4096(0x1000) bytes. The total size
+    for the "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
+    N is the number of the kinds of hardware error sources.
+
+(4) QEMU generates the ACPI linker/loader script for the firmware. The
+    firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
+    and copies blob contents there.
+
+(5) QEMU generates N ADD_POINTER commands, which patch addresses in the
+    "error_status_address" fields of the HEST table with a pointer to the
+    corresponding "address registers" in the "etc/hardware_errors" blob.
+
+(6) QEMU generates N ADD_POINTER commands, which patch addresses in the
+    "read_ack_register" fields of the HEST table with a pointer to the
+    corresponding "read_ack_register" within the "etc/hardware_errors" blob.
+
+(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
+    addresses in the "error_block_address" fields with a pointer to the
+    respective "Error Status Data Block" in the "etc/hardware_errors" blob.
+
+(8) QEMU defines a third and write-only fw_cfg blob which is called
+    "etc/hardware_errors_addr". Through that blob, the firmware can send back
+    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
+    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
+    for the firmware. The firmware will write back the start address of
+    "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
+
+(9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
+    "Error Status Data Block", guest memory, and then injects platform specific
+    interrupt (in case of arm/virt machine it's Synchronous External Abort) as a
+    notification which is necessary for notifying the guest.
+
+(10) This notification (in virtual hardware) will be handled by the guest
+     kernel, on receiving notification, guest APEI driver could read the CPER error
+     and take appropriate action.
+
+(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
+     find out "Error Status Data Block" entry corresponding to error source. So supported
+     source_id values should be assigned here and not be changed afterwards to make sure
+     that guest will write error into expected "Error Status Data Block" even if guest was
+     migrated to a newer QEMU.
diff --git a/docs/specs/index.rst b/docs/specs/index.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/specs/index.rst
+++ b/docs/specs/index.rst
@@ -XXX,XX +XXX,XX @@ Contents:
    ppc-spapr-xive
    acpi_hw_reduced_hotplug
    tpm
+   acpi_hest_ghes
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

This patch builds error_block_address and read_ack_register fields
in hardware errors table , the error_block_address points to Generic
Error Status Block(GESB) via bios_linker. The max size for one GESB
is 1kb, For more detailed information, please refer to
document: docs/specs/acpi_hest_ghes.rst

Now we only support one Error source, if necessary, we can extend to
support more.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200512030609.19593-5-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 default-configs/arm-softmmu.mak |  1 +
 include/hw/acpi/aml-build.h     |  1 +
 include/hw/acpi/ghes.h          | 28 +++++++++++
 hw/acpi/aml-build.c             |  2 +
 hw/acpi/ghes.c                  | 89 +++++++++++++++++++++++++++++++++
 hw/arm/virt-acpi-build.c        |  5 ++
 hw/acpi/Kconfig                 |  4 ++
 hw/acpi/Makefile.objs           |  1 +
 8 files changed, 131 insertions(+)
 create mode 100644 include/hw/acpi/ghes.h
 create mode 100644 hw/acpi/ghes.c

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index XXXXXXX..XXXXXXX 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -XXX,XX +XXX,XX @@ CONFIG_FSL_IMX7=y
 CONFIG_FSL_IMX6UL=y
 CONFIG_SEMIHOSTING=y
 CONFIG_ALLWINNER_H3=y
+CONFIG_ACPI_APEI=y
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -XXX,XX +XXX,XX @@ struct AcpiBuildTables {
     GArray *rsdp;
     GArray *tcpalog;
     GArray *vmgenid;
+    GArray *hardware_errors;
     BIOSLinker *linker;
 } AcpiBuildTables;
 
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/acpi/ghes.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * Support for generating APEI tables and recording CPER for Guests
+ *
+ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
+ *
+ * Author: Dongjiu Geng <gengdongjiu@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef ACPI_GHES_H
+#define ACPI_GHES_H
+
+#include "hw/acpi/bios-linker-loader.h"
+
+void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
+#endif
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_init(AcpiBuildTables *tables)
     tables->table_data = g_array_new(false, true /* clear */, 1);
     tables->tcpalog = g_array_new(false, true /* clear */, 1);
     tables->vmgenid = g_array_new(false, true /* clear */, 1);
+    tables->hardware_errors = g_array_new(false, true /* clear */, 1);
     tables->linker = bios_linker_loader_init();
 }
 
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre)
     g_array_free(tables->table_data, true);
     g_array_free(tables->tcpalog, mfre);
     g_array_free(tables->vmgenid, mfre);
+    g_array_free(tables->hardware_errors, mfre);
 }
 
 /*
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * Support for generating APEI tables and recording CPER for Guests
+ *
+ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
+ *
+ * Author: Dongjiu Geng <gengdongjiu@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "hw/acpi/ghes.h"
+#include "hw/acpi/aml-build.h"
+
+#define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
+#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
+
+/* The max size in bytes for one error block */
+#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
+
+/* Now only support ARMv8 SEA notification type error source */
+#define ACPI_GHES_ERROR_SOURCE_COUNT        1
+
+/*
+ * Build table for the hardware error fw_cfg blob.
+ * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
+ * See docs/specs/acpi_hest_ghes.rst for blobs format.
+ */
+void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
+{
+    int i, error_status_block_offset;
+
+    /* Build error_block_address */
+    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+        build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
+    }
+
+    /* Build read_ack_register */
+    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+        /*
+         * Initialize the value of read_ack_register to 1, so GHES can be
+         * writeable after (re)boot.
+         * ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
+         * (GHESv2 - Type 10)
+         */
+        build_append_int_noprefix(hardware_errors, 1, sizeof(uint64_t));
+    }
+
+    /* Generic Error Status Block offset in the hardware error fw_cfg blob */
+    error_status_block_offset = hardware_errors->len;
+
+    /* Reserve space for Error Status Data Block */
+    acpi_data_push(hardware_errors,
+        ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
+
+    /* Tell guest firmware to place hardware_errors blob into RAM */
+    bios_linker_loader_alloc(linker, ACPI_GHES_ERRORS_FW_CFG_FILE,
+                             hardware_errors, sizeof(uint64_t), false);
+
+    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+        /*
+         * Tell firmware to patch error_block_address entries to point to
+         * corresponding "Generic Error Status Block"
+         */
+        bios_linker_loader_add_pointer(linker,
+            ACPI_GHES_ERRORS_FW_CFG_FILE, sizeof(uint64_t) * i,
+            sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
+            error_status_block_offset + i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
+    }
+
+    /*
+     * tell firmware to write hardware_errors GPA into
+     * hardware_errors_addr fw_cfg, once the former has been initialized.
+     */
+    bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
+        0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
+}
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -XXX,XX +XXX,XX @@
 #include "sysemu/reset.h"
 #include "kvm_arm.h"
 #include "migration/vmstate.h"
+#include "hw/acpi/ghes.h"
 
 #define ARM_SPI_BASE 32
 
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
     acpi_add_table(table_offsets, tables_blob);
     build_spcr(tables_blob, tables->linker, vms);
 
+    if (vms->ras) {
+        build_ghes_error_table(tables->hardware_errors, tables->linker);
+    }
+
     if (ms->numa_state->num_nodes > 0) {
         acpi_add_table(table_offsets, tables_blob);
         build_srat(tables_blob, tables->linker, vms);
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -XXX,XX +XXX,XX @@ config ACPI_HMAT
     bool
     depends on ACPI
 
+config ACPI_APEI
+    bool
+    depends on ACPI
+
 config ACPI_PCI
     bool
     depends on ACPI && PCI
diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/Makefile.objs
+++ b/hw/acpi/Makefile.objs
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
 common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
 common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
 common-obj-$(CONFIG_ACPI_HMAT) += hmat.o
+common-obj-$(CONFIG_ACPI_APEI) += ghes.o
 common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
 common-obj-$(call lnot,$(CONFIG_PC)) += acpi-x86-stub.o
 
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

This patch builds Hardware Error Source Table(HEST) via fw_cfg blobs.
Now it only supports ARMv8 SEA, a type of Generic Hardware Error
Source version 2(GHESv2) error source. Afterwards, we can extend
the supported types if needed. For the CPER section, currently it
is memory section because kernel mainly wants userspace to handle
the memory errors.

This patch follows the spec ACPI 6.2 to build the Hardware Error
Source table. For more detailed information, please refer to
document: docs/specs/acpi_hest_ghes.rst

build_ghes_hw_error_notification() helper will help to add Hardware
Error Notification to ACPI tables without using packed C structures
and avoid endianness issues as API doesn't need explicit conversion.

Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200512030609.19593-6-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/acpi/ghes.h   |  39 ++++++++++++
 hw/acpi/ghes.c           | 126 +++++++++++++++++++++++++++++++++++++++
 hw/arm/virt-acpi-build.c |   2 +
 3 files changed, 167 insertions(+)

diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -XXX,XX +XXX,XX @@
 
 #include "hw/acpi/bios-linker-loader.h"
 
+/*
+ * Values for Hardware Error Notification Type field
+ */
+enum AcpiGhesNotifyType {
+    /* Polled */
+    ACPI_GHES_NOTIFY_POLLED = 0,
+    /* External Interrupt */
+    ACPI_GHES_NOTIFY_EXTERNAL = 1,
+    /* Local Interrupt */
+    ACPI_GHES_NOTIFY_LOCAL = 2,
+    /* SCI */
+    ACPI_GHES_NOTIFY_SCI = 3,
+    /* NMI */
+    ACPI_GHES_NOTIFY_NMI = 4,
+    /* CMCI, ACPI 5.0: 18.3.2.7, Table 18-290 */
+    ACPI_GHES_NOTIFY_CMCI = 5,
+    /* MCE, ACPI 5.0: 18.3.2.7, Table 18-290 */
+    ACPI_GHES_NOTIFY_MCE = 6,
+    /* GPIO-Signal, ACPI 6.0: 18.3.2.7, Table 18-332 */
+    ACPI_GHES_NOTIFY_GPIO = 7,
+    /* ARMv8 SEA, ACPI 6.1: 18.3.2.9, Table 18-345 */
+    ACPI_GHES_NOTIFY_SEA = 8,
+    /* ARMv8 SEI, ACPI 6.1: 18.3.2.9, Table 18-345 */
+    ACPI_GHES_NOTIFY_SEI = 9,
+    /* External Interrupt - GSIV, ACPI 6.1: 18.3.2.9, Table 18-345 */
+    ACPI_GHES_NOTIFY_GSIV = 10,
+    /* Software Delegated Exception, ACPI 6.2: 18.3.2.9, Table 18-383 */
+    ACPI_GHES_NOTIFY_SDEI = 11,
+    /* 12 and greater are reserved */
+    ACPI_GHES_NOTIFY_RESERVED = 12
+};
+
+enum {
+    ACPI_HEST_SRC_ID_SEA = 0,
+    /* future ids go here */
+    ACPI_HEST_SRC_ID_RESERVED,
+};
+
 void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
+void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
 #endif
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/units.h"
 #include "hw/acpi/ghes.h"
 #include "hw/acpi/aml-build.h"
+#include "qemu/error-report.h"
 
 #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
 #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
@@ -XXX,XX +XXX,XX @@
 /* Now only support ARMv8 SEA notification type error source */
 #define ACPI_GHES_ERROR_SOURCE_COUNT        1
 
+/* Generic Hardware Error Source version 2 */
+#define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
+
+/* Address offset in Generic Address Structure(GAS) */
+#define GAS_ADDR_OFFSET 4
+
+/*
+ * Hardware Error Notification
+ * ACPI 4.0: 17.3.2.7 Hardware Error Notification
+ * Composes dummy Hardware Error Notification descriptor of specified type
+ */
+static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
+{
+    /* Type */
+    build_append_int_noprefix(table, type, 1);
+    /*
+     * Length:
+     * Total length of the structure in bytes
+     */
+    build_append_int_noprefix(table, 28, 1);
+    /* Configuration Write Enable */
+    build_append_int_noprefix(table, 0, 2);
+    /* Poll Interval */
+    build_append_int_noprefix(table, 0, 4);
+    /* Vector */
+    build_append_int_noprefix(table, 0, 4);
+    /* Switch To Polling Threshold Value */
+    build_append_int_noprefix(table, 0, 4);
+    /* Switch To Polling Threshold Window */
+    build_append_int_noprefix(table, 0, 4);
+    /* Error Threshold Value */
+    build_append_int_noprefix(table, 0, 4);
+    /* Error Threshold Window */
+    build_append_int_noprefix(table, 0, 4);
+}
+
 /*
  * Build table for the hardware error fw_cfg blob.
  * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
     bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
         0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
 }
+
+/* Build Generic Hardware Error Source version 2 (GHESv2) */
+static void build_ghes_v2(GArray *table_data, int source_id, BIOSLinker *linker)
+{
+    uint64_t address_offset;
+    /*
+     * Type:
+     * Generic Hardware Error Source version 2(GHESv2 - Type 10)
+     */
+    build_append_int_noprefix(table_data, ACPI_GHES_SOURCE_GENERIC_ERROR_V2, 2);
+    /* Source Id */
+    build_append_int_noprefix(table_data, source_id, 2);
+    /* Related Source Id */
+    build_append_int_noprefix(table_data, 0xffff, 2);
+    /* Flags */
+    build_append_int_noprefix(table_data, 0, 1);
+    /* Enabled */
+    build_append_int_noprefix(table_data, 1, 1);
+
+    /* Number of Records To Pre-allocate */
+    build_append_int_noprefix(table_data, 1, 4);
+    /* Max Sections Per Record */
+    build_append_int_noprefix(table_data, 1, 4);
+    /* Max Raw Data Length */
+    build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
+
+    address_offset = table_data->len;
+    /* Error Status Address */
+    build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
+                     4 /* QWord access */, 0);
+    bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
+        address_offset + GAS_ADDR_OFFSET, sizeof(uint64_t),
+        ACPI_GHES_ERRORS_FW_CFG_FILE, source_id * sizeof(uint64_t));
+
+    switch (source_id) {
+    case ACPI_HEST_SRC_ID_SEA:
+        /*
+         * Notification Structure
+         * Now only enable ARMv8 SEA notification type
+         */
+        build_ghes_hw_error_notification(table_data, ACPI_GHES_NOTIFY_SEA);
+        break;
+    default:
+        error_report("Not support this error source");
+        abort();
+    }
+
+    /* Error Status Block Length */
+    build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
+
+    /*
+     * Read Ack Register
+     * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source
+     * version 2 (GHESv2 - Type 10)
+     */
+    address_offset = table_data->len;
+    build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
+                     4 /* QWord access */, 0);
+    bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
+        address_offset + GAS_ADDR_OFFSET,
+        sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
+        (ACPI_GHES_ERROR_SOURCE_COUNT + source_id) * sizeof(uint64_t));
+
+    /*
+     * Read Ack Preserve field
+     * We only provide the first bit in Read Ack Register to OSPM to write
+     * while the other bits are preserved.
+     */
+    build_append_int_noprefix(table_data, ~0x1ULL, 8);
+    /* Read Ack Write */
+    build_append_int_noprefix(table_data, 0x1, 8);
+}
+
+/* Build Hardware Error Source Table */
+void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
+{
+    uint64_t hest_start = table_data->len;
+
+    /* Hardware Error Source Table header*/
+    acpi_data_push(table_data, sizeof(AcpiTableHeader));
+
+    /* Error Source Count */
+    build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
+
+    build_ghes_v2(table_data, ACPI_HEST_SRC_ID_SEA, linker);
+
+    build_header(linker, table_data, (void *)(table_data->data + hest_start),
+        "HEST", table_data->len - hest_start, 1, NULL, NULL);
+}
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 
     if (vms->ras) {
         build_ghes_error_table(tables->hardware_errors, tables->linker);
+        acpi_add_table(table_offsets, tables_blob);
+        acpi_build_hest(tables_blob, tables->linker);
     }
 
     if (ms->numa_state->num_nodes > 0) {
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

Record the GHEB address via fw_cfg file, when recording
a error to CPER, it will use this address to find out
Generic Error Data Entries and write the error.

In order to avoid migration failure, make hardware
error table address to a part of GED device instead
of global variable, then this address will be migrated
to target QEMU.

Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200512030609.19593-7-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/acpi/generic_event_device.h |  2 ++
 include/hw/acpi/ghes.h                 |  6 ++++++
 hw/acpi/generic_event_device.c         | 19 +++++++++++++++++++
 hw/acpi/ghes.c                         | 14 ++++++++++++++
 hw/arm/virt-acpi-build.c               |  8 ++++++++
 5 files changed, 49 insertions(+)

diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -XXX,XX +XXX,XX @@
 
 #include "hw/sysbus.h"
 #include "hw/acpi/memory_hotplug.h"
+#include "hw/acpi/ghes.h"
 
 #define ACPI_POWER_BUTTON_DEVICE "PWRB"
 
@@ -XXX,XX +XXX,XX @@ typedef struct AcpiGedState {
     GEDState ged_state;
     uint32_t ged_event_bitmap;
     qemu_irq irq;
+    AcpiGhesState ghes_state;
 } AcpiGedState;
 
 void build_ged_aml(Aml *table, const char* name, HotplugHandler *hotplug_dev,
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -XXX,XX +XXX,XX @@ enum {
     ACPI_HEST_SRC_ID_RESERVED,
 };
 
+typedef struct AcpiGhesState {
+    uint64_t ghes_addr_le;
+} AcpiGhesState;
+
 void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
 void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
+void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
+                          GArray *hardware_errors);
 #endif
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_ged_state = {
     }
 };
 
+static bool ghes_needed(void *opaque)
+{
+    AcpiGedState *s = opaque;
+    return s->ghes_state.ghes_addr_le;
+}
+
+static const VMStateDescription vmstate_ghes_state = {
+    .name = "acpi-ged/ghes",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = ghes_needed,
+    .fields      = (VMStateField[]) {
+        VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
+                       vmstate_ghes_state, AcpiGhesState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_acpi_ged = {
     .name = "acpi-ged",
     .version_id = 1,
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_acpi_ged = {
     },
     .subsections = (const VMStateDescription * []) {
         &vmstate_memhp_state,
+        &vmstate_ghes_state,
         NULL
     }
 };
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/acpi/ghes.h"
 #include "hw/acpi/aml-build.h"
 #include "qemu/error-report.h"
+#include "hw/acpi/generic_event_device.h"
+#include "hw/nvram/fw_cfg.h"
 
 #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
 #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
@@ -XXX,XX +XXX,XX @@ void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
     build_header(linker, table_data, (void *)(table_data->data + hest_start),
         "HEST", table_data->len - hest_start, 1, NULL, NULL);
 }
+
+void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
+                          GArray *hardware_error)
+{
+    /* Create a read-only fw_cfg file for GHES */
+    fw_cfg_add_file(s, ACPI_GHES_ERRORS_FW_CFG_FILE, hardware_error->data,
+                    hardware_error->len);
+
+    /* Create a read-write fw_cfg file for Address */
+    fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
+        NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
+}
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
 {
     AcpiBuildTables tables;
     AcpiBuildState *build_state;
+    AcpiGedState *acpi_ged_state;
 
     if (!vms->fw_cfg) {
         trace_virt_acpi_setup();
@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
     fw_cfg_add_file(vms->fw_cfg, ACPI_BUILD_TPMLOG_FILE, tables.tcpalog->data,
                     acpi_data_len(tables.tcpalog));
 
+    if (vms->ras) {
+        assert(vms->acpi_dev);
+        acpi_ged_state = ACPI_GED(vms->acpi_dev);
+        acpi_ghes_add_fw_cfg(&acpi_ged_state->ghes_state,
+                             vms->fw_cfg, tables.hardware_errors);
+    }
+
     build_state->rsdp_mr = acpi_add_rom_blob(virt_acpi_build_update,
                                              build_state, tables.rsdp,
                                              ACPI_BUILD_RSDP_FILE, 0);
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

kvm_hwpoison_page_add() and kvm_unpoison_all() will both
be used by X86 and ARM platforms, so moving them into
"accel/kvm/kvm-all.c" to avoid duplicate code.

For architectures that don't use the poison-list functionality
the reset handler will harmlessly do nothing, so let's register
the kvm_unpoison_all() function in the generic kvm_init() function.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
Message-id: 20200512030609.19593-8-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/sysemu/kvm_int.h | 12 ++++++++++++
 accel/kvm/kvm-all.c      | 36 ++++++++++++++++++++++++++++++++++++
 target/i386/kvm.c        | 36 ------------------------------------
 3 files changed, 48 insertions(+), 36 deletions(-)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index XXXXXXX..XXXXXXX 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -XXX,XX +XXX,XX @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
                                   AddressSpace *as, int as_id);
 
 void kvm_set_max_memslot_size(hwaddr max_slot_size);
+
+/**
+ * kvm_hwpoison_page_add:
+ *
+ * Parameters:
+ *  @ram_addr: the address in the RAM for the poisoned page
+ *
+ * Add a poisoned page to the list
+ *
+ * Return: None.
+ */
+void kvm_hwpoison_page_add(ram_addr_t ram_addr);
 #endif
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index XXXXXXX..XXXXXXX 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -XXX,XX +XXX,XX @@
 #include "qapi/visitor.h"
 #include "qapi/qapi-types-common.h"
 #include "qapi/qapi-visit-common.h"
+#include "sysemu/reset.h"
 
 #include "hw/boards.h"
 
@@ -XXX,XX +XXX,XX @@ int kvm_vm_check_extension(KVMState *s, unsigned int extension)
     return ret;
 }
 
+typedef struct HWPoisonPage {
+    ram_addr_t ram_addr;
+    QLIST_ENTRY(HWPoisonPage) list;
+} HWPoisonPage;
+
+static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
+    QLIST_HEAD_INITIALIZER(hwpoison_page_list);
+
+static void kvm_unpoison_all(void *param)
+{
+    HWPoisonPage *page, *next_page;
+
+    QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
+        QLIST_REMOVE(page, list);
+        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
+        g_free(page);
+    }
+}
+
+void kvm_hwpoison_page_add(ram_addr_t ram_addr)
+{
+    HWPoisonPage *page;
+
+    QLIST_FOREACH(page, &hwpoison_page_list, list) {
+        if (page->ram_addr == ram_addr) {
+            return;
+        }
+    }
+    page = g_new(HWPoisonPage, 1);
+    page->ram_addr = ram_addr;
+    QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
+}
+
 static uint32_t adjust_ioeventfd_endianness(uint32_t val, uint32_t size)
 {
 #if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
@@ -XXX,XX +XXX,XX @@ static int kvm_init(MachineState *ms)
         s->kernel_irqchip_split = mc->default_kernel_irqchip_split ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
     }
 
+    qemu_register_reset(kvm_unpoison_all, NULL);
+
     if (s->kernel_irqchip_allowed) {
         kvm_irqchip_create(s);
     }
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index XXXXXXX..XXXXXXX 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -XXX,XX +XXX,XX @@
 #include "sysemu/sysemu.h"
 #include "sysemu/hw_accel.h"
 #include "sysemu/kvm_int.h"
-#include "sysemu/reset.h"
 #include "sysemu/runstate.h"
 #include "kvm_i386.h"
 #include "hyperv.h"
@@ -XXX,XX +XXX,XX @@ uint64_t kvm_arch_get_supported_msr_feature(KVMState *s, uint32_t index)
     }
 }
 
-
-typedef struct HWPoisonPage {
-    ram_addr_t ram_addr;
-    QLIST_ENTRY(HWPoisonPage) list;
-} HWPoisonPage;
-
-static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
-    QLIST_HEAD_INITIALIZER(hwpoison_page_list);
-
-static void kvm_unpoison_all(void *param)
-{
-    HWPoisonPage *page, *next_page;
-
-    QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
-        QLIST_REMOVE(page, list);
-        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
-        g_free(page);
-    }
-}
-
-static void kvm_hwpoison_page_add(ram_addr_t ram_addr)
-{
-    HWPoisonPage *page;
-
-    QLIST_FOREACH(page, &hwpoison_page_list, list) {
-        if (page->ram_addr == ram_addr) {
-            return;
-        }
-    }
-    page = g_new(HWPoisonPage, 1);
-    page->ram_addr = ram_addr;
-    QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
-}
-
 static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap,
                                      int *max_banks)
 {
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init(MachineState *ms, KVMState *s)
         fprintf(stderr, "e820_add_entry() table is full\n");
         return ret;
     }
-    qemu_register_reset(kvm_unpoison_all, NULL);
 
     shadow_mem = object_property_get_int(OBJECT(s), "kvm-shadow-mem", &error_abort);
     if (shadow_mem != -1) {
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

kvm_arch_on_sigbus_vcpu() error injection uses source_id as
index in etc/hardware_errors to find out Error Status Data
Block entry corresponding to error source. So supported source_id
values should be assigned here and not be changed afterwards to
make sure that guest will write error into expected Error Status
Data Block.

Before QEMU writes a new error to ACPI table, it will check whether
previous error has been acknowledged. If not acknowledged, the new
errors will be ignored and not be recorded. For the errors section
type, QEMU simulate it to memory section error.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200512030609.19593-9-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/acpi/ghes.h |   1 +
 hw/acpi/ghes.c         | 219 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 220 insertions(+)

diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
 void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
 void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
                           GArray *hardware_errors);
+int acpi_ghes_record_errors(uint8_t notify, uint64_t error_physical_addr);
 #endif
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/error-report.h"
 #include "hw/acpi/generic_event_device.h"
 #include "hw/nvram/fw_cfg.h"
+#include "qemu/uuid.h"
 
 #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
 #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
@@ -XXX,XX +XXX,XX @@
 /* Address offset in Generic Address Structure(GAS) */
 #define GAS_ADDR_OFFSET 4
 
+/*
+ * The total size of Generic Error Data Entry
+ * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
+ * Table 18-343 Generic Error Data Entry
+ */
+#define ACPI_GHES_DATA_LENGTH               72
+
+/* The memory section CPER size, UEFI 2.6: N.2.5 Memory Error Section */
+#define ACPI_GHES_MEM_CPER_LENGTH           80
+
+/* Masks for block_status flags */
+#define ACPI_GEBS_UNCORRECTABLE         1
+
+/*
+ * Total size for Generic Error Status Block except Generic Error Data Entries
+ * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
+ * Table 18-380 Generic Error Status Block
+ */
+#define ACPI_GHES_GESB_SIZE                 20
+
+/*
+ * Values for error_severity field
+ */
+enum AcpiGenericErrorSeverity {
+    ACPI_CPER_SEV_RECOVERABLE = 0,
+    ACPI_CPER_SEV_FATAL = 1,
+    ACPI_CPER_SEV_CORRECTED = 2,
+    ACPI_CPER_SEV_NONE = 3,
+};
+
 /*
  * Hardware Error Notification
  * ACPI 4.0: 17.3.2.7 Hardware Error Notification
@@ -XXX,XX +XXX,XX @@ static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
     build_append_int_noprefix(table, 0, 4);
 }
 
+/*
+ * Generic Error Data Entry
+ * ACPI 6.1: 18.3.2.7.1 Generic Error Data
+ */
+static void acpi_ghes_generic_error_data(GArray *table,
+                const uint8_t *section_type, uint32_t error_severity,
+                uint8_t validation_bits, uint8_t flags,
+                uint32_t error_data_length, QemuUUID fru_id,
+                uint64_t time_stamp)
+{
+    const uint8_t fru_text[20] = {0};
+
+    /* Section Type */
+    g_array_append_vals(table, section_type, 16);
+
+    /* Error Severity */
+    build_append_int_noprefix(table, error_severity, 4);
+    /* Revision */
+    build_append_int_noprefix(table, 0x300, 2);
+    /* Validation Bits */
+    build_append_int_noprefix(table, validation_bits, 1);
+    /* Flags */
+    build_append_int_noprefix(table, flags, 1);
+    /* Error Data Length */
+    build_append_int_noprefix(table, error_data_length, 4);
+
+    /* FRU Id */
+    g_array_append_vals(table, fru_id.data, ARRAY_SIZE(fru_id.data));
+
+    /* FRU Text */
+    g_array_append_vals(table, fru_text, sizeof(fru_text));
+
+    /* Timestamp */
+    build_append_int_noprefix(table, time_stamp, 8);
+}
+
+/*
+ * Generic Error Status Block
+ * ACPI 6.1: 18.3.2.7.1 Generic Error Data
+ */
+static void acpi_ghes_generic_error_status(GArray *table, uint32_t block_status,
+                uint32_t raw_data_offset, uint32_t raw_data_length,
+                uint32_t data_length, uint32_t error_severity)
+{
+    /* Block Status */
+    build_append_int_noprefix(table, block_status, 4);
+    /* Raw Data Offset */
+    build_append_int_noprefix(table, raw_data_offset, 4);
+    /* Raw Data Length */
+    build_append_int_noprefix(table, raw_data_length, 4);
+    /* Data Length */
+    build_append_int_noprefix(table, data_length, 4);
+    /* Error Severity */
+    build_append_int_noprefix(table, error_severity, 4);
+}
+
+/* UEFI 2.6: N.2.5 Memory Error Section */
+static void acpi_ghes_build_append_mem_cper(GArray *table,
+                                            uint64_t error_physical_addr)
+{
+    /*
+     * Memory Error Record
+     */
+
+    /* Validation Bits */
+    build_append_int_noprefix(table,
+                              (1ULL << 14) | /* Type Valid */
+                              (1ULL << 1) /* Physical Address Valid */,
+                              8);
+    /* Error Status */
+    build_append_int_noprefix(table, 0, 8);
+    /* Physical Address */
+    build_append_int_noprefix(table, error_physical_addr, 8);
+    /* Skip all the detailed information normally found in such a record */
+    build_append_int_noprefix(table, 0, 48);
+    /* Memory Error Type */
+    build_append_int_noprefix(table, 0 /* Unknown error */, 1);
+    /* Skip all the detailed information normally found in such a record */
+    build_append_int_noprefix(table, 0, 7);
+}
+
+static int acpi_ghes_record_mem_error(uint64_t error_block_address,
+                                      uint64_t error_physical_addr)
+{
+    GArray *block;
+
+    /* Memory Error Section Type */
+    const uint8_t uefi_cper_mem_sec[] =
+          UUID_LE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
+                  0xED, 0x7C, 0x83, 0xB1);
+
+    /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
+     * Table 17-13 Generic Error Data Entry
+     */
+    QemuUUID fru_id = {};
+    uint32_t data_length;
+
+    block = g_array_new(false, true /* clear */, 1);
+
+    /* This is the length if adding a new generic error data entry*/
+    data_length = ACPI_GHES_DATA_LENGTH + ACPI_GHES_MEM_CPER_LENGTH;
+
+    /*
+     * Check whether it will run out of the preallocated memory if adding a new
+     * generic error data entry
+     */
+    if ((data_length + ACPI_GHES_GESB_SIZE) > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
+        error_report("Not enough memory to record new CPER!!!");
+        g_array_free(block, true);
+        return -1;
+    }
+
+    /* Build the new generic error status block header */
+    acpi_ghes_generic_error_status(block, ACPI_GEBS_UNCORRECTABLE,
+        0, 0, data_length, ACPI_CPER_SEV_RECOVERABLE);
+
+    /* Build this new generic error data entry header */
+    acpi_ghes_generic_error_data(block, uefi_cper_mem_sec,
+        ACPI_CPER_SEV_RECOVERABLE, 0, 0,
+        ACPI_GHES_MEM_CPER_LENGTH, fru_id, 0);
+
+    /* Build the memory section CPER for above new generic error data entry */
+    acpi_ghes_build_append_mem_cper(block, error_physical_addr);
+
+    /* Write the generic error data entry into guest memory */
+    cpu_physical_memory_write(error_block_address, block->data, block->len);
+
+    g_array_free(block, true);
+
+    return 0;
+}
+
 /*
  * Build table for the hardware error fw_cfg blob.
  * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
@@ -XXX,XX +XXX,XX @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
     fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
         NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
 }
+
+int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
+{
+    uint64_t error_block_addr, read_ack_register_addr, read_ack_register = 0;
+    uint64_t start_addr;
+    bool ret = -1;
+    AcpiGedState *acpi_ged_state;
+    AcpiGhesState *ags;
+
+    assert(source_id < ACPI_HEST_SRC_ID_RESERVED);
+
+    acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
+                                                       NULL));
+    g_assert(acpi_ged_state);
+    ags = &acpi_ged_state->ghes_state;
+
+    start_addr = le64_to_cpu(ags->ghes_addr_le);
+
+    if (physical_address) {
+
+        if (source_id < ACPI_HEST_SRC_ID_RESERVED) {
+            start_addr += source_id * sizeof(uint64_t);
+        }
+
+        cpu_physical_memory_read(start_addr, &error_block_addr,
+                                 sizeof(error_block_addr));
+
+        error_block_addr = le64_to_cpu(error_block_addr);
+
+        read_ack_register_addr = start_addr +
+            ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t);
+
+        cpu_physical_memory_read(read_ack_register_addr,
+                                 &read_ack_register, sizeof(read_ack_register));
+
+        /* zero means OSPM does not acknowledge the error */
+        if (!read_ack_register) {
+            error_report("OSPM does not acknowledge previous error,"
+                " so can not record CPER for current error anymore");
+        } else if (error_block_addr) {
+            read_ack_register = cpu_to_le64(0);
+            /*
+             * Clear the Read Ack Register, OSPM will write it to 1 when
+             * it acknowledges this error.
+             */
+            cpu_physical_memory_write(read_ack_register_addr,
+                &read_ack_register, sizeof(uint64_t));
+
+            ret = acpi_ghes_record_mem_error(error_block_addr,
+                                             physical_address);
+        } else
+            error_report("can not find Generic Error Status Block");
+    }
+
+    return ret;
+}
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
translates the host VA delivered by host to guest PA, then fills this PA
to guest APEI GHES memory, then notifies guest according to the SIGBUS
type.

When guest accesses the poisoned memory, it will generate a Synchronous
External Abort(SEA). Then host kernel gets an APEI notification and calls
memory_failure() to unmapped the affected page in stage 2, finally
returns to guest.

Guest continues to access the PG_hwpoison page, it will trap to KVM as
stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
Qemu, Qemu records this error address into guest APEI GHES memory and
notifes guest using Synchronous-External-Abort(SEA).

In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
in which we can setup the type of exception and the syndrome information.
When switching to guest, the target vcpu will jump to the synchronous
external abort vector table entry.

The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
not valid and hold an UNKNOWN value. These values will be set to KVM
register structures through KVM_SET_ONE_REG IOCTL.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20200512030609.19593-10-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/sysemu/kvm.h    |  3 +-
 target/arm/cpu.h        |  4 +++
 target/arm/internals.h  |  5 +--
 target/i386/cpu.h       |  2 ++
 target/arm/helper.c     |  2 +-
 target/arm/kvm64.c      | 77 +++++++++++++++++++++++++++++++++++++++++
 target/arm/tlb_helper.c |  2 +-
 7 files changed, 89 insertions(+), 6 deletions(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index XXXXXXX..XXXXXXX 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -XXX,XX +XXX,XX @@ bool kvm_vcpu_id_is_valid(int vcpu_id);
 /* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
 unsigned long kvm_arch_vcpu_id(CPUState *cpu);
 
-#ifdef TARGET_I386
-#define KVM_HAVE_MCE_INJECTION 1
+#ifdef KVM_HAVE_MCE_INJECTION
 void kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
 #endif
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@
 /* ARM processors have a weak memory model */
 #define TCG_GUEST_DEFAULT_MO      (0)
 
+#ifdef TARGET_AARCH64
+#define KVM_HAVE_MCE_INJECTION 1
+#endif
+
 #define EXCP_UDEF            1   /* undefined instruction */
 #define EXCP_SWI             2   /* software interrupt */
 #define EXCP_PREFETCH_ABORT  3
diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_insn_abort(int same_el, int ea, int s1ptw, int fsc)
         | ARM_EL_IL | (ea << 9) | (s1ptw << 7) | fsc;
 }
 
-static inline uint32_t syn_data_abort_no_iss(int same_el,
+static inline uint32_t syn_data_abort_no_iss(int same_el, int fnv,
                                              int ea, int cm, int s1ptw,
                                              int wnr, int fsc)
 {
     return (EC_DATAABORT << ARM_EL_EC_SHIFT) | (same_el << ARM_EL_EC_SHIFT)
            | ARM_EL_IL
-           | (ea << 9) | (cm << 8) | (s1ptw << 7) | (wnr << 6) | fsc;
+           | (fnv << 10) | (ea << 9) | (cm << 8) | (s1ptw << 7)
+           | (wnr << 6) | fsc;
 }
 
 static inline uint32_t syn_data_abort_with_iss(int same_el,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -XXX,XX +XXX,XX @@
 /* The x86 has a strong memory model with some store-after-load re-ordering */
 #define TCG_GUEST_DEFAULT_MO      (TCG_MO_ALL & ~TCG_MO_ST_LD)
 
+#define KVM_HAVE_MCE_INJECTION 1
+
 /* Maximum instruction code size */
 #define TARGET_MAX_INSN_SIZE 16
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
              * Report exception with ESR indicating a fault due to a
              * translation table walk for a cache maintenance instruction.
              */
-            syn = syn_data_abort_no_iss(current_el == target_el,
+            syn = syn_data_abort_no_iss(current_el == target_el, 0,
                                         fi.ea, 1, fi.s1ptw, 1, fsc);
             env->exception.vaddress = value;
             env->exception.fsr = fsr;
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@
 #include "sysemu/kvm_int.h"
 #include "kvm_arm.h"
 #include "internals.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/ghes.h"
+#include "hw/arm/virt.h"
 
 static bool have_guest_debug;
 
@@ -XXX,XX +XXX,XX @@ int kvm_arm_cpreg_level(uint64_t regidx)
     return KVM_PUT_RUNTIME_STATE;
 }
 
+/* Callers must hold the iothread mutex lock */
+static void kvm_inject_arm_sea(CPUState *c)
+{
+    ARMCPU *cpu = ARM_CPU(c);
+    CPUARMState *env = &cpu->env;
+    CPUClass *cc = CPU_GET_CLASS(c);
+    uint32_t esr;
+    bool same_el;
+
+    c->exception_index = EXCP_DATA_ABORT;
+    env->exception.target_el = 1;
+
+    /*
+     * Set the DFSC to synchronous external abort and set FnV to not valid,
+     * this will tell guest the FAR_ELx is UNKNOWN for this abort.
+     */
+    same_el = arm_current_el(env) == env->exception.target_el;
+    esr = syn_data_abort_no_iss(same_el, 1, 0, 0, 0, 0, 0x10);
+
+    env->exception.syndrome = esr;
+
+    cc->do_interrupt(c);
+}
+
 #define AARCH64_CORE_REG(x)   (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
                  KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
 
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
     return ret;
 }
 
+void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
+{
+    ram_addr_t ram_addr;
+    hwaddr paddr;
+    Object *obj = qdev_get_machine();
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+    bool acpi_enabled = virt_is_acpi_enabled(vms);
+
+    assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
+
+    if (acpi_enabled && addr &&
+            object_property_get_bool(obj, "ras", NULL)) {
+        ram_addr = qemu_ram_addr_from_host(addr);
+        if (ram_addr != RAM_ADDR_INVALID &&
+            kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
+            kvm_hwpoison_page_add(ram_addr);
+            /*
+             * If this is a BUS_MCEERR_AR, we know we have been called
+             * synchronously from the vCPU thread, so we can easily
+             * synchronize the state and inject an error.
+             *
+             * TODO: we currently don't tell the guest at all about
+             * BUS_MCEERR_AO. In that case we might either be being
+             * called synchronously from the vCPU thread, or a bit
+             * later from the main thread, so doing the injection of
+             * the error would be more complicated.
+             */
+            if (code == BUS_MCEERR_AR) {
+                kvm_cpu_synchronize_state(c);
+                if (!acpi_ghes_record_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
+                    kvm_inject_arm_sea(c);
+                } else {
+                    error_report("failed to record the error");
+                    abort();
+                }
+            }
+            return;
+        }
+        if (code == BUS_MCEERR_AO) {
+            error_report("Hardware memory error at addr %p for memory used by "
+                "QEMU itself instead of guest system!", addr);
+        }
+    }
+
+    if (code == BUS_MCEERR_AR) {
+        error_report("Hardware memory error!");
+        exit(1);
+    }
+}
+
 /* C6.6.29 BRK instruction */
 static const uint32_t brk_insn = 0xd4200000;
 
diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tlb_helper.c
+++ b/target/arm/tlb_helper.c
@@ -XXX,XX +XXX,XX @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn,
      * ISV field.
      */
     if (!(template_syn & ARM_EL_ISV) || target_el != 2 || s1ptw) {
-        syn = syn_data_abort_no_iss(same_el,
+        syn = syn_data_abort_no_iss(same_el, 0,
                                     ea, 0, s1ptw, is_write, fsc);
     } else {
         /*
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

I and Xiang are willing to review the APEI-related patches and
volunteer as the reviewers for the HEST/GHES part.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200512030609.19593-11-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 MAINTAINERS | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index XXXXXXX..XXXXXXX 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ F: tests/qtest/bios-tables-test.c
 F: tests/qtest/acpi-utils.[hc]
 F: tests/data/acpi/
 
+ACPI/HEST/GHES
+R: Dongjiu Geng <gengdongjiu@huawei.com>
+R: Xiang Zheng <zhengxiang9@huawei.com>
+L: qemu-arm@nongnu.org
+S: Maintained
+F: hw/acpi/ghes.c
+F: include/hw/acpi/ghes.h
+F: docs/specs/acpi_hest_ghes.rst
+
 ppc4xx
 M: David Gibson <david@gibson.dropbear.id.au>
 L: qemu-ppc@nongnu.org
-- 
2.20.1

Convert the Neon VQRDMLAH and VQRDMLSH insns in the 3-reg-same group
to decodetree.  These don't use do_3same() because they want to
operate on VFP double registers, whose offsets are different from the
neon_reg_offset() calculations do_3same does.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-2-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  3 +++
 target/arm/translate-neon.inc.c | 15 +++++++++++++++
 target/arm/translate.c          | 14 ++------------
 3 files changed, 20 insertions(+), 12 deletions(-)

Convert the Neon SHA instructions in the 3-reg-same group
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-3-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  10 +++
 target/arm/translate-neon.inc.c | 139 ++++++++++++++++++++++++++++++++
 target/arm/translate.c          |  46 +----------
 3 files changed, 151 insertions(+), 44 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
 VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
 
 VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
+
+SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
+SHA256H_3s       1111 001 1 0 . 00 .... .... 1100 . 1 . 0 .... \
+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
+SHA256H2_3s      1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
+SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
+
 VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
 
 DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
 DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
+
+static bool trans_SHA1_3s(DisasContext *s, arg_SHA1_3s *a)
+{
+    TCGv_ptr ptr1, ptr2, ptr3;
+    TCGv_i32 tmp;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !dc_isar_feature(aa32_sha1, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    ptr1 = vfp_reg_ptr(true, a->vd);
+    ptr2 = vfp_reg_ptr(true, a->vn);
+    ptr3 = vfp_reg_ptr(true, a->vm);
+    tmp = tcg_const_i32(a->optype);
+    gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp);
+    tcg_temp_free_i32(tmp);
+    tcg_temp_free_ptr(ptr1);
+    tcg_temp_free_ptr(ptr2);
+    tcg_temp_free_ptr(ptr3);
+
+    return true;
+}
+
+static bool trans_SHA256H_3s(DisasContext *s, arg_SHA256H_3s *a)
+{
+    TCGv_ptr ptr1, ptr2, ptr3;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !dc_isar_feature(aa32_sha2, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    ptr1 = vfp_reg_ptr(true, a->vd);
+    ptr2 = vfp_reg_ptr(true, a->vn);
+    ptr3 = vfp_reg_ptr(true, a->vm);
+    gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
+    tcg_temp_free_ptr(ptr1);
+    tcg_temp_free_ptr(ptr2);
+    tcg_temp_free_ptr(ptr3);
+
+    return true;
+}
+
+static bool trans_SHA256H2_3s(DisasContext *s, arg_SHA256H2_3s *a)
+{
+    TCGv_ptr ptr1, ptr2, ptr3;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !dc_isar_feature(aa32_sha2, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    ptr1 = vfp_reg_ptr(true, a->vd);
+    ptr2 = vfp_reg_ptr(true, a->vn);
+    ptr3 = vfp_reg_ptr(true, a->vm);
+    gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
+    tcg_temp_free_ptr(ptr1);
+    tcg_temp_free_ptr(ptr2);
+    tcg_temp_free_ptr(ptr3);
+
+    return true;
+}
+
+static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
+{
+    TCGv_ptr ptr1, ptr2, ptr3;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !dc_isar_feature(aa32_sha2, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    ptr1 = vfp_reg_ptr(true, a->vd);
+    ptr2 = vfp_reg_ptr(true, a->vn);
+    ptr3 = vfp_reg_ptr(true, a->vm);
+    gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
+    tcg_temp_free_ptr(ptr1);
+    tcg_temp_free_ptr(ptr2);
+    tcg_temp_free_ptr(ptr3);
+
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     int vec_size;
     uint32_t imm;
     TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
-    TCGv_ptr ptr1, ptr2, ptr3;
+    TCGv_ptr ptr1, ptr2;
     TCGv_i64 tmp64;
 
     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             return 1;
         }
         switch (op) {
-        case NEON_3R_SHA:
-            /* The SHA-1/SHA-256 3-register instructions require special
-             * treatment here, as their size field is overloaded as an
-             * op type selector, and they all consume their input in a
-             * single pass.
-             */
-            if (!q) {
-                return 1;
-            }
-            if (!u) { /* SHA-1 */
-                if (!dc_isar_feature(aa32_sha1, s)) {
-                    return 1;
-                }
-                ptr1 = vfp_reg_ptr(true, rd);
-                ptr2 = vfp_reg_ptr(true, rn);
-                ptr3 = vfp_reg_ptr(true, rm);
-                tmp4 = tcg_const_i32(size);
-                gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp4);
-                tcg_temp_free_i32(tmp4);
-            } else { /* SHA-256 */
-                if (!dc_isar_feature(aa32_sha2, s) || size == 3) {
-                    return 1;
-                }
-                ptr1 = vfp_reg_ptr(true, rd);
-                ptr2 = vfp_reg_ptr(true, rn);
-                ptr3 = vfp_reg_ptr(true, rm);
-                switch (size) {
-                case 0:
-                    gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
-                    break;
-                case 1:
-                    gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
-                    break;
-                case 2:
-                    gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
-                    break;
-                }
-            }
-            tcg_temp_free_ptr(ptr1);
-            tcg_temp_free_ptr(ptr2);
-            tcg_temp_free_ptr(ptr3);
-            return 0;
-
         case NEON_3R_VPADD_VQRDMLAH:
             if (!u) {
                 break;  /* VPADD */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VMUL:
         case NEON_3R_VML:
         case NEON_3R_VSHL:
+        case NEON_3R_SHA:
             /* Already handled by decodetree */
             return 1;
         }
-- 
2.20.1

Convert the 64-bit element insns in the 3-reg-same group
to decodetree. This covers VQSHL, VRSHL and VQRSHL where
size==0b11.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-4-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       | 13 +++++++++++
 target/arm/translate-neon.inc.c | 24 +++++++++++++++++++++
 target/arm/translate.c          | 38 ++-------------------------------
 3 files changed, 39 insertions(+), 36 deletions(-)

Convert the Neon VHADD insns in the 3-reg-same group to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-5-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  2 ++
 target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
 target/arm/translate.c          |  4 +---
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 
+VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
+VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
 VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
 VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
 
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
 DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
 DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
 DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
+
+#define DO_3SAME_32(INSN, FUNC)                                         \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        static const GVecGen3 ops[4] = {                                \
+            { .fni4 = gen_helper_neon_##FUNC##8 },                      \
+            { .fni4 = gen_helper_neon_##FUNC##16 },                     \
+            { .fni4 = gen_helper_neon_##FUNC##32 },                     \
+            { 0 },                                                      \
+        };                                                              \
+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
+    }                                                                   \
+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+    {                                                                   \
+        if (a->size > 2) {                                              \
+            return false;                                               \
+        }                                                               \
+        return do_3same(s, a, gen_##INSN##_3s);                         \
+    }
+
+DO_3SAME_32(VHADD_S, hadd_s)
+DO_3SAME_32(VHADD_U, hadd_u)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VML:
         case NEON_3R_VSHL:
         case NEON_3R_SHA:
+        case NEON_3R_VHADD:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tmp2 = neon_load_reg(rm, pass);
         }
         switch (op) {
-        case NEON_3R_VHADD:
-            GEN_NEON_INTEGER_OP(hadd);
-            break;
         case NEON_3R_VRHADD:
             GEN_NEON_INTEGER_OP(rhadd);
             break;
-- 
2.20.1

Convert the Neon VABA and VABD insns in the 3-reg-same group to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-6-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  6 ++++++
 target/arm/translate-neon.inc.c |  4 ++++
 target/arm/translate.c          | 22 ++--------------------
 3 files changed, 12 insertions(+), 20 deletions(-)

Convert the Neon VRHADD and VHSUB 3-reg-same insns to decodetree.
(These are all the other insns in 3-reg-same which were using
GEN_NEON_INTEGER_OP() and which are not pairwise or
reversed-operands.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-7-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       | 6 ++++++
 target/arm/translate-neon.inc.c | 4 ++++
 target/arm/translate.c          | 8 ++------
 3 files changed, 12 insertions(+), 6 deletions(-)

Convert the VQSHL, VRSHL and VQRSHL insns in the 3-reg-same
group to decodetree. We have already implemented the size==0b11
case of these insns; this commit handles the remaining sizes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-8-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       | 30 ++++++++++++++++++-----
 target/arm/translate-neon.inc.c | 43 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 22 +++--------------
 3 files changed, 70 insertions(+), 25 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
 @3same_64_rev    .... ... . . . 11 .... .... .... . q:1 . . .... \
                  &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
 
-VQSHL_S64_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-VQSHL_U64_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-VRSHL_S64_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-VRSHL_U64_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-VQRSHL_S64_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-VQRSHL_U64_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
+{
+  VQSHL_S64_3s   1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
+  VQSHL_S_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_rev
+}
+{
+  VQSHL_U64_3s   1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
+  VQSHL_U_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev
+}
+{
+  VRSHL_S64_3s   1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
+  VRSHL_S_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev
+}
+{
+  VRSHL_U64_3s   1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
+  VRSHL_U_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev
+}
+{
+  VQRSHL_S64_3s  1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
+  VQRSHL_S_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_rev
+}
+{
+  VQRSHL_U64_3s  1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
+  VQRSHL_U_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_rev
+}
 
 VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
 VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
         return do_3same(s, a, gen_##INSN##_3s);                         \
     }
 
+/*
+ * Some helper functions need to be passed the cpu_env. In order
+ * to use those with the gvec APIs like tcg_gen_gvec_3() we need
+ * to create wrapper functions whose prototype is a NeonGenTwoOpFn()
+ * and which call a NeonGenTwoOpEnvFn().
+ */
+#define WRAP_ENV_FN(WRAPNAME, FUNC)                                     \
+    static void WRAPNAME(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m)            \
+    {                                                                   \
+        FUNC(d, cpu_env, n, m);                                         \
+    }
+
+#define DO_3SAME_32_ENV(INSN, FUNC)                                     \
+    WRAP_ENV_FN(gen_##INSN##_tramp8, gen_helper_neon_##FUNC##8);        \
+    WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##16);      \
+    WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##32);      \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        static const GVecGen3 ops[4] = {                                \
+            { .fni4 = gen_##INSN##_tramp8 },                            \
+            { .fni4 = gen_##INSN##_tramp16 },                           \
+            { .fni4 = gen_##INSN##_tramp32 },                           \
+            { 0 },                                                      \
+        };                                                              \
+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
+    }                                                                   \
+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+    {                                                                   \
+        if (a->size > 2) {                                              \
+            return false;                                               \
+        }                                                               \
+        return do_3same(s, a, gen_##INSN##_3s);                         \
+    }
+
 DO_3SAME_32(VHADD_S, hadd_s)
 DO_3SAME_32(VHADD_U, hadd_u)
 DO_3SAME_32(VHSUB_S, hsub_s)
 DO_3SAME_32(VHSUB_U, hsub_u)
 DO_3SAME_32(VRHADD_S, rhadd_s)
 DO_3SAME_32(VRHADD_U, rhadd_u)
+DO_3SAME_32(VRSHL_S, rshl_s)
+DO_3SAME_32(VRSHL_U, rshl_u)
+
+DO_3SAME_32_ENV(VQSHL_S, qshl_s)
+DO_3SAME_32_ENV(VQSHL_U, qshl_u)
+DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
+DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VHSUB:
         case NEON_3R_VABD:
         case NEON_3R_VABA:
+        case NEON_3R_VQSHL:
+        case NEON_3R_VRSHL:
+        case NEON_3R_VQRSHL:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         }
         pairwise = 0;
         switch (op) {
-        case NEON_3R_VQSHL:
-        case NEON_3R_VRSHL:
-        case NEON_3R_VQRSHL:
-            {
-                int rtmp;
-                /* Shift instruction operands are reversed.  */
-                rtmp = rn;
-                rn = rm;
-                rm = rtmp;
-            }
-            break;
         case NEON_3R_VPADD_VQRDMLAH:
         case NEON_3R_VPMAX:
         case NEON_3R_VPMIN:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tmp2 = neon_load_reg(rm, pass);
         }
         switch (op) {
-        case NEON_3R_VQSHL:
-            GEN_NEON_INTEGER_OP_ENV(qshl);
-            break;
-        case NEON_3R_VRSHL:
-            GEN_NEON_INTEGER_OP(rshl);
-            break;
-        case NEON_3R_VQRSHL:
-            GEN_NEON_INTEGER_OP_ENV(qrshl);
             break;
         case NEON_3R_VPMAX:
             GEN_NEON_INTEGER_OP(pmax);
-- 
2.20.1

Convert the Neon integer VPMAX and VPMIN 3-reg-same insns to
decodetree. These are 'pairwise' operations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-9-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  9 +++++
 target/arm/translate-neon.inc.c | 71 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 17 +-------
 3 files changed, 82 insertions(+), 15 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 
+@3same_q0        .... ... . . . size:2 .... .... .... . 0 . . .... \
+                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
+
 VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
 VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
 VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
@@ -XXX,XX +XXX,XX @@ VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
 VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
 VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
 
+VPMAX_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 0 .... @3same_q0
+VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
+
+VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
+VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
+
 VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
 
 SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_32_ENV(VQSHL_S, qshl_s)
 DO_3SAME_32_ENV(VQSHL_U, qshl_u)
 DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
 DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
+
+static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
+{
+    /* Operations handled pairwise 32 bits at a time */
+    TCGv_i32 tmp, tmp2, tmp3;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (a->size == 3) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    assert(a->q == 0); /* enforced by decode patterns */
+
+    /*
+     * Note that we have to be careful not to clobber the source operands
+     * in the "vm == vd" case by storing the result of the first pass too
+     * early. Since Q is 0 there are always just two passes, so instead
+     * of a complicated loop over each pass we just unroll.
+     */
+    tmp = neon_load_reg(a->vn, 0);
+    tmp2 = neon_load_reg(a->vn, 1);
+    fn(tmp, tmp, tmp2);
+    tcg_temp_free_i32(tmp2);
+
+    tmp3 = neon_load_reg(a->vm, 0);
+    tmp2 = neon_load_reg(a->vm, 1);
+    fn(tmp3, tmp3, tmp2);
+    tcg_temp_free_i32(tmp2);
+
+    neon_store_reg(a->vd, 0, tmp);
+    neon_store_reg(a->vd, 1, tmp3);
+    return true;
+}
+
+#define DO_3SAME_PAIR(INSN, func)                                       \
+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+    {                                                                   \
+        static NeonGenTwoOpFn * const fns[] = {                         \
+            gen_helper_neon_##func##8,                                  \
+            gen_helper_neon_##func##16,                                 \
+            gen_helper_neon_##func##32,                                 \
+        };                                                              \
+        if (a->size > 2) {                                              \
+            return false;                                               \
+        }                                                               \
+        return do_3same_pair(s, a, fns[a->size]);                       \
+    }
+
+/* 32-bit pairwise ops end up the same as the elementwise versions.  */
+#define gen_helper_neon_pmax_s32  tcg_gen_smax_i32
+#define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
+#define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
+#define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
+
+DO_3SAME_PAIR(VPMAX_S, pmax_s)
+DO_3SAME_PAIR(VPMIN_S, pmin_s)
+DO_3SAME_PAIR(VPMAX_U, pmax_u)
+DO_3SAME_PAIR(VPMIN_U, pmin_u)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
     }
 }
 
-/* 32-bit pairwise ops end up the same as the elementwise versions.  */
-#define gen_helper_neon_pmax_s32  tcg_gen_smax_i32
-#define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
-#define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
-#define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
-
 #define GEN_NEON_INTEGER_OP_ENV(name) do { \
     switch ((size << 1) | u) { \
     case 0: \
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VQSHL:
         case NEON_3R_VRSHL:
         case NEON_3R_VQRSHL:
+        case NEON_3R_VPMAX:
+        case NEON_3R_VPMIN:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         pairwise = 0;
         switch (op) {
         case NEON_3R_VPADD_VQRDMLAH:
-        case NEON_3R_VPMAX:
-        case NEON_3R_VPMIN:
             pairwise = 1;
             break;
         case NEON_3R_FLOAT_ARITH:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tmp2 = neon_load_reg(rm, pass);
         }
         switch (op) {
-            break;
-        case NEON_3R_VPMAX:
-            GEN_NEON_INTEGER_OP(pmax);
-            break;
-        case NEON_3R_VPMIN:
-            GEN_NEON_INTEGER_OP(pmin);
-            break;
         case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high.  */
             if (!u) { /* VQDMULH */
                 switch (size) {
-- 
2.20.1

Convert the Neon integer VPADD 3-reg-same insns to decodetree.  These
are 'pairwise' operations.  (Note that VQRDMLAH, which shares the
same primary opcode but has U=1, has already been converted.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-10-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  2 ++
 target/arm/translate-neon.inc.c |  2 ++
 target/arm/translate.c          | 19 +------------------
 3 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
 VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 
+VPADD_3s         1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
+
 VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
 
 SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
 #define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
 #define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
 #define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
+#define gen_helper_neon_padd_u32  tcg_gen_add_i32
 
 DO_3SAME_PAIR(VPMAX_S, pmax_s)
 DO_3SAME_PAIR(VPMIN_S, pmin_s)
 DO_3SAME_PAIR(VPMAX_U, pmax_u)
 DO_3SAME_PAIR(VPMIN_U, pmin_u)
+DO_3SAME_PAIR(VPADD, padd_u)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             return 1;
         }
         switch (op) {
-        case NEON_3R_VPADD_VQRDMLAH:
-            if (!u) {
-                break;  /* VPADD */
-            }
-            /* VQRDMLAH : handled by decodetree */
-            return 1;
-
         case NEON_3R_VFM_VQRDMLSH:
             if (!u) {
                 /* VFM, VFMS */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VQRSHL:
         case NEON_3R_VPMAX:
         case NEON_3R_VPMIN:
+        case NEON_3R_VPADD_VQRDMLAH:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         }
         pairwise = 0;
         switch (op) {
-        case NEON_3R_VPADD_VQRDMLAH:
-            pairwise = 1;
-            break;
         case NEON_3R_FLOAT_ARITH:
             pairwise = (u && size < 2); /* if VPADD (float) */
             break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 }
             }
             break;
-        case NEON_3R_VPADD_VQRDMLAH:
-            switch (size) {
-            case 0: gen_helper_neon_padd_u8(tmp, tmp, tmp2); break;
-            case 1: gen_helper_neon_padd_u16(tmp, tmp, tmp2); break;
-            case 2: tcg_gen_add_i32(tmp, tmp, tmp2); break;
-            default: abort();
-            }
-            break;
         case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
         {
             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-- 
2.20.1

Convert the Neon VQDMULH and VQRDMULH 3-reg-same insns to
decodetree. These are the last integer operations in the
3-reg-same group.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-11-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  3 +++
 target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
 target/arm/translate.c          | 24 +-----------------------
 3 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
 VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 
+VQDMULH_3s       1111 001 0 0 . .. .... .... 1011 . . . 0 .... @3same
+VQRDMULH_3s      1111 001 1 0 . .. .... .... 1011 . . . 0 .... @3same
+
 VPADD_3s         1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
 
 VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPMIN_S, pmin_s)
 DO_3SAME_PAIR(VPMAX_U, pmax_u)
 DO_3SAME_PAIR(VPMIN_U, pmin_u)
 DO_3SAME_PAIR(VPADD, padd_u)
+
+#define DO_3SAME_VQDMULH(INSN, FUNC)                                    \
+    WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16);    \
+    WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32);    \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        static const GVecGen3 ops[2] = {                                \
+            { .fni4 = gen_##INSN##_tramp16 },                           \
+            { .fni4 = gen_##INSN##_tramp32 },                           \
+        };                                                              \
+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece - 1]); \
+    }                                                                   \
+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+    {                                                                   \
+        if (a->size != 1 && a->size != 2) {                             \
+            return false;                                               \
+        }                                                               \
+        return do_3same(s, a, gen_##INSN##_3s);                         \
+    }
+
+DO_3SAME_VQDMULH(VQDMULH, qdmulh)
+DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VPMAX:
         case NEON_3R_VPMIN:
         case NEON_3R_VPADD_VQRDMLAH:
+        case NEON_3R_VQDMULH_VQRDMULH:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tmp2 = neon_load_reg(rm, pass);
         }
         switch (op) {
-        case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high.  */
-            if (!u) { /* VQDMULH */
-                switch (size) {
-                case 1:
-                    gen_helper_neon_qdmulh_s16(tmp, cpu_env, tmp, tmp2);
-                    break;
-                case 2:
-                    gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
-                    break;
-                default: abort();
-                }
-            } else { /* VQRDMULH */
-                switch (size) {
-                case 1:
-                    gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
-                    break;
-                case 2:
-                    gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
-                    break;
-                default: abort();
-                }
-            }
-            break;
         case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
         {
             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-- 
2.20.1

Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree.
We already have gvec helpers for addition and subtraction, but must
add one for fabd.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-12-peter.maydell@linaro.org
---
 target/arm/helper.h             |  3 ++-
 target/arm/neon-dp.decode       |  8 ++++++++
 target/arm/neon_helper.c        |  7 -------
 target/arm/translate-neon.inc.c | 28 ++++++++++++++++++++++++++++
 target/arm/translate.c          | 10 +++-------
 target/arm/vec_helper.c         |  7 +++++++
 6 files changed, 48 insertions(+), 15 deletions(-)

Convert the Neon float VPMIN, VPMAX and VPADD 3-reg-same insns to
decodetree. These are the only remaining 'pairwise' operations,
so we can delete the pairwise-specific bits of the old decoder's
for-each-element loop now.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-13-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  5 +++
 target/arm/translate-neon.inc.c | 63 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 63 +++++----------------------------
 3 files changed, 76 insertions(+), 55 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 # For FP insns the high bit of 'size' is used as part of opcode decode
 @3same_fp        .... ... . . . . size:1 .... .... .... . q:1 . . .... \
                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
+@3same_fp_q0     .... ... . . . . size:1 .... .... .... . 0 . . .... \
+                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
 
 VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
 VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
@@ -XXX,XX +XXX,XX @@ VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
 
 VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
 VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
+VPADD_fp_3s      1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
 VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
+VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
+VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
 DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
 DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
 DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
+
+static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
+{
+    /* FP operations handled pairwise 32 bits at a time */
+    TCGv_i32 tmp, tmp2, tmp3;
+    TCGv_ptr fpstatus;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    assert(a->q == 0); /* enforced by decode patterns */
+
+    /*
+     * Note that we have to be careful not to clobber the source operands
+     * in the "vm == vd" case by storing the result of the first pass too
+     * early. Since Q is 0 there are always just two passes, so instead
+     * of a complicated loop over each pass we just unroll.
+     */
+    fpstatus = get_fpstatus_ptr(1);
+    tmp = neon_load_reg(a->vn, 0);
+    tmp2 = neon_load_reg(a->vn, 1);
+    fn(tmp, tmp, tmp2, fpstatus);
+    tcg_temp_free_i32(tmp2);
+
+    tmp3 = neon_load_reg(a->vm, 0);
+    tmp2 = neon_load_reg(a->vm, 1);
+    fn(tmp3, tmp3, tmp2, fpstatus);
+    tcg_temp_free_i32(tmp2);
+    tcg_temp_free_ptr(fpstatus);
+
+    neon_store_reg(a->vd, 0, tmp);
+    neon_store_reg(a->vd, 1, tmp3);
+    return true;
+}
+
+/*
+ * For all the functions using this macro, size == 1 means fp16,
+ * which is an architecture extension we don't implement yet.
+ */
+#define DO_3S_FP_PAIR(INSN,FUNC)                                    \
+    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
+    {                                                               \
+        if (a->size != 0) {                                         \
+            /* TODO fp16 support */                                 \
+            return false;                                           \
+        }                                                           \
+        return do_3same_fp_pair(s, a, FUNC);                        \
+    }
+
+DO_3S_FP_PAIR(VPADD, gen_helper_vfp_adds)
+DO_3S_FP_PAIR(VPMAX, gen_helper_vfp_maxs)
+DO_3S_FP_PAIR(VPMIN, gen_helper_vfp_mins)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     int shift;
     int pass;
     int count;
-    int pairwise;
     int u;
     int vec_size;
     uint32_t imm;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VPMIN:
         case NEON_3R_VPADD_VQRDMLAH:
         case NEON_3R_VQDMULH_VQRDMULH:
+        case NEON_3R_FLOAT_ARITH:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             /* 64-bit element instructions: handled by decodetree */
             return 1;
         }
-        pairwise = 0;
         switch (op) {
-        case NEON_3R_FLOAT_ARITH:
-            pairwise = (u && size < 2); /* if VPADD (float) */
-            if (!pairwise) {
-                return 1; /* handled by decodetree */
-            }
-            break;
         case NEON_3R_FLOAT_MINMAX:
-            pairwise = u; /* if VPMIN/VPMAX (float) */
+            if (u) {
+                return 1; /* VPMIN/VPMAX handled by decodetree */
+            }
             break;
         case NEON_3R_FLOAT_CMP:
             if (!u && size) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             break;
         }
 
-        if (pairwise && q) {
-            /* All the pairwise insns UNDEF if Q is set */
-            return 1;
-        }
-
         for (pass = 0; pass < (q ? 4 : 2); pass++) {
 
-        if (pairwise) {
-            /* Pairwise.  */
-            if (pass < 1) {
-                tmp = neon_load_reg(rn, 0);
-                tmp2 = neon_load_reg(rn, 1);
-            } else {
-                tmp = neon_load_reg(rm, 0);
-                tmp2 = neon_load_reg(rm, 1);
-            }
-        } else {
-            /* Elementwise.  */
-            tmp = neon_load_reg(rn, pass);
-            tmp2 = neon_load_reg(rm, pass);
-        }
+        /* Elementwise.  */
+        tmp = neon_load_reg(rn, pass);
+        tmp2 = neon_load_reg(rm, pass);
         switch (op) {
-        case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
-        {
-            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-            switch ((u << 2) | size) {
-            case 4: /* VPADD */
-                gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
-                break;
-            default:
-                abort();
-            }
-            tcg_temp_free_ptr(fpstatus);
-            break;
-        }
         case NEON_3R_FLOAT_MULTIPLY:
         {
             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         }
         tcg_temp_free_i32(tmp2);
 
-        /* Save the result.  For elementwise operations we can put it
-           straight into the destination register.  For pairwise operations
-           we have to be careful to avoid clobbering the source operands.  */
-        if (pairwise && rd == rm) {
-            neon_store_scratch(pass, tmp);
-        } else {
-            neon_store_reg(rd, pass, tmp);
-        }
+        neon_store_reg(rd, pass, tmp);
 
         } /* for pass */
-        if (pairwise && rd == rm) {
-            for (pass = 0; pass < (q ? 4 : 2); pass++) {
-                tmp = neon_load_scratch(pass);
-                neon_store_reg(rd, pass, tmp);
-            }
-        }
         /* End of 3 register same size operations.  */
     } else if (insn & (1 << 4)) {
         if ((insn & 0x00380080) != 0) {
-- 
2.20.1

Convert the Neon integer VMUL, VMLA, and VMLS 3-reg-same inssn to
decodetree.

We don't have a gvec helper for multiply-accumulate, so VMLA and VMLS
need a loop function do_3same_fp().  This takes a reads_vd parameter
to do_3same_fp() which tells it to load the old value into vd before
calling the callback function, in the same way that the do_vfp_3op_sp()
and do_vfp_3op_dp() functions in translate-vfp.inc.c work. (The
only uses in this patch pass reads_vd == true, but later commits
will use reads_vd == false.)

This conversion fixes in passing an underdecoding for VMUL
(originally reported by Fredrik Strupe <fredrik@strupe.net>): bit 1
of the 'size' field must be 0.  The old decoder didn't enforce this,
but the decodetree pattern does.

The gen_VMLA_fp_reg() function performs the addition operation
with the operands in the opposite order to the old decoder:
since Neon sets 'default NaN mode' float32_add operations are
commutative so there is no behaviour difference, but putting
them this way around matches the Arm ARM pseudocode and the
required operation order for the subtraction in gen_VMLS_fp_reg().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-14-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  3 ++
 target/arm/translate-neon.inc.c | 81 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 17 +------
 3 files changed, 85 insertions(+), 16 deletions(-)

Convert the Neon integer 3-reg-same compare insns VCGE, VCGT,
VCEQ, VACGE and VACGT to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-15-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  5 +++++
 target/arm/translate-neon.inc.c |  6 +++++
 target/arm/translate.c          | 39 ++-------------------------------
 3 files changed, 13 insertions(+), 37 deletions(-)

The usual location for the env argument in the argument list of a TCG helper
is immediately after the return-value argument. recps_f32 and rsqrts_f32
differ in that they put it at the end.

Move the env argument to its usual place; this will allow us to
more easily use these helper functions with the gvec APIs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-16-peter.maydell@linaro.org
---
 target/arm/helper.h     | 4 ++--
 target/arm/translate.c  | 4 ++--
 target/arm/vfp_helper.c | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, ptr, i32)
 DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
 DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
 
-DEF_HELPER_3(recps_f32, f32, f32, f32, env)
-DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env)
+DEF_HELPER_3(recps_f32, f32, env, f32, f32)
+DEF_HELPER_3(rsqrts_f32, f32, env, f32, f32)
 DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
 DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
 DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 tcg_temp_free_ptr(fpstatus);
             } else {
                 if (size == 0) {
-                    gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env);
+                    gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
                 } else {
-                    gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env);
+                    gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
               }
             }
             break;
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vfp_helper.c
+++ b/target/arm/vfp_helper.c
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
 #define float32_three make_float32(0x40400000)
 #define float32_one_point_five make_float32(0x3fc00000)
 
-float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
+float32 HELPER(recps_f32)(CPUARMState *env, float32 a, float32 b)
 {
     float_status *s = &env->vfp.standard_fp_status;
     if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
@@ -XXX,XX +XXX,XX @@ float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
     return float32_sub(float32_two, float32_mul(a, b, s), s);
 }
 
-float32 HELPER(rsqrts_f32)(float32 a, float32 b, CPUARMState *env)
+float32 HELPER(rsqrts_f32)(CPUARMState *env, float32 a, float32 b)
 {
     float_status *s = &env->vfp.standard_fp_status;
     float32 product;
-- 
2.20.1

Convert the Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS 3-reg-same
insns to decodetree. (These are all the remaining non-accumulation
instructions in this group.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-17-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  6 +++
 target/arm/translate-neon.inc.c | 70 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 42 +-------------------
 3 files changed, 78 insertions(+), 40 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VCGE_fp_3s       1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
 VACGE_fp_3s      1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
 VCGT_fp_3s       1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
 VACGT_fp_3s      1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
+VMAX_fp_3s       1111 001 0 0 . 0 . .... .... 1111 ... 0 .... @3same_fp
+VMIN_fp_3s       1111 001 0 0 . 1 . .... .... 1111 ... 0 .... @3same_fp
 VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
 VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
+VRECPS_fp_3s     1111 001 0 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
+VRSQRTS_fp_3s    1111 001 0 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
+VMAXNM_fp_3s     1111 001 1 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
+VMINNM_fp_3s     1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
 DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
 DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
 DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
+DO_3S_FP(VMAX, gen_helper_vfp_maxs, false)
+DO_3S_FP(VMIN, gen_helper_vfp_mins, false)
 
 static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
                             TCGv_ptr fpstatus)
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
 DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
 DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
 
+static bool trans_VMAXNM_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
+        return false;
+    }
+
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same_fp(s, a, gen_helper_vfp_maxnums, false);
+}
+
+static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
+        return false;
+    }
+
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same_fp(s, a, gen_helper_vfp_minnums, false);
+}
+
+WRAP_ENV_FN(gen_VRECPS_tramp, gen_helper_recps_f32)
+
+static void gen_VRECPS_fp_3s(unsigned vece, uint32_t rd_ofs,
+                             uint32_t rn_ofs, uint32_t rm_ofs,
+                             uint32_t oprsz, uint32_t maxsz)
+{
+    static const GVecGen3 ops = { .fni4 = gen_VRECPS_tramp };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
+}
+
+static bool trans_VRECPS_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same(s, a, gen_VRECPS_fp_3s);
+}
+
+WRAP_ENV_FN(gen_VRSQRTS_tramp, gen_helper_rsqrts_f32)
+
+static void gen_VRSQRTS_fp_3s(unsigned vece, uint32_t rd_ofs,
+                              uint32_t rn_ofs, uint32_t rm_ofs,
+                              uint32_t oprsz, uint32_t maxsz)
+{
+    static const GVecGen3 ops = { .fni4 = gen_VRSQRTS_tramp };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
+}
+
+static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same(s, a, gen_VRSQRTS_fp_3s);
+}
+
 static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
 {
     /* FP operations handled pairwise 32 bits at a time */
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_FLOAT_MULTIPLY:
         case NEON_3R_FLOAT_CMP:
         case NEON_3R_FLOAT_ACMP:
+        case NEON_3R_FLOAT_MINMAX:
+        case NEON_3R_FLOAT_MISC:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             return 1;
         }
         switch (op) {
-        case NEON_3R_FLOAT_MINMAX:
-            if (u) {
-                return 1; /* VPMIN/VPMAX handled by decodetree */
-            }
-            break;
-        case NEON_3R_FLOAT_MISC:
-            /* VMAXNM/VMINNM in ARMv8 */
-            if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
-                return 1;
-            }
-            break;
         case NEON_3R_VFM_VQRDMLSH:
             if (!dc_isar_feature(aa32_simdfmac, s)) {
                 return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         tmp = neon_load_reg(rn, pass);
         tmp2 = neon_load_reg(rm, pass);
         switch (op) {
-        case NEON_3R_FLOAT_MINMAX:
-        {
-            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-            if (size == 0) {
-                gen_helper_vfp_maxs(tmp, tmp, tmp2, fpstatus);
-            } else {
-                gen_helper_vfp_mins(tmp, tmp, tmp2, fpstatus);
-            }
-            tcg_temp_free_ptr(fpstatus);
-            break;
-        }
-        case NEON_3R_FLOAT_MISC:
-            if (u) {
-                /* VMAXNM/VMINNM */
-                TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                if (size == 0) {
-                    gen_helper_vfp_maxnums(tmp, tmp, tmp2, fpstatus);
-                } else {
-                    gen_helper_vfp_minnums(tmp, tmp, tmp2, fpstatus);
-                }
-                tcg_temp_free_ptr(fpstatus);
-            } else {
-                if (size == 0) {
-                    gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
-                } else {
-                    gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
-              }
-            }
-            break;
         case NEON_3R_VFM_VQRDMLSH:
         {
             /* VFMA, VFMS: fused multiply-add */
-- 
2.20.1

Convert the Neon floating point VFMA and VFMS insn to decodetree.
These are the last insns in the 3-reg-same group so we can
remove all the support/loop code from the old decoder.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-18-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |   3 +
 target/arm/translate-neon.inc.c |  41 ++++++++
 target/arm/translate.c          | 176 +-------------------------------
 3 files changed, 46 insertions(+), 174 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ SHA256H2_3s      1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
 SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
                  vm=%vm_dp vn=%vn_dp vd=%vd_dp
 
+VFMA_fp_3s       1111 001 0 0 . 0 . .... .... 1100 ... 1 .... @3same_fp
+VFMS_fp_3s       1111 001 0 0 . 1 . .... .... 1100 ... 1 .... @3same_fp
+
 VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
 
 VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
     return do_3same(s, a, gen_VRSQRTS_fp_3s);
 }
 
+static void gen_VFMA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
+                            TCGv_ptr fpstatus)
+{
+    gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
+}
+
+static bool trans_VFMA_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (!dc_isar_feature(aa32_simdfmac, s)) {
+        return false;
+    }
+
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same_fp(s, a, gen_VFMA_fp_3s, true);
+}
+
+static void gen_VFMS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
+                            TCGv_ptr fpstatus)
+{
+    gen_helper_vfp_negs(vn, vn);
+    gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
+}
+
+static bool trans_VFMS_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (!dc_isar_feature(aa32_simdfmac, s)) {
+        return false;
+    }
+
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same_fp(s, a, gen_VFMS_fp_3s, true);
+}
+
 static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
 {
     /* FP operations handled pairwise 32 bits at a time */
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
     }
 }
 
-/* Symbolic constants for op fields for Neon 3-register same-length.
- * The values correspond to bits [11:8,4]; see the ARM ARM DDI0406B
- * table A7-9.
- */
-#define NEON_3R_VHADD 0
-#define NEON_3R_VQADD 1
-#define NEON_3R_VRHADD 2
-#define NEON_3R_LOGIC 3 /* VAND,VBIC,VORR,VMOV,VORN,VEOR,VBIF,VBIT,VBSL */
-#define NEON_3R_VHSUB 4
-#define NEON_3R_VQSUB 5
-#define NEON_3R_VCGT 6
-#define NEON_3R_VCGE 7
-#define NEON_3R_VSHL 8
-#define NEON_3R_VQSHL 9
-#define NEON_3R_VRSHL 10
-#define NEON_3R_VQRSHL 11
-#define NEON_3R_VMAX 12
-#define NEON_3R_VMIN 13
-#define NEON_3R_VABD 14
-#define NEON_3R_VABA 15
-#define NEON_3R_VADD_VSUB 16
-#define NEON_3R_VTST_VCEQ 17
-#define NEON_3R_VML 18 /* VMLA, VMLS */
-#define NEON_3R_VMUL 19
-#define NEON_3R_VPMAX 20
-#define NEON_3R_VPMIN 21
-#define NEON_3R_VQDMULH_VQRDMULH 22
-#define NEON_3R_VPADD_VQRDMLAH 23
-#define NEON_3R_SHA 24 /* SHA1C,SHA1P,SHA1M,SHA1SU0,SHA256H{2},SHA256SU1 */
-#define NEON_3R_VFM_VQRDMLSH 25 /* VFMA, VFMS, VQRDMLSH */
-#define NEON_3R_FLOAT_ARITH 26 /* float VADD, VSUB, VPADD, VABD */
-#define NEON_3R_FLOAT_MULTIPLY 27 /* float VMLA, VMLS, VMUL */
-#define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */
-#define NEON_3R_FLOAT_ACMP 29 /* float VACGE, VACGT, VACLE, VACLT */
-#define NEON_3R_FLOAT_MINMAX 30 /* float VMIN, VMAX */
-#define NEON_3R_FLOAT_MISC 31 /* float VRECPS, VRSQRTS, VMAXNM/MINNM */
-
-static const uint8_t neon_3r_sizes[] = {
-    [NEON_3R_VHADD] = 0x7,
-    [NEON_3R_VQADD] = 0xf,
-    [NEON_3R_VRHADD] = 0x7,
-    [NEON_3R_LOGIC] = 0xf, /* size field encodes op type */
-    [NEON_3R_VHSUB] = 0x7,
-    [NEON_3R_VQSUB] = 0xf,
-    [NEON_3R_VCGT] = 0x7,
-    [NEON_3R_VCGE] = 0x7,
-    [NEON_3R_VSHL] = 0xf,
-    [NEON_3R_VQSHL] = 0xf,
-    [NEON_3R_VRSHL] = 0xf,
-    [NEON_3R_VQRSHL] = 0xf,
-    [NEON_3R_VMAX] = 0x7,
-    [NEON_3R_VMIN] = 0x7,
-    [NEON_3R_VABD] = 0x7,
-    [NEON_3R_VABA] = 0x7,
-    [NEON_3R_VADD_VSUB] = 0xf,
-    [NEON_3R_VTST_VCEQ] = 0x7,
-    [NEON_3R_VML] = 0x7,
-    [NEON_3R_VMUL] = 0x7,
-    [NEON_3R_VPMAX] = 0x7,
-    [NEON_3R_VPMIN] = 0x7,
-    [NEON_3R_VQDMULH_VQRDMULH] = 0x6,
-    [NEON_3R_VPADD_VQRDMLAH] = 0x7,
-    [NEON_3R_SHA] = 0xf, /* size field encodes op type */
-    [NEON_3R_VFM_VQRDMLSH] = 0x7, /* For VFM, size bit 1 encodes op */
-    [NEON_3R_FLOAT_ARITH] = 0x5, /* size bit 1 encodes op */
-    [NEON_3R_FLOAT_MULTIPLY] = 0x5, /* size bit 1 encodes op */
-    [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */
-    [NEON_3R_FLOAT_ACMP] = 0x5, /* size bit 1 encodes op */
-    [NEON_3R_FLOAT_MINMAX] = 0x5, /* size bit 1 encodes op */
-    [NEON_3R_FLOAT_MISC] = 0x5, /* size bit 1 encodes op */
-};
-
 /* Symbolic constants for op fields for Neon 2-register miscellaneous.
  * The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
  * table A7-13.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     rm_ofs = neon_reg_offset(rm, 0);
 
     if ((insn & (1 << 23)) == 0) {
-        /* Three register same length.  */
-        op = ((insn >> 7) & 0x1e) | ((insn >> 4) & 1);
-        /* Catch invalid op and bad size combinations: UNDEF */
-        if ((neon_3r_sizes[op] & (1 << size)) == 0) {
-            return 1;
-        }
-        /* All insns of this form UNDEF for either this condition or the
-         * superset of cases "Q==1"; we catch the latter later.
-         */
-        if (q && ((rd | rn | rm) & 1)) {
-            return 1;
-        }
-        switch (op) {
-        case NEON_3R_VFM_VQRDMLSH:
-            if (!u) {
-                /* VFM, VFMS */
-                if (size == 1) {
-                    return 1;
-                }
-                break;
-            }
-            /* VQRDMLSH : handled by decodetree */
-            return 1;
-
-        case NEON_3R_VADD_VSUB:
-        case NEON_3R_LOGIC:
-        case NEON_3R_VMAX:
-        case NEON_3R_VMIN:
-        case NEON_3R_VTST_VCEQ:
-        case NEON_3R_VCGT:
-        case NEON_3R_VCGE:
-        case NEON_3R_VQADD:
-        case NEON_3R_VQSUB:
-        case NEON_3R_VMUL:
-        case NEON_3R_VML:
-        case NEON_3R_VSHL:
-        case NEON_3R_SHA:
-        case NEON_3R_VHADD:
-        case NEON_3R_VRHADD:
-        case NEON_3R_VHSUB:
-        case NEON_3R_VABD:
-        case NEON_3R_VABA:
-        case NEON_3R_VQSHL:
-        case NEON_3R_VRSHL:
-        case NEON_3R_VQRSHL:
-        case NEON_3R_VPMAX:
-        case NEON_3R_VPMIN:
-        case NEON_3R_VPADD_VQRDMLAH:
-        case NEON_3R_VQDMULH_VQRDMULH:
-        case NEON_3R_FLOAT_ARITH:
-        case NEON_3R_FLOAT_MULTIPLY:
-        case NEON_3R_FLOAT_CMP:
-        case NEON_3R_FLOAT_ACMP:
-        case NEON_3R_FLOAT_MINMAX:
-        case NEON_3R_FLOAT_MISC:
-            /* Already handled by decodetree */
-            return 1;
-        }
-
-        if (size == 3) {
-            /* 64-bit element instructions: handled by decodetree */
-            return 1;
-        }
-        switch (op) {
-        case NEON_3R_VFM_VQRDMLSH:
-            if (!dc_isar_feature(aa32_simdfmac, s)) {
-                return 1;
-            }
-            break;
-        default:
-            break;
-        }
-
-        for (pass = 0; pass < (q ? 4 : 2); pass++) {
-
-        /* Elementwise.  */
-        tmp = neon_load_reg(rn, pass);
-        tmp2 = neon_load_reg(rm, pass);
-        switch (op) {
-        case NEON_3R_VFM_VQRDMLSH:
-        {
-            /* VFMA, VFMS: fused multiply-add */
-            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-            TCGv_i32 tmp3 = neon_load_reg(rd, pass);
-            if (size) {
-                /* VFMS */
-                gen_helper_vfp_negs(tmp, tmp);
-            }
-            gen_helper_vfp_muladds(tmp, tmp, tmp2, tmp3, fpstatus);
-            tcg_temp_free_i32(tmp3);
-            tcg_temp_free_ptr(fpstatus);
-            break;
-        }
-        default:
-            abort();
-        }
-        tcg_temp_free_i32(tmp2);
-
-        neon_store_reg(rd, pass, tmp);
-
-        } /* for pass */
-        /* End of 3 register same size operations.  */
+        /* Three register same length: handled by decodetree */
+        return 1;
     } else if (insn & (1 << 4)) {
         if ((insn & 0x00380080) != 0) {
             /* Two registers and shift.  */
-- 
2.20.1