Series comparison

-[Qemu-devel] [PULL 00/42] target-arm queue
+[PULL 00/45] target-arm queue
-First pullreq for arm of the 4.1 series, since I'm back from
+Mostly this is patches from me and RTH cleaning up and doing
-holiday now. This is mostly my M-profile FPU series and Philippe's
+more decodetree conversion for AArch32 Neon. The major new feature
-devices.h cleanup. I have a pile of other patchsets to work through
+is Dongjiu Geng's patchset to report host memory errors to KVM guests;
-in my to-review folder, but 42 patches is definitely quite
+also a new aspeed board from Patrick Williams.
 big enough to send now...
 thanks
 -- PMM
-The following changes since commit 413a99a92c13ec408dcf2adaa87918dc81e890c8:
+The following changes since commit 035b448b84f3557206abc44d786c5d3db2638f7d:
-  Add Nios II semihosting support. (2019-04-29 16:09:51 +0100)
+  Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-05-14' into staging (2020-05-14 10:58:30 +0100)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20190429
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200514
-for you to fetch changes up to 437cc27ddfded3bbab6afd5ac1761e0e195edba7:
+for you to fetch changes up to e95485f85657be21135c17a9226e297c21e73360:
-  hw/devices: Move SMSC 91C111 declaration into a new header (2019-04-29 17:57:21 +0100)
+  target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree (2020-05-14 15:03:09 +0100)
 ----------------------------------------------------------------
 target-arm queue:
- * remove "bag of random stuff" hw/devices.h header
+ * target/arm: Use correct GDB XML for M-profile cores
- * implement FPU for Cortex-M and enable it for Cortex-M4 and -M33
+ * target/arm: Code cleanup to use gvec APIs better
- * hw/dma: Compile the bcm2835_dma device as common object
+ * aspeed: Add support for the sonorapass-bmc board
- * configure: Remove --source-path option
+ * target/arm: Support reporting KVM host memory errors
- * hw/ssi/xilinx_spips: Avoid variable length array
+   to the guest via ACPI notifications
- * hw/arm/smmuv3: Remove SMMUNotifierNode
+ * target/arm: Finish conversion of Neon 3-reg-same insns to decodetree
 ----------------------------------------------------------------
-Eric Auger (1):
+Dongjiu Geng (10):
-      hw/arm/smmuv3: Remove SMMUNotifierNode
+      acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
       hw/arm/virt: Introduce a RAS machine option
       docs: APEI GHES generation and CPER record description
       ACPI: Build related register address fields via hardware error fw_cfg blob
       ACPI: Build Hardware Error Source Table
       ACPI: Record the Generic Error Status Block address
       KVM: Move hwpoison page related functions into kvm-all.c
       ACPI: Record Generic Error Status Block(GESB) table
       target-arm: kvm64: handle SIGBUS signal from kernel or KVM
       MAINTAINERS: Add ACPI/HEST/GHES entries
-Peter Maydell (28):
+Patrick Williams (1):
-      hw/ssi/xilinx_spips: Avoid variable length array
+      aspeed: Add support for the sonorapass-bmc board
       configure: Remove --source-path option
       target/arm: Make sure M-profile FPSCR RES0 bits are not settable
       hw/intc/armv7m_nvic: Allow reading of M-profile MVFR* registers
       target/arm: Implement dummy versions of M-profile FP-related registers
       target/arm: Disable most VFP sysregs for M-profile
       target/arm: Honour M-profile FP enable bits
       target/arm: Decode FP instructions for M profile
       target/arm: Clear CONTROL_S.SFPA in SG insn if FPU present
       target/arm: Handle SFPA and FPCA bits in reads and writes of CONTROL
       target/arm/helper: don't return early for STKOF faults during stacking
       target/arm: Handle floating point registers in exception entry
       target/arm: Implement v7m_update_fpccr()
       target/arm: Clear CONTROL.SFPA in BXNS and BLXNS
       target/arm: Clean excReturn bits when tail chaining
       target/arm: Allow for floating point in callee stack integrity check
       target/arm: Handle floating point registers in exception return
       target/arm: Move NS TBFLAG from bit 19 to bit 6
       target/arm: Overlap VECSTRIDE and XSCALE_CPAR TB flags
       target/arm: Set FPCCR.S when executing M-profile floating point insns
       target/arm: Activate M-profile floating point context when FPCCR.ASPEN is set
       target/arm: New helper function arm_v7m_mmu_idx_all()
       target/arm: New function armv7m_nvic_set_pending_lazyfp()
       target/arm: Add lazy-FP-stacking support to v7m_stack_write()
       target/arm: Implement M-profile lazy FP state preservation
       target/arm: Implement VLSTM for v7M CPUs with an FPU
       target/arm: Implement VLLDM for v7M CPUs with an FPU
       target/arm: Enable FPU for Cortex-M4 and Cortex-M33
-Philippe Mathieu-Daudé (13):
+Peter Maydell (18):
-      hw/dma: Compile the bcm2835_dma device as common object
+      target/arm: Use correct GDB XML for M-profile cores
-      hw/arm/aspeed: Use TYPE_TMP105/TYPE_PCA9552 instead of hardcoded string
+      target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
-      hw/arm/nseries: Use TYPE_TMP105 instead of hardcoded string
+      target/arm: Convert Neon 3-reg-same SHA to decodetree
-      hw/display/tc6393xb: Remove unused functions
+      target/arm: Convert Neon 64-bit element 3-reg-same insns
-      hw/devices: Move TC6393XB declarations into a new header
+      target/arm: Convert Neon VHADD 3-reg-same insns
-      hw/devices: Move Blizzard declarations into a new header
+      target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
-      hw/devices: Move CBus declarations into a new header
+      target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
-      hw/devices: Move Gamepad declarations into a new header
+      target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
-      hw/devices: Move TI touchscreen declarations into a new header
+      target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
-      hw/devices: Move LAN9118 declarations into a new header
+      target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
-      hw/net/ne2000-isa: Add guards to the header
+      target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
-      hw/net/lan9118: Export TYPE_LAN9118 and use it instead of hardcoded string
+      target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
-      hw/devices: Move SMSC 91C111 declaration into a new header
+      target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
       target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
       target/arm: Convert Neon 3-reg-same compare insns to decodetree
       target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
       target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
       target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree
- configure                     |  10 +-
+Richard Henderson (16):
- hw/dma/Makefile.objs          |   2 +-
+      target/arm: Create gen_gvec_[us]sra
- include/hw/arm/omap.h         |   6 +-
+      target/arm: Create gen_gvec_{u,s}{rshr,rsra}
- include/hw/arm/smmu-common.h  |   8 +-
+      target/arm: Create gen_gvec_{sri,sli}
- include/hw/devices.h          |  62 ---
+      target/arm: Remove unnecessary range check for VSHL
- include/hw/display/blizzard.h |  22 ++
+      target/arm: Tidy handle_vec_simd_shri
- include/hw/display/tc6393xb.h |  24 ++
+      target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
- include/hw/input/gamepad.h    |  19 +
+      target/arm: Create gen_gvec_{mla,mls}
- include/hw/input/tsc2xxx.h    |  36 ++
+      target/arm: Swap argument order for VSHL during decode
- include/hw/misc/cbus.h        |  32 ++
+      target/arm: Create gen_gvec_{cmtst,ushl,sshl}
- include/hw/net/lan9118.h      |  21 +
+      target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
- include/hw/net/ne2000-isa.h   |   6 +
+      target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
- include/hw/net/smc91c111.h    |  19 +
+      target/arm: Create gen_gvec_{qrdmla,qrdmls}
- include/qemu/typedefs.h       |   1 -
+      target/arm: Pass pointer to qc to qrdmla/qrdmls
- target/arm/cpu.h              |  95 ++++-
+      target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
- target/arm/helper.h           |   5 +
+      target/arm: Vectorize SABD/UABD
- target/arm/translate.h        |   3 +
+      target/arm: Vectorize SABA/UABA
  hw/arm/aspeed.c               |  13 +-
  hw/arm/exynos4_boards.c       |   3 +-
  hw/arm/gumstix.c              |   2 +-
  hw/arm/integratorcp.c         |   2 +-
  hw/arm/kzm.c                  |   2 +-
  hw/arm/mainstone.c            |   2 +-
  hw/arm/mps2-tz.c              |   3 +-
  hw/arm/mps2.c                 |   2 +-
  hw/arm/nseries.c              |   7 +-
  hw/arm/palm.c                 |   2 +-
  hw/arm/realview.c             |   3 +-
  hw/arm/smmu-common.c          |   6 +-
  hw/arm/smmuv3.c               |  28 +-
  hw/arm/stellaris.c            |   2 +-
  hw/arm/tosa.c                 |   2 +-
  hw/arm/versatilepb.c          |   2 +-
  hw/arm/vexpress.c             |   2 +-
  hw/display/blizzard.c         |   2 +-
  hw/display/tc6393xb.c         |  18 +-
  hw/input/stellaris_input.c    |   2 +-
  hw/input/tsc2005.c            |   2 +-
  hw/input/tsc210x.c            |   4 +-
  hw/intc/armv7m_nvic.c         | 261 +++++++++++++
  hw/misc/cbus.c                |   2 +-
  hw/net/lan9118.c              |   3 +-
  hw/net/smc91c111.c            |   2 +-
  hw/ssi/xilinx_spips.c         |   6 +-
  target/arm/cpu.c              |  20 +
  target/arm/helper.c           | 873 +++++++++++++++++++++++++++++++++++++++---
  target/arm/machine.c          |  16 +
  target/arm/translate.c        | 150 +++++++-
  target/arm/vfp_helper.c       |   8 +
  MAINTAINERS                   |   7 +
 files changed, 1595 insertions(+), 235 deletions(-)
  delete mode 100644 include/hw/devices.h
  create mode 100644 include/hw/display/blizzard.h
  create mode 100644 include/hw/display/tc6393xb.h
  create mode 100644 include/hw/input/gamepad.h
  create mode 100644 include/hw/input/tsc2xxx.h
  create mode 100644 include/hw/misc/cbus.h
  create mode 100644 include/hw/net/lan9118.h
  create mode 100644 include/hw/net/smc91c111.h
+ docs/specs/acpi_hest_ghes.rst          |  110 ++
+ docs/specs/index.rst                   |    1 +
+ configure                              |    4 +-
+ default-configs/arm-softmmu.mak        |    1 +
+ include/hw/acpi/aml-build.h            |    1 +
+ include/hw/acpi/generic_event_device.h |    2 +
+ include/hw/acpi/ghes.h                 |   74 +
+ include/hw/arm/virt.h                  |    1 +
+ include/qemu/uuid.h                    |   27 +
+ include/sysemu/kvm.h                   |    3 +-
+ include/sysemu/kvm_int.h               |   12 +
+ target/arm/cpu.h                       |    4 +
+ target/arm/helper.h                    |   78 +-
+ target/arm/internals.h                 |    5 +-
+ target/arm/translate.h                 |   84 +-
+ target/i386/cpu.h                      |    2 +
+ target/arm/neon-dp.decode              |  119 +-
+ accel/kvm/kvm-all.c                    |   36 +
+ hw/acpi/aml-build.c                    |    2 +
+ hw/acpi/generic_event_device.c         |   19 +
+ hw/acpi/ghes.c                         |  448 ++++++
+ hw/acpi/nvdimm.c                       |   10 +-
+ hw/arm/aspeed.c                        |   78 ++
+ hw/arm/virt-acpi-build.c               |   15 +
+ hw/arm/virt.c                          |   23 +
+ target/arm/cpu_tcg.c                   |    1 +
+ target/arm/gdbstub.c                   |   22 +-
+ target/arm/helper.c                    |    2 +-
+ target/arm/kvm64.c                     |   77 ++
+ target/arm/neon_helper.c               |   17 -
+ target/arm/tlb_helper.c                |    2 +-
+ target/arm/translate-a64.c             |  210 +--
+ target/arm/translate-neon.inc.c        |  682 +++++++++-
+ target/arm/translate.c                 | 2349 +++++++++++++++++---------------
+ target/arm/vec_helper.c                |  240 +++-
+ target/arm/vfp_helper.c                |    9 +-
+ target/i386/kvm.c                      |   36 -
+ MAINTAINERS                            |    9 +
+ gdb-xml/arm-m-profile.xml              |   27 +
+ hw/acpi/Kconfig                        |    4 +
+ hw/acpi/Makefile.objs                  |    1 +
+files changed, 3402 insertions(+), 1445 deletions(-)
+ create mode 100644 docs/specs/acpi_hest_ghes.rst
+ create mode 100644 include/hw/acpi/ghes.h
+ create mode 100644 hw/acpi/ghes.c
+ create mode 100644 gdb-xml/arm-m-profile.xml

-[Qemu-devel] [PULL 03/42] configure: Remove --source-path option
+[PULL 01/45] target/arm: Use correct GDB XML for M-profile cores
-Normally configure identifies the source path by looking
+GDB's remote protocol requires M-profile cores to use the feature
-at the location where the configure script itself exists.
+name 'org.gnu.gdb.arm.m-profile' instead of the 'org.gnu.gdb.arm.core'
-We also provide a --source-path option which lets the user
+feature used for A- and R-profile cores. We weren't doing this, which
-manually override this.
+meant GDB treated our M-profile cores like A-profile ones. This mostly
 doesn't matter, but for instance means that it doesn't correctly
 handle backtraces where an M-profile exception frame is involved.
-There isn't really an obvious use case for the --source-path
+Ship a copy of GDB's arm-m-profile.xml and use it on the M-profile
-option, and in commit 927128222b0a91f56c13a in 2017 we
+cores.  The integer registers have the same offsets as the
-accidentally added some logic that looks at $source_path
+arm-core.xml, but register 25 is the M-profile XPSR rather than the
-before the command line option that overrides it has been
+A-profile CPSR, so we need to update arm_cpu_gdb_read_register() and
-processed.
+arm_cpu_gdb_write_register() to handle XSPR reads and writes.
-The fact that nobody complained suggests that there isn't
+Fixes: https://bugs.launchpad.net/qemu/+bug/1877136
 any use of this option and we aren't testing it either;
 remove it. This allows us to move the "make $source_path
 absolute" logic up so that there is no window in the script
 where $source_path is set but not yet absolute.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Message-id: 20190318134019.23729-1-peter.maydell@linaro.org
+Message-id: 20200507134755.13997-1-peter.maydell@linaro.org
 ---
- configure | 10 ++--------
+ configure                 |  4 ++--
-file changed, 2 insertions(+), 8 deletions(-)
+ target/arm/cpu_tcg.c      |  1 +
  target/arm/gdbstub.c      | 22 ++++++++++++++++++----
  gdb-xml/arm-m-profile.xml | 27 +++++++++++++++++++++++++++
 files changed, 48 insertions(+), 6 deletions(-)
  create mode 100644 gdb-xml/arm-m-profile.xml
 diff --git a/configure b/configure
 index XXXXXXX..XXXXXXX 100755
 --- a/configure
 +++ b/configure
-@@ -XXX,XX +XXX,XX @@ ld_has() {
+@@ -XXX,XX +XXX,XX @@ case "$target_name" in
+     TARGET_SYSTBL_ABI=common,oabi
- # default parameters
+     bflt="yes"
- source_path=$(dirname "$0")
+     mttcg="yes"
-+# make source path absolute
+-    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
-+source_path=$(cd "$source_path"; pwd)
++    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
  cpu=""
  iasl="iasl"
  interp_prefix="/usr/gnemul/qemu-%M"
@@ -XXX,XX +XXX,XX @@ for opt do
    ;;
-   --cxx=*) CXX="$optarg"
+   aarch64|aarch64_be)
      TARGET_ARCH=aarch64
      TARGET_BASE_ARCH=arm
      bflt="yes"
      mttcg="yes"
 -    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
 +    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
    ;;
--  --source-path=*) source_path="$optarg"
+   cris)
 -  ;;
    --cpu=*) cpu="$optarg"
    ;;
-   --extra-cflags=*) QEMU_CFLAGS="$QEMU_CFLAGS $optarg"
+diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
-@@ -XXX,XX +XXX,XX @@ if test "$debug_info" = "yes"; then
+index XXXXXXX..XXXXXXX 100644
-     LDFLAGS="-g $LDFLAGS"
+--- a/target/arm/cpu_tcg.c
- fi
++++ b/target/arm/cpu_tcg.c
+@@ -XXX,XX +XXX,XX @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
--# make source path absolute
+ #endif
--source_path=$(cd "$source_path"; pwd)
--
+     cc->cpu_exec_interrupt = arm_v7m_cpu_exec_interrupt;
- # running configure in the source tree?
++    cc->gdb_core_xml_file = "arm-m-profile.xml";
- # we know that's the case if configure is there.
+ }
- if test -f "./configure"; then
-@@ -XXX,XX +XXX,XX @@ for opt do
+ static const ARMCPUInfo arm_tcg_cpus[] = {
-   ;;
+diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
-   --interp-prefix=*) interp_prefix="$optarg"
+index XXXXXXX..XXXXXXX 100644
-   ;;
+--- a/target/arm/gdbstub.c
--  --source-path=*)
++++ b/target/arm/gdbstub.c
--  ;;
+@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
-   --cross-prefix=*)
+         }
-   ;;
+         return gdb_get_reg32(mem_buf, 0);
-   --cc=*)
+     case 25:
-@@ -XXX,XX +XXX,XX @@ $(echo Available targets: $default_target_list | \
+-        /* CPSR */
-   --target-list-exclude=LIST exclude a set of targets from the default target-list
+-        return gdb_get_reg32(mem_buf, cpsr_read(env));
++        /* CPSR, or XPSR for M-profile */
- Advanced options (experts only):
++        if (arm_feature(env, ARM_FEATURE_M)) {
--  --source-path=PATH       path of source code [$source_path]
++            return gdb_get_reg32(mem_buf, xpsr_read(env));
-   --cross-prefix=PREFIX    use PREFIX for compile tools [$cross_prefix]
++        } else {
-   --cc=CC                  use C compiler CC [$cc]
++            return gdb_get_reg32(mem_buf, cpsr_read(env));
-   --iasl=IASL              use ACPI compiler IASL [$iasl]
++        }
      }
      /* Unknown register.  */
      return 0;
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
          }
          return 4;
      case 25:
 -        /* CPSR */
 -        cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
 +        /* CPSR, or XPSR for M-profile */
 +        if (arm_feature(env, ARM_FEATURE_M)) {
 +            /*
 +             * Don't allow writing to XPSR.Exception as it can cause
 +             * a transition into or out of handler mode (it's not
 +             * writeable via the MSR insn so this is a reasonable
 +             * restriction). Other fields are safe to update.
 +             */
 +            xpsr_write(env, tmp, ~XPSR_EXCP);
 +        } else {
 +            cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
 +        }
          return 4;
      }
      /* Unknown register.  */
 diff --git a/gdb-xml/arm-m-profile.xml b/gdb-xml/arm-m-profile.xml
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/gdb-xml/arm-m-profile.xml
@@ -XXX,XX +XXX,XX @@
 +<?xml version="1.0"?>
 +<!-- Copyright (C) 2010-2020 Free Software Foundation, Inc.
 +
 +     Copying and distribution of this file, with or without modification,
 +     are permitted in any medium without royalty provided the copyright
 +     notice and this notice are preserved.  -->
 +
 +<!DOCTYPE feature SYSTEM "gdb-target.dtd">
 +<feature name="org.gnu.gdb.arm.m-profile">
 +  <reg name="r0" bitsize="32"/>
 +  <reg name="r1" bitsize="32"/>
 +  <reg name="r2" bitsize="32"/>
 +  <reg name="r3" bitsize="32"/>
 +  <reg name="r4" bitsize="32"/>
 +  <reg name="r5" bitsize="32"/>
 +  <reg name="r6" bitsize="32"/>
 +  <reg name="r7" bitsize="32"/>
 +  <reg name="r8" bitsize="32"/>
 +  <reg name="r9" bitsize="32"/>
 +  <reg name="r10" bitsize="32"/>
 +  <reg name="r11" bitsize="32"/>
 +  <reg name="r12" bitsize="32"/>
 +  <reg name="sp" bitsize="32" type="data_ptr"/>
 +  <reg name="lr" bitsize="32"/>
 +  <reg name="pc" bitsize="32" type="code_ptr"/>
 +  <reg name="xpsr" bitsize="32" regnum="25"/>
 +</feature>
 --
 .20.1

-[Qemu-devel] [PULL 23/42] target/arm: New helper function arm_v7m_mmu_idx_all()
+[PULL 02/45] target/arm: Create gen_gvec_[us]sra
-Add a new helper function which returns the MMU index to use
+From: Richard Henderson <richard.henderson@linaro.org>
-for v7M, where the caller specifies all of the security
-state, privilege level and whether the execution priority
+The functions eliminate duplication of the special cases for
-is negative, and reimplement the existing
+this operation.  They match up with the GVecGen2iFn typedef.
-arm_v7m_mmu_idx_for_secstate_and_priv() in terms of it.
+Add out-of-line helpers.  We got away with only having inline
-We are going to need this for the lazy-FP-stacking code.
+expanders because the neon vector size is only 16 bytes, and
+we know that the inline expansion will always succeed.
 When we reuse this for SVE, tcg-gvec-op may decide to use an
 out-of-line helper due to longer vector lengths.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200513163245.17915-2-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-21-peter.maydell@linaro.org
 ---
- target/arm/cpu.h    |  7 +++++++
+ target/arm/helper.h        |  10 +++
- target/arm/helper.c | 14 +++++++++++---
+ target/arm/translate.h     |   7 +-
-files changed, 18 insertions(+), 3 deletions(-)
+ target/arm/translate-a64.c |  15 +---
+ target/arm/translate.c     | 161 ++++++++++++++++++++++---------------
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+ target/arm/vec_helper.c    |  25 ++++++
-index XXXXXXX..XXXXXXX 100644
+files changed, 139 insertions(+), 79 deletions(-)
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
+diff --git a/target/arm/helper.h b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ static inline int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx)
+index XXXXXXX..XXXXXXX 100644
-     }
+--- a/target/arm/helper.h
 +++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
 +DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
  #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 mls_op[4];
  extern const GVecGen3 cmtst_op[4];
  extern const GVecGen3 sshl_op[4];
  extern const GVecGen3 ushl_op[4];
 -extern const GVecGen2i ssra_op[4];
 -extern const GVecGen2i usra_op[4];
  extern const GVecGen2i sri_op[4];
  extern const GVecGen2i sli_op[4];
  extern const GVecGen4 uqadd_op[4];
@@ -XXX,XX +XXX,XX @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
  void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
  void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
      switch (opcode) {
      case 0x02: /* SSRA / USRA (accumulate) */
 -        if (is_u) {
 -            /* Shift count same as element size produces zero to add.  */
 -            if (shift == 8 << size) {
 -                goto done;
 -            }
 -            gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]);
 -        } else {
 -            /* Shift count same as element size produces all sign to add.  */
 -            if (shift == 8 << size) {
 -                shift -= 1;
 -            }
 -            gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]);
 -        }
 +        gen_gvec_fn2i(s, is_q, rd, rn, shift,
 +                      is_u ? gen_gvec_usra : gen_gvec_ssra, size);
          return;
      case 0x08: /* SRI */
          /* Shift count same as element size is valid but does nothing.  */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
      tcg_gen_add_vec(vece, d, d, a);
  }
-+/*
+-static const TCGOpcode vecop_list_ssra[] = {
-+ * Return the MMU index for a v7M CPU with all relevant information
+-    INDEX_op_sari_vec, INDEX_op_add_vec, 0
-+ * manually specified.
+-};
-+ */
++void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-+ARMMMUIdx arm_v7m_mmu_idx_all(CPUARMState *env,
++                   int64_t shift, uint32_t opr_sz, uint32_t max_sz)
-+                              bool secstate, bool priv, bool negpri);
++{
-+
++    static const TCGOpcode vecop_list[] = {
- /* Return the MMU index for a v7M CPU in the specified security and
++        INDEX_op_sari_vec, INDEX_op_add_vec, 0
-  * privilege state.
++    };
-  */
++    static const GVecGen2i ops[4] = {
-diff --git a/target/arm/helper.c b/target/arm/helper.c
++        { .fni8 = gen_ssra8_i64,
-index XXXXXXX..XXXXXXX 100644
++          .fniv = gen_ssra_vec,
---- a/target/arm/helper.c
++          .fno = gen_helper_gvec_ssra_b,
-+++ b/target/arm/helper.c
++          .load_dest = true,
-@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
++          .opt_opc = vecop_list,
-     return 0;
++          .vece = MO_8 },
 +        { .fni8 = gen_ssra16_i64,
 +          .fniv = gen_ssra_vec,
 +          .fno = gen_helper_gvec_ssra_h,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_ssra32_i32,
 +          .fniv = gen_ssra_vec,
 +          .fno = gen_helper_gvec_ssra_s,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_ssra64_i64,
 +          .fniv = gen_ssra_vec,
 +          .fno = gen_helper_gvec_ssra_b,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 -const GVecGen2i ssra_op[4] = {
 -    { .fni8 = gen_ssra8_i64,
 -      .fniv = gen_ssra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_ssra,
 -      .vece = MO_8 },
 -    { .fni8 = gen_ssra16_i64,
 -      .fniv = gen_ssra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_ssra,
 -      .vece = MO_16 },
 -    { .fni4 = gen_ssra32_i32,
 -      .fniv = gen_ssra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_ssra,
 -      .vece = MO_32 },
 -    { .fni8 = gen_ssra64_i64,
 -      .fniv = gen_ssra_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .opt_opc = vecop_list_ssra,
 -      .load_dest = true,
 -      .vece = MO_64 },
 -};
 +    /* tszimm encoding produces immediates in the range [1..esize]. */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    /*
 +     * Shifts larger than the element size are architecturally valid.
 +     * Signed results in all sign bits.
 +     */
 +    shift = MIN(shift, (8 << vece) - 1);
 +    tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +}
  static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
@@ -XXX,XX +XXX,XX @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
      tcg_gen_add_vec(vece, d, d, a);
  }
--ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
+-static const TCGOpcode vecop_list_usra[] = {
--                                                bool secstate, bool priv)
+-    INDEX_op_shri_vec, INDEX_op_add_vec, 0
-+ARMMMUIdx arm_v7m_mmu_idx_all(CPUARMState *env,
+-};
-+                              bool secstate, bool priv, bool negpri)
++void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_usra8_i64,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_b,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8, },
 +        { .fni8 = gen_usra16_i64,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_h,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16, },
 +        { .fni4 = gen_usra32_i32,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_s,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32, },
 +        { .fni8 = gen_usra64_i64,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64, },
 +    };
 -const GVecGen2i usra_op[4] = {
 -    { .fni8 = gen_usra8_i64,
 -      .fniv = gen_usra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_8, },
 -    { .fni8 = gen_usra16_i64,
 -      .fniv = gen_usra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_16, },
 -    { .fni4 = gen_usra32_i32,
 -      .fniv = gen_usra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_32, },
 -    { .fni8 = gen_usra64_i64,
 -      .fniv = gen_usra_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_64, },
 -};
 +    /* tszimm encoding produces immediates in the range [1..esize]. */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    /*
 +     * Shifts larger than the element size are architecturally valid.
 +     * Unsigned results in all zeros as input to accumulate: nop.
 +     */
 +    if (shift < (8 << vece)) {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    } else {
 +        /* Nop, but we do need to clear the tail. */
 +        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
 +    }
 +}
  static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
-     ARMMMUIdx mmu_idx = ARM_MMU_IDX_M;
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+                 case 1:  /* VSRA */
-@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
+                     /* Right shift comes here negative.  */
-         mmu_idx |= ARM_MMU_IDX_M_PRIV;
+                     shift = -shift;
-     }
+-                    /* Shifts larger than the element size are architecturally
+-                     * valid.  Unsigned results in all zeros; signed results
--    if (armv7m_nvic_neg_prio_requested(env->nvic, secstate)) {
+-                     * in all sign bits.
-+    if (negpri) {
+-                     */
-         mmu_idx |= ARM_MMU_IDX_M_NEGPRI;
+-                    if (!u) {
-     }
+-                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
+-                                        MIN(shift, (8 << size) - 1),
-@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
+-                                        &ssra_op[size]);
-     return mmu_idx;
+-                    } else if (shift >= 8 << size) {
 -                        /* rd += 0 */
 +                    if (u) {
 +                        gen_gvec_usra(size, rd_ofs, rm_ofs, shift,
 +                                      vec_size, vec_size);
                      } else {
 -                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 -                                        shift, &usra_op[size]);
 +                        gen_gvec_ssra(size, rd_ofs, rm_ofs, shift,
 +                                      vec_size, vec_size);
                      }
                      return 0;
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn,
      clear_tail(d, oprsz, simd_maxsz(desc));
  }
-+ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
++
-+                                                bool secstate, bool priv)
++#define DO_SRA(NAME, TYPE)                              \
-+{
++void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
-+    bool negpri = armv7m_nvic_neg_prio_requested(env->nvic, secstate);
++{                                                       \
-+
++    intptr_t i, oprsz = simd_oprsz(desc);               \
-+    return arm_v7m_mmu_idx_all(env, secstate, priv, negpri);
++    int shift = simd_data(desc);                        \
 +    TYPE *d = vd, *n = vn;                              \
 +    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
 +        d[i] += n[i] >> shift;                          \
 +    }                                                   \
 +    clear_tail(d, oprsz, simd_maxsz(desc));             \
 +}
 +
- /* Return the MMU index for a v7M CPU in the specified security state */
++DO_SRA(gvec_ssra_b, int8_t)
- ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate)
++DO_SRA(gvec_ssra_h, int16_t)
- {
++DO_SRA(gvec_ssra_s, int32_t)
 +DO_SRA(gvec_ssra_d, int64_t)
 +
 +DO_SRA(gvec_usra_b, uint8_t)
 +DO_SRA(gvec_usra_h, uint16_t)
 +DO_SRA(gvec_usra_s, uint32_t)
 +DO_SRA(gvec_usra_d, uint64_t)
 +
 +#undef DO_SRA
 +
  /*
   * Convert float16 to float32, raising no exceptions and
   * preserving exceptional values, including SNaN.
 --
 .20.1

-[Qemu-devel] [PULL 38/42] hw/devices: Move TI touchscreen declarations into a new header
+[PULL 03/45] target/arm: Create gen_gvec_{u,s}{rshr,rsra}
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-Since uWireSlave is only used in this new header, there is no
+Create vectorized versions of handle_shri_with_rndacc
-need to expose it via "qemu/typedefs.h".
+for shift+round and shift+round+accumulate.  Add out-of-line
 helpers in preparation for longer vector lengths from SVE.
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190412165416.7977-9-philmd@redhat.com
+Message-id: 20200513163245.17915-3-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/arm/omap.h      |  6 +-----
+ target/arm/helper.h        |  20 ++
- include/hw/devices.h       | 15 ---------------
+ target/arm/translate.h     |   9 +
- include/hw/input/tsc2xxx.h | 36 ++++++++++++++++++++++++++++++++++++
+ target/arm/translate-a64.c |  11 +-
- include/qemu/typedefs.h    |  1 -
+ target/arm/translate.c     | 463 +++++++++++++++++++++++++++++++++++--
- hw/arm/nseries.c           |  2 +-
+ target/arm/vec_helper.c    |  50 ++++
- hw/arm/palm.c              |  2 +-
+files changed, 527 insertions(+), 26 deletions(-)
  hw/input/tsc2005.c         |  2 +-
  hw/input/tsc210x.c         |  4 ++--
  MAINTAINERS                |  2 ++
 files changed, 44 insertions(+), 26 deletions(-)
  create mode 100644 include/hw/input/tsc2xxx.h
-diff --git a/include/hw/arm/omap.h b/include/hw/arm/omap.h
+diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/omap.h
+--- a/target/arm/helper.h
-+++ b/include/hw/arm/omap.h
++++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- #include "exec/memory.h"
+ DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- # define hw_omap_h        "omap.h"
+ DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- #include "hw/irq.h"
-+#include "hw/input/tsc2xxx.h"
++DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- #include "target/arm/cpu-qom.h"
++DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- #include "qemu/log.h"
++DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-@@ -XXX,XX +XXX,XX @@ qemu_irq *omap_mpuio_in_get(struct omap_mpuio_s *s);
++
- void omap_mpuio_out_set(struct omap_mpuio_s *s, int line, qemu_irq handler);
++DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- void omap_mpuio_key(struct omap_mpuio_s *s, int row, int col, int down);
++DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
--struct uWireSlave {
++DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
--    uint16_t (*receive)(void *opaque);
++
--    void (*send)(void *opaque, uint16_t data);
++DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
--    void *opaque;
++DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
--};
++DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- struct omap_uwire_s;
++DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- void omap_uwire_attach(struct omap_uwire_s *s,
++
-                 uWireSlave *slave, int chipselect);
++DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-diff --git a/include/hw/devices.h b/include/hw/devices.h
++DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
  #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
+--- a/target/arm/translate.h
-+++ b/include/hw/devices.h
++++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
- /* Devices that have nowhere better to go.  */
+ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
- #include "hw/hw.h"
--#include "ui/console.h"
++void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
++                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
- /* smc91c111.c */
++void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
- void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
++                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
-@@ -XXX,XX +XXX,XX @@ void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
++void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
- /* lan9118.c */
++                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
- void lan9118_init(NICInfo *, uint32_t, qemu_irq);
++void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
++                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
--/* tsc210x.c */
++
--uWireSlave *tsc2102_init(qemu_irq pint);
+ /*
--uWireSlave *tsc2301_init(qemu_irq penirq, qemu_irq kbirq, qemu_irq dav);
+  * Forward to the isar_feature_* tests given a DisasContext pointer.
--I2SCodec *tsc210x_codec(uWireSlave *chip);
+  */
--uint32_t tsc210x_txrx(void *opaque, uint32_t value, int len);
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
--void tsc210x_set_transform(uWireSlave *chip,
+index XXXXXXX..XXXXXXX 100644
--                MouseTransformInfo *info);
+--- a/target/arm/translate-a64.c
--void tsc210x_key_event(uWireSlave *chip, int key, int down);
++++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
          return;
      case 0x04: /* SRSHR / URSHR (rounding) */
 -        break;
 +        gen_gvec_fn2i(s, is_q, rd, rn, shift,
 +                      is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
 +        return;
 +
      case 0x06: /* SRSRA / URSRA (accum + rounding) */
 -        accumulate = true;
 -        break;
 +        gen_gvec_fn2i(s, is_q, rd, rn, shift,
 +                      is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
 +        return;
 +
      default:
          g_assert_not_reached();
      }
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
      }
  }
 +/*
 + * Shift one less than the requested amount, and the low bit is
 + * the rounding bit.  For the 8 and 16-bit operations, because we
 + * mask the low bit, we can perform a normal integer shift instead
 + * of a vector shift.
 + */
 +static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_shri_i64(t, a, sh - 1);
 +    tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
 +    tcg_gen_vec_sar8i_i64(d, a, sh);
 +    tcg_gen_vec_add8_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_shri_i64(t, a, sh - 1);
 +    tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
 +    tcg_gen_vec_sar16i_i64(d, a, sh);
 +    tcg_gen_vec_add16_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    tcg_gen_extract_i32(t, a, sh - 1, 1);
 +    tcg_gen_sari_i32(d, a, sh);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_extract_i64(t, a, sh - 1, 1);
 +    tcg_gen_sari_i64(d, a, sh);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    TCGv_vec ones = tcg_temp_new_vec_matching(d);
 +
 +    tcg_gen_shri_vec(vece, t, a, sh - 1);
 +    tcg_gen_dupi_vec(vece, ones, 1);
 +    tcg_gen_and_vec(vece, t, t, ones);
 +    tcg_gen_sari_vec(vece, d, a, sh);
 +    tcg_gen_add_vec(vece, d, d, t);
 +
 +    tcg_temp_free_vec(t);
 +    tcg_temp_free_vec(ones);
 +}
 +
 +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_srshr8_i64,
 +          .fniv = gen_srshr_vec,
 +          .fno = gen_helper_gvec_srshr_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni8 = gen_srshr16_i64,
 +          .fniv = gen_srshr_vec,
 +          .fno = gen_helper_gvec_srshr_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_srshr32_i32,
 +          .fniv = gen_srshr_vec,
 +          .fno = gen_helper_gvec_srshr_s,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_srshr64_i64,
 +          .fniv = gen_srshr_vec,
 +          .fno = gen_helper_gvec_srshr_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +
 +    /* tszimm encoding produces immediates in the range [1..esize] */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    if (shift == (8 << vece)) {
 +        /*
 +         * Shifts larger than the element size are architecturally valid.
 +         * Signed results in all sign bits.  With rounding, this produces
 +         *   (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
 +         * I.e. always zero.
 +         */
 +        tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
 +    } else {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    }
 +}
 +
 +static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    gen_srshr8_i64(t, a, sh);
 +    tcg_gen_vec_add8_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    gen_srshr16_i64(t, a, sh);
 +    tcg_gen_vec_add16_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    gen_srshr32_i32(t, a, sh);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    gen_srshr64_i64(t, a, sh);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +
 +    gen_srshr_vec(vece, t, a, sh);
 +    tcg_gen_add_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_srsra8_i64,
 +          .fniv = gen_srsra_vec,
 +          .fno = gen_helper_gvec_srsra_b,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_8 },
 +        { .fni8 = gen_srsra16_i64,
 +          .fniv = gen_srsra_vec,
 +          .fno = gen_helper_gvec_srsra_h,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_16 },
 +        { .fni4 = gen_srsra32_i32,
 +          .fniv = gen_srsra_vec,
 +          .fno = gen_helper_gvec_srsra_s,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_32 },
 +        { .fni8 = gen_srsra64_i64,
 +          .fniv = gen_srsra_vec,
 +          .fno = gen_helper_gvec_srsra_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 +
 +    /* tszimm encoding produces immediates in the range [1..esize] */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    /*
 +     * Shifts larger than the element size are architecturally valid.
 +     * Signed results in all sign bits.  With rounding, this produces
 +     *   (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
 +     * I.e. always zero.  With accumulation, this leaves D unchanged.
 +     */
 +    if (shift == (8 << vece)) {
 +        /* Nop, but we do need to clear the tail. */
 +        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
 +    } else {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    }
 +}
 +
 +static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_shri_i64(t, a, sh - 1);
 +    tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
 +    tcg_gen_vec_shr8i_i64(d, a, sh);
 +    tcg_gen_vec_add8_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_shri_i64(t, a, sh - 1);
 +    tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
 +    tcg_gen_vec_shr16i_i64(d, a, sh);
 +    tcg_gen_vec_add16_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    tcg_gen_extract_i32(t, a, sh - 1, 1);
 +    tcg_gen_shri_i32(d, a, sh);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_extract_i64(t, a, sh - 1, 1);
 +    tcg_gen_shri_i64(d, a, sh);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    TCGv_vec ones = tcg_temp_new_vec_matching(d);
 +
 +    tcg_gen_shri_vec(vece, t, a, shift - 1);
 +    tcg_gen_dupi_vec(vece, ones, 1);
 +    tcg_gen_and_vec(vece, t, t, ones);
 +    tcg_gen_shri_vec(vece, d, a, shift);
 +    tcg_gen_add_vec(vece, d, d, t);
 +
 +    tcg_temp_free_vec(t);
 +    tcg_temp_free_vec(ones);
 +}
 +
 +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_urshr8_i64,
 +          .fniv = gen_urshr_vec,
 +          .fno = gen_helper_gvec_urshr_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni8 = gen_urshr16_i64,
 +          .fniv = gen_urshr_vec,
 +          .fno = gen_helper_gvec_urshr_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_urshr32_i32,
 +          .fniv = gen_urshr_vec,
 +          .fno = gen_helper_gvec_urshr_s,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_urshr64_i64,
 +          .fniv = gen_urshr_vec,
 +          .fno = gen_helper_gvec_urshr_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +
 +    /* tszimm encoding produces immediates in the range [1..esize] */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    if (shift == (8 << vece)) {
 +        /*
 +         * Shifts larger than the element size are architecturally valid.
 +         * Unsigned results in zero.  With rounding, this produces a
 +         * copy of the most significant bit.
 +         */
 +        tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
 +    } else {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    }
 +}
 +
 +static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    if (sh == 8) {
 +        tcg_gen_vec_shr8i_i64(t, a, 7);
 +    } else {
 +        gen_urshr8_i64(t, a, sh);
 +    }
 +    tcg_gen_vec_add8_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    if (sh == 16) {
 +        tcg_gen_vec_shr16i_i64(t, a, 15);
 +    } else {
 +        gen_urshr16_i64(t, a, sh);
 +    }
 +    tcg_gen_vec_add16_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    if (sh == 32) {
 +        tcg_gen_shri_i32(t, a, 31);
 +    } else {
 +        gen_urshr32_i32(t, a, sh);
 +    }
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    if (sh == 64) {
 +        tcg_gen_shri_i64(t, a, 63);
 +    } else {
 +        gen_urshr64_i64(t, a, sh);
 +    }
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +
 +    if (sh == (8 << vece)) {
 +        tcg_gen_shri_vec(vece, t, a, sh - 1);
 +    } else {
 +        gen_urshr_vec(vece, t, a, sh);
 +    }
 +    tcg_gen_add_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_ursra8_i64,
 +          .fniv = gen_ursra_vec,
 +          .fno = gen_helper_gvec_ursra_b,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_8 },
 +        { .fni8 = gen_ursra16_i64,
 +          .fniv = gen_ursra_vec,
 +          .fno = gen_helper_gvec_ursra_h,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_16 },
 +        { .fni4 = gen_ursra32_i32,
 +          .fniv = gen_ursra_vec,
 +          .fno = gen_helper_gvec_ursra_s,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_32 },
 +        { .fni8 = gen_ursra64_i64,
 +          .fniv = gen_ursra_vec,
 +          .fno = gen_helper_gvec_ursra_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 +
 +    /* tszimm encoding produces immediates in the range [1..esize] */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +}
 +
  static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
      uint64_t mask = dup_const(MO_8, 0xff >> shift);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      }
                      return 0;
 +                case 2: /* VRSHR */
 +                    /* Right shift comes here negative.  */
 +                    shift = -shift;
 +                    if (u) {
 +                        gen_gvec_urshr(size, rd_ofs, rm_ofs, shift,
 +                                       vec_size, vec_size);
 +                    } else {
 +                        gen_gvec_srshr(size, rd_ofs, rm_ofs, shift,
 +                                       vec_size, vec_size);
 +                    }
 +                    return 0;
 +
 +                case 3: /* VRSRA */
 +                    /* Right shift comes here negative.  */
 +                    shift = -shift;
 +                    if (u) {
 +                        gen_gvec_ursra(size, rd_ofs, rm_ofs, shift,
 +                                       vec_size, vec_size);
 +                    } else {
 +                        gen_gvec_srsra(size, rd_ofs, rm_ofs, shift,
 +                                       vec_size, vec_size);
 +                    }
 +                    return 0;
 +
                  case 4: /* VSRI */
                      if (!u) {
                          return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          neon_load_reg64(cpu_V0, rm + pass);
                          tcg_gen_movi_i64(cpu_V1, imm);
                          switch (op) {
 -                        case 2: /* VRSHR */
 -                        case 3: /* VRSRA */
 -                            if (u)
 -                                gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1);
 -                            else
 -                                gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1);
 -                            break;
                          case 6: /* VQSHLU */
                              gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
                                                        cpu_V0, cpu_V1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          default:
                              g_assert_not_reached();
                          }
 -                        if (op == 3) {
 -                            /* Accumulate.  */
 -                            neon_load_reg64(cpu_V1, rd + pass);
 -                            tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
 -                        }
                          neon_store_reg64(cpu_V0, rd + pass);
                      } else { /* size < 3 */
                          /* Operands in T0 and T1.  */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          tmp2 = tcg_temp_new_i32();
                          tcg_gen_movi_i32(tmp2, imm);
                          switch (op) {
 -                        case 2: /* VRSHR */
 -                        case 3: /* VRSRA */
 -                            GEN_NEON_INTEGER_OP(rshl);
 -                            break;
                          case 6: /* VQSHLU */
                              switch (size) {
                              case 0:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              g_assert_not_reached();
                          }
                          tcg_temp_free_i32(tmp2);
 -
--/* tsc2005.c */
+-                        if (op == 3) {
--void *tsc2005_init(qemu_irq pintdav);
+-                            /* Accumulate.  */
--uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len);
+-                            tmp2 = neon_load_reg(rd, pass);
--void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
+-                            gen_neon_add(size, tmp, tmp2);
--
+-                            tcg_temp_free_i32(tmp2);
- #endif
+-                        }
-diff --git a/include/hw/input/tsc2xxx.h b/include/hw/input/tsc2xxx.h
+                         neon_store_reg(rd, pass, tmp);
-new file mode 100644
+                     }
-index XXXXXXX..XXXXXXX
+                 } /* for pass */
---- /dev/null
+diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 +++ b/include/hw/input/tsc2xxx.h
@@ -XXX,XX +XXX,XX @@
 +/*
 + * TI touchscreen controller
 + *
 + * Copyright (c) 2006 Andrzej Zaborowski
 + * Copyright (C) 2008 Nokia Corporation
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or later.
 + * See the COPYING file in the top-level directory.
 + */
 +
 +#ifndef HW_INPUT_TSC2XXX_H
 +#define HW_INPUT_TSC2XXX_H
 +
 +#include "hw/irq.h"
 +#include "ui/console.h"
 +
 +typedef struct uWireSlave {
 +    uint16_t (*receive)(void *opaque);
 +    void (*send)(void *opaque, uint16_t data);
 +    void *opaque;
 +} uWireSlave;
 +
 +/* tsc210x.c */
 +uWireSlave *tsc2102_init(qemu_irq pint);
 +uWireSlave *tsc2301_init(qemu_irq penirq, qemu_irq kbirq, qemu_irq dav);
 +I2SCodec *tsc210x_codec(uWireSlave *chip);
 +uint32_t tsc210x_txrx(void *opaque, uint32_t value, int len);
 +void tsc210x_set_transform(uWireSlave *chip, MouseTransformInfo *info);
 +void tsc210x_key_event(uWireSlave *chip, int key, int down);
 +
 +/* tsc2005.c */
 +void *tsc2005_init(qemu_irq pintdav);
 +uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len);
 +void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
 +
 +#endif
 diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/qemu/typedefs.h
+--- a/target/arm/vec_helper.c
-+++ b/include/qemu/typedefs.h
++++ b/target/arm/vec_helper.c
-@@ -XXX,XX +XXX,XX @@ typedef struct RAMBlock RAMBlock;
+@@ -XXX,XX +XXX,XX @@ DO_SRA(gvec_usra_d, uint64_t)
- typedef struct Range Range;
- typedef struct SHPCDevice SHPCDevice;
+ #undef DO_SRA
- typedef struct SSIBus SSIBus;
--typedef struct uWireSlave uWireSlave;
++#define DO_RSHR(NAME, TYPE)                             \
- typedef struct VirtIODevice VirtIODevice;
++void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
- typedef struct Visitor Visitor;
++{                                                       \
- typedef void SaveStateHandler(QEMUFile *f, void *opaque);
++    intptr_t i, oprsz = simd_oprsz(desc);               \
-diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
++    int shift = simd_data(desc);                        \
-index XXXXXXX..XXXXXXX 100644
++    TYPE *d = vd, *n = vn;                              \
---- a/hw/arm/nseries.c
++    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
-+++ b/hw/arm/nseries.c
++        TYPE tmp = n[i] >> (shift - 1);                 \
-@@ -XXX,XX +XXX,XX @@
++        d[i] = (tmp >> 1) + (tmp & 1);                  \
- #include "ui/console.h"
++    }                                                   \
- #include "hw/boards.h"
++    clear_tail(d, oprsz, simd_maxsz(desc));             \
- #include "hw/i2c/i2c.h"
++}
--#include "hw/devices.h"
++
- #include "hw/display/blizzard.h"
++DO_RSHR(gvec_srshr_b, int8_t)
-+#include "hw/input/tsc2xxx.h"
++DO_RSHR(gvec_srshr_h, int16_t)
- #include "hw/misc/cbus.h"
++DO_RSHR(gvec_srshr_s, int32_t)
- #include "hw/misc/tmp105.h"
++DO_RSHR(gvec_srshr_d, int64_t)
- #include "hw/block/flash.h"
++
-diff --git a/hw/arm/palm.c b/hw/arm/palm.c
++DO_RSHR(gvec_urshr_b, uint8_t)
-index XXXXXXX..XXXXXXX 100644
++DO_RSHR(gvec_urshr_h, uint16_t)
---- a/hw/arm/palm.c
++DO_RSHR(gvec_urshr_s, uint32_t)
-+++ b/hw/arm/palm.c
++DO_RSHR(gvec_urshr_d, uint64_t)
-@@ -XXX,XX +XXX,XX @@
++
- #include "hw/arm/omap.h"
++#undef DO_RSHR
- #include "hw/boards.h"
++
- #include "hw/arm/arm.h"
++#define DO_RSRA(NAME, TYPE)                             \
--#include "hw/devices.h"
++void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
-+#include "hw/input/tsc2xxx.h"
++{                                                       \
- #include "hw/loader.h"
++    intptr_t i, oprsz = simd_oprsz(desc);               \
- #include "exec/address-spaces.h"
++    int shift = simd_data(desc);                        \
- #include "cpu.h"
++    TYPE *d = vd, *n = vn;                              \
-diff --git a/hw/input/tsc2005.c b/hw/input/tsc2005.c
++    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
-index XXXXXXX..XXXXXXX 100644
++        TYPE tmp = n[i] >> (shift - 1);                 \
---- a/hw/input/tsc2005.c
++        d[i] += (tmp >> 1) + (tmp & 1);                 \
-+++ b/hw/input/tsc2005.c
++    }                                                   \
-@@ -XXX,XX +XXX,XX @@
++    clear_tail(d, oprsz, simd_maxsz(desc));             \
- #include "hw/hw.h"
++}
- #include "qemu/timer.h"
++
- #include "ui/console.h"
++DO_RSRA(gvec_srsra_b, int8_t)
--#include "hw/devices.h"
++DO_RSRA(gvec_srsra_h, int16_t)
-+#include "hw/input/tsc2xxx.h"
++DO_RSRA(gvec_srsra_s, int32_t)
- #include "trace.h"
++DO_RSRA(gvec_srsra_d, int64_t)
++
- #define TSC_CUT_RESOLUTION(value, p)    ((value) >> (16 - (p ? 12 : 10)))
++DO_RSRA(gvec_ursra_b, uint8_t)
-diff --git a/hw/input/tsc210x.c b/hw/input/tsc210x.c
++DO_RSRA(gvec_ursra_h, uint16_t)
-index XXXXXXX..XXXXXXX 100644
++DO_RSRA(gvec_ursra_s, uint32_t)
---- a/hw/input/tsc210x.c
++DO_RSRA(gvec_ursra_d, uint64_t)
-+++ b/hw/input/tsc210x.c
++
-@@ -XXX,XX +XXX,XX @@
++#undef DO_RSRA
- #include "audio/audio.h"
++
- #include "qemu/timer.h"
+ /*
- #include "ui/console.h"
+  * Convert float16 to float32, raising no exceptions and
--#include "hw/arm/omap.h"    /* For I2SCodec and uWireSlave */
+  * preserving exceptional values, including SNaN.
 -#include "hw/devices.h"
 +#include "hw/arm/omap.h"            /* For I2SCodec */
 +#include "hw/input/tsc2xxx.h"
  #define TSC_DATA_REGISTERS_PAGE        0x0
  #define TSC_CONTROL_REGISTERS_PAGE    0x1
 diff --git a/MAINTAINERS b/MAINTAINERS
 index XXXXXXX..XXXXXXX 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ F: hw/input/tsc2005.c
  F: hw/misc/cbus.c
  F: hw/timer/twl92230.c
  F: include/hw/display/blizzard.h
 +F: include/hw/input/tsc2xxx.h
  F: include/hw/misc/cbus.h
  Palm
@@ -XXX,XX +XXX,XX @@ L: qemu-arm@nongnu.org
  S: Odd Fixes
  F: hw/arm/palm.c
  F: hw/input/tsc210x.c
 +F: include/hw/input/tsc2xxx.h
  Raspberry Pi
  M: Peter Maydell <peter.maydell@linaro.org>
 --
 .20.1

-[Qemu-devel] [PULL 26/42] target/arm: Implement M-profile lazy FP state preservation
+[PULL 04/45] target/arm: Create gen_gvec_{sri,sli}
-The M-profile architecture floating point system supports
+From: Richard Henderson <richard.henderson@linaro.org>
-lazy FP state preservation, where FP registers are not
-pushed to the stack when an exception occurs but are instead
+The functions eliminate duplication of the special cases for
-only saved if and when the first FP instruction in the exception
+this operation.  They match up with the GVecGen2iFn typedef.
-handler is executed. Implement this in QEMU, corresponding
-to the check of LSPACT in the pseudocode ExecuteFPCheck().
+Add out-of-line helpers.  We got away with only having inline
+expanders because the neon vector size is only 16 bytes, and
 we know that the inline expansion will always succeed.
 When we reuse this for SVE, tcg-gvec-op may decide to use an
 out-of-line helper due to longer vector lengths.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200513163245.17915-4-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-24-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       |   3 ++
+ target/arm/helper.h        |  10 ++
- target/arm/helper.h    |   2 +
+ target/arm/translate.h     |   7 +-
- target/arm/translate.h |   1 +
+ target/arm/translate-a64.c |  20 +---
- target/arm/helper.c    | 112 +++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate.c     | 186 +++++++++++++++++++++----------------
- target/arm/translate.c |  22 ++++++++
+ target/arm/vec_helper.c    |  38 ++++++++
-files changed, 140 insertions(+)
+files changed, 160 insertions(+), 101 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@
  #define EXCP_NOCP           17   /* v7M NOCP UsageFault */
  #define EXCP_INVSTATE       18   /* v7M INVSTATE UsageFault */
  #define EXCP_STKOF          19   /* v8M STKOF UsageFault */
 +#define EXCP_LAZYFP         20   /* v7M fault during lazy FP stacking */
  /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
  #define ARMV7M_EXCP_RESET   1
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
  FIELD(TBFLAG_A32, VFPEN, 7, 1)
  FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
  FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
 +/* For M profile only, set if FPCCR.LSPACT is set */
 +FIELD(TBFLAG_A32, LSPACT, 18, 1)
  /* For M profile only, set if we must create a new FP context */
  FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1)
  /* For M profile only, set if FPCCR.S does not match current security state */
 diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.h
 +++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(v7m_blxns, void, env, i32)
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+ DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
+ DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+DEF_HELPER_1(v7m_preserve_fp_state, void, env)
++DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+
++DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- DEF_HELPER_2(v8m_stackcheck, void, env, i32)
++DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- DEF_HELPER_4(access_check_cp_reg, void, env, ptr, i32, i32)
++
 +DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
  #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
+@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 mls_op[4];
-     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
+ extern const GVecGen3 cmtst_op[4];
-     bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
+ extern const GVecGen3 sshl_op[4];
-     bool v7m_new_fp_ctxt_needed; /* ASPEN set but no active FP context */
+ extern const GVecGen3 ushl_op[4];
-+    bool v7m_lspact; /* FPCCR.LSPACT set */
+-extern const GVecGen2i sri_op[4];
-     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
+-extern const GVecGen2i sli_op[4];
-      * so that top level loop can generate correct syndrome information.
+ extern const GVecGen4 uqadd_op[4];
-      */
+ extern const GVecGen4 sqadd_op[4];
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+ extern const GVecGen4 uqsub_op[4];
@@ -XXX,XX +XXX,XX @@ void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
  void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                      int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/translate-a64.c
-+++ b/target/arm/helper.c
++++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
+@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
-     g_assert_not_reached();
+                    is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
  }
-+void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
+-/* Expand a 2-operand + immediate AdvSIMD vector operation using
-+{
+- * an op descriptor.
-+    /* translate.c should never generate calls here in user-only mode */
+- */
-+    g_assert_not_reached();
+-static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd,
-+}
+-                          int rn, int64_t imm, const GVecGen2i *gvec_op)
-+
+-{
- uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
+-    tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
- {
+-                    is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op);
-     /* The TT instructions can be used by unprivileged code, but in
+-}
-@@ -XXX,XX +XXX,XX @@ pend_fault:
+-
-     return false;
+ /* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
  static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
                           int rn, int rm, const GVecGen3 *gvec_op)
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
          gen_gvec_fn2i(s, is_q, rd, rn, shift,
                        is_u ? gen_gvec_usra : gen_gvec_ssra, size);
          return;
 +
      case 0x08: /* SRI */
 -        /* Shift count same as element size is valid but does nothing.  */
 -        if (shift == 8 << size) {
 -            goto done;
 -        }
 -        gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]);
 +        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
          return;
      case 0x00: /* SSHR / USHR */
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
      }
      tcg_temp_free_i64(tcg_round);
 - done:
      clear_vec_high(s, is_q, rd);
  }
-+void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
+@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
 +{
 +    /*
 +     * Preserve FP state (because LSPACT was set and we are about
 +     * to execute an FP instruction). This corresponds to the
 +     * PreserveFPState() pseudocode.
 +     * We may throw an exception if the stacking fails.
 +     */
 +    ARMCPU *cpu = arm_env_get_cpu(env);
 +    bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
 +    bool negpri = !(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_HFRDY_MASK);
 +    bool is_priv = !(env->v7m.fpccr[is_secure] & R_V7M_FPCCR_USER_MASK);
 +    bool splimviol = env->v7m.fpccr[is_secure] & R_V7M_FPCCR_SPLIMVIOL_MASK;
 +    uint32_t fpcar = env->v7m.fpcar[is_secure];
 +    bool stacked_ok = true;
 +    bool ts = is_secure && (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK);
 +    bool take_exception;
 +
 +    /* Take the iothread lock as we are going to touch the NVIC */
 +    qemu_mutex_lock_iothread();
 +
 +    /* Check the background context had access to the FPU */
 +    if (!v7m_cpacr_pass(env, is_secure, is_priv)) {
 +        armv7m_nvic_set_pending_lazyfp(env->nvic, ARMV7M_EXCP_USAGE, is_secure);
 +        env->v7m.cfsr[is_secure] |= R_V7M_CFSR_NOCP_MASK;
 +        stacked_ok = false;
 +    } else if (!is_secure && !extract32(env->v7m.nsacr, 10, 1)) {
 +        armv7m_nvic_set_pending_lazyfp(env->nvic, ARMV7M_EXCP_USAGE, M_REG_S);
 +        env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_NOCP_MASK;
 +        stacked_ok = false;
 +    }
 +
 +    if (!splimviol && stacked_ok) {
 +        /* We only stack if the stack limit wasn't violated */
 +        int i;
 +        ARMMMUIdx mmu_idx;
 +
 +        mmu_idx = arm_v7m_mmu_idx_all(env, is_secure, is_priv, negpri);
 +        for (i = 0; i < (ts ? 32 : 16); i += 2) {
 +            uint64_t dn = *aa32_vfp_dreg(env, i / 2);
 +            uint32_t faddr = fpcar + 4 * i;
 +            uint32_t slo = extract64(dn, 0, 32);
 +            uint32_t shi = extract64(dn, 32, 32);
 +
 +            if (i >= 16) {
 +                faddr += 8; /* skip the slot for the FPSCR */
 +            }
 +            stacked_ok = stacked_ok &&
 +                v7m_stack_write(cpu, faddr, slo, mmu_idx, STACK_LAZYFP) &&
 +                v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, STACK_LAZYFP);
 +        }
 +
 +        stacked_ok = stacked_ok &&
 +            v7m_stack_write(cpu, fpcar + 0x40,
 +                            vfp_get_fpscr(env), mmu_idx, STACK_LAZYFP);
 +    }
 +
 +    /*
 +     * We definitely pended an exception, but it's possible that it
 +     * might not be able to be taken now. If its priority permits us
 +     * to take it now, then we must not update the LSPACT or FP regs,
 +     * but instead jump out to take the exception immediately.
 +     * If it's just pending and won't be taken until the current
 +     * handler exits, then we do update LSPACT and the FP regs.
 +     */
 +    take_exception = !stacked_ok &&
 +        armv7m_nvic_can_take_pending_exception(env->nvic);
 +
 +    qemu_mutex_unlock_iothread();
 +
 +    if (take_exception) {
 +        raise_exception_ra(env, EXCP_LAZYFP, 0, 1, GETPC());
 +    }
 +
 +    env->v7m.fpccr[is_secure] &= ~R_V7M_FPCCR_LSPACT_MASK;
 +
 +    if (ts) {
 +        /* Clear s0 to s31 and the FPSCR */
 +        int i;
 +
 +        for (i = 0; i < 32; i += 2) {
 +            *aa32_vfp_dreg(env, i / 2) = 0;
 +        }
 +        vfp_set_fpscr(env, 0);
 +    }
 +    /*
 +     * Otherwise s0 to s15 and FPSCR are UNKNOWN; we choose to leave them
 +     * unchanged.
 +     */
 +}
 +
  /* Write to v7M CONTROL.SPSEL bit for the specified security bank.
   * This may change the current stack pointer between Main and Process
   * stack pointers if it is done for the CONTROL register for the current
@@ -XXX,XX +XXX,XX @@ static void arm_log_exception(int idx)
              [EXCP_NOCP] = "v7M NOCP UsageFault",
              [EXCP_INVSTATE] = "v7M INVSTATE UsageFault",
              [EXCP_STKOF] = "v8M STKOF UsageFault",
 +            [EXCP_LAZYFP] = "v7M exception during lazy FP stacking",
          };
          if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
              return;
          }
          break;
 +    case EXCP_LAZYFP:
 +        /*
 +         * We already pended the specific exception in the NVIC in the
 +         * v7m_preserve_fp_state() helper function.
 +         */
 +        break;
      default:
          cpu_abort(cs, "Unhandled exception 0x%x\n", cs->exception_index);
          return; /* Never happens.  Keep compiler happy.  */
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
          flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
      }
-+    if (arm_feature(env, ARM_FEATURE_M)) {
+     if (insert) {
-+        bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
+-        gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]);
-+
++        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size);
-+        if (env->v7m.fpccr[is_secure] & R_V7M_FPCCR_LSPACT_MASK) {
+     } else {
-+            flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
+         gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size);
-+        }
+     }
 +    }
 +
      *pflags = flags;
      *cs_base = 0;
  }
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
-     if (arm_dc_feature(s, ARM_FEATURE_M)) {
-         /* Handle M-profile lazy FP state mechanics */
+ static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
+ {
-+        /* Trigger lazy-state preservation if necessary */
+-    if (sh == 0) {
-+        if (s->v7m_lspact) {
+-        tcg_gen_mov_vec(d, a);
-+            /*
+-    } else {
-+             * Lazy state saving affects external memory and also the NVIC,
+-        TCGv_vec t = tcg_temp_new_vec_matching(d);
-+             * so we must mark it as an IO operation for icount.
+-        TCGv_vec m = tcg_temp_new_vec_matching(d);
-+             */
++    TCGv_vec t = tcg_temp_new_vec_matching(d);
-+            if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
++    TCGv_vec m = tcg_temp_new_vec_matching(d);
-+                gen_io_start();
-+            }
+-        tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
-+            gen_helper_v7m_preserve_fp_state(cpu_env);
+-        tcg_gen_shri_vec(vece, t, a, sh);
-+            if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+-        tcg_gen_and_vec(vece, d, d, m);
-+                gen_io_end();
+-        tcg_gen_or_vec(vece, d, d, t);
-+            }
++    tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
-+            /*
++    tcg_gen_shri_vec(vece, t, a, sh);
-+             * If the preserve_fp_state helper doesn't throw an exception
++    tcg_gen_and_vec(vece, d, d, m);
-+             * then it will clear LSPACT; we don't need to repeat this for
++    tcg_gen_or_vec(vece, d, d, t);
-+             * any further FP insns in this TB.
-+             */
+-        tcg_temp_free_vec(t);
-+            s->v7m_lspact = false;
+-        tcg_temp_free_vec(m);
-+        }
+-    }
-+
++    tcg_temp_free_vec(t);
-         /* Update ownership of FP context: set FPCCR.S to match current state */
++    tcg_temp_free_vec(m);
-         if (s->v8m_fpccr_s_wrong) {
+ }
-             TCGv_i32 tmp;
-@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
+-static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 };
-     dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
++void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-     dc->v7m_new_fp_ctxt_needed =
++                  int64_t shift, uint32_t opr_sz, uint32_t max_sz)
-         FIELD_EX32(tb_flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED);
++{
-+    dc->v7m_lspact = FIELD_EX32(tb_flags, TBFLAG_A32, LSPACT);
++    static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 };
-     dc->cp_regs = cpu->cp_regs;
++    const GVecGen2i ops[4] = {
-     dc->features = env->features;
++        { .fni8 = gen_shr8_ins_i64,
++          .fniv = gen_shr_ins_vec,
 +          .fno = gen_helper_gvec_sri_b,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni8 = gen_shr16_ins_i64,
 +          .fniv = gen_shr_ins_vec,
 +          .fno = gen_helper_gvec_sri_h,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_shr32_ins_i32,
 +          .fniv = gen_shr_ins_vec,
 +          .fno = gen_helper_gvec_sri_s,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_shr64_ins_i64,
 +          .fniv = gen_shr_ins_vec,
 +          .fno = gen_helper_gvec_sri_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 -const GVecGen2i sri_op[4] = {
 -    { .fni8 = gen_shr8_ins_i64,
 -      .fniv = gen_shr_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sri,
 -      .vece = MO_8 },
 -    { .fni8 = gen_shr16_ins_i64,
 -      .fniv = gen_shr_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sri,
 -      .vece = MO_16 },
 -    { .fni4 = gen_shr32_ins_i32,
 -      .fniv = gen_shr_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sri,
 -      .vece = MO_32 },
 -    { .fni8 = gen_shr64_ins_i64,
 -      .fniv = gen_shr_ins_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sri,
 -      .vece = MO_64 },
 -};
 +    /* tszimm encoding produces immediates in the range [1..esize]. */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    /* Shift of esize leaves destination unchanged. */
 +    if (shift < (8 << vece)) {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    } else {
 +        /* Nop, but we do need to clear the tail. */
 +        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
 +    }
 +}
  static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
@@ -XXX,XX +XXX,XX @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
  {
 -    if (sh == 0) {
 -        tcg_gen_mov_vec(d, a);
 -    } else {
 -        TCGv_vec t = tcg_temp_new_vec_matching(d);
 -        TCGv_vec m = tcg_temp_new_vec_matching(d);
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    TCGv_vec m = tcg_temp_new_vec_matching(d);
 -        tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
 -        tcg_gen_shli_vec(vece, t, a, sh);
 -        tcg_gen_and_vec(vece, d, d, m);
 -        tcg_gen_or_vec(vece, d, d, t);
 +    tcg_gen_shli_vec(vece, t, a, sh);
 +    tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
 +    tcg_gen_and_vec(vece, d, d, m);
 +    tcg_gen_or_vec(vece, d, d, t);
 -        tcg_temp_free_vec(t);
 -        tcg_temp_free_vec(m);
 -    }
 +    tcg_temp_free_vec(t);
 +    tcg_temp_free_vec(m);
  }
 -static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 };
 +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 };
 +    const GVecGen2i ops[4] = {
 +        { .fni8 = gen_shl8_ins_i64,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_b,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni8 = gen_shl16_ins_i64,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_h,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_shl32_ins_i32,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_s,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_shl64_ins_i64,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 -const GVecGen2i sli_op[4] = {
 -    { .fni8 = gen_shl8_ins_i64,
 -      .fniv = gen_shl_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_8 },
 -    { .fni8 = gen_shl16_ins_i64,
 -      .fniv = gen_shl_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_16 },
 -    { .fni4 = gen_shl32_ins_i32,
 -      .fniv = gen_shl_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_32 },
 -    { .fni8 = gen_shl64_ins_i64,
 -      .fniv = gen_shl_ins_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_64 },
 -};
 +    /* tszimm encoding produces immediates in the range [0..esize-1]. */
 +    tcg_debug_assert(shift >= 0);
 +    tcg_debug_assert(shift < (8 << vece));
 +
 +    if (shift == 0) {
 +        tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz);
 +    } else {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    }
 +}
  static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
  {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      }
                      /* Right shift comes here negative.  */
                      shift = -shift;
 -                    /* Shift out of range leaves destination unchanged.  */
 -                    if (shift < 8 << size) {
 -                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 -                                        shift, &sri_op[size]);
 -                    }
 +                    gen_gvec_sri(size, rd_ofs, rm_ofs, shift,
 +                                 vec_size, vec_size);
                      return 0;
                  case 5: /* VSHL, VSLI */
                      if (u) { /* VSLI */
 -                        /* Shift out of range leaves destination unchanged.  */
 -                        if (shift < 8 << size) {
 -                            tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size,
 -                                            vec_size, shift, &sli_op[size]);
 -                        }
 +                        gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
 +                                     vec_size, vec_size);
                      } else { /* VSHL */
                          /* Shifts larger than the element size are
                           * architecturally valid and results in zero.
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_RSRA(gvec_ursra_d, uint64_t)
  #undef DO_RSRA
 +#define DO_SRI(NAME, TYPE)                              \
 +void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
 +{                                                       \
 +    intptr_t i, oprsz = simd_oprsz(desc);               \
 +    int shift = simd_data(desc);                        \
 +    TYPE *d = vd, *n = vn;                              \
 +    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
 +        d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \
 +    }                                                   \
 +    clear_tail(d, oprsz, simd_maxsz(desc));             \
 +}
 +
 +DO_SRI(gvec_sri_b, uint8_t)
 +DO_SRI(gvec_sri_h, uint16_t)
 +DO_SRI(gvec_sri_s, uint32_t)
 +DO_SRI(gvec_sri_d, uint64_t)
 +
 +#undef DO_SRI
 +
 +#define DO_SLI(NAME, TYPE)                              \
 +void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
 +{                                                       \
 +    intptr_t i, oprsz = simd_oprsz(desc);               \
 +    int shift = simd_data(desc);                        \
 +    TYPE *d = vd, *n = vn;                              \
 +    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
 +        d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \
 +    }                                                   \
 +    clear_tail(d, oprsz, simd_maxsz(desc));             \
 +}
 +
 +DO_SLI(gvec_sli_b, uint8_t)
 +DO_SLI(gvec_sli_h, uint16_t)
 +DO_SLI(gvec_sli_s, uint32_t)
 +DO_SLI(gvec_sli_d, uint64_t)
 +
 +#undef DO_SLI
 +
  /*
   * Convert float16 to float32, raising no exceptions and
   * preserving exceptional values, including SNaN.
 --
 .20.1

-[Qemu-devel] [PULL 37/42] hw/devices: Move Gamepad declarations into a new header
+[PULL 05/45] target/arm: Remove unnecessary range check for VSHL
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+In 1dc8425e551, while converting to gvec, I added an extra range check
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+against the shift count.  This was unnecessary because the encoding of
-Message-id: 20190412165416.7977-8-philmd@redhat.com
+the shift count produces 0 to the element size - 1.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200513163245.17915-5-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/devices.h       |  3 ---
+ target/arm/translate.c | 12 ++----------
- include/hw/input/gamepad.h | 19 +++++++++++++++++++
+file changed, 2 insertions(+), 10 deletions(-)
  hw/arm/stellaris.c         |  2 +-
  hw/input/stellaris_input.c |  2 +-
  MAINTAINERS                |  1 +
 files changed, 22 insertions(+), 5 deletions(-)
  create mode 100644 include/hw/input/gamepad.h
-diff --git a/include/hw/devices.h b/include/hw/devices.h
+diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
+--- a/target/arm/translate.c
-+++ b/include/hw/devices.h
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ void *tsc2005_init(qemu_irq pintdav);
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
- uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len);
+                         gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
- void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
+                                      vec_size, vec_size);
+                     } else { /* VSHL */
--/* stellaris_input.c */
+-                        /* Shifts larger than the element size are
--void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
+-                         * architecturally valid and results in zero.
--
+-                         */
- #endif
+-                        if (shift >= 8 << size) {
-diff --git a/include/hw/input/gamepad.h b/include/hw/input/gamepad.h
+-                            tcg_gen_gvec_dup_imm(size, rd_ofs,
-new file mode 100644
+-                                                 vec_size, vec_size, 0);
-index XXXXXXX..XXXXXXX
+-                        } else {
---- /dev/null
+-                            tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
-+++ b/include/hw/input/gamepad.h
+-                                              vec_size, vec_size);
-@@ -XXX,XX +XXX,XX @@
+-                        }
-+/*
++                        tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
-+ * Gamepad style buttons connected to IRQ/GPIO lines
++                                          vec_size, vec_size);
-+ *
+                     }
-+ * Copyright (c) 2007 CodeSourcery.
+                     return 0;
-+ * Written by Paul Brook
+                 }
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or later.
 + * See the COPYING file in the top-level directory.
 + */
 +
 +#ifndef HW_INPUT_GAMEPAD_H
 +#define HW_INPUT_GAMEPAD_H
 +
 +#include "hw/irq.h"
 +
 +/* stellaris_input.c */
 +void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
 +
 +#endif
 diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/stellaris.c
 +++ b/hw/arm/stellaris.c
@@ -XXX,XX +XXX,XX @@
  #include "hw/sysbus.h"
  #include "hw/ssi/ssi.h"
  #include "hw/arm/arm.h"
 -#include "hw/devices.h"
  #include "qemu/timer.h"
  #include "hw/i2c/i2c.h"
  #include "net/net.h"
@@ -XXX,XX +XXX,XX @@
  #include "sysemu/sysemu.h"
  #include "hw/arm/armv7m.h"
  #include "hw/char/pl011.h"
 +#include "hw/input/gamepad.h"
  #include "hw/watchdog/cmsdk-apb-watchdog.h"
  #include "hw/misc/unimp.h"
  #include "cpu.h"
 diff --git a/hw/input/stellaris_input.c b/hw/input/stellaris_input.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/input/stellaris_input.c
 +++ b/hw/input/stellaris_input.c
@@ -XXX,XX +XXX,XX @@
   */
  #include "qemu/osdep.h"
  #include "hw/hw.h"
 -#include "hw/devices.h"
 +#include "hw/input/gamepad.h"
  #include "ui/console.h"
  typedef struct {
 diff --git a/MAINTAINERS b/MAINTAINERS
 index XXXXXXX..XXXXXXX 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ M: Peter Maydell <peter.maydell@linaro.org>
  L: qemu-arm@nongnu.org
  S: Maintained
  F: hw/*/stellaris*
 +F: include/hw/input/gamepad.h
  Versatile Express
  M: Peter Maydell <peter.maydell@linaro.org>
 --
 .20.1

-New patch
+[PULL 06/45] target/arm: Tidy handle_vec_simd_shri
+From: Richard Henderson <richard.henderson@linaro.org>
+Now that we've converted all cases to gvec, there is quite a bit
+of dead code at the end of the function.  Remove it.
+Sink the call to gen_gvec_fn2i to the end, loading a function
+pointer within the switch statement.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200513163245.17915-6-richard.henderson@linaro.org
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ target/arm/translate-a64.c | 56 ++++++++++----------------------------
+file changed, 14 insertions(+), 42 deletions(-)
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-a64.c
++++ b/target/arm/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
+     int size = 32 - clz32(immh) - 1;
+     int immhb = immh << 3 | immb;
+     int shift = 2 * (8 << size) - immhb;
+-    bool accumulate = false;
+-    int dsize = is_q ? 128 : 64;
+-    int esize = 8 << size;
+-    int elements = dsize/esize;
+-    MemOp memop = size | (is_u ? 0 : MO_SIGN);
+-    TCGv_i64 tcg_rn = new_tmp_a64(s);
+-    TCGv_i64 tcg_rd = new_tmp_a64(s);
+-    TCGv_i64 tcg_round;
+-    uint64_t round_const;
+-    int i;
++    GVecGen2iFn *gvec_fn;
+     if (extract32(immh, 3, 1) && !is_q) {
+         unallocated_encoding(s);
+@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
+     switch (opcode) {
+     case 0x02: /* SSRA / USRA (accumulate) */
+-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
+-                      is_u ? gen_gvec_usra : gen_gvec_ssra, size);
+-        return;
++        gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra;
++        break;
+     case 0x08: /* SRI */
+-        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
+-        return;
++        gvec_fn = gen_gvec_sri;
++        break;
+     case 0x00: /* SSHR / USHR */
+         if (is_u) {
+@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
+                 /* Shift count the same size as element size produces zero.  */
+                 tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd),
+                                      is_q ? 16 : 8, vec_full_reg_size(s), 0);
+-            } else {
+-                gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size);
++                return;
+             }
++            gvec_fn = tcg_gen_gvec_shri;
+         } else {
+             /* Shift count the same size as element size produces all sign.  */
+             if (shift == 8 << size) {
+                 shift -= 1;
+             }
+-            gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size);
++            gvec_fn = tcg_gen_gvec_sari;
+         }
+-        return;
++        break;
+     case 0x04: /* SRSHR / URSHR (rounding) */
+-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
+-                      is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
+-        return;
++        gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr;
++        break;
+     case 0x06: /* SRSRA / URSRA (accum + rounding) */
+-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
+-                      is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
+-        return;
++        gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra;
++        break;
+     default:
+         g_assert_not_reached();
+     }
+-    round_const = 1ULL << (shift - 1);
+-    tcg_round = tcg_const_i64(round_const);
+-
+-    for (i = 0; i < elements; i++) {
+-        read_vec_element(s, tcg_rn, rn, i, memop);
+-        if (accumulate) {
+-            read_vec_element(s, tcg_rd, rd, i, memop);
+-        }
+-
+-        handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
+-                                accumulate, is_u, size, shift);
+-
+-        write_vec_element(s, tcg_rd, rd, i, size);
+-    }
+-    tcg_temp_free_i64(tcg_round);
+-
+-    clear_vec_high(s, is_q, rd);
++    gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size);
+ }
+ /* SHL/SLI - Vector shift left */
+--
+.20.1

-[Qemu-devel] [PULL 41/42] hw/net/lan9118: Export TYPE_LAN9118 and use it instead of hardcoded string
+[PULL 07/45] target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+Provide a functional interface for the vector expansion.
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+This fits better with the existing set of helpers that
-Message-id: 20190412165416.7977-12-philmd@redhat.com
+we provide for other operations.
 Macro-ize the 5 nearly identical comparisons.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200513163245.17915-7-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/net/lan9118.h | 2 ++
+ target/arm/translate.h     |  16 ++-
- hw/arm/exynos4_boards.c  | 3 ++-
+ target/arm/translate-a64.c |  22 ++--
- hw/arm/mps2-tz.c         | 3 ++-
+ target/arm/translate.c     | 254 ++++++++-----------------------------
- hw/net/lan9118.c         | 1 -
+files changed, 74 insertions(+), 218 deletions(-)
-files changed, 6 insertions(+), 3 deletions(-)
+diff --git a/target/arm/translate.h b/target/arm/translate.h
 diff --git a/include/hw/net/lan9118.h b/include/hw/net/lan9118.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/net/lan9118.h
+--- a/target/arm/translate.h
-+++ b/include/hw/net/lan9118.h
++++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
- #include "hw/irq.h"
+ uint64_t vfp_expand_imm(int size, uint8_t imm8);
- #include "net/net.h"
+ /* Vector operations shared between ARM and AArch64.  */
-+#define TYPE_LAN9118 "lan9118"
+-extern const GVecGen2 ceq0_op[4];
 -extern const GVecGen2 clt0_op[4];
 -extern const GVecGen2 cgt0_op[4];
 -extern const GVecGen2 cle0_op[4];
 -extern const GVecGen2 cge0_op[4];
 +void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   uint32_t opr_sz, uint32_t max_sz);
 +
- void lan9118_init(NICInfo *, uint32_t, qemu_irq);
+ extern const GVecGen3 mla_op[4];
+ extern const GVecGen3 mls_op[4];
- #endif
+ extern const GVecGen3 cmtst_op[4];
-diff --git a/hw/arm/exynos4_boards.c b/hw/arm/exynos4_boards.c
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/exynos4_boards.c
+--- a/target/arm/translate-a64.c
-+++ b/hw/arm/exynos4_boards.c
++++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
- #include "hw/arm/arm.h"
+             is_q ? 16 : 8, vec_full_reg_size(s));
- #include "exec/address-spaces.h"
+ }
- #include "hw/arm/exynos4210.h"
-+#include "hw/net/lan9118.h"
+-/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */
- #include "hw/boards.h"
+-static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
+-                         int rn, const GVecGen2 *gvec_op)
- #undef DEBUG
+-{
-@@ -XXX,XX +XXX,XX @@ static void lan9215_init(uint32_t base, qemu_irq irq)
+-    tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
-     /* This should be a 9215 but the 9118 is close enough */
+-                   is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
-     if (nd_table[0].used) {
+-}
-         qemu_check_nic_model(&nd_table[0], "lan9118");
+-
--        dev = qdev_create(NULL, "lan9118");
+ /* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
-+        dev = qdev_create(NULL, TYPE_LAN9118);
+ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
-         qdev_set_nic_properties(dev, &nd_table[0]);
+                          int rn, int rm, const GVecGen3 *gvec_op)
-         qdev_prop_set_uint32(dev, "mode_16bit", 1);
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
-         qdev_init_nofail(dev);
+         }
-diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
+         break;
      case 0x8: /* CMGT, CMGE */
 -        gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]);
 +        if (u) {
 +            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size);
 +        } else {
 +            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size);
 +        }
          return;
      case 0x9: /* CMEQ, CMLE */
 -        gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]);
 +        if (u) {
 +            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size);
 +        } else {
 +            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size);
 +        }
          return;
      case 0xa: /* CMLT */
 -        gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]);
 +        gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size);
          return;
      case 0xb:
          if (u) { /* ABS, NEG */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/mps2-tz.c
+--- a/target/arm/translate.c
-+++ b/hw/arm/mps2-tz.c
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
- #include "hw/arm/armsse.h"
+     return 1;
- #include "hw/dma/pl080.h"
+ }
- #include "hw/ssi/pl022.h"
-+#include "hw/net/lan9118.h"
+-static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a)
- #include "net/net.h"
+-{
- #include "hw/core/split-irq.h"
+-    tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0);
+-    tcg_gen_neg_i32(d, d);
-@@ -XXX,XX +XXX,XX @@ static MemoryRegion *make_eth_dev(MPS2TZMachineState *mms, void *opaque,
+-}
-      * except that it doesn't support the checksum-offload feature.
+-
-      */
+-static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a)
-     qemu_check_nic_model(nd, "lan9118");
+-{
--    mms->lan9118 = qdev_create(NULL, "lan9118");
+-    tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0);
-+    mms->lan9118 = qdev_create(NULL, TYPE_LAN9118);
+-    tcg_gen_neg_i64(d, d);
-     qdev_set_nic_properties(mms->lan9118, nd);
+-}
-     qdev_init_nofail(mms->lan9118);
+-
+-static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
+-{
-index XXXXXXX..XXXXXXX 100644
+-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
---- a/hw/net/lan9118.c
+-    tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero);
-+++ b/hw/net/lan9118.c
+-    tcg_temp_free_vec(zero);
-@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118_packet = {
+-}
-     }
++#define GEN_CMP0(NAME, COND)                                            \
 +    static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a)               \
 +    {                                                                   \
 +        tcg_gen_setcondi_i32(COND, d, a, 0);                            \
 +        tcg_gen_neg_i32(d, d);                                          \
 +    }                                                                   \
 +    static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a)               \
 +    {                                                                   \
 +        tcg_gen_setcondi_i64(COND, d, a, 0);                            \
 +        tcg_gen_neg_i64(d, d);                                          \
 +    }                                                                   \
 +    static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \
 +    {                                                                   \
 +        TCGv_vec zero = tcg_const_zeros_vec_matching(d);                \
 +        tcg_gen_cmp_vec(COND, vece, d, a, zero);                        \
 +        tcg_temp_free_vec(zero);                                        \
 +    }                                                                   \
 +    void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m,      \
 +                            uint32_t opr_sz, uint32_t max_sz)           \
 +    {                                                                   \
 +        const GVecGen2 op[4] = {                                        \
 +            { .fno = gen_helper_gvec_##NAME##0_b,                       \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .vece = MO_8 },                                           \
 +            { .fno = gen_helper_gvec_##NAME##0_h,                       \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .vece = MO_16 },                                          \
 +            { .fni4 = gen_##NAME##0_i32,                                \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .vece = MO_32 },                                          \
 +            { .fni8 = gen_##NAME##0_i64,                                \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .prefer_i64 = TCG_TARGET_REG_BITS == 64,                  \
 +              .vece = MO_64 },                                          \
 +        };                                                              \
 +        tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]);                \
 +    }
  static const TCGOpcode vecop_list_cmp[] = {
      INDEX_op_cmp_vec, 0
  };
--#define TYPE_LAN9118 "lan9118"
+-const GVecGen2 ceq0_op[4] = {
- #define LAN9118(obj) OBJECT_CHECK(lan9118_state, (obj), TYPE_LAN9118)
+-    { .fno = gen_helper_gvec_ceq0_b,
+-      .fniv = gen_ceq0_vec,
- typedef struct {
+-      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_ceq0_h,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_ceq0_i32,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_ceq0_i64,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 +GEN_CMP0(ceq, TCG_COND_EQ)
 +GEN_CMP0(cle, TCG_COND_LE)
 +GEN_CMP0(cge, TCG_COND_GE)
 +GEN_CMP0(clt, TCG_COND_LT)
 +GEN_CMP0(cgt, TCG_COND_GT)
 -static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 cle0_op[4] = {
 -    { .fno = gen_helper_gvec_cle0_b,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_cle0_h,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cle0_i32,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cle0_i64,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 -
 -static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 cge0_op[4] = {
 -    { .fno = gen_helper_gvec_cge0_b,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_cge0_h,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cge0_i32,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cge0_i64,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 -
 -static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 clt0_op[4] = {
 -    { .fno = gen_helper_gvec_clt0_b,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_clt0_h,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_clt0_i32,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_clt0_i64,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 -
 -static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 cgt0_op[4] = {
 -    { .fno = gen_helper_gvec_cgt0_b,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_cgt0_h,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cgt0_i32,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cgt0_i64,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 +#undef GEN_CMP0
  static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      break;
                  case NEON_2RM_VCEQ0:
 -                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
 -                                   vec_size, &ceq0_op[size]);
 +                    gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                      break;
                  case NEON_2RM_VCGT0:
 -                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
 -                                   vec_size, &cgt0_op[size]);
 +                    gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                      break;
                  case NEON_2RM_VCLE0:
 -                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
 -                                   vec_size, &cle0_op[size]);
 +                    gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                      break;
                  case NEON_2RM_VCGE0:
 -                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
 -                                   vec_size, &cge0_op[size]);
 +                    gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                      break;
                  case NEON_2RM_VCLT0:
 -                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
 -                                   vec_size, &clt0_op[size]);
 +                    gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                      break;
                  default:
 --
 .20.1

-New patch
+[PULL 08/45] target/arm: Create gen_gvec_{mla,mls}
+From: Richard Henderson <richard.henderson@linaro.org>
 Provide a functional interface for the vector expansion.
 This fits better with the existing set of helpers that
 we provide for other operations.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200513163245.17915-8-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  target/arm/translate.h          |   7 +-
  target/arm/translate-a64.c      |   4 +-
  target/arm/translate-neon.inc.c |  16 +----
  target/arm/translate.c          | 117 +++++++++++++++++---------------
 files changed, 71 insertions(+), 73 deletions(-)
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
  void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                     uint32_t opr_sz, uint32_t max_sz);
 -extern const GVecGen3 mla_op[4];
 -extern const GVecGen3 mls_op[4];
 +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
  extern const GVecGen3 cmtst_op[4];
  extern const GVecGen3 sshl_op[4];
  extern const GVecGen3 ushl_op[4];
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
          return;
      case 0x12: /* MLA, MLS */
          if (u) {
 -            gen_gvec_op3(s, is_q, rd, rn, rm, &mls_op[size]);
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size);
          } else {
 -            gen_gvec_op3(s, is_q, rd, rn, rm, &mla_op[size]);
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size);
          }
          return;
      case 0x11:
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
  DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
  DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
  DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
 +DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
 +DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
  #define DO_3SAME_CMP(INSN, COND)                                        \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
      return do_3same(s, a, gen_VMUL_p_3s);
  }
 -#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY)                           \
 -    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 -                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 -                                uint32_t oprsz, uint32_t maxsz)         \
 -    {                                                                   \
 -        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
 -                       oprsz, maxsz, &OPARRAY[vece]);                   \
 -    }                                                                   \
 -    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
 -
 -
 -DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op)
 -DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op)
 -
  #define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
                                  uint32_t rn_ofs, uint32_t rm_ofs,       \
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
  /* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
   * these tables are shared with AArch64 which does support them.
   */
 +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_mul_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fni4 = gen_mla8_i32,
 +          .fniv = gen_mla_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni4 = gen_mla16_i32,
 +          .fniv = gen_mla_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_mla32_i32,
 +          .fniv = gen_mla_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_mla64_i64,
 +          .fniv = gen_mla_vec,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 -static const TCGOpcode vecop_list_mla[] = {
 -    INDEX_op_mul_vec, INDEX_op_add_vec, 0
 -};
 -
 -static const TCGOpcode vecop_list_mls[] = {
 -    INDEX_op_mul_vec, INDEX_op_sub_vec, 0
 -};
 -
 -const GVecGen3 mla_op[4] = {
 -    { .fni4 = gen_mla8_i32,
 -      .fniv = gen_mla_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_8 },
 -    { .fni4 = gen_mla16_i32,
 -      .fniv = gen_mla_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_16 },
 -    { .fni4 = gen_mla32_i32,
 -      .fniv = gen_mla_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_32 },
 -    { .fni8 = gen_mla64_i64,
 -      .fniv = gen_mla_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_64 },
 -};
 -
 -const GVecGen3 mls_op[4] = {
 -    { .fni4 = gen_mls8_i32,
 -      .fniv = gen_mls_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_8 },
 -    { .fni4 = gen_mls16_i32,
 -      .fniv = gen_mls_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_16 },
 -    { .fni4 = gen_mls32_i32,
 -      .fniv = gen_mls_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_32 },
 -    { .fni8 = gen_mls64_i64,
 -      .fniv = gen_mls_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_mul_vec, INDEX_op_sub_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fni4 = gen_mls8_i32,
 +          .fniv = gen_mls_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni4 = gen_mls16_i32,
 +          .fniv = gen_mls_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_mls32_i32,
 +          .fniv = gen_mls_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_mls64_i64,
 +          .fniv = gen_mls_vec,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  /* CMTST : test is "if (X & Y != 0)". */
  static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 --
 .20.1

-New patch
+[PULL 09/45] target/arm: Swap argument order for VSHL during decode
+From: Richard Henderson <richard.henderson@linaro.org>
+Rather than perform the argument swap during code generation,
+perform it during decode.  This means it doesn't have to be
+special cased later, and we can share code with aarch64 code
+generation.  Hopefully the decode comment addresses any confusion
+that might arise in between.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200513163245.17915-9-richard.henderson@linaro.org
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ target/arm/neon-dp.decode       | 17 +++++++++++++++--
+ target/arm/translate-neon.inc.c |  3 +--
+files changed, 16 insertions(+), 4 deletions(-)
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon-dp.decode
++++ b/target/arm/neon-dp.decode
+@@ -XXX,XX +XXX,XX @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
+ VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
+ VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
+-VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
+-VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
++# The _rev suffix indicates that Vn and Vm are reversed. This is
++# the case for shifts. In the Arm ARM these insns are documented
++# with the Vm and Vn fields in their usual places, but in the
++# assembly the operands are listed "backwards", ie in the order
++# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose
++# to consider Vm and Vn as being in different fields in the insn,
++# which allows us to avoid special-casing shifts in the trans_
++# function code. We would otherwise need to manually swap the operands
++# over to call Neon helper functions that are shared with AArch64,
++# which does not have this odd reversed-operand situation.
++@3same_rev       .... ... . . . size:2 .... .... .... . q:1 . . .... \
++                 &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
++
++VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
++VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
+ VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
+ VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
+                                 uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                 uint32_t oprsz, uint32_t maxsz)         \
+     {                                                                   \
+-        /* Note the operation is vshl vd,vm,vn */                       \
+-        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
++        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
+                        oprsz, maxsz, &OPARRAY[vece]);                   \
+     }                                                                   \
+     DO_3SAME(INSN, gen_##INSN##_3s)
+--
+.20.1

-[Qemu-devel] [PULL 40/42] hw/net/ne2000-isa: Add guards to the header
+[PULL 10/45] target/arm: Create gen_gvec_{cmtst,ushl,sshl}
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Thomas Huth <thuth@redhat.com>
+Provide a functional interface for the vector expansion.
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+This fits better with the existing set of helpers that
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+we provide for other operations.
-Message-id: 20190412165416.7977-11-philmd@redhat.com
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200513163245.17915-10-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/net/ne2000-isa.h | 6 ++++++
+ target/arm/translate.h          |  10 ++-
-file changed, 6 insertions(+)
+ target/arm/translate-a64.c      |  18 ++--
+ target/arm/translate-neon.inc.c |  23 +----
-diff --git a/include/hw/net/ne2000-isa.h b/include/hw/net/ne2000-isa.h
+ target/arm/translate.c          | 146 +++++++++++++++++---------------
 files changed, 95 insertions(+), 102 deletions(-)
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/net/ne2000-isa.h
+--- a/target/arm/translate.h
-+++ b/include/hw/net/ne2000-isa.h
++++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-  * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-  * See the COPYING file in the top-level directory.
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
-  */
 -extern const GVecGen3 cmtst_op[4];
 -extern const GVecGen3 sshl_op[4];
 -extern const GVecGen3 ushl_op[4];
 +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
-+#ifndef HW_NET_NE2K_ISA_H
+ extern const GVecGen4 uqadd_op[4];
-+#define HW_NET_NE2K_ISA_H
+ extern const GVecGen4 sqadd_op[4];
-+
+ extern const GVecGen4 uqsub_op[4];
- #include "hw/hw.h"
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
- #include "hw/qdev.h"
+index XXXXXXX..XXXXXXX 100644
- #include "hw/isa/isa.h"
+--- a/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static inline ISADevice *isa_ne2000_init(ISABus *bus, int base, int irq,
++++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
              is_q ? 16 : 8, vec_full_reg_size(s));
  }
 -/* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
 -static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
 -                         int rn, int rm, const GVecGen3 *gvec_op)
 -{
 -    tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
 -                   vec_full_reg_offset(s, rm), is_q ? 16 : 8,
 -                   vec_full_reg_size(s), gvec_op);
 -}
 -
  /* Expand a 3-operand operation using an out-of-line helper.  */
  static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
                               int rn, int rm, int data, gen_helper_gvec_3 *fn)
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                         (u ? uqsub_op : sqsub_op) + size);
          return;
      case 0x08: /* SSHL, USHL */
 -        gen_gvec_op3(s, is_q, rd, rn, rm,
 -                     u ? &ushl_op[size] : &sshl_op[size]);
 +        if (u) {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size);
 +        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size);
 +        }
          return;
      case 0x0c: /* SMAX, UMAX */
          if (u) {
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
          return;
      case 0x11:
          if (!u) { /* CMTST */
 -            gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]);
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size);
              return;
          }
          /* else CMEQ */
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME(VBIC, tcg_gen_gvec_andc)
  DO_3SAME(VORR, tcg_gen_gvec_or)
  DO_3SAME(VORN, tcg_gen_gvec_orc)
  DO_3SAME(VEOR, tcg_gen_gvec_xor)
 +DO_3SAME(VSHL_S, gen_gvec_sshl)
 +DO_3SAME(VSHL_U, gen_gvec_ushl)
  /* These insns are all gvec_bitsel but with the inputs in various orders. */
  #define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
  DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
  DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
  DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
 +DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
  #define DO_3SAME_CMP(INSN, COND)                                        \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
  DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
  DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
 -static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 -                         uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
 -{
 -    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
 -}
 -DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
 -
  #define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
                                  uint32_t rn_ofs, uint32_t rm_ofs,       \
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
      }
-     return d;
+     return do_3same(s, a, gen_VMUL_p_3s);
  }
-+
+-
-+#endif
+-#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
 -    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 -                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 -                                uint32_t oprsz, uint32_t maxsz)         \
 -    {                                                                   \
 -        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
 -                       oprsz, maxsz, &OPARRAY[vece]);                   \
 -    }                                                                   \
 -    DO_3SAME(INSN, gen_##INSN##_3s)
 -
 -DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
 -DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
      tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
  }
 -static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 };
 -
 -const GVecGen3 cmtst_op[4] = {
 -    { .fni4 = gen_helper_neon_tst_u8,
 -      .fniv = gen_cmtst_vec,
 -      .opt_opc = vecop_list_cmtst,
 -      .vece = MO_8 },
 -    { .fni4 = gen_helper_neon_tst_u16,
 -      .fniv = gen_cmtst_vec,
 -      .opt_opc = vecop_list_cmtst,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cmtst_i32,
 -      .fniv = gen_cmtst_vec,
 -      .opt_opc = vecop_list_cmtst,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cmtst_i64,
 -      .fniv = gen_cmtst_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .opt_opc = vecop_list_cmtst,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
 +    static const GVecGen3 ops[4] = {
 +        { .fni4 = gen_helper_neon_tst_u8,
 +          .fniv = gen_cmtst_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni4 = gen_helper_neon_tst_u16,
 +          .fniv = gen_cmtst_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_cmtst_i32,
 +          .fniv = gen_cmtst_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_cmtst_i64,
 +          .fniv = gen_cmtst_vec,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
  {
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
      tcg_temp_free_vec(rsh);
  }
 -static const TCGOpcode ushl_list[] = {
 -    INDEX_op_neg_vec, INDEX_op_shlv_vec,
 -    INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
 -};
 -
 -const GVecGen3 ushl_op[4] = {
 -    { .fniv = gen_ushl_vec,
 -      .fno = gen_helper_gvec_ushl_b,
 -      .opt_opc = ushl_list,
 -      .vece = MO_8 },
 -    { .fniv = gen_ushl_vec,
 -      .fno = gen_helper_gvec_ushl_h,
 -      .opt_opc = ushl_list,
 -      .vece = MO_16 },
 -    { .fni4 = gen_ushl_i32,
 -      .fniv = gen_ushl_vec,
 -      .opt_opc = ushl_list,
 -      .vece = MO_32 },
 -    { .fni8 = gen_ushl_i64,
 -      .fniv = gen_ushl_vec,
 -      .opt_opc = ushl_list,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_neg_vec, INDEX_op_shlv_vec,
 +        INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_ushl_vec,
 +          .fno = gen_helper_gvec_ushl_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_ushl_vec,
 +          .fno = gen_helper_gvec_ushl_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_ushl_i32,
 +          .fniv = gen_ushl_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_ushl_i64,
 +          .fniv = gen_ushl_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
  {
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
      tcg_temp_free_vec(tmp);
  }
 -static const TCGOpcode sshl_list[] = {
 -    INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
 -    INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
 -};
 -
 -const GVecGen3 sshl_op[4] = {
 -    { .fniv = gen_sshl_vec,
 -      .fno = gen_helper_gvec_sshl_b,
 -      .opt_opc = sshl_list,
 -      .vece = MO_8 },
 -    { .fniv = gen_sshl_vec,
 -      .fno = gen_helper_gvec_sshl_h,
 -      .opt_opc = sshl_list,
 -      .vece = MO_16 },
 -    { .fni4 = gen_sshl_i32,
 -      .fniv = gen_sshl_vec,
 -      .opt_opc = sshl_list,
 -      .vece = MO_32 },
 -    { .fni8 = gen_sshl_i64,
 -      .fniv = gen_sshl_vec,
 -      .opt_opc = sshl_list,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
 +        INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_sshl_vec,
 +          .fno = gen_helper_gvec_sshl_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_sshl_vec,
 +          .fno = gen_helper_gvec_sshl_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_sshl_i32,
 +          .fniv = gen_sshl_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_sshl_i64,
 +          .fniv = gen_sshl_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
 --
 .20.1

-[Qemu-devel] [PULL 36/42] hw/devices: Move CBus declarations into a new header
+[PULL 11/45] target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-Reviewed-by: Thomas Huth <thuth@redhat.com>
+Provide a functional interface for the vector expansion.
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+This fits better with the existing set of helpers that
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+we provide for other operations.
-Message-id: 20190412165416.7977-7-philmd@redhat.com
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200513163245.17915-11-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/devices.h   | 14 --------------
+ target/arm/translate.h          |  13 +-
- include/hw/misc/cbus.h | 32 ++++++++++++++++++++++++++++++++
+ target/arm/translate-a64.c      |  22 ++-
- hw/arm/nseries.c       |  1 +
+ target/arm/translate-neon.inc.c |  19 +--
- hw/misc/cbus.c         |  2 +-
+ target/arm/translate.c          | 228 +++++++++++++++++---------------
- MAINTAINERS            |  1 +
+files changed, 147 insertions(+), 135 deletions(-)
-files changed, 35 insertions(+), 15 deletions(-)
- create mode 100644 include/hw/misc/cbus.h
+diff --git a/target/arm/translate.h b/target/arm/translate.h
 diff --git a/include/hw/devices.h b/include/hw/devices.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
+--- a/target/arm/translate.h
-+++ b/include/hw/devices.h
++++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
- /* stellaris_input.c */
+ void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
- void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
+                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
--/* cbus.c */
+-extern const GVecGen4 uqadd_op[4];
--typedef struct {
+-extern const GVecGen4 sqadd_op[4];
--    qemu_irq clk;
+-extern const GVecGen4 uqsub_op[4];
--    qemu_irq dat;
+-extern const GVecGen4 sqsub_op[4];
--    qemu_irq sel;
+ void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
--} CBus;
+ void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
--CBus *cbus_init(qemu_irq dat_out);
+ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
--void cbus_attach(CBus *bus, void *slave_opaque);
+ void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
--
+ void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
--void *retu_init(qemu_irq irq, int vilma);
--void *tahvo_init(qemu_irq irq, int betty);
++void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
--
++                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
--void retu_key_event(void *retu, int state);
++void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
--
++                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
- #endif
++void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-diff --git a/include/hw/misc/cbus.h b/include/hw/misc/cbus.h
++                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
-new file mode 100644
++void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-index XXXXXXX..XXXXXXX
++                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 --- /dev/null
 +++ b/include/hw/misc/cbus.h
@@ -XXX,XX +XXX,XX @@
 +/*
 + * CBUS three-pin bus and the Retu / Betty / Tahvo / Vilma / Avilma /
 + * Hinku / Vinku / Ahne / Pihi chips used in various Nokia platforms.
 + * Based on reverse-engineering of a linux driver.
 + *
 + * Copyright (C) 2008 Nokia Corporation
 + * Written by Andrzej Zaborowski
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or later.
 + * See the COPYING file in the top-level directory.
 + */
 +
-+#ifndef HW_MISC_CBUS_H
+ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-+#define HW_MISC_CBUS_H
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
-+
+ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-+#include "hw/irq.h"
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 +
 +typedef struct {
 +    qemu_irq clk;
 +    qemu_irq dat;
 +    qemu_irq sel;
 +} CBus;
 +
 +CBus *cbus_init(qemu_irq dat_out);
 +void cbus_attach(CBus *bus, void *slave_opaque);
 +
 +void *retu_init(qemu_irq irq, int vilma);
 +void *tahvo_init(qemu_irq irq, int betty);
 +
 +void retu_key_event(void *retu, int state);
 +
 +#endif
 diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/nseries.c
+--- a/target/arm/translate-a64.c
-+++ b/hw/arm/nseries.c
++++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
- #include "hw/i2c/i2c.h"
- #include "hw/devices.h"
+     switch (opcode) {
- #include "hw/display/blizzard.h"
+     case 0x01: /* SQADD, UQADD */
-+#include "hw/misc/cbus.h"
+-        tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
- #include "hw/misc/tmp105.h"
+-                       offsetof(CPUARMState, vfp.qc),
- #include "hw/block/flash.h"
+-                       vec_full_reg_offset(s, rn),
- #include "hw/hw.h"
+-                       vec_full_reg_offset(s, rm),
-diff --git a/hw/misc/cbus.c b/hw/misc/cbus.c
+-                       is_q ? 16 : 8, vec_full_reg_size(s),
 -                       (u ? uqadd_op : sqadd_op) + size);
 +        if (u) {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size);
 +        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size);
 +        }
          return;
      case 0x05: /* SQSUB, UQSUB */
 -        tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
 -                       offsetof(CPUARMState, vfp.qc),
 -                       vec_full_reg_offset(s, rn),
 -                       vec_full_reg_offset(s, rm),
 -                       is_q ? 16 : 8, vec_full_reg_size(s),
 -                       (u ? uqsub_op : sqsub_op) + size);
 +        if (u) {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size);
 +        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size);
 +        }
          return;
      case 0x08: /* SSHL, USHL */
          if (u) {
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/cbus.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/misc/cbus.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ DO_3SAME(VORN, tcg_gen_gvec_orc)
- #include "qemu/osdep.h"
+ DO_3SAME(VEOR, tcg_gen_gvec_xor)
- #include "hw/hw.h"
+ DO_3SAME(VSHL_S, gen_gvec_sshl)
- #include "hw/irq.h"
+ DO_3SAME(VSHL_U, gen_gvec_ushl)
--#include "hw/devices.h"
++DO_3SAME(VQADD_S, gen_gvec_sqadd_qc)
-+#include "hw/misc/cbus.h"
++DO_3SAME(VQADD_U, gen_gvec_uqadd_qc)
- #include "sysemu/sysemu.h"
++DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc)
++DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc)
- //#define DEBUG
-diff --git a/MAINTAINERS b/MAINTAINERS
+ /* These insns are all gvec_bitsel but with the inputs in various orders. */
  #define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
  DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
  DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
 -#define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
 -    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 -                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 -                                uint32_t oprsz, uint32_t maxsz)         \
 -    {                                                                   \
 -        tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),           \
 -                       rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]);   \
 -    }                                                                   \
 -    DO_3SAME(INSN, gen_##INSN##_3s)
 -
 -DO_3SAME_GVEC4(VQADD_S, sqadd_op)
 -DO_3SAME_GVEC4(VQADD_U, uqadd_op)
 -DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
 -DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
 -
  static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                             uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
  {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
---- a/MAINTAINERS
+--- a/target/arm/translate.c
-+++ b/MAINTAINERS
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ F: hw/input/tsc2005.c
+@@ -XXX,XX +XXX,XX @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
- F: hw/misc/cbus.c
+     tcg_temp_free_vec(x);
- F: hw/timer/twl92230.c
+ }
- F: include/hw/display/blizzard.h
-+F: include/hw/misc/cbus.h
+-static const TCGOpcode vecop_list_uqadd[] = {
+-    INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
- Palm
+-};
- M: Andrzej Zaborowski <balrogg@gmail.com>
+-
 -const GVecGen4 uqadd_op[4] = {
 -    { .fniv = gen_uqadd_vec,
 -      .fno = gen_helper_gvec_uqadd_b,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_8 },
 -    { .fniv = gen_uqadd_vec,
 -      .fno = gen_helper_gvec_uqadd_h,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_16 },
 -    { .fniv = gen_uqadd_vec,
 -      .fno = gen_helper_gvec_uqadd_s,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_32 },
 -    { .fniv = gen_uqadd_vec,
 -      .fno = gen_helper_gvec_uqadd_d,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_uqadd_vec,
 +          .fno = gen_helper_gvec_uqadd_b,
 +          .write_aofs = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_uqadd_vec,
 +          .fno = gen_helper_gvec_uqadd_h,
 +          .write_aofs = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fniv = gen_uqadd_vec,
 +          .fno = gen_helper_gvec_uqadd_s,
 +          .write_aofs = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fniv = gen_uqadd_vec,
 +          .fno = gen_helper_gvec_uqadd_d,
 +          .write_aofs = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
      tcg_temp_free_vec(x);
  }
 -static const TCGOpcode vecop_list_sqadd[] = {
 -    INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
 -};
 -
 -const GVecGen4 sqadd_op[4] = {
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_b,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_8 },
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_h,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_16 },
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_s,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_32 },
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_d,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_b,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_h,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_16 },
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_s,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_32 },
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_d,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
      tcg_temp_free_vec(x);
  }
 -static const TCGOpcode vecop_list_uqsub[] = {
 -    INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 -};
 -
 -const GVecGen4 uqsub_op[4] = {
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_b,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_8 },
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_h,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_16 },
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_s,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_32 },
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_d,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_b,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_h,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_16 },
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_s,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_32 },
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_d,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
      tcg_temp_free_vec(x);
  }
 -static const TCGOpcode vecop_list_sqsub[] = {
 -    INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 -};
 -
 -const GVecGen4 sqsub_op[4] = {
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_b,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_8 },
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_h,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_16 },
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_s,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_32 },
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_d,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_b,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_h,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_16 },
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_s,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_32 },
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_d,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
 --
 .20.1

-[Qemu-devel] [PULL 33/42] hw/display/tc6393xb: Remove unused functions
+[PULL 12/45] target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-No code used the tc6393xb_gpio_in_get() and tc6393xb_gpio_out_set()
+These operations do not touch fp_status.
 functions since their introduction in commit 88d2c950b002. Time to
 remove them.
-Suggested-by: Markus Armbruster <armbru@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-4-philmd@redhat.com
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200513163245.17915-12-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/devices.h  |  3 ---
+ target/arm/helper.h        |  4 ++--
- hw/display/tc6393xb.c | 16 ----------------
+ target/arm/translate-a64.c |  5 ++---
-files changed, 19 deletions(-)
+ target/arm/translate.c     | 12 ++----------
  target/arm/vfp_helper.c    |  5 ++---
 files changed, 8 insertions(+), 18 deletions(-)
-diff --git a/include/hw/devices.h b/include/hw/devices.h
+diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
+--- a/target/arm/helper.h
-+++ b/include/hw/devices.h
++++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ void retu_key_event(void *retu, int state);
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
- typedef struct TC6393xbState TC6393xbState;
+ DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
- TC6393xbState *tc6393xb_init(struct MemoryRegion *sysmem,
+ DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
-                              uint32_t base, qemu_irq irq);
+ DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
--void tc6393xb_gpio_out_set(TC6393xbState *s, int line,
+-DEF_HELPER_2(recpe_u32, i32, i32, ptr)
--                    qemu_irq handler);
+-DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr)
--qemu_irq *tc6393xb_gpio_in_get(TC6393xbState *s);
++DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32)
- qemu_irq tc6393xb_l3v_get(TC6393xbState *s);
++DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32)
+ DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32)
- #endif
-diff --git a/hw/display/tc6393xb.c b/hw/display/tc6393xb.c
+ DEF_HELPER_3(shl_cc, i32, env, i32, i32)
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/display/tc6393xb.c
+--- a/target/arm/translate-a64.c
-+++ b/hw/display/tc6393xb.c
++++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ struct TC6393xbState {
+@@ -XXX,XX +XXX,XX @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode,
-              blanked : 1;
- };
+             switch (opcode) {
+             case 0x3c: /* URECPE */
--qemu_irq *tc6393xb_gpio_in_get(TC6393xbState *s)
+-                gen_helper_recpe_u32(tcg_res, tcg_op, fpst);
--{
++                gen_helper_recpe_u32(tcg_res, tcg_op);
--    return s->gpio_in;
+                 break;
--}
+             case 0x3d: /* FRECPE */
--
+                 gen_helper_recpe_f32(tcg_res, tcg_op, fpst);
- static void tc6393xb_gpio_set(void *opaque, int line, int level)
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
                  unallocated_encoding(s);
                  return;
              }
 -            need_fpstatus = true;
              break;
          case 0x1e: /* FRINT32Z */
          case 0x1f: /* FRINT64Z */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
                      gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus);
                      break;
                  case 0x7c: /* URSQRTE */
 -                    gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus);
 +                    gen_helper_rsqrte_u32(tcg_res, tcg_op);
                      break;
                  case 0x1e: /* FRINT32Z */
                  case 0x5e: /* FRINT32X */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              break;
                          }
                          case NEON_2RM_VRECPE:
 -                        {
 -                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                            gen_helper_recpe_u32(tmp, tmp, fpstatus);
 -                            tcg_temp_free_ptr(fpstatus);
 +                            gen_helper_recpe_u32(tmp, tmp);
                              break;
 -                        }
                          case NEON_2RM_VRSQRTE:
 -                        {
 -                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                            gen_helper_rsqrte_u32(tmp, tmp, fpstatus);
 -                            tcg_temp_free_ptr(fpstatus);
 +                            gen_helper_rsqrte_u32(tmp, tmp);
                              break;
 -                        }
                          case NEON_2RM_VRECPE_F:
                          {
                              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vfp_helper.c
 +++ b/target/arm/vfp_helper.c
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp)
      return make_float64(val);
  }
 -uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp)
 +uint32_t HELPER(recpe_u32)(uint32_t a)
  {
- //    TC6393xbState *s = opaque;
+-    /* float_status *s = fpstp; */
-@@ -XXX,XX +XXX,XX @@ static void tc6393xb_gpio_set(void *opaque, int line, int level)
+     int input, estimate;
-     // FIXME: how does the chip reflect the GPIO input level change?
      if ((a & 0x80000000) == 0) {
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp)
      return deposit32(0, (32 - 9), 9, estimate);
  }
--void tc6393xb_gpio_out_set(TC6393xbState *s, int line,
+-uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp)
--                    qemu_irq handler)
++uint32_t HELPER(rsqrte_u32)(uint32_t a)
 -{
 -    if (line >= TC6393XB_GPIOS) {
 -        fprintf(stderr, "TC6393xb: no GPIO pin %d\n", line);
 -        return;
 -    }
 -
 -    s->handler[line] = handler;
 -}
 -
  static void tc6393xb_gpio_handler_update(TC6393xbState *s)
  {
-     uint32_t level, diff;
+     int estimate;
 --
 .20.1

-[Qemu-devel] [PULL 22/42] target/arm: Activate M-profile floating point context when FPCCR.ASPEN is set
+[PULL 13/45] target/arm: Create gen_gvec_{qrdmla,qrdmls}
-The M-profile FPCCR.ASPEN bit indicates that automatic floating-point
+From: Richard Henderson <richard.henderson@linaro.org>
 context preservation is enabled. Before executing any floating-point
 instruction, if FPCCR.ASPEN is set and the CONTROL FPCA/SFPA bits
 indicate that there is no active floating point context then we
 must create a new context (by initializing FPSCR and setting
 FPCA/SFPA to indicate that the context is now active). In the
 pseudocode this is handled by ExecuteFPCheck().
-Implement this with a new TB flag which tracks whether we
+Provide a functional interface for the vector expansion.
-need to create a new FP context.
+This fits better with the existing set of helpers that
 we provide for other operations.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200513163245.17915-13-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-20-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       |  2 ++
+ target/arm/translate.h     |  5 ++++
- target/arm/translate.h |  1 +
+ target/arm/translate-a64.c | 34 ++----------------------
- target/arm/helper.c    | 13 +++++++++++++
+ target/arm/translate.c     | 54 +++++++++++++++++++-------------------
- target/arm/translate.c | 29 +++++++++++++++++++++++++++++
+files changed, 34 insertions(+), 59 deletions(-)
 files changed, 45 insertions(+)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
- FIELD(TBFLAG_A32, VFPEN, 7, 1)
- FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
- FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
-+/* For M profile only, set if we must create a new FP context */
-+FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1)
- /* For M profile only, set if FPCCR.S does not match current security state */
- FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1)
- /* For M profile only, Handler (ie not Thread) mode */
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-     bool v8m_secure; /* true if v8M and we're in Secure mode */
+ void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
+                   int64_t shift, uint32_t opr_sz, uint32_t max_sz);
-     bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
-+    bool v7m_new_fp_ctxt_needed; /* ASPEN set but no active FP context */
++void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
++                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
-      * so that top level loop can generate correct syndrome information.
++void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-      */
++                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
-diff --git a/target/arm/helper.c b/target/arm/helper.c
++
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/translate-a64.c
-+++ b/target/arm/helper.c
++++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
+@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
-         flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
+                        is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
      }
 +    if (arm_feature(env, ARM_FEATURE_M) &&
 +        (env->v7m.fpccr[env->v7m.secure] & R_V7M_FPCCR_ASPEN_MASK) &&
 +        (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) ||
 +         (env->v7m.secure &&
 +          !(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)))) {
 +        /*
 +         * ASPEN is set, but FPCA/SFPA indicate that there is no active
 +         * FP context; we must create a new FP context before executing
 +         * any FP insn.
 +         */
 +        flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
 +    }
 +
      *pflags = flags;
      *cs_base = 0;
  }
+-/* Expand a 3-operand + env pointer operation using
+- * an out-of-line helper.
+- */
+-static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd,
+-                             int rn, int rm, gen_helper_gvec_3_ptr *fn)
+-{
+-    tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
+-                       vec_full_reg_offset(s, rn),
+-                       vec_full_reg_offset(s, rm), cpu_env,
+-                       is_q ? 16 : 8, vec_full_reg_size(s), 0, fn);
+-}
+-
+ /* Expand a 3-operand + fpstatus pointer + simd data value operation using
+  * an out-of-line helper.
+  */
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
+     switch (opcode) {
+     case 0x0: /* SQRDMLAH (vector) */
+-        switch (size) {
+-        case 1:
+-            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16);
+-            break;
+-        case 2:
+-            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32);
+-            break;
+-        default:
+-            g_assert_not_reached();
+-        }
++        gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlah_qc, size);
+         return;
+     case 0x1: /* SQRDMLSH (vector) */
+-        switch (size) {
+-        case 1:
+-            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16);
+-            break;
+-        case 2:
+-            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32);
+-            break;
+-        default:
+-            g_assert_not_reached();
+-        }
++        gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlsh_qc, size);
+         return;
+     case 0x2: /* SDOT / UDOT */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
-             /* Don't need to do this for any further FP insns in this TB */
+     [NEON_2RM_VCVT_UF] = 0x4,
-             s->v8m_fpccr_s_wrong = false;
+ };
-         }
 -
 -/* Expand v8.1 simd helper.  */
 -static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
 -                         int q, int rd, int rn, int rm)
 +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
  {
 -    if (dc_isar_feature(aa32_rdm, s)) {
 -        int opr_sz = (1 + q) * 8;
 -        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
 -                           vfp_reg_offset(1, rn),
 -                           vfp_reg_offset(1, rm), cpu_env,
 -                           opr_sz, opr_sz, 0, fn);
 -        return 0;
 -    }
 -    return 1;
 +    static gen_helper_gvec_3_ptr * const fns[2] = {
 +        gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
 +    };
 +    tcg_debug_assert(vece >= 1 && vece <= 2);
 +    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
 +                       opr_sz, max_sz, 0, fns[vece - 1]);
 +}
 +
-+        if (s->v7m_new_fp_ctxt_needed) {
++void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-+            /*
++                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
-+             * Create new FP context by updating CONTROL.FPCA, CONTROL.SFPA
++{
-+             * and the FPSCR.
++    static gen_helper_gvec_3_ptr * const fns[2] = {
-+             */
++        gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
-+            TCGv_i32 control, fpscr;
++    };
-+            uint32_t bits = R_V7M_CONTROL_FPCA_MASK;
++    tcg_debug_assert(vece >= 1 && vece <= 2);
-+
++    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
-+            fpscr = load_cpu_field(v7m.fpdscr[s->v8m_secure]);
++                       opr_sz, max_sz, 0, fns[vece - 1]);
-+            gen_helper_vfp_set_fpscr(cpu_env, fpscr);
+ }
-+            tcg_temp_free_i32(fpscr);
-+            /*
+ #define GEN_CMP0(NAME, COND)                                            \
-+             * We don't need to arrange to end the TB, because the only
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+             * parts of FPSCR which we cache in the TB flags are the VECLEN
+                 break;  /* VPADD */
-+             * and VECSTRIDE, and those don't exist for M-profile.
+             }
-+             */
+             /* VQRDMLAH */
-+
+-            switch (size) {
-+            if (s->v8m_secure) {
+-            case 1:
-+                bits |= R_V7M_CONTROL_SFPA_MASK;
+-                return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16,
-+            }
+-                                     q, rd, rn, rm);
-+            control = load_cpu_field(v7m.control[M_REG_S]);
+-            case 2:
-+            tcg_gen_ori_i32(control, control, bits);
+-                return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32,
-+            store_cpu_field(control, v7m.control[M_REG_S]);
+-                                     q, rd, rn, rm);
-+            /* Don't need to do this for any further FP insns in this TB */
++            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
-+            s->v7m_new_fp_ctxt_needed = false;
++                gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs,
-+        }
++                                     vec_size, vec_size);
-     }
++                return 0;
+             }
-     if (extract32(insn, 28, 4) == 0xf) {
+             return 1;
-@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
-         regime_is_secure(env, dc->mmu_idx);
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     dc->v8m_stackcheck = FIELD_EX32(tb_flags, TBFLAG_A32, STACKCHECK);
+                 break;
-     dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
+             }
-+    dc->v7m_new_fp_ctxt_needed =
+             /* VQRDMLSH */
-+        FIELD_EX32(tb_flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED);
+-            switch (size) {
-     dc->cp_regs = cpu->cp_regs;
+-            case 1:
-     dc->features = env->features;
+-                return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16,
 -                                     q, rd, rn, rm);
 -            case 2:
 -                return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32,
 -                                     q, rd, rn, rm);
 +            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
 +                gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs,
 +                                     vec_size, vec_size);
 +                return 0;
              }
              return 1;
 --
 .20.1

-[Qemu-devel] [PULL 12/42] target/arm/helper: don't return early for STKOF faults during stacking
+[PULL 14/45] target/arm: Pass pointer to qc to qrdmla/qrdmls
-Currently the code in v7m_push_stack() which detects a violation
+From: Richard Henderson <richard.henderson@linaro.org>
-of the v8M stack limit simply returns early if it does so. This
-is OK for the current integer-only code, but won't work for the
+Pass a pointer directly to env->vfp.qc[0], rather than env.
-floating point handling we're about to add. We need to continue
+This will allow SVE2, which does not modify QC, to pass a
-executing the rest of the function so that we check for other
+pointer to dummy storage.
-exceptions like not having permission to use the FPU and so
-that we correctly set the FPCCR state if we are doing lazy
+Change the return type of inl_qrdml.h_s16 to match the
-stacking. Refactor to avoid the early return.
+sense of the operation: signed.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200513163245.17915-14-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-10-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 23 ++++++++++++++++++-----
+ target/arm/translate.c  | 18 ++++++++---
-file changed, 18 insertions(+), 5 deletions(-)
+ target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------
+files changed, 54 insertions(+), 34 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/translate.c
-+++ b/target/arm/helper.c
++++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
+@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
-      * should ignore further stack faults trying to process
+     [NEON_2RM_VCVT_UF] = 0x4,
-      * that derived exception.)
+ };
-      */
--    bool stacked_ok;
++static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
-+    bool stacked_ok = true, limitviol = false;
++                            uint32_t opr_sz, uint32_t max_sz,
-     CPUARMState *env = &cpu->env;
++                            gen_helper_gvec_3_ptr *fn)
-     uint32_t xpsr = xpsr_read(env);
++{
-     uint32_t frameptr = env->regs[13];
++    TCGv_ptr qc_ptr = tcg_temp_new_ptr();
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++
-             armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
++    tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc));
-                                     env->v7m.secure);
++    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
-             env->regs[13] = limit;
++                       opr_sz, max_sz, 0, fn);
--            return true;
++    tcg_temp_free_ptr(qc_ptr);
-+            /*
++}
-+             * We won't try to perform any further memory accesses but
++
-+             * we must continue through the following code to check for
+ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-+             * permission faults during FPU state preservation, and we
+                           uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
-+             * must update FPCCR if lazy stacking is enabled.
+ {
-+             */
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-+            limitviol = true;
+         gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
-+            stacked_ok = false;
+     };
-         }
+     tcg_debug_assert(vece >= 1 && vece <= 2);
-     }
+-    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
+-                       opr_sz, max_sz, 0, fns[vece - 1]);
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++    gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
-      * (which may be taken in preference to the one we started with
+ }
-      * if it has higher priority).
-      */
+ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
--    stacked_ok =
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-+    stacked_ok = stacked_ok &&
+         gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
-         v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, false) &&
+     };
-         v7m_stack_write(cpu, frameptr + 4, env->regs[1], mmu_idx, false) &&
+     tcg_debug_assert(vece >= 1 && vece <= 2);
-         v7m_stack_write(cpu, frameptr + 8, env->regs[2], mmu_idx, false) &&
+-    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
+-                       opr_sz, max_sz, 0, fns[vece - 1]);
-         v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
++    gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
-         v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
+ }
--    /* Update SP regardless of whether any of the stack accesses failed. */
+ #define GEN_CMP0(NAME, COND)                                            \
--    env->regs[13] = frameptr;
+diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
-+    /*
+index XXXXXXX..XXXXXXX 100644
-+     * If we broke a stack limit then SP was already updated earlier;
+--- a/target/arm/vec_helper.c
-+     * otherwise we update SP regardless of whether any of the stack
++++ b/target/arm/vec_helper.c
-+     * accesses failed or we took some other kind of fault.
+@@ -XXX,XX +XXX,XX @@
-+     */
+ #define H4(x)  (x)
-+    if (!limitviol) {
+ #endif
-+        env->regs[13] = frameptr;
-+    }
+-#define SET_QC() env->vfp.qc[0] = 1
+-
-     return !stacked_ok;
+ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
  {
      uint64_t *d = vd + opr_sz;
@@ -XXX,XX +XXX,XX @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
  }
  /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
 -static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
 -                                int16_t src2, int16_t src3)
 +static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2,
 +                               int16_t src3, uint32_t *sat)
  {
      /* Simplify:
       * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
      ret = ((int32_t)src3 << 15) + ret + (1 << 14);
      ret >>= 15;
      if (ret != (int16_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? -0x8000 : 0x7fff);
      }
      return ret;
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
  uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
                                    uint32_t src2, uint32_t src3)
  {
 -    uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3);
 -    uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
 +    uint32_t *sat = &env->vfp.qc[0];
 +    uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat);
 +    uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
      return deposit32(e1, 16, 16, e2);
  }
  void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int16_t *d = vd;
      int16_t *n = vn;
      int16_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 2; ++i) {
 -        d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
  /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
 -static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
 -                                int16_t src2, int16_t src3)
 +static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2,
 +                               int16_t src3, uint32_t *sat)
  {
      /* Similarly, using subtraction:
       * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
      ret = ((int32_t)src3 << 15) - ret + (1 << 14);
      ret >>= 15;
      if (ret != (int16_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? -0x8000 : 0x7fff);
      }
      return ret;
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
  uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
                                    uint32_t src2, uint32_t src3)
  {
 -    uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3);
 -    uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
 +    uint32_t *sat = &env->vfp.qc[0];
 +    uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat);
 +    uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
      return deposit32(e1, 16, 16, e2);
  }
  void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int16_t *d = vd;
      int16_t *n = vn;
      int16_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 2; ++i) {
 -        d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
  /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
 -uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
 -                                  int32_t src2, int32_t src3)
 +static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2,
 +                               int32_t src3, uint32_t *sat)
  {
      /* Simplify similarly to int_qrdmlah_s16 above.  */
      int64_t ret = (int64_t)src1 * src2;
      ret = ((int64_t)src3 << 31) + ret + (1 << 30);
      ret >>= 31;
      if (ret != (int32_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? INT32_MIN : INT32_MAX);
      }
      return ret;
  }
 +uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
 +                                  int32_t src2, int32_t src3)
 +{
 +    uint32_t *sat = &env->vfp.qc[0];
 +    return inl_qrdmlah_s32(src1, src2, src3, sat);
 +}
 +
  void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int32_t *d = vd;
      int32_t *n = vn;
      int32_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 4; ++i) {
 -        d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
  /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
 -uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
 -                                  int32_t src2, int32_t src3)
 +static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2,
 +                               int32_t src3, uint32_t *sat)
  {
      /* Simplify similarly to int_qrdmlsh_s16 above.  */
      int64_t ret = (int64_t)src1 * src2;
      ret = ((int64_t)src3 << 31) - ret + (1 << 30);
      ret >>= 31;
      if (ret != (int32_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? INT32_MIN : INT32_MAX);
      }
      return ret;
  }
 +uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
 +                                  int32_t src2, int32_t src3)
 +{
 +    uint32_t *sat = &env->vfp.qc[0];
 +    return inl_qrdmlsh_s32(src1, src2, src3, sat);
 +}
 +
  void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int32_t *d = vd;
      int32_t *n = vn;
      int32_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 4; ++i) {
 -        d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
 --
 .20.1

-[Qemu-devel] [PULL 32/42] hw/arm/nseries: Use TYPE_TMP105 instead of hardcoded string
+[PULL 15/45] target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-Suggested-by: Markus Armbruster <armbru@redhat.com>
+Must clear the tail for AdvSIMD when SVE is enabled.
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-3-philmd@redhat.com
+Fixes: ca40a6e6e39
 Cc: qemu-stable@nongnu.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20200513163245.17915-15-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/nseries.c | 3 ++-
+ target/arm/vec_helper.c | 2 ++
-file changed, 2 insertions(+), 1 deletion(-)
+file changed, 2 insertions(+)
-diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
+diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/nseries.c
+--- a/target/arm/vec_helper.c
-+++ b/hw/arm/nseries.c
++++ b/target/arm/vec_helper.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
- #include "hw/boards.h"
+             d[i + j] = TYPE##_mul(n[i + j], mm, stat);                     \
- #include "hw/i2c/i2c.h"
+         }                                                                  \
- #include "hw/devices.h"
+     }                                                                      \
-+#include "hw/misc/tmp105.h"
++    clear_tail(d, oprsz, simd_maxsz(desc));                                \
  #include "hw/block/flash.h"
  #include "hw/hw.h"
  #include "hw/bt.h"
@@ -XXX,XX +XXX,XX @@ static void n8x0_i2c_setup(struct n800_s *s)
      qemu_register_powerdown_notifier(&n8x0_system_powerdown_notifier);
      /* Attach a TMP105 PM chip (A0 wired to ground) */
 -    dev = i2c_create_slave(i2c, "tmp105", N8X0_TMP105_ADDR);
 +    dev = i2c_create_slave(i2c, TYPE_TMP105, N8X0_TMP105_ADDR);
      qdev_connect_gpio_out(dev, 0, tmp_irq);
  }
+ DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
+@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va,                  \
+                                      mm, a[i + j], 0, stat);               \
+         }                                                                  \
+     }                                                                      \
++    clear_tail(d, oprsz, simd_maxsz(desc));                                \
+ }
+ DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2)
 --
 .20.1

-[Qemu-devel] [PULL 21/42] target/arm: Set FPCCR.S when executing M-profile floating point insns
+[PULL 16/45] target/arm: Vectorize SABD/UABD
-The M-profile FPCCR.S bit indicates the security status of
+From: Richard Henderson <richard.henderson@linaro.org>
-the floating point context. In the pseudocode ExecuteFPCheck()
-function it is unconditionally set to match the current
+Include 64-bit element size in preparation for SVE2.
-security state whenever a floating point instruction is
-executed.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Implement this by adding a new TB flag which tracks whether
+Message-id: 20200513163245.17915-16-richard.henderson@linaro.org
 FPCCR.S is different from the current security state, so
 that we only need to emit the code to update it in the
 less-common case when it is not already set correctly.
 Note that we will add the handling for the other work done
 by ExecuteFPCheck() in later commits.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-19-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       |  2 ++
+ target/arm/helper.h        |  10 +++
- target/arm/translate.h |  1 +
+ target/arm/translate.h     |   5 ++
- target/arm/helper.c    |  5 +++++
+ target/arm/translate-a64.c |   8 ++-
- target/arm/translate.c | 20 ++++++++++++++++++++
+ target/arm/translate.c     | 133 ++++++++++++++++++++++++++++++++++++-
-files changed, 28 insertions(+)
+ target/arm/vec_helper.c    |  24 +++++++
+files changed, 176 insertions(+), 4 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
+diff --git a/target/arm/helper.h b/target/arm/helper.h
---- a/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
-+++ b/target/arm/cpu.h
+--- a/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
++++ b/target/arm/helper.h
- FIELD(TBFLAG_A32, VFPEN, 7, 1)
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
+ DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
- FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
+ DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
-+/* For M profile only, set if FPCCR.S does not match current security state */
-+FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1)
++DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
- /* For M profile only, Handler (ie not Thread) mode */
++DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
- FIELD(TBFLAG_A32, HANDLER, 21, 1)
++DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
- /* For M profile only, whether we should generate stack-limit checks */
++DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +
 +DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
  #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-     bool v7m_handler_mode;
+ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-     bool v8m_secure; /* true if v8M and we're in Secure mode */
+                           uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
-     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
-+    bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
++void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
++                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
-      * so that top level loop can generate correct syndrome information.
++void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-      */
++                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
-diff --git a/target/arm/helper.c b/target/arm/helper.c
++
-index XXXXXXX..XXXXXXX 100644
+ /*
---- a/target/arm/helper.c
+  * Forward to the isar_feature_* tests given a DisasContext pointer.
-+++ b/target/arm/helper.c
+  */
-@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
-         flags = FIELD_DP32(flags, TBFLAG_A32, STACKCHECK, 1);
+index XXXXXXX..XXXXXXX 100644
-     }
+--- a/target/arm/translate-a64.c
++++ b/target/arm/translate-a64.c
-+    if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
+@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
-+        FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S) != env->v7m.secure) {
+             gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size);
-+        flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
+         }
-+    }
+         return;
-+
++    case 0xe: /* SABD, UABD */
-     *pflags = flags;
++        if (u) {
-     *cs_base = 0;
++            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size);
- }
++        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
 +        }
 +        return;
      case 0x10: /* ADD, SUB */
          if (u) {
              gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                  genenvfn = fns[size][u];
                  break;
              }
 -            case 0xe: /* SABD, UABD */
              case 0xf: /* SABA, UABA */
              {
                  static NeonGenTwoOpFn * const fns[3][2] = {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-         }
+                    rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
-     }
+ }
-+    if (arm_dc_feature(s, ARM_FEATURE_M)) {
++static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
-+        /* Handle M-profile lazy FP state mechanics */
++{
-+
++    TCGv_i32 t = tcg_temp_new_i32();
-+        /* Update ownership of FP context: set FPCCR.S to match current state */
++
-+        if (s->v8m_fpccr_s_wrong) {
++    tcg_gen_sub_i32(t, a, b);
-+            TCGv_i32 tmp;
++    tcg_gen_sub_i32(d, b, a);
-+
++    tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t);
-+            tmp = load_cpu_field(v7m.fpccr[M_REG_S]);
++    tcg_temp_free_i32(t);
-+            if (s->v8m_secure) {
++}
-+                tcg_gen_ori_i32(tmp, tmp, R_V7M_FPCCR_S_MASK);
++
 +static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_sub_i64(t, a, b);
 +    tcg_gen_sub_i64(d, b, a);
 +    tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +
 +    tcg_gen_smin_vec(vece, t, a, b);
 +    tcg_gen_smax_vec(vece, d, a, b);
 +    tcg_gen_sub_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_sabd_vec,
 +          .fno = gen_helper_gvec_sabd_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_sabd_vec,
 +          .fno = gen_helper_gvec_sabd_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_sabd_i32,
 +          .fniv = gen_sabd_vec,
 +          .fno = gen_helper_gvec_sabd_s,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_sabd_i64,
 +          .fniv = gen_sabd_vec,
 +          .fno = gen_helper_gvec_sabd_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
 +static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    tcg_gen_sub_i32(t, a, b);
 +    tcg_gen_sub_i32(d, b, a);
 +    tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_sub_i64(t, a, b);
 +    tcg_gen_sub_i64(d, b, a);
 +    tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +
 +    tcg_gen_umin_vec(vece, t, a, b);
 +    tcg_gen_umax_vec(vece, d, a, b);
 +    tcg_gen_sub_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_uabd_i32,
 +          .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_s,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_uabd_i64,
 +          .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
     We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 1;
 +        case NEON_3R_VABD:
 +            if (u) {
 +                gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs,
 +                              vec_size, vec_size);
 +            } else {
-+                tcg_gen_andi_i32(tmp, tmp, ~R_V7M_FPCCR_S_MASK);
++                gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs,
 +                              vec_size, vec_size);
 +            }
-+            store_cpu_field(tmp, v7m.fpccr[M_REG_S]);
++            return 0;
-+            /* Don't need to do this for any further FP insns in this TB */
++
-+            s->v8m_fpccr_s_wrong = false;
+         case NEON_3R_VADD_VSUB:
-+        }
+         case NEON_3R_LOGIC:
-+    }
+         case NEON_3R_VMAX:
-+
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     if (extract32(insn, 28, 4) == 0xf) {
+         case NEON_3R_VQRSHL:
-         /*
+             GEN_NEON_INTEGER_OP_ENV(qrshl);
-          * Encodings with T=1 (Thumb) or unconditional (ARM):
+             break;
-@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
+-        case NEON_3R_VABD:
-     dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
+-            GEN_NEON_INTEGER_OP(abd);
-         regime_is_secure(env, dc->mmu_idx);
+-            break;
-     dc->v8m_stackcheck = FIELD_EX32(tb_flags, TBFLAG_A32, STACKCHECK);
+         case NEON_3R_VABA:
-+    dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
+             GEN_NEON_INTEGER_OP(abd);
-     dc->cp_regs = cpu->cp_regs;
+             tcg_temp_free_i32(tmp2);
-     dc->features = env->features;
+diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
+index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_CMP0(gvec_cgt0_h, int16_t, >)
  DO_CMP0(gvec_cge0_h, int16_t, >=)
  #undef DO_CMP0
 +
 +#define DO_ABD(NAME, TYPE)                                      \
 +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc)  \
 +{                                                               \
 +    intptr_t i, opr_sz = simd_oprsz(desc);                      \
 +    TYPE *d = vd, *n = vn, *m = vm;                             \
 +                                                                \
 +    for (i = 0; i < opr_sz / sizeof(TYPE); ++i) {               \
 +        d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i];         \
 +    }                                                           \
 +    clear_tail(d, opr_sz, simd_maxsz(desc));                    \
 +}
 +
 +DO_ABD(gvec_sabd_b, int8_t)
 +DO_ABD(gvec_sabd_h, int16_t)
 +DO_ABD(gvec_sabd_s, int32_t)
 +DO_ABD(gvec_sabd_d, int64_t)
 +
 +DO_ABD(gvec_uabd_b, uint8_t)
 +DO_ABD(gvec_uabd_h, uint16_t)
 +DO_ABD(gvec_uabd_s, uint32_t)
 +DO_ABD(gvec_uabd_d, uint64_t)
 +
 +#undef DO_ABD
 --
 .20.1

-[Qemu-devel] [PULL 02/42] hw/ssi/xilinx_spips: Avoid variable length array
+[PULL 17/45] target/arm: Vectorize SABA/UABA
-In the stripe8() function we use a variable length array; however
+From: Richard Henderson <richard.henderson@linaro.org>
-we know that the maximum length required is MAX_NUM_BUSSES. Use
-a fixed-length array and an assert instead.
+Include 64-bit element size in preparation for SVE2.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 20200513163245.17915-17-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
-Message-id: 20190328152635.2794-1-peter.maydell@linaro.org
 ---
- hw/ssi/xilinx_spips.c | 6 ++++--
+ target/arm/helper.h        |  17 +++--
-file changed, 4 insertions(+), 2 deletions(-)
+ target/arm/translate.h     |   5 ++
+ target/arm/neon_helper.c   |  10 ---
-diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
+ target/arm/translate-a64.c |  17 ++---
-index XXXXXXX..XXXXXXX 100644
+ target/arm/translate.c     | 134 +++++++++++++++++++++++++++++++++++--
---- a/hw/ssi/xilinx_spips.c
+ target/arm/vec_helper.c    |  24 +++++++
-+++ b/hw/ssi/xilinx_spips.c
+files changed, 174 insertions(+), 33 deletions(-)
-@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_qspips_reset(DeviceState *d)
+diff --git a/target/arm/helper.h b/target/arm/helper.h
- static inline void stripe8(uint8_t *x, int num, bool dir)
+index XXXXXXX..XXXXXXX 100644
- {
+--- a/target/arm/helper.h
--    uint8_t r[num];
++++ b/target/arm/helper.h
--    memset(r, 0, sizeof(uint8_t) * num);
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32)
-+    uint8_t r[MAX_NUM_BUSSES];
+ DEF_HELPER_2(neon_pmax_u16, i32, i32, i32)
-     int idx[2] = {0, 0};
+ DEF_HELPER_2(neon_pmax_s16, i32, i32, i32)
-     int bit[2] = {0, 7};
-     int d = dir;
+-DEF_HELPER_2(neon_abd_u8, i32, i32, i32)
+-DEF_HELPER_2(neon_abd_s8, i32, i32, i32)
-+    assert(num <= MAX_NUM_BUSSES);
+-DEF_HELPER_2(neon_abd_u16, i32, i32, i32)
-+    memset(r, 0, sizeof(uint8_t) * num);
+-DEF_HELPER_2(neon_abd_s16, i32, i32, i32)
-+
+-DEF_HELPER_2(neon_abd_u32, i32, i32, i32)
-     for (idx[0] = 0; idx[0] < num; ++idx[0]) {
+-DEF_HELPER_2(neon_abd_s32, i32, i32, i32)
-         for (bit[0] = 7; bit[0] >= 0; bit[0]--) {
+-
-             r[idx[!d]] |= x[idx[d]] & 1 << bit[d] ? 1 << bit[!d] : 0;
+ DEF_HELPER_2(neon_shl_u16, i32, i32, i32)
  DEF_HELPER_2(neon_shl_s16, i32, i32, i32)
  DEF_HELPER_2(neon_rshl_u8, i32, i32, i32)
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +
 +DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
  #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
  void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                     uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon_helper.c
 +++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ NEON_POP(pmax_s16, neon_s16, 2)
  NEON_POP(pmax_u16, neon_u16, 2)
  #undef NEON_FN
 -#define NEON_FN(dest, src1, src2) \
 -    dest = (src1 > src2) ? (src1 - src2) : (src2 - src1)
 -NEON_VOP(abd_s8, neon_s8, 4)
 -NEON_VOP(abd_u8, neon_u8, 4)
 -NEON_VOP(abd_s16, neon_s16, 2)
 -NEON_VOP(abd_u16, neon_u16, 2)
 -NEON_VOP(abd_s32, neon_s32, 1)
 -NEON_VOP(abd_u32, neon_u32, 1)
 -#undef NEON_FN
 -
  #define NEON_FN(dest, src1, src2) do { \
      int8_t tmp; \
      tmp = (int8_t)src2; \
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
              gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
          }
          return;
 +    case 0xf: /* SABA, UABA */
 +        if (u) {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size);
 +        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size);
 +        }
 +        return;
      case 0x10: /* ADD, SUB */
          if (u) {
              gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                  genenvfn = fns[size][u];
                  break;
              }
 -            case 0xf: /* SABA, UABA */
 -            {
 -                static NeonGenTwoOpFn * const fns[3][2] = {
 -                    { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 },
 -                    { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 },
 -                    { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 },
 -                };
 -                genfn = fns[size][u];
 -                break;
 -            }
              case 0x16: /* SQDMULH, SQRDMULH */
              {
                  static NeonGenTwoOpEnvFn * const fns[2][2] = {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
      tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
  }
 +static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +    gen_sabd_i32(t, a, b);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +    gen_sabd_i64(t, a, b);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    gen_sabd_vec(vece, t, a, b);
 +    tcg_gen_add_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_add_vec,
 +        INDEX_op_smin_vec, INDEX_op_smax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_b,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_h,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_16 },
 +        { .fni4 = gen_saba_i32,
 +          .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_s,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_32 },
 +        { .fni8 = gen_saba_i64,
 +          .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
 +static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +    gen_uabd_i32(t, a, b);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +    gen_uabd_i64(t, a, b);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    gen_uabd_vec(vece, t, a, b);
 +    tcg_gen_add_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_add_vec,
 +        INDEX_op_umin_vec, INDEX_op_umax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_b,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_h,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_16 },
 +        { .fni4 = gen_uaba_i32,
 +          .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_s,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_32 },
 +        { .fni8 = gen_uaba_i64,
 +          .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
     We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 0;
 +        case NEON_3R_VABA:
 +            if (u) {
 +                gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
 +                              vec_size, vec_size);
 +            } else {
 +                gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
 +                              vec_size, vec_size);
 +            }
 +            return 0;
 +
          case NEON_3R_VADD_VSUB:
          case NEON_3R_LOGIC:
          case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VQRSHL:
              GEN_NEON_INTEGER_OP_ENV(qrshl);
              break;
 -        case NEON_3R_VABA:
 -            GEN_NEON_INTEGER_OP(abd);
 -            tcg_temp_free_i32(tmp2);
 -            tmp2 = neon_load_reg(rd, pass);
 -            gen_neon_add(size, tmp, tmp2);
 -            break;
          case NEON_3R_VPMAX:
              GEN_NEON_INTEGER_OP(pmax);
              break;
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_ABD(gvec_uabd_s, uint32_t)
  DO_ABD(gvec_uabd_d, uint64_t)
  #undef DO_ABD
 +
 +#define DO_ABA(NAME, TYPE)                                      \
 +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc)  \
 +{                                                               \
 +    intptr_t i, opr_sz = simd_oprsz(desc);                      \
 +    TYPE *d = vd, *n = vn, *m = vm;                             \
 +                                                                \
 +    for (i = 0; i < opr_sz / sizeof(TYPE); ++i) {               \
 +        d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i];        \
 +    }                                                           \
 +    clear_tail(d, opr_sz, simd_maxsz(desc));                    \
 +}
 +
 +DO_ABA(gvec_saba_b, int8_t)
 +DO_ABA(gvec_saba_h, int16_t)
 +DO_ABA(gvec_saba_s, int32_t)
 +DO_ABA(gvec_saba_d, int64_t)
 +
 +DO_ABA(gvec_uaba_b, uint8_t)
 +DO_ABA(gvec_uaba_h, uint16_t)
 +DO_ABA(gvec_uaba_s, uint32_t)
 +DO_ABA(gvec_uaba_d, uint64_t)
 +
 +#undef DO_ABA
 --
 .20.1

-[Qemu-devel] [PULL 31/42] hw/arm/aspeed: Use TYPE_TMP105/TYPE_PCA9552 instead of hardcoded string
+[PULL 18/45] aspeed: Add support for the sonorapass-bmc board
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Patrick Williams <patrick@stwcx.xyz>
-Reviewed-by: Thomas Huth <thuth@redhat.com>
+Sonora Pass is a 2 socket x86 motherboard designed by Facebook
 and supported by OpenBMC.  Strapping configuration was obtained
 from hardware and i2c configuration is based on dts found at:
 https://github.com/facebook/openbmc-linux/blob/1633c87b8ba7c162095787c988979b748ba65dc8/arch/arm/boot/dts/aspeed-bmc-facebook-sonorapass.dts
 Booted a test image of http://github.com/facebook/openbmc to login
 prompt.
 Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
 Reviewed-by: Amithash Prasad <amithash@fb.com>
 Reviewed-by: Cédric Le Goater <clg@kaod.org>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+[PMM: fixed block comment style nit]
 Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Message-id: 20190412165416.7977-2-philmd@redhat.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/aspeed.c | 13 +++++++++----
+ hw/arm/aspeed.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++
-file changed, 9 insertions(+), 4 deletions(-)
+file changed, 78 insertions(+)
 diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/aspeed.c
 +++ b/hw/arm/aspeed.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ struct AspeedBoardState {
- #include "hw/arm/aspeed_soc.h"
+         SCU_AST2500_HW_STRAP_ACPI_ENABLE |                              \
- #include "hw/boards.h"
+         SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER))
- #include "hw/i2c/smbus_eeprom.h"
-+#include "hw/misc/pca9552.h"
++/* Sonorapass hardware value: 0xF100D216 */
-+#include "hw/misc/tmp105.h"
++#define SONORAPASS_BMC_HW_STRAP1 (                                      \
- #include "qemu/log.h"
++        SCU_AST2500_HW_STRAP_SPI_AUTOFETCH_ENABLE |                     \
- #include "sysemu/block-backend.h"
++        SCU_AST2500_HW_STRAP_GPIO_STRAP_ENABLE |                        \
- #include "hw/loader.h"
++        SCU_AST2500_HW_STRAP_UART_DEBUG |                               \
-@@ -XXX,XX +XXX,XX @@ static void ast2500_evb_i2c_init(AspeedBoardState *bmc)
++        SCU_AST2500_HW_STRAP_RESERVED28 |                               \
-                           eeprom_buf);
++        SCU_AST2500_HW_STRAP_DDR4_ENABLE |                              \
++        SCU_HW_STRAP_VGA_CLASS_CODE |                                   \
-     /* The AST2500 EVB expects a LM75 but a TMP105 is compatible */
++        SCU_HW_STRAP_LPC_RESET_PIN |                                    \
--    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 7), "tmp105", 0x4d);
++        SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER) |                \
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 7),
++        SCU_AST2500_HW_STRAP_SET_AXI_AHB_RATIO(AXI_AHB_RATIO_2_1) |     \
-+                     TYPE_TMP105, 0x4d);
++        SCU_HW_STRAP_VGA_BIOS_ROM |                                     \
++        SCU_HW_STRAP_VGA_SIZE_SET(VGA_16M_DRAM) |                       \
-     /* The AST2500 EVB does not have an RTC. Let's pretend that one is
++        SCU_AST2500_HW_STRAP_RESERVED1)
-      * plugged on the I2C bus header */
++
-@@ -XXX,XX +XXX,XX @@ static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
+ /* Swift hardware value: 0xF11AD206 */
  #define SWIFT_BMC_HW_STRAP1 (                                           \
          AST2500_HW_STRAP1_DEFAULTS |                                    \
@@ -XXX,XX +XXX,XX @@ static void swift_bmc_i2c_init(AspeedBoardState *bmc)
      i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 12), "tmp105", 0x4a);
  }
 +static void sonorapass_bmc_i2c_init(AspeedBoardState *bmc)
 +{
 +    AspeedSoCState *soc = &bmc->soc;
 +
 +    /* bus 2 : */
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x48);
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x49);
 +    /* bus 2 : pca9546 @ 0x73 */
 +
 +    /* bus 3 : pca9548 @ 0x70 */
 +
 +    /* bus 4 : */
 +    uint8_t *eeprom4_54 = g_malloc0(8 * 1024);
 +    smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), 0x54,
 +                          eeprom4_54);
 +    /* PCA9539 @ 0x76, but PCA9552 is compatible */
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x76);
 +    /* PCA9539 @ 0x77, but PCA9552 is compatible */
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x77);
 +
 +    /* bus 6 : */
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x48);
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x49);
 +    /* bus 6 : pca9546 @ 0x73 */
 +
 +    /* bus 8 : */
 +    uint8_t *eeprom8_56 = g_malloc0(8 * 1024);
 +    smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), 0x56,
 +                          eeprom8_56);
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x60);
 +    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x61);
 +    /* bus 8 : adc128d818 @ 0x1d */
 +    /* bus 8 : adc128d818 @ 0x1f */
 +
 +    /*
 +     * bus 13 : pca9548 @ 0x71
 +     *      - channel 3:
 +     *          - tmm421 @ 0x4c
 +     *          - tmp421 @ 0x4e
 +     *          - tmp421 @ 0x4f
 +     */
 +
 +}
 +
  static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
  {
      AspeedSoCState *soc = &bmc->soc;
-     uint8_t *eeprom_buf = g_malloc0(8 * 1024);
+@@ -XXX,XX +XXX,XX @@ static void aspeed_machine_romulus_class_init(ObjectClass *oc, void *data)
+     mc->default_ram_size       = 512 * MiB;
--    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 3), "pca9552", 0x60);
+ };
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 3), TYPE_PCA9552,
-+                     0x60);
++static void aspeed_machine_sonorapass_class_init(ObjectClass *oc, void *data)
++{
-     i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "tmp423", 0x4c);
++    MachineClass *mc = MACHINE_CLASS(oc);
-     i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 5), "tmp423", 0x4c);
++    AspeedMachineClass *amc = ASPEED_MACHINE_CLASS(oc);
++
-     /* The Witherspoon expects a TMP275 but a TMP105 is compatible */
++    mc->desc       = "OCP SonoraPass BMC (ARM1176)";
--    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 9), "tmp105", 0x4a);
++    amc->soc_name  = "ast2500-a1";
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 9), TYPE_TMP105,
++    amc->hw_strap1 = SONORAPASS_BMC_HW_STRAP1;
-+                     0x4a);
++    amc->fmc_model = "mx66l1g45g";
++    amc->spi_model = "mx66l1g45g";
-     /* The witherspoon board expects Epson RX8900 I2C RTC but a ds1338 is
++    amc->num_cs    = 2;
-      * good enough */
++    amc->i2c_init  = sonorapass_bmc_i2c_init;
-@@ -XXX,XX +XXX,XX @@ static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
++    mc->default_ram_size       = 512 * MiB;
++};
-     smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 11), 0x51,
++
-                           eeprom_buf);
+ static void aspeed_machine_swift_class_init(ObjectClass *oc, void *data)
--    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 11), "pca9552",
+ {
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 11), TYPE_PCA9552,
+     MachineClass *mc = MACHINE_CLASS(oc);
-x60);
+@@ -XXX,XX +XXX,XX @@ static const TypeInfo aspeed_machine_types[] = {
- }
+         .name          = MACHINE_TYPE_NAME("swift-bmc"),
+         .parent        = TYPE_ASPEED_MACHINE,
          .class_init    = aspeed_machine_swift_class_init,
 +    }, {
 +        .name          = MACHINE_TYPE_NAME("sonorapass-bmc"),
 +        .parent        = TYPE_ASPEED_MACHINE,
 +        .class_init    = aspeed_machine_sonorapass_class_init,
      }, {
          .name          = MACHINE_TYPE_NAME("witherspoon-bmc"),
          .parent        = TYPE_ASPEED_MACHINE,
 --
 .20.1

-[Qemu-devel] [PULL 30/42] hw/dma: Compile the bcm2835_dma device as common object
+[PULL 19/45] acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-This device is used by both ARM (BCM2836, for raspi2) and AArch64
+The little end UUID is used in many places, so make
-(BCM2837, for raspi3) targets, and is not CPU-specific.
+NVDIMM_UUID_LE to a common macro to convert the UUID
-Move it to common object, so we build it once for all targets.
+to a little end array.
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Xiang Zheng <zhengxiang9@huawei.com>
-Message-id: 20190427133028.12874-1-philmd@redhat.com
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Message-id: 20200512030609.19593-2-gengdongjiu@huawei.com
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/dma/Makefile.objs | 2 +-
+ include/qemu/uuid.h | 27 +++++++++++++++++++++++++++
-file changed, 1 insertion(+), 1 deletion(-)
+ hw/acpi/nvdimm.c    | 10 +++-------
 files changed, 30 insertions(+), 7 deletions(-)
-diff --git a/hw/dma/Makefile.objs b/hw/dma/Makefile.objs
+diff --git a/include/qemu/uuid.h b/include/qemu/uuid.h
 index XXXXXXX..XXXXXXX 100644
---- a/hw/dma/Makefile.objs
+--- a/include/qemu/uuid.h
-+++ b/hw/dma/Makefile.objs
++++ b/include/qemu/uuid.h
-@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_XLNX_ZYNQMP_ARM) += xlnx-zdma.o
+@@ -XXX,XX +XXX,XX @@ typedef struct {
+     };
- obj-$(CONFIG_OMAP) += omap_dma.o soc_dma.o
+ } QemuUUID;
- obj-$(CONFIG_PXA2XX) += pxa2xx_dma.o
--obj-$(CONFIG_RASPI) += bcm2835_dma.o
++/**
-+common-obj-$(CONFIG_RASPI) += bcm2835_dma.o
++ * UUID_LE - converts the fields of UUID to little-endian array,
 + * each of parameters is the filed of UUID.
 + *
 + * @time_low: The low field of the timestamp
 + * @time_mid: The middle field of the timestamp
 + * @time_hi_and_version: The high field of the timestamp
 + *                       multiplexed with the version number
 + * @clock_seq_hi_and_reserved: The high field of the clock
 + *                             sequence multiplexed with the variant
 + * @clock_seq_low: The low field of the clock sequence
 + * @node0: The spatially unique node0 identifier
 + * @node1: The spatially unique node1 identifier
 + * @node2: The spatially unique node2 identifier
 + * @node3: The spatially unique node3 identifier
 + * @node4: The spatially unique node4 identifier
 + * @node5: The spatially unique node5 identifier
 + */
 +#define UUID_LE(time_low, time_mid, time_hi_and_version,                    \
 +  clock_seq_hi_and_reserved, clock_seq_low, node0, node1, node2,            \
 +  node3, node4, node5)                                                      \
 +  { (time_low) & 0xff, ((time_low) >> 8) & 0xff, ((time_low) >> 16) & 0xff, \
 +    ((time_low) >> 24) & 0xff, (time_mid) & 0xff, ((time_mid) >> 8) & 0xff, \
 +    (time_hi_and_version) & 0xff, ((time_hi_and_version) >> 8) & 0xff,      \
 +    (clock_seq_hi_and_reserved), (clock_seq_low), (node0), (node1), (node2),\
 +    (node3), (node4), (node5) }
 +
  #define UUID_FMT "%02hhx%02hhx%02hhx%02hhx-" \
                   "%02hhx%02hhx-%02hhx%02hhx-" \
                   "%02hhx%02hhx-" \
 diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/acpi/nvdimm.c
 +++ b/hw/acpi/nvdimm.c
@@ -XXX,XX +XXX,XX @@
   */
  #include "qemu/osdep.h"
 +#include "qemu/uuid.h"
  #include "hw/acpi/acpi.h"
  #include "hw/acpi/aml-build.h"
  #include "hw/acpi/bios-linker-loader.h"
@@ -XXX,XX +XXX,XX @@
  #include "hw/mem/nvdimm.h"
  #include "qemu/nvdimm-utils.h"
 -#define NVDIMM_UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)             \
 -   { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
 -     (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
 -     (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) }
 -
  /*
   * define Byte Addressable Persistent Memory (PM) Region according to
   * ACPI 6.0: 5.2.25.1 System Physical Address Range Structure.
   */
  static const uint8_t nvdimm_nfit_spa_uuid[] =
 -      NVDIMM_UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
 -                     0x18, 0xb7, 0x8c, 0xdb);
 +      UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
 +              0x18, 0xb7, 0x8c, 0xdb);
  /*
   * NVDIMM Firmware Interface Table
 --
 .20.1

-[Qemu-devel] [PULL 17/42] target/arm: Allow for floating point in callee stack integrity check
+[PULL 20/45] hw/arm/virt: Introduce a RAS machine option
-The magic value pushed onto the callee stack as an integrity
+From: Dongjiu Geng <gengdongjiu@huawei.com>
 check is different if floating point is present.
+RAS Virtualization feature is not supported now, so
+add a RAS machine option and disable it by default.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
+Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
+Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
+Reviewed-by: Igor Mammedov <imammedo@redhat.com>
+Message-id: 20200512030609.19593-3-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-15-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 22 +++++++++++++++++++---
+ include/hw/arm/virt.h |  1 +
-file changed, 19 insertions(+), 3 deletions(-)
+ hw/arm/virt.c         | 23 +++++++++++++++++++++++
 files changed, 24 insertions(+)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/include/hw/arm/virt.h
-+++ b/target/arm/helper.c
++++ b/include/hw/arm/virt.h
-@@ -XXX,XX +XXX,XX @@ load_fail:
+@@ -XXX,XX +XXX,XX @@ typedef struct {
-     return false;
+     bool highmem_ecam;
      bool its;
      bool virt;
 +    bool ras;
      OnOffAuto acpi;
      VirtGICType gic_version;
      VirtIOMMUType iommu;
 diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/virt.c
 +++ b/hw/arm/virt.c
@@ -XXX,XX +XXX,XX @@ static void virt_set_acpi(Object *obj, Visitor *v, const char *name,
      visit_type_OnOffAuto(v, name, &vms->acpi, errp);
  }
-+static uint32_t v7m_integrity_sig(CPUARMState *env, uint32_t lr)
++static bool virt_get_ras(Object *obj, Error **errp)
 +{
-+    /*
++    VirtMachineState *vms = VIRT_MACHINE(obj);
 +     * Return the integrity signature value for the callee-saves
 +     * stack frame section. @lr is the exception return payload/LR value
 +     * whose FType bit forms bit 0 of the signature if FP is present.
 +     */
 +    uint32_t sig = 0xfefa125a;
 +
-+    if (!arm_feature(env, ARM_FEATURE_VFP) || (lr & R_V7M_EXCRET_FTYPE_MASK)) {
++    return vms->ras;
 +        sig |= 1;
 +    }
 +    return sig;
 +}
 +
- static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
++static void virt_set_ras(Object *obj, bool value, Error **errp)
-                                   bool ignore_faults)
++{
 +    VirtMachineState *vms = VIRT_MACHINE(obj);
 +
 +    vms->ras = value;
 +}
 +
  static char *virt_get_gic_version(Object *obj, Error **errp)
  {
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
+     VirtMachineState *vms = VIRT_MACHINE(obj);
-     bool stacked_ok;
+@@ -XXX,XX +XXX,XX @@ static void virt_instance_init(Object *obj)
-     uint32_t limit;
+                                     "Valid values are none and smmuv3",
-     bool want_psp;
+                                     NULL);
-+    uint32_t sig;
++    /* Default disallows RAS instantiation */
-     if (dotailchain) {
++    vms->ras = false;
-         bool mode = lr & R_V7M_EXCRET_MODE_MASK;
++    object_property_add_bool(obj, "ras", virt_get_ras,
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
++                             virt_set_ras, NULL);
-     /* Write as much of the stack frame as we can. A write failure may
++    object_property_set_description(obj, "ras",
-      * cause us to pend a derived exception.
++                                    "Set on/off to enable/disable reporting host memory errors "
-      */
++                                    "to a KVM guest using ACPI and guest external abort exceptions",
-+    sig = v7m_integrity_sig(env, lr);
++                                    NULL);
-     stacked_ok =
++
--        v7m_stack_write(cpu, frameptr, 0xfefa125b, mmu_idx, ignore_faults) &&
+     vms->irqmap = a15irqmap;
-+        v7m_stack_write(cpu, frameptr, sig, mmu_idx, ignore_faults) &&
-         v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx,
+     virt_flash_create(vms);
                          ignore_faults) &&
          v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx,
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
          if (return_to_secure &&
              ((excret & R_V7M_EXCRET_ES_MASK) == 0 ||
               (excret & R_V7M_EXCRET_DCRS_MASK) == 0)) {
 -            uint32_t expected_sig = 0xfefa125b;
              uint32_t actual_sig;
              pop_ok = v7m_stack_read(cpu, &actual_sig, frameptr, mmu_idx);
 -            if (pop_ok && expected_sig != actual_sig) {
 +            if (pop_ok && v7m_integrity_sig(env, excret) != actual_sig) {
                  /* Take a SecureFault on the current stack */
                  env->v7m.sfsr |= R_V7M_SFSR_INVIS_MASK;
                  armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
 --
 .20.1

-[Qemu-devel] [PULL 35/42] hw/devices: Move Blizzard declarations into a new header
+[PULL 21/45] docs: APEI GHES generation and CPER record description
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-Add an entries the Blizzard device in MAINTAINERS.
+Add APEI/GHES detailed design document
-Reviewed-by: Thomas Huth <thuth@redhat.com>
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
-Message-id: 20190412165416.7977-6-philmd@redhat.com
+Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Message-id: 20200512030609.19593-4-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/devices.h          |  7 -------
+ docs/specs/acpi_hest_ghes.rst | 110 ++++++++++++++++++++++++++++++++++
- include/hw/display/blizzard.h | 22 ++++++++++++++++++++++
+ docs/specs/index.rst          |   1 +
- hw/arm/nseries.c              |  1 +
+files changed, 111 insertions(+)
- hw/display/blizzard.c         |  2 +-
+ create mode 100644 docs/specs/acpi_hest_ghes.rst
  MAINTAINERS                   |  2 ++
 files changed, 26 insertions(+), 8 deletions(-)
  create mode 100644 include/hw/display/blizzard.h
-diff --git a/include/hw/devices.h b/include/hw/devices.h
+diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
 index XXXXXXX..XXXXXXX 100644
 --- a/include/hw/devices.h
 +++ b/include/hw/devices.h
@@ -XXX,XX +XXX,XX @@ void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
  /* stellaris_input.c */
  void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
 -/* blizzard.c */
 -void *s1d13745_init(qemu_irq gpio_int);
 -void s1d13745_write(void *opaque, int dc, uint16_t value);
 -void s1d13745_write_block(void *opaque, int dc,
 -                void *buf, size_t len, int pitch);
 -uint16_t s1d13745_read(void *opaque, int dc);
 -
  /* cbus.c */
  typedef struct {
      qemu_irq clk;
 diff --git a/include/hw/display/blizzard.h b/include/hw/display/blizzard.h
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
-+++ b/include/hw/display/blizzard.h
++++ b/docs/specs/acpi_hest_ghes.rst
 @@ -XXX,XX +XXX,XX @@
-+/*
++APEI tables generating and CPER record
-+ * Epson S1D13744/S1D13745 (Blizzard/Hailstorm/Tornado) LCD/TV controller.
++======================================
 + *
 + * Copyright (C) 2008 Nokia Corporation
 + * Written by Andrzej Zaborowski
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or later.
 + * See the COPYING file in the top-level directory.
 + */
 +
-+#ifndef HW_DISPLAY_BLIZZARD_H
++..
-+#define HW_DISPLAY_BLIZZARD_H
++   Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
 +
-+#include "hw/irq.h"
++   This work is licensed under the terms of the GNU GPL, version 2 or later.
 +   See the COPYING file in the top-level directory.
 +
-+void *s1d13745_init(qemu_irq gpio_int);
++Design Details
-+void s1d13745_write(void *opaque, int dc, uint16_t value);
++--------------
 +void s1d13745_write_block(void *opaque, int dc,
 +                          void *buf, size_t len, int pitch);
 +uint16_t s1d13745_read(void *opaque, int dc);
 +
-+#endif
++::
-diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
++
 +         etc/acpi/tables                           etc/hardware_errors
 +      ====================                   ===============================
 +  + +--------------------------+            +----------------------------+
 +  | | HEST                     | +--------->|    error_block_address1    |------+
 +  | +--------------------------+ |          +----------------------------+      |
 +  | | GHES1                    | | +------->|    error_block_address2    |------+-+
 +  | +--------------------------+ | |        +----------------------------+      | |
 +  | | .................        | | |        |      ..............        |      | |
 +  | | error_status_address-----+-+ |        -----------------------------+      | |
 +  | | .................        |   |   +--->|    error_block_addressN    |------+-+---+
 +  | | read_ack_register--------+-+ |   |    +----------------------------+      | |   |
 +  | | read_ack_preserve        | +-+---+--->|     read_ack_register1     |      | |   |
 +  | | read_ack_write           |   |   |    +----------------------------+      | |   |
 +  + +--------------------------+   | +-+--->|     read_ack_register2     |      | |   |
 +  | | GHES2                    |   | | |    +----------------------------+      | |   |
 +  + +--------------------------+   | | |    |       .............        |      | |   |
 +  | | .................        |   | | |    +----------------------------+      | |   |
 +  | | error_status_address-----+---+ | | +->|     read_ack_registerN     |      | |   |
 +  | | .................        |     | | |  +----------------------------+      | |   |
 +  | | read_ack_register--------+-----+ | |  |Generic Error Status Block 1|<-----+ |   |
 +  | | read_ack_preserve        |       | |  |-+------------------------+-+        |   |
 +  | | read_ack_write           |       | |  | |          CPER          | |        |   |
 +  + +--------------------------|       | |  | |          CPER          | |        |   |
 +  | | ...............          |       | |  | |          ....          | |        |   |
 +  + +--------------------------+       | |  | |          CPER          | |        |   |
 +  | | GHESN                    |       | |  |-+------------------------+-|        |   |
 +  + +--------------------------+       | |  |Generic Error Status Block 2|<-------+   |
 +  | | .................        |       | |  |-+------------------------+-+            |
 +  | | error_status_address-----+-------+ |  | |           CPER         | |            |
 +  | | .................        |         |  | |           CPER         | |            |
 +  | | read_ack_register--------+---------+  | |           ....         | |            |
 +  | | read_ack_preserve        |            | |           CPER         | |            |
 +  | | read_ack_write           |            +-+------------------------+-+            |
 +  + +--------------------------+            |         ..........         |            |
 +                                            |----------------------------+            |
 +                                            |Generic Error Status Block N |<----------+
 +                                            |-+-------------------------+-+
 +                                            | |          CPER           | |
 +                                            | |          CPER           | |
 +                                            | |          ....           | |
 +                                            | |          CPER           | |
 +                                            +-+-------------------------+-+
 +
 +
 +(1) QEMU generates the ACPI HEST table. This table goes in the current
 +    "etc/acpi/tables" fw_cfg blob. Each error source has different
 +    notification types.
 +
 +(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
 +    also needs to populate this blob. The "etc/hardware_errors" fw_cfg blob
 +    contains an address registers table and an Error Status Data Block table.
 +
 +(3) The address registers table contains N Error Block Address entries
 +    and N Read Ack Register entries. The size for each entry is 8-byte.
 +    The Error Status Data Block table contains N Error Status Data Block
 +    entries. The size for each entry is 4096(0x1000) bytes. The total size
 +    for the "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
 +    N is the number of the kinds of hardware error sources.
 +
 +(4) QEMU generates the ACPI linker/loader script for the firmware. The
 +    firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
 +    and copies blob contents there.
 +
 +(5) QEMU generates N ADD_POINTER commands, which patch addresses in the
 +    "error_status_address" fields of the HEST table with a pointer to the
 +    corresponding "address registers" in the "etc/hardware_errors" blob.
 +
 +(6) QEMU generates N ADD_POINTER commands, which patch addresses in the
 +    "read_ack_register" fields of the HEST table with a pointer to the
 +    corresponding "read_ack_register" within the "etc/hardware_errors" blob.
 +
 +(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
 +    addresses in the "error_block_address" fields with a pointer to the
 +    respective "Error Status Data Block" in the "etc/hardware_errors" blob.
 +
 +(8) QEMU defines a third and write-only fw_cfg blob which is called
 +    "etc/hardware_errors_addr". Through that blob, the firmware can send back
 +    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
 +    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
 +    for the firmware. The firmware will write back the start address of
 +    "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
 +
 +(9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
 +    "Error Status Data Block", guest memory, and then injects platform specific
 +    interrupt (in case of arm/virt machine it's Synchronous External Abort) as a
 +    notification which is necessary for notifying the guest.
 +
 +(10) This notification (in virtual hardware) will be handled by the guest
 +     kernel, on receiving notification, guest APEI driver could read the CPER error
 +     and take appropriate action.
 +
 +(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
 +     find out "Error Status Data Block" entry corresponding to error source. So supported
 +     source_id values should be assigned here and not be changed afterwards to make sure
 +     that guest will write error into expected "Error Status Data Block" even if guest was
 +     migrated to a newer QEMU.
 diff --git a/docs/specs/index.rst b/docs/specs/index.rst
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/nseries.c
+--- a/docs/specs/index.rst
-+++ b/hw/arm/nseries.c
++++ b/docs/specs/index.rst
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ Contents:
- #include "hw/boards.h"
+    ppc-spapr-xive
- #include "hw/i2c/i2c.h"
+    acpi_hw_reduced_hotplug
- #include "hw/devices.h"
+    tpm
-+#include "hw/display/blizzard.h"
++   acpi_hest_ghes
  #include "hw/misc/tmp105.h"
  #include "hw/block/flash.h"
  #include "hw/hw.h"
 diff --git a/hw/display/blizzard.c b/hw/display/blizzard.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/display/blizzard.c
 +++ b/hw/display/blizzard.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu/osdep.h"
  #include "qemu-common.h"
  #include "ui/console.h"
 -#include "hw/devices.h"
 +#include "hw/display/blizzard.h"
  #include "ui/pixel_ops.h"
  typedef void (*blizzard_fn_t)(uint8_t *, const uint8_t *, unsigned int);
 diff --git a/MAINTAINERS b/MAINTAINERS
 index XXXXXXX..XXXXXXX 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ M: Peter Maydell <peter.maydell@linaro.org>
  L: qemu-arm@nongnu.org
  S: Odd Fixes
  F: hw/arm/nseries.c
 +F: hw/display/blizzard.c
  F: hw/input/lm832x.c
  F: hw/input/tsc2005.c
  F: hw/misc/cbus.c
  F: hw/timer/twl92230.c
 +F: include/hw/display/blizzard.h
  Palm
  M: Andrzej Zaborowski <balrogg@gmail.com>
 --
 .20.1

-[Qemu-devel] [PULL 42/42] hw/devices: Move SMSC 91C111 declaration into a new header
+[PULL 22/45] ACPI: Build related register address fields via hardware error fw_cfg blob
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-This commit finally deletes "hw/devices.h".
+This patch builds error_block_address and read_ack_register fields
+in hardware errors table , the error_block_address points to Generic
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+Error Status Block(GESB) via bios_linker. The max size for one GESB
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+is 1kb, For more detailed information, please refer to
-Message-id: 20190412165416.7977-13-philmd@redhat.com
+document: docs/specs/acpi_hest_ghes.rst
 Now we only support one Error source, if necessary, we can extend to
 support more.
 Suggested-by: Laszlo Ersek <lersek@redhat.com>
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Message-id: 20200512030609.19593-5-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/devices.h       | 11 -----------
+ default-configs/arm-softmmu.mak |  1 +
- include/hw/net/smc91c111.h | 19 +++++++++++++++++++
+ include/hw/acpi/aml-build.h     |  1 +
- hw/arm/gumstix.c           |  2 +-
+ include/hw/acpi/ghes.h          | 28 +++++++++++
- hw/arm/integratorcp.c      |  2 +-
+ hw/acpi/aml-build.c             |  2 +
- hw/arm/mainstone.c         |  2 +-
+ hw/acpi/ghes.c                  | 89 +++++++++++++++++++++++++++++++++
- hw/arm/realview.c          |  2 +-
+ hw/arm/virt-acpi-build.c        |  5 ++
- hw/arm/versatilepb.c       |  2 +-
+ hw/acpi/Kconfig                 |  4 ++
- hw/net/smc91c111.c         |  2 +-
+ hw/acpi/Makefile.objs           |  1 +
-files changed, 25 insertions(+), 17 deletions(-)
+files changed, 131 insertions(+)
- delete mode 100644 include/hw/devices.h
+ create mode 100644 include/hw/acpi/ghes.h
- create mode 100644 include/hw/net/smc91c111.h
+ create mode 100644 hw/acpi/ghes.c
-diff --git a/include/hw/devices.h b/include/hw/devices.h
+diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
-deleted file mode 100644
+index XXXXXXX..XXXXXXX 100644
-index XXXXXXX..XXXXXXX
+--- a/default-configs/arm-softmmu.mak
---- a/include/hw/devices.h
++++ b/default-configs/arm-softmmu.mak
-+++ /dev/null
+@@ -XXX,XX +XXX,XX @@ CONFIG_FSL_IMX7=y
-@@ -XXX,XX +XXX,XX @@
+ CONFIG_FSL_IMX6UL=y
--#ifndef QEMU_DEVICES_H
+ CONFIG_SEMIHOSTING=y
--#define QEMU_DEVICES_H
+ CONFIG_ALLWINNER_H3=y
--
++CONFIG_ACPI_APEI=y
--/* Devices that have nowhere better to go.  */
+diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
--
+index XXXXXXX..XXXXXXX 100644
--#include "hw/hw.h"
+--- a/include/hw/acpi/aml-build.h
--
++++ b/include/hw/acpi/aml-build.h
--/* smc91c111.c */
+@@ -XXX,XX +XXX,XX @@ struct AcpiBuildTables {
--void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
+     GArray *rsdp;
--
+     GArray *tcpalog;
--#endif
+     GArray *vmgenid;
-diff --git a/include/hw/net/smc91c111.h b/include/hw/net/smc91c111.h
++    GArray *hardware_errors;
      BIOSLinker *linker;
  } AcpiBuildTables;
 diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
-+++ b/include/hw/net/smc91c111.h
++++ b/include/hw/acpi/ghes.h
 @@ -XXX,XX +XXX,XX @@
 +/*
-+ * SMSC 91C111 Ethernet interface emulation
++ * Support for generating APEI tables and recording CPER for Guests
 + *
-+ * Copyright (c) 2005 CodeSourcery, LLC.
++ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
-+ * Written by Paul Brook
++ *
-+ *
++ * Author: Dongjiu Geng <gengdongjiu@huawei.com>
-+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
++ *
-+ * See the COPYING file in the top-level directory.
++ * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation; either version 2 of the License, or
 + * (at your option) any later version.
 +
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 +
 + * You should have received a copy of the GNU General Public License along
 + * with this program; if not, see <http://www.gnu.org/licenses/>.
 + */
 +
-+#ifndef HW_NET_SMC91C111_H
++#ifndef ACPI_GHES_H
-+#define HW_NET_SMC91C111_H
++#define ACPI_GHES_H
 +
-+#include "hw/irq.h"
++#include "hw/acpi/bios-linker-loader.h"
-+#include "net/net.h"
++
-+
++void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
 +void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
 +
 +#endif
-diff --git a/hw/arm/gumstix.c b/hw/arm/gumstix.c
+diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/gumstix.c
+--- a/hw/acpi/aml-build.c
-+++ b/hw/arm/gumstix.c
++++ b/hw/acpi/aml-build.c
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_init(AcpiBuildTables *tables)
      tables->table_data = g_array_new(false, true /* clear */, 1);
      tables->tcpalog = g_array_new(false, true /* clear */, 1);
      tables->vmgenid = g_array_new(false, true /* clear */, 1);
 +    tables->hardware_errors = g_array_new(false, true /* clear */, 1);
      tables->linker = bios_linker_loader_init();
  }
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre)
      g_array_free(tables->table_data, true);
      g_array_free(tables->tcpalog, mfre);
      g_array_free(tables->vmgenid, mfre);
 +    g_array_free(tables->hardware_errors, mfre);
  }
  /*
 diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/hw/acpi/ghes.c
 @@ -XXX,XX +XXX,XX @@
- #include "hw/arm/pxa.h"
++/*
- #include "net/net.h"
++ * Support for generating APEI tables and recording CPER for Guests
- #include "hw/block/flash.h"
++ *
--#include "hw/devices.h"
++ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
-+#include "hw/net/smc91c111.h"
++ *
- #include "hw/boards.h"
++ * Author: Dongjiu Geng <gengdongjiu@huawei.com>
- #include "exec/address-spaces.h"
++ *
- #include "sysemu/qtest.h"
++ * This program is free software; you can redistribute it and/or modify
-diff --git a/hw/arm/integratorcp.c b/hw/arm/integratorcp.c
++ * it under the terms of the GNU General Public License as published by
-index XXXXXXX..XXXXXXX 100644
++ * the Free Software Foundation; either version 2 of the License, or
---- a/hw/arm/integratorcp.c
++ * (at your option) any later version.
-+++ b/hw/arm/integratorcp.c
++
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 +
 + * You should have received a copy of the GNU General Public License along
 + * with this program; if not, see <http://www.gnu.org/licenses/>.
 + */
 +
 +#include "qemu/osdep.h"
 +#include "qemu/units.h"
 +#include "hw/acpi/ghes.h"
 +#include "hw/acpi/aml-build.h"
 +
 +#define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
 +#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
 +
 +/* The max size in bytes for one error block */
 +#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
 +
 +/* Now only support ARMv8 SEA notification type error source */
 +#define ACPI_GHES_ERROR_SOURCE_COUNT        1
 +
 +/*
 + * Build table for the hardware error fw_cfg blob.
 + * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
 + * See docs/specs/acpi_hest_ghes.rst for blobs format.
 + */
 +void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
 +{
 +    int i, error_status_block_offset;
 +
 +    /* Build error_block_address */
 +    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
 +        build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
 +    }
 +
 +    /* Build read_ack_register */
 +    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
 +        /*
 +         * Initialize the value of read_ack_register to 1, so GHES can be
 +         * writeable after (re)boot.
 +         * ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
 +         * (GHESv2 - Type 10)
 +         */
 +        build_append_int_noprefix(hardware_errors, 1, sizeof(uint64_t));
 +    }
 +
 +    /* Generic Error Status Block offset in the hardware error fw_cfg blob */
 +    error_status_block_offset = hardware_errors->len;
 +
 +    /* Reserve space for Error Status Data Block */
 +    acpi_data_push(hardware_errors,
 +        ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
 +
 +    /* Tell guest firmware to place hardware_errors blob into RAM */
 +    bios_linker_loader_alloc(linker, ACPI_GHES_ERRORS_FW_CFG_FILE,
 +                             hardware_errors, sizeof(uint64_t), false);
 +
 +    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
 +        /*
 +         * Tell firmware to patch error_block_address entries to point to
 +         * corresponding "Generic Error Status Block"
 +         */
 +        bios_linker_loader_add_pointer(linker,
 +            ACPI_GHES_ERRORS_FW_CFG_FILE, sizeof(uint64_t) * i,
 +            sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
 +            error_status_block_offset + i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
 +    }
 +
 +    /*
 +     * tell firmware to write hardware_errors GPA into
 +     * hardware_errors_addr fw_cfg, once the former has been initialized.
 +     */
 +    bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
 +        0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
 +}
 diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/virt-acpi-build.c
 +++ b/hw/arm/virt-acpi-build.c
 @@ -XXX,XX +XXX,XX @@
- #include "qemu-common.h"
+ #include "sysemu/reset.h"
- #include "cpu.h"
+ #include "kvm_arm.h"
- #include "hw/sysbus.h"
+ #include "migration/vmstate.h"
--#include "hw/devices.h"
++#include "hw/acpi/ghes.h"
- #include "hw/boards.h"
- #include "hw/arm/arm.h"
+ #define ARM_SPI_BASE 32
- #include "hw/misc/arm_integrator_debug.h"
-+#include "hw/net/smc91c111.h"
+@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
- #include "net/net.h"
+     acpi_add_table(table_offsets, tables_blob);
- #include "exec/address-spaces.h"
+     build_spcr(tables_blob, tables->linker, vms);
- #include "sysemu/sysemu.h"
-diff --git a/hw/arm/mainstone.c b/hw/arm/mainstone.c
++    if (vms->ras) {
-index XXXXXXX..XXXXXXX 100644
++        build_ghes_error_table(tables->hardware_errors, tables->linker);
---- a/hw/arm/mainstone.c
++    }
-+++ b/hw/arm/mainstone.c
++
-@@ -XXX,XX +XXX,XX @@
+     if (ms->numa_state->num_nodes > 0) {
- #include "hw/arm/pxa.h"
+         acpi_add_table(table_offsets, tables_blob);
- #include "hw/arm/arm.h"
+         build_srat(tables_blob, tables->linker, vms);
- #include "net/net.h"
+diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
--#include "hw/devices.h"
+index XXXXXXX..XXXXXXX 100644
-+#include "hw/net/smc91c111.h"
+--- a/hw/acpi/Kconfig
- #include "hw/boards.h"
++++ b/hw/acpi/Kconfig
- #include "hw/block/flash.h"
+@@ -XXX,XX +XXX,XX @@ config ACPI_HMAT
- #include "hw/sysbus.h"
+     bool
-diff --git a/hw/arm/realview.c b/hw/arm/realview.c
+     depends on ACPI
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/realview.c
++config ACPI_APEI
-+++ b/hw/arm/realview.c
++    bool
-@@ -XXX,XX +XXX,XX @@
++    depends on ACPI
- #include "hw/sysbus.h"
++
- #include "hw/arm/arm.h"
+ config ACPI_PCI
- #include "hw/arm/primecell.h"
+     bool
--#include "hw/devices.h"
+     depends on ACPI && PCI
- #include "hw/net/lan9118.h"
+diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
-+#include "hw/net/smc91c111.h"
+index XXXXXXX..XXXXXXX 100644
- #include "hw/pci/pci.h"
+--- a/hw/acpi/Makefile.objs
- #include "net/net.h"
++++ b/hw/acpi/Makefile.objs
- #include "sysemu/sysemu.h"
+@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
-diff --git a/hw/arm/versatilepb.c b/hw/arm/versatilepb.c
+ common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
-index XXXXXXX..XXXXXXX 100644
+ common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
---- a/hw/arm/versatilepb.c
+ common-obj-$(CONFIG_ACPI_HMAT) += hmat.o
-+++ b/hw/arm/versatilepb.c
++common-obj-$(CONFIG_ACPI_APEI) += ghes.o
-@@ -XXX,XX +XXX,XX @@
+ common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
- #include "cpu.h"
+ common-obj-$(call lnot,$(CONFIG_PC)) += acpi-x86-stub.o
- #include "hw/sysbus.h"
  #include "hw/arm/arm.h"
 -#include "hw/devices.h"
 +#include "hw/net/smc91c111.h"
  #include "net/net.h"
  #include "sysemu/sysemu.h"
  #include "hw/pci/pci.h"
 diff --git a/hw/net/smc91c111.c b/hw/net/smc91c111.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/net/smc91c111.c
 +++ b/hw/net/smc91c111.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu/osdep.h"
  #include "hw/sysbus.h"
  #include "net/net.h"
 -#include "hw/devices.h"
 +#include "hw/net/smc91c111.h"
  #include "qemu/log.h"
  /* For crc32 */
  #include <zlib.h>
 --
 .20.1

-[Qemu-devel] [PULL 11/42] target/arm: Handle SFPA and FPCA bits in reads and writes of CONTROL
+[PULL 23/45] ACPI: Build Hardware Error Source Table
-The M-profile CONTROL register has two bits -- SFPA and FPCA --
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-which relate to floating-point support, and should be RES0 otherwise.
-Handle them correctly in the MSR/MRS register access code.
+This patch builds Hardware Error Source Table(HEST) via fw_cfg blobs.
-Neither is banked between security states, so they are stored
+Now it only supports ARMv8 SEA, a type of Generic Hardware Error
-in v7m.control[M_REG_S] regardless of current security state.
+Source version 2(GHESv2) error source. Afterwards, we can extend
+the supported types if needed. For the CPER section, currently it
 is memory section because kernel mainly wants userspace to handle
 the memory errors.
 This patch follows the spec ACPI 6.2 to build the Hardware Error
 Source table. For more detailed information, please refer to
 document: docs/specs/acpi_hest_ghes.rst
 build_ghes_hw_error_notification() helper will help to add Hardware
 Error Notification to ACPI tables without using packed C structures
 and avoid endianness issues as API doesn't need explicit conversion.
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Message-id: 20200512030609.19593-6-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-9-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 57 ++++++++++++++++++++++++++++++++++++++-------
+ include/hw/acpi/ghes.h   |  39 ++++++++++++
-file changed, 49 insertions(+), 8 deletions(-)
+ hw/acpi/ghes.c           | 126 +++++++++++++++++++++++++++++++++++++++
+ hw/arm/virt-acpi-build.c |   2 +
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+files changed, 167 insertions(+)
 diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/include/hw/acpi/ghes.h
-+++ b/target/arm/helper.c
++++ b/include/hw/acpi/ghes.h
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
+@@ -XXX,XX +XXX,XX @@
-         return xpsr_read(env) & mask;
-         break;
+ #include "hw/acpi/bios-linker-loader.h"
-     case 20: /* CONTROL */
--        return env->v7m.control[env->v7m.secure];
++/*
-+    {
++ * Values for Hardware Error Notification Type field
-+        uint32_t value = env->v7m.control[env->v7m.secure];
++ */
-+        if (!env->v7m.secure) {
++enum AcpiGhesNotifyType {
-+            /* SFPA is RAZ/WI from NS; FPCA is stored in the M_REG_S bank */
++    /* Polled */
-+            value |= env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK;
++    ACPI_GHES_NOTIFY_POLLED = 0,
-+        }
++    /* External Interrupt */
-+        return value;
++    ACPI_GHES_NOTIFY_EXTERNAL = 1,
 +    /* Local Interrupt */
 +    ACPI_GHES_NOTIFY_LOCAL = 2,
 +    /* SCI */
 +    ACPI_GHES_NOTIFY_SCI = 3,
 +    /* NMI */
 +    ACPI_GHES_NOTIFY_NMI = 4,
 +    /* CMCI, ACPI 5.0: 18.3.2.7, Table 18-290 */
 +    ACPI_GHES_NOTIFY_CMCI = 5,
 +    /* MCE, ACPI 5.0: 18.3.2.7, Table 18-290 */
 +    ACPI_GHES_NOTIFY_MCE = 6,
 +    /* GPIO-Signal, ACPI 6.0: 18.3.2.7, Table 18-332 */
 +    ACPI_GHES_NOTIFY_GPIO = 7,
 +    /* ARMv8 SEA, ACPI 6.1: 18.3.2.9, Table 18-345 */
 +    ACPI_GHES_NOTIFY_SEA = 8,
 +    /* ARMv8 SEI, ACPI 6.1: 18.3.2.9, Table 18-345 */
 +    ACPI_GHES_NOTIFY_SEI = 9,
 +    /* External Interrupt - GSIV, ACPI 6.1: 18.3.2.9, Table 18-345 */
 +    ACPI_GHES_NOTIFY_GSIV = 10,
 +    /* Software Delegated Exception, ACPI 6.2: 18.3.2.9, Table 18-383 */
 +    ACPI_GHES_NOTIFY_SDEI = 11,
 +    /* 12 and greater are reserved */
 +    ACPI_GHES_NOTIFY_RESERVED = 12
 +};
 +
 +enum {
 +    ACPI_HEST_SRC_ID_SEA = 0,
 +    /* future ids go here */
 +    ACPI_HEST_SRC_ID_RESERVED,
 +};
 +
  void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
 +void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
  #endif
 diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/acpi/ghes.c
 +++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu/units.h"
  #include "hw/acpi/ghes.h"
  #include "hw/acpi/aml-build.h"
 +#include "qemu/error-report.h"
  #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
  #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
@@ -XXX,XX +XXX,XX @@
  /* Now only support ARMv8 SEA notification type error source */
  #define ACPI_GHES_ERROR_SOURCE_COUNT        1
 +/* Generic Hardware Error Source version 2 */
 +#define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
 +
 +/* Address offset in Generic Address Structure(GAS) */
 +#define GAS_ADDR_OFFSET 4
 +
 +/*
 + * Hardware Error Notification
 + * ACPI 4.0: 17.3.2.7 Hardware Error Notification
 + * Composes dummy Hardware Error Notification descriptor of specified type
 + */
 +static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
 +{
 +    /* Type */
 +    build_append_int_noprefix(table, type, 1);
 +    /*
 +     * Length:
 +     * Total length of the structure in bytes
 +     */
 +    build_append_int_noprefix(table, 28, 1);
 +    /* Configuration Write Enable */
 +    build_append_int_noprefix(table, 0, 2);
 +    /* Poll Interval */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Vector */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Switch To Polling Threshold Value */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Switch To Polling Threshold Window */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Error Threshold Value */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Error Threshold Window */
 +    build_append_int_noprefix(table, 0, 4);
 +}
 +
  /*
   * Build table for the hardware error fw_cfg blob.
   * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
      bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
 , sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
  }
 +
 +/* Build Generic Hardware Error Source version 2 (GHESv2) */
 +static void build_ghes_v2(GArray *table_data, int source_id, BIOSLinker *linker)
 +{
 +    uint64_t address_offset;
 +    /*
 +     * Type:
 +     * Generic Hardware Error Source version 2(GHESv2 - Type 10)
 +     */
 +    build_append_int_noprefix(table_data, ACPI_GHES_SOURCE_GENERIC_ERROR_V2, 2);
 +    /* Source Id */
 +    build_append_int_noprefix(table_data, source_id, 2);
 +    /* Related Source Id */
 +    build_append_int_noprefix(table_data, 0xffff, 2);
 +    /* Flags */
 +    build_append_int_noprefix(table_data, 0, 1);
 +    /* Enabled */
 +    build_append_int_noprefix(table_data, 1, 1);
 +
 +    /* Number of Records To Pre-allocate */
 +    build_append_int_noprefix(table_data, 1, 4);
 +    /* Max Sections Per Record */
 +    build_append_int_noprefix(table_data, 1, 4);
 +    /* Max Raw Data Length */
 +    build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
 +
 +    address_offset = table_data->len;
 +    /* Error Status Address */
 +    build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
 +                     4 /* QWord access */, 0);
 +    bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
 +        address_offset + GAS_ADDR_OFFSET, sizeof(uint64_t),
 +        ACPI_GHES_ERRORS_FW_CFG_FILE, source_id * sizeof(uint64_t));
 +
 +    switch (source_id) {
 +    case ACPI_HEST_SRC_ID_SEA:
 +        /*
 +         * Notification Structure
 +         * Now only enable ARMv8 SEA notification type
 +         */
 +        build_ghes_hw_error_notification(table_data, ACPI_GHES_NOTIFY_SEA);
 +        break;
 +    default:
 +        error_report("Not support this error source");
 +        abort();
 +    }
-     case 0x94: /* CONTROL_NS */
++
-         /* We have to handle this here because unprivileged Secure code
++    /* Error Status Block Length */
-          * can read the NS CONTROL register.
++    build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
++
-         if (!env->v7m.secure) {
++    /*
-             return 0;
++     * Read Ack Register
-         }
++     * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source
--        return env->v7m.control[M_REG_NS];
++     * version 2 (GHESv2 - Type 10)
-+        return env->v7m.control[M_REG_NS] |
++     */
-+            (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK);
++    address_offset = table_data->len;
 +    build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
 +                     4 /* QWord access */, 0);
 +    bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
 +        address_offset + GAS_ADDR_OFFSET,
 +        sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
 +        (ACPI_GHES_ERROR_SOURCE_COUNT + source_id) * sizeof(uint64_t));
 +
 +    /*
 +     * Read Ack Preserve field
 +     * We only provide the first bit in Read Ack Register to OSPM to write
 +     * while the other bits are preserved.
 +     */
 +    build_append_int_noprefix(table_data, ~0x1ULL, 8);
 +    /* Read Ack Write */
 +    build_append_int_noprefix(table_data, 0x1, 8);
 +}
 +
 +/* Build Hardware Error Source Table */
 +void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
 +{
 +    uint64_t hest_start = table_data->len;
 +
 +    /* Hardware Error Source Table header*/
 +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
 +
 +    /* Error Source Count */
 +    build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
 +
 +    build_ghes_v2(table_data, ACPI_HEST_SRC_ID_SEA, linker);
 +
 +    build_header(linker, table_data, (void *)(table_data->data + hest_start),
 +        "HEST", table_data->len - hest_start, 1, NULL, NULL);
 +}
 diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/virt-acpi-build.c
 +++ b/hw/arm/virt-acpi-build.c
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
      if (vms->ras) {
          build_ghes_error_table(tables->hardware_errors, tables->linker);
 +        acpi_add_table(table_offsets, tables_blob);
 +        acpi_build_hest(tables_blob, tables->linker);
      }
-     if (el == 0) {
+     if (ms->numa_state->num_nodes > 0) {
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
       */
      uint32_t mask = extract32(maskreg, 8, 4);
      uint32_t reg = extract32(maskreg, 0, 8);
 +    int cur_el = arm_current_el(env);
 -    if (arm_current_el(env) == 0 && reg > 7) {
 -        /* only xPSR sub-fields may be written by unprivileged */
 +    if (cur_el == 0 && reg > 7 && reg != 20) {
 +        /*
 +         * only xPSR sub-fields and CONTROL.SFPA may be written by
 +         * unprivileged code
 +         */
          return;
      }
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
                  env->v7m.control[M_REG_NS] &= ~R_V7M_CONTROL_NPRIV_MASK;
                  env->v7m.control[M_REG_NS] |= val & R_V7M_CONTROL_NPRIV_MASK;
              }
 +            /*
 +             * SFPA is RAZ/WI from NS. FPCA is RO if NSACR.CP10 == 0,
 +             * RES0 if the FPU is not present, and is stored in the S bank
 +             */
 +            if (arm_feature(env, ARM_FEATURE_VFP) &&
 +                extract32(env->v7m.nsacr, 10, 1)) {
 +                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
 +                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_FPCA_MASK;
 +            }
              return;
          case 0x98: /* SP_NS */
          {
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
          env->v7m.faultmask[env->v7m.secure] = val & 1;
          break;
      case 20: /* CONTROL */
 -        /* Writing to the SPSEL bit only has an effect if we are in
 +        /*
 +         * Writing to the SPSEL bit only has an effect if we are in
           * thread mode; other bits can be updated by any privileged code.
           * write_v7m_control_spsel() deals with updating the SPSEL bit in
           * env->v7m.control, so we only need update the others.
           * For v7M, we must just ignore explicit writes to SPSEL in handler
           * mode; for v8M the write is permitted but will have no effect.
 +         * All these bits are writes-ignored from non-privileged code,
 +         * except for SFPA.
           */
 -        if (arm_feature(env, ARM_FEATURE_V8) ||
 -            !arm_v7m_is_handler_mode(env)) {
 +        if (cur_el > 0 && (arm_feature(env, ARM_FEATURE_V8) ||
 +                           !arm_v7m_is_handler_mode(env))) {
              write_v7m_control_spsel(env, (val & R_V7M_CONTROL_SPSEL_MASK) != 0);
          }
 -        if (arm_feature(env, ARM_FEATURE_M_MAIN)) {
 +        if (cur_el > 0 && arm_feature(env, ARM_FEATURE_M_MAIN)) {
              env->v7m.control[env->v7m.secure] &= ~R_V7M_CONTROL_NPRIV_MASK;
              env->v7m.control[env->v7m.secure] |= val & R_V7M_CONTROL_NPRIV_MASK;
          }
 +        if (arm_feature(env, ARM_FEATURE_VFP)) {
 +            /*
 +             * SFPA is RAZ/WI from NS or if no FPU.
 +             * FPCA is RO if NSACR.CP10 == 0, RES0 if the FPU is not present.
 +             * Both are stored in the S bank.
 +             */
 +            if (env->v7m.secure) {
 +                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
 +                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_SFPA_MASK;
 +            }
 +            if (cur_el > 0 &&
 +                (env->v7m.secure || !arm_feature(env, ARM_FEATURE_M_SECURITY) ||
 +                 extract32(env->v7m.nsacr, 10, 1))) {
 +                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
 +                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_FPCA_MASK;
 +            }
 +        }
          break;
      default:
      bad_reg:
 --
 .20.1

-[Qemu-devel] [PULL 06/42] target/arm: Implement dummy versions of M-profile FP-related registers
+[PULL 24/45] ACPI: Record the Generic Error Status Block address
-The M-profile floating point support has three associated config
+From: Dongjiu Geng <gengdongjiu@huawei.com>
 registers: FPCAR, FPCCR and FPDSCR. It also makes the registers
 CPACR and NSACR have behaviour other than reads-as-zero.
 Add support for all of these as simple reads-as-written registers.
 We will hook up actual functionality later.
-The main complexity here is handling the FPCCR register, which
+Record the GHEB address via fw_cfg file, when recording
-has a mix of banked and unbanked bits.
+a error to CPER, it will use this address to find out
 Generic Error Data Entries and write the error.
-Note that we don't share storage with the A-profile
+In order to avoid migration failure, make hardware
-cpu->cp15.nsacr and cpu->cp15.cpacr_el1, though the behaviour
+error table address to a part of GED device instead
-is quite similar, for two reasons:
+of global variable, then this address will be migrated
- * the M profile CPACR is banked between security states
+to target QEMU.
  * it preserves the invariant that M profile uses no state
    inside the cp15 substruct
+Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
+Reviewed-by: Igor Mammedov <imammedo@redhat.com>
+Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
+Message-id: 20200512030609.19593-7-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-4-peter.maydell@linaro.org
 ---
- target/arm/cpu.h      |  34 ++++++++++++
+ include/hw/acpi/generic_event_device.h |  2 ++
- hw/intc/armv7m_nvic.c | 125 ++++++++++++++++++++++++++++++++++++++++++
+ include/hw/acpi/ghes.h                 |  6 ++++++
- target/arm/cpu.c      |   5 ++
+ hw/acpi/generic_event_device.c         | 19 +++++++++++++++++++
- target/arm/machine.c  |  16 ++++++
+ hw/acpi/ghes.c                         | 14 ++++++++++++++
-files changed, 180 insertions(+)
+ hw/arm/virt-acpi-build.c               |  8 ++++++++
 files changed, 49 insertions(+)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/include/hw/acpi/generic_event_device.h
-+++ b/target/arm/cpu.h
++++ b/include/hw/acpi/generic_event_device.h
-@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
+@@ -XXX,XX +XXX,XX @@
-         uint32_t scr[M_REG_NUM_BANKS];
-         uint32_t msplim[M_REG_NUM_BANKS];
+ #include "hw/sysbus.h"
-         uint32_t psplim[M_REG_NUM_BANKS];
+ #include "hw/acpi/memory_hotplug.h"
-+        uint32_t fpcar[M_REG_NUM_BANKS];
++#include "hw/acpi/ghes.h"
-+        uint32_t fpccr[M_REG_NUM_BANKS];
-+        uint32_t fpdscr[M_REG_NUM_BANKS];
+ #define ACPI_POWER_BUTTON_DEVICE "PWRB"
-+        uint32_t cpacr[M_REG_NUM_BANKS];
-+        uint32_t nsacr;
+@@ -XXX,XX +XXX,XX @@ typedef struct AcpiGedState {
-     } v7m;
+     GEDState ged_state;
+     uint32_t ged_event_bitmap;
-     /* Information associated with an exception about to be taken:
+     qemu_irq irq;
-@@ -XXX,XX +XXX,XX @@ FIELD(V7M_CSSELR, LEVEL, 1, 3)
++    AcpiGhesState ghes_state;
-  */
+ } AcpiGedState;
- FIELD(V7M_CSSELR, INDEX, 0, 4)
+ void build_ged_aml(Aml *table, const char* name, HotplugHandler *hotplug_dev,
-+/* v7M FPCCR bits */
+diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
-+FIELD(V7M_FPCCR, LSPACT, 0, 1)
+index XXXXXXX..XXXXXXX 100644
-+FIELD(V7M_FPCCR, USER, 1, 1)
+--- a/include/hw/acpi/ghes.h
-+FIELD(V7M_FPCCR, S, 2, 1)
++++ b/include/hw/acpi/ghes.h
-+FIELD(V7M_FPCCR, THREAD, 3, 1)
+@@ -XXX,XX +XXX,XX @@ enum {
-+FIELD(V7M_FPCCR, HFRDY, 4, 1)
+     ACPI_HEST_SRC_ID_RESERVED,
-+FIELD(V7M_FPCCR, MMRDY, 5, 1)
+ };
-+FIELD(V7M_FPCCR, BFRDY, 6, 1)
-+FIELD(V7M_FPCCR, SFRDY, 7, 1)
++typedef struct AcpiGhesState {
-+FIELD(V7M_FPCCR, MONRDY, 8, 1)
++    uint64_t ghes_addr_le;
-+FIELD(V7M_FPCCR, SPLIMVIOL, 9, 1)
++} AcpiGhesState;
 +FIELD(V7M_FPCCR, UFRDY, 10, 1)
 +FIELD(V7M_FPCCR, RES0, 11, 15)
 +FIELD(V7M_FPCCR, TS, 26, 1)
 +FIELD(V7M_FPCCR, CLRONRETS, 27, 1)
 +FIELD(V7M_FPCCR, CLRONRET, 28, 1)
 +FIELD(V7M_FPCCR, LSPENS, 29, 1)
 +FIELD(V7M_FPCCR, LSPEN, 30, 1)
 +FIELD(V7M_FPCCR, ASPEN, 31, 1)
 +/* These bits are banked. Others are non-banked and live in the M_REG_S bank */
 +#define R_V7M_FPCCR_BANKED_MASK                 \
 +    (R_V7M_FPCCR_LSPACT_MASK |                  \
 +     R_V7M_FPCCR_USER_MASK |                    \
 +     R_V7M_FPCCR_THREAD_MASK |                  \
 +     R_V7M_FPCCR_MMRDY_MASK |                   \
 +     R_V7M_FPCCR_SPLIMVIOL_MASK |               \
 +     R_V7M_FPCCR_UFRDY_MASK |                   \
 +     R_V7M_FPCCR_ASPEN_MASK)
 +
- /*
+ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
-  * System register ID fields.
+ void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
-  */
++void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
-diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
++                          GArray *hardware_errors);
  #endif
 diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/intc/armv7m_nvic.c
+--- a/hw/acpi/generic_event_device.c
-+++ b/hw/intc/armv7m_nvic.c
++++ b/hw/acpi/generic_event_device.c
-@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
+@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_ged_state = {
      }
      case 0xd84: /* CSSELR */
          return cpu->env.v7m.csselr[attrs.secure];
 +    case 0xd88: /* CPACR */
 +        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            return 0;
 +        }
 +        return cpu->env.v7m.cpacr[attrs.secure];
 +    case 0xd8c: /* NSACR */
 +        if (!attrs.secure || !arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            return 0;
 +        }
 +        return cpu->env.v7m.nsacr;
      /* TODO: Implement debug registers.  */
      case 0xd90: /* MPU_TYPE */
          /* Unified MPU; if the MPU is not present this value is zero */
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
              return 0;
          }
          return cpu->env.v7m.sfar;
 +    case 0xf34: /* FPCCR */
 +        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            return 0;
 +        }
 +        if (attrs.secure) {
 +            return cpu->env.v7m.fpccr[M_REG_S];
 +        } else {
 +            /*
 +             * NS can read LSPEN, CLRONRET and MONRDY. It can read
 +             * BFRDY and HFRDY if AIRCR.BFHFNMINS != 0;
 +             * other non-banked bits RAZ.
 +             * TODO: MONRDY should RAZ/WI if DEMCR.SDME is set.
 +             */
 +            uint32_t value = cpu->env.v7m.fpccr[M_REG_S];
 +            uint32_t mask = R_V7M_FPCCR_LSPEN_MASK |
 +                R_V7M_FPCCR_CLRONRET_MASK |
 +                R_V7M_FPCCR_MONRDY_MASK;
 +
 +            if (s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK) {
 +                mask |= R_V7M_FPCCR_BFRDY_MASK | R_V7M_FPCCR_HFRDY_MASK;
 +            }
 +
 +            value &= mask;
 +
 +            value |= cpu->env.v7m.fpccr[M_REG_NS];
 +            return value;
 +        }
 +    case 0xf38: /* FPCAR */
 +        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            return 0;
 +        }
 +        return cpu->env.v7m.fpcar[attrs.secure];
 +    case 0xf3c: /* FPDSCR */
 +        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            return 0;
 +        }
 +        return cpu->env.v7m.fpdscr[attrs.secure];
      case 0xf40: /* MVFR0 */
          return cpu->isar.mvfr0;
      case 0xf44: /* MVFR1 */
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
              cpu->env.v7m.csselr[attrs.secure] = value & R_V7M_CSSELR_INDEX_MASK;
          }
          break;
 +    case 0xd88: /* CPACR */
 +        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            /* We implement only the Floating Point extension's CP10/CP11 */
 +            cpu->env.v7m.cpacr[attrs.secure] = value & (0xf << 20);
 +        }
 +        break;
 +    case 0xd8c: /* NSACR */
 +        if (attrs.secure && arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            /* We implement only the Floating Point extension's CP10/CP11 */
 +            cpu->env.v7m.nsacr = value & (3 << 10);
 +        }
 +        break;
      case 0xd90: /* MPU_TYPE */
          return; /* RO */
      case 0xd94: /* MPU_CTRL */
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
          }
          break;
      }
 +    case 0xf34: /* FPCCR */
 +        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            /* Not all bits here are banked. */
 +            uint32_t fpccr_s;
 +
 +            if (!arm_feature(&cpu->env, ARM_FEATURE_V8)) {
 +                /* Don't allow setting of bits not present in v7M */
 +                value &= (R_V7M_FPCCR_LSPACT_MASK |
 +                          R_V7M_FPCCR_USER_MASK |
 +                          R_V7M_FPCCR_THREAD_MASK |
 +                          R_V7M_FPCCR_HFRDY_MASK |
 +                          R_V7M_FPCCR_MMRDY_MASK |
 +                          R_V7M_FPCCR_BFRDY_MASK |
 +                          R_V7M_FPCCR_MONRDY_MASK |
 +                          R_V7M_FPCCR_LSPEN_MASK |
 +                          R_V7M_FPCCR_ASPEN_MASK);
 +            }
 +            value &= ~R_V7M_FPCCR_RES0_MASK;
 +
 +            if (!attrs.secure) {
 +                /* Some non-banked bits are configurably writable by NS */
 +                fpccr_s = cpu->env.v7m.fpccr[M_REG_S];
 +                if (!(fpccr_s & R_V7M_FPCCR_LSPENS_MASK)) {
 +                    uint32_t lspen = FIELD_EX32(value, V7M_FPCCR, LSPEN);
 +                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, LSPEN, lspen);
 +                }
 +                if (!(fpccr_s & R_V7M_FPCCR_CLRONRETS_MASK)) {
 +                    uint32_t cor = FIELD_EX32(value, V7M_FPCCR, CLRONRET);
 +                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, CLRONRET, cor);
 +                }
 +                if ((s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK)) {
 +                    uint32_t hfrdy = FIELD_EX32(value, V7M_FPCCR, HFRDY);
 +                    uint32_t bfrdy = FIELD_EX32(value, V7M_FPCCR, BFRDY);
 +                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, HFRDY, hfrdy);
 +                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, BFRDY, bfrdy);
 +                }
 +                /* TODO MONRDY should RAZ/WI if DEMCR.SDME is set */
 +                {
 +                    uint32_t monrdy = FIELD_EX32(value, V7M_FPCCR, MONRDY);
 +                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, MONRDY, monrdy);
 +                }
 +
 +                /*
 +                 * All other non-banked bits are RAZ/WI from NS; write
 +                 * just the banked bits to fpccr[M_REG_NS].
 +                 */
 +                value &= R_V7M_FPCCR_BANKED_MASK;
 +                cpu->env.v7m.fpccr[M_REG_NS] = value;
 +            } else {
 +                fpccr_s = value;
 +            }
 +            cpu->env.v7m.fpccr[M_REG_S] = fpccr_s;
 +        }
 +        break;
 +    case 0xf38: /* FPCAR */
 +        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            value &= ~7;
 +            cpu->env.v7m.fpcar[attrs.secure] = value;
 +        }
 +        break;
 +    case 0xf3c: /* FPDSCR */
 +        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            value &= 0x07c00000;
 +            cpu->env.v7m.fpdscr[attrs.secure] = value;
 +        }
 +        break;
      case 0xf50: /* ICIALLU */
      case 0xf58: /* ICIMVAU */
      case 0xf5c: /* DCIMVAC */
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
              env->v7m.ccr[M_REG_S] |= R_V7M_CCR_UNALIGN_TRP_MASK;
          }
 +        if (arm_feature(env, ARM_FEATURE_VFP)) {
 +            env->v7m.fpccr[M_REG_NS] = R_V7M_FPCCR_ASPEN_MASK;
 +            env->v7m.fpccr[M_REG_S] = R_V7M_FPCCR_ASPEN_MASK |
 +                R_V7M_FPCCR_LSPEN_MASK | R_V7M_FPCCR_S_MASK;
 +        }
          /* Unlike A/R profile, M profile defines the reset LR value */
          env->regs[14] = 0xffffffff;
 diff --git a/target/arm/machine.c b/target/arm/machine.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/machine.c
 +++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m_v8m = {
      }
  };
-+static const VMStateDescription vmstate_m_fp = {
++static bool ghes_needed(void *opaque)
-+    .name = "cpu/m/fp",
++{
 +    AcpiGedState *s = opaque;
 +    return s->ghes_state.ghes_addr_le;
 +}
 +
 +static const VMStateDescription vmstate_ghes_state = {
 +    .name = "acpi-ged/ghes",
 +    .version_id = 1,
 +    .minimum_version_id = 1,
-+    .needed = vfp_needed,
++    .needed = ghes_needed,
-+    .fields = (VMStateField[]) {
++    .fields      = (VMStateField[]) {
-+        VMSTATE_UINT32_ARRAY(env.v7m.fpcar, ARMCPU, M_REG_NUM_BANKS),
++        VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
-+        VMSTATE_UINT32_ARRAY(env.v7m.fpccr, ARMCPU, M_REG_NUM_BANKS),
++                       vmstate_ghes_state, AcpiGhesState),
 +        VMSTATE_UINT32_ARRAY(env.v7m.fpdscr, ARMCPU, M_REG_NUM_BANKS),
 +        VMSTATE_UINT32_ARRAY(env.v7m.cpacr, ARMCPU, M_REG_NUM_BANKS),
 +        VMSTATE_UINT32(env.v7m.nsacr, ARMCPU),
 +        VMSTATE_END_OF_LIST()
 +    }
 +};
 +
- static const VMStateDescription vmstate_m = {
+ static const VMStateDescription vmstate_acpi_ged = {
-     .name = "cpu/m",
+     .name = "acpi-ged",
-     .version_id = 4,
+     .version_id = 1,
-@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m = {
+@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_acpi_ged = {
-         &vmstate_m_scr,
+     },
-         &vmstate_m_other_sp,
+     .subsections = (const VMStateDescription * []) {
-         &vmstate_m_v8m,
+         &vmstate_memhp_state,
-+        &vmstate_m_fp,
++        &vmstate_ghes_state,
          NULL
      }
  };
+diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/acpi/ghes.c
++++ b/hw/acpi/ghes.c
+@@ -XXX,XX +XXX,XX @@
+ #include "hw/acpi/ghes.h"
+ #include "hw/acpi/aml-build.h"
+ #include "qemu/error-report.h"
++#include "hw/acpi/generic_event_device.h"
++#include "hw/nvram/fw_cfg.h"
+ #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
+ #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
+@@ -XXX,XX +XXX,XX @@ void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
+     build_header(linker, table_data, (void *)(table_data->data + hest_start),
+         "HEST", table_data->len - hest_start, 1, NULL, NULL);
+ }
++
++void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
++                          GArray *hardware_error)
++{
++    /* Create a read-only fw_cfg file for GHES */
++    fw_cfg_add_file(s, ACPI_GHES_ERRORS_FW_CFG_FILE, hardware_error->data,
++                    hardware_error->len);
++
++    /* Create a read-write fw_cfg file for Address */
++    fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
++        NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
++}
+diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/virt-acpi-build.c
++++ b/hw/arm/virt-acpi-build.c
+@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
+ {
+     AcpiBuildTables tables;
+     AcpiBuildState *build_state;
++    AcpiGedState *acpi_ged_state;
+     if (!vms->fw_cfg) {
+         trace_virt_acpi_setup();
+@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
+     fw_cfg_add_file(vms->fw_cfg, ACPI_BUILD_TPMLOG_FILE, tables.tcpalog->data,
+                     acpi_data_len(tables.tcpalog));
++    if (vms->ras) {
++        assert(vms->acpi_dev);
++        acpi_ged_state = ACPI_GED(vms->acpi_dev);
++        acpi_ghes_add_fw_cfg(&acpi_ged_state->ghes_state,
++                             vms->fw_cfg, tables.hardware_errors);
++    }
++
+     build_state->rsdp_mr = acpi_add_rom_blob(virt_acpi_build_update,
+                                              build_state, tables.rsdp,
+                                              ACPI_BUILD_RSDP_FILE, 0);
 --
 .20.1

-[Qemu-devel] [PULL 01/42] hw/arm/smmuv3: Remove SMMUNotifierNode
+[PULL 25/45] KVM: Move hwpoison page related functions into kvm-all.c
-From: Eric Auger <eric.auger@redhat.com>
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-The SMMUNotifierNode struct is not necessary and brings extra
+kvm_hwpoison_page_add() and kvm_unpoison_all() will both
-complexity so let's remove it. We now directly track the SMMUDevices
+be used by X86 and ARM platforms, so moving them into
-which have registered IOMMU MR notifiers.
+"accel/kvm/kvm-all.c" to avoid duplicate code.
-This is inspired from the same transformation on intel-iommu
+For architectures that don't use the poison-list functionality
-done in commit b4a4ba0d68f50f218ee3957b6638dbee32a5eeef
+the reset handler will harmlessly do nothing, so let's register
-("intel-iommu: remove IntelIOMMUNotifierNode")
+the kvm_unpoison_all() function in the generic kvm_init() function.
-Signed-off-by: Eric Auger <eric.auger@redhat.com>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Peter Xu <peterx@redhat.com>
+Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
-Message-id: 20190409160219.19026-1-eric.auger@redhat.com
+Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
 Message-id: 20200512030609.19593-8-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/arm/smmu-common.h |  8 ++------
+ include/sysemu/kvm_int.h | 12 ++++++++++++
- hw/arm/smmu-common.c         |  6 +++---
+ accel/kvm/kvm-all.c      | 36 ++++++++++++++++++++++++++++++++++++
- hw/arm/smmuv3.c              | 28 +++++++---------------------
+ target/i386/kvm.c        | 36 ------------------------------------
-files changed, 12 insertions(+), 30 deletions(-)
+files changed, 48 insertions(+), 36 deletions(-)
-diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
+diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/smmu-common.h
+--- a/include/sysemu/kvm_int.h
-+++ b/include/hw/arm/smmu-common.h
++++ b/include/sysemu/kvm_int.h
-@@ -XXX,XX +XXX,XX @@ typedef struct SMMUDevice {
+@@ -XXX,XX +XXX,XX @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
-     AddressSpace       as;
+                                   AddressSpace *as, int as_id);
-     uint32_t           cfg_cache_hits;
-     uint32_t           cfg_cache_misses;
+ void kvm_set_max_memslot_size(hwaddr max_slot_size);
-+    QLIST_ENTRY(SMMUDevice) next;
++
- } SMMUDevice;
++/**
++ * kvm_hwpoison_page_add:
--typedef struct SMMUNotifierNode {
++ *
--    SMMUDevice *sdev;
++ * Parameters:
--    QLIST_ENTRY(SMMUNotifierNode) next;
++ *  @ram_addr: the address in the RAM for the poisoned page
--} SMMUNotifierNode;
++ *
--
++ * Add a poisoned page to the list
- typedef struct SMMUPciBus {
++ *
-     PCIBus       *bus;
++ * Return: None.
-     SMMUDevice   *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
++ */
-@@ -XXX,XX +XXX,XX @@ typedef struct SMMUState {
++void kvm_hwpoison_page_add(ram_addr_t ram_addr);
-     GHashTable *iotlb;
+ #endif
-     SMMUPciBus *smmu_pcibus_by_bus_num[SMMU_PCI_BUS_MAX];
+diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
      PCIBus *pci_bus;
 -    QLIST_HEAD(, SMMUNotifierNode) notifiers_list;
 +    QLIST_HEAD(, SMMUDevice) devices_with_notifiers;
      uint8_t bus_num;
      PCIBus *primary_bus;
  } SMMUState;
 diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/smmu-common.c
+--- a/accel/kvm/kvm-all.c
-+++ b/hw/arm/smmu-common.c
++++ b/accel/kvm/kvm-all.c
-@@ -XXX,XX +XXX,XX @@ inline void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
+@@ -XXX,XX +XXX,XX @@
- /* Unmap all notifiers of all mr's */
+ #include "qapi/visitor.h"
- void smmu_inv_notifiers_all(SMMUState *s)
+ #include "qapi/qapi-types-common.h"
  #include "qapi/qapi-visit-common.h"
 +#include "sysemu/reset.h"
  #include "hw/boards.h"
@@ -XXX,XX +XXX,XX @@ int kvm_vm_check_extension(KVMState *s, unsigned int extension)
      return ret;
  }
 +typedef struct HWPoisonPage {
 +    ram_addr_t ram_addr;
 +    QLIST_ENTRY(HWPoisonPage) list;
 +} HWPoisonPage;
 +
 +static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
 +    QLIST_HEAD_INITIALIZER(hwpoison_page_list);
 +
 +static void kvm_unpoison_all(void *param)
 +{
 +    HWPoisonPage *page, *next_page;
 +
 +    QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
 +        QLIST_REMOVE(page, list);
 +        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
 +        g_free(page);
 +    }
 +}
 +
 +void kvm_hwpoison_page_add(ram_addr_t ram_addr)
 +{
 +    HWPoisonPage *page;
 +
 +    QLIST_FOREACH(page, &hwpoison_page_list, list) {
 +        if (page->ram_addr == ram_addr) {
 +            return;
 +        }
 +    }
 +    page = g_new(HWPoisonPage, 1);
 +    page->ram_addr = ram_addr;
 +    QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
 +}
 +
  static uint32_t adjust_ioeventfd_endianness(uint32_t val, uint32_t size)
  {
--    SMMUNotifierNode *node;
+ #if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
-+    SMMUDevice *sdev;
+@@ -XXX,XX +XXX,XX @@ static int kvm_init(MachineState *ms)
+         s->kernel_irqchip_split = mc->default_kernel_irqchip_split ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
--    QLIST_FOREACH(node, &s->notifiers_list, next) {
+     }
--        smmu_inv_notifiers_mr(&node->sdev->iommu);
-+    QLIST_FOREACH(sdev, &s->devices_with_notifiers, next) {
++    qemu_register_reset(kvm_unpoison_all, NULL);
-+        smmu_inv_notifiers_mr(&sdev->iommu);
++
      if (s->kernel_irqchip_allowed) {
          kvm_irqchip_create(s);
      }
 diff --git a/target/i386/kvm.c b/target/i386/kvm.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/i386/kvm.c
 +++ b/target/i386/kvm.c
@@ -XXX,XX +XXX,XX @@
  #include "sysemu/sysemu.h"
  #include "sysemu/hw_accel.h"
  #include "sysemu/kvm_int.h"
 -#include "sysemu/reset.h"
  #include "sysemu/runstate.h"
  #include "kvm_i386.h"
  #include "hyperv.h"
@@ -XXX,XX +XXX,XX @@ uint64_t kvm_arch_get_supported_msr_feature(KVMState *s, uint32_t index)
      }
  }
-diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
+-
-index XXXXXXX..XXXXXXX 100644
+-typedef struct HWPoisonPage {
---- a/hw/arm/smmuv3.c
+-    ram_addr_t ram_addr;
-+++ b/hw/arm/smmuv3.c
+-    QLIST_ENTRY(HWPoisonPage) list;
-@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
+-} HWPoisonPage;
- /* invalidate an asid/iova tuple in all mr's */
+-
- static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova)
+-static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
- {
+-    QLIST_HEAD_INITIALIZER(hwpoison_page_list);
--    SMMUNotifierNode *node;
+-
-+    SMMUDevice *sdev;
+-static void kvm_unpoison_all(void *param)
+-{
--    QLIST_FOREACH(node, &s->notifiers_list, next) {
+-    HWPoisonPage *page, *next_page;
--        IOMMUMemoryRegion *mr = &node->sdev->iommu;
+-
-+    QLIST_FOREACH(sdev, &s->devices_with_notifiers, next) {
+-    QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
-+        IOMMUMemoryRegion *mr = &sdev->iommu;
+-        QLIST_REMOVE(page, list);
-         IOMMUNotifier *n;
+-        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
+-        g_free(page);
          trace_smmuv3_inv_notifiers_iova(mr->parent_obj.name, asid, iova);
@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
      SMMUDevice *sdev = container_of(iommu, SMMUDevice, iommu);
      SMMUv3State *s3 = sdev->smmu;
      SMMUState *s = &(s3->smmu_state);
 -    SMMUNotifierNode *node = NULL;
 -    SMMUNotifierNode *next_node = NULL;
      if (new & IOMMU_NOTIFIER_MAP) {
          int bus_num = pci_bus_num(sdev->bus);
@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
      if (old == IOMMU_NOTIFIER_NONE) {
          trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
 -        node = g_malloc0(sizeof(*node));
 -        node->sdev = sdev;
 -        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
 -        return;
 -    }
+-}
 -
--    /* update notifier node with new flags */
+-static void kvm_hwpoison_page_add(ram_addr_t ram_addr)
--    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
+-{
--        if (node->sdev == sdev) {
+-    HWPoisonPage *page;
--            if (new == IOMMU_NOTIFIER_NONE) {
+-
--                trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
+-    QLIST_FOREACH(page, &hwpoison_page_list, list) {
--                QLIST_REMOVE(node, next);
+-        if (page->ram_addr == ram_addr) {
 -                g_free(node);
 -            }
 -            return;
 -        }
-+        QLIST_INSERT_HEAD(&s->devices_with_notifiers, sdev, next);
+-    }
-+    } else if (new == IOMMU_NOTIFIER_NONE) {
+-    page = g_new(HWPoisonPage, 1);
-+        trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
+-    page->ram_addr = ram_addr;
-+        QLIST_REMOVE(sdev, next);
+-    QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
 -}
 -
  static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap,
                                       int *max_banks)
  {
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init(MachineState *ms, KVMState *s)
          fprintf(stderr, "e820_add_entry() table is full\n");
          return ret;
      }
- }
+-    qemu_register_reset(kvm_unpoison_all, NULL);
      shadow_mem = object_property_get_int(OBJECT(s), "kvm-shadow-mem", &error_abort);
      if (shadow_mem != -1) {
 --
 .20.1

-[Qemu-devel] [PULL 39/42] hw/devices: Move LAN9118 declarations into a new header
+[PULL 26/45] ACPI: Record Generic Error Status Block(GESB) table
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+kvm_arch_on_sigbus_vcpu() error injection uses source_id as
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+index in etc/hardware_errors to find out Error Status Data
-Message-id: 20190412165416.7977-10-philmd@redhat.com
+Block entry corresponding to error source. So supported source_id
 values should be assigned here and not be changed afterwards to
 make sure that guest will write error into expected Error Status
 Data Block.
 Before QEMU writes a new error to ACPI table, it will check whether
 previous error has been acknowledged. If not acknowledged, the new
 errors will be ignored and not be recorded. For the errors section
 type, QEMU simulate it to memory section error.
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Message-id: 20200512030609.19593-9-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/devices.h     |  3 ---
+ include/hw/acpi/ghes.h |   1 +
- include/hw/net/lan9118.h | 19 +++++++++++++++++++
+ hw/acpi/ghes.c         | 219 +++++++++++++++++++++++++++++++++++++++++
- hw/arm/kzm.c             |  2 +-
+files changed, 220 insertions(+)
- hw/arm/mps2.c            |  2 +-
- hw/arm/realview.c        |  1 +
+diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
  hw/arm/vexpress.c        |  2 +-
  hw/net/lan9118.c         |  2 +-
 files changed, 24 insertions(+), 7 deletions(-)
  create mode 100644 include/hw/net/lan9118.h
 diff --git a/include/hw/devices.h b/include/hw/devices.h
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
+--- a/include/hw/acpi/ghes.h
-+++ b/include/hw/devices.h
++++ b/include/hw/acpi/ghes.h
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
- /* smc91c111.c */
+ void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
- void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
+ void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
+                           GArray *hardware_errors);
--/* lan9118.c */
++int acpi_ghes_record_errors(uint8_t notify, uint64_t error_physical_addr);
 -void lan9118_init(NICInfo *, uint32_t, qemu_irq);
 -
  #endif
-diff --git a/include/hw/net/lan9118.h b/include/hw/net/lan9118.h
+diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/include/hw/net/lan9118.h
@@ -XXX,XX +XXX,XX @@
 +/*
 + * SMSC LAN9118 Ethernet interface emulation
 + *
 + * Copyright (c) 2009 CodeSourcery, LLC.
 + * Written by Paul Brook
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or later.
 + * See the COPYING file in the top-level directory.
 + */
 +
 +#ifndef HW_NET_LAN9118_H
 +#define HW_NET_LAN9118_H
 +
 +#include "hw/irq.h"
 +#include "net/net.h"
 +
 +void lan9118_init(NICInfo *, uint32_t, qemu_irq);
 +
 +#endif
 diff --git a/hw/arm/kzm.c b/hw/arm/kzm.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/kzm.c
+--- a/hw/acpi/ghes.c
-+++ b/hw/arm/kzm.c
++++ b/hw/acpi/ghes.c
 @@ -XXX,XX +XXX,XX @@
  #include "qemu/error-report.h"
- #include "exec/address-spaces.h"
+ #include "hw/acpi/generic_event_device.h"
- #include "net/net.h"
+ #include "hw/nvram/fw_cfg.h"
--#include "hw/devices.h"
++#include "qemu/uuid.h"
-+#include "hw/net/lan9118.h"
- #include "hw/char/serial.h"
+ #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
- #include "sysemu/qtest.h"
+ #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
 diff --git a/hw/arm/mps2.c b/hw/arm/mps2.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/mps2.c
 +++ b/hw/arm/mps2.c
 @@ -XXX,XX +XXX,XX @@
- #include "hw/timer/cmsdk-apb-timer.h"
+ /* Address offset in Generic Address Structure(GAS) */
- #include "hw/timer/cmsdk-apb-dualtimer.h"
+ #define GAS_ADDR_OFFSET 4
- #include "hw/misc/mps2-scc.h"
--#include "hw/devices.h"
++/*
-+#include "hw/net/lan9118.h"
++ * The total size of Generic Error Data Entry
- #include "net/net.h"
++ * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
++ * Table 18-343 Generic Error Data Entry
- typedef enum MPS2FPGAType {
++ */
-diff --git a/hw/arm/realview.c b/hw/arm/realview.c
++#define ACPI_GHES_DATA_LENGTH               72
-index XXXXXXX..XXXXXXX 100644
++
---- a/hw/arm/realview.c
++/* The memory section CPER size, UEFI 2.6: N.2.5 Memory Error Section */
-+++ b/hw/arm/realview.c
++#define ACPI_GHES_MEM_CPER_LENGTH           80
-@@ -XXX,XX +XXX,XX @@
++
- #include "hw/arm/arm.h"
++/* Masks for block_status flags */
- #include "hw/arm/primecell.h"
++#define ACPI_GEBS_UNCORRECTABLE         1
- #include "hw/devices.h"
++
-+#include "hw/net/lan9118.h"
++/*
- #include "hw/pci/pci.h"
++ * Total size for Generic Error Status Block except Generic Error Data Entries
- #include "net/net.h"
++ * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
- #include "sysemu/sysemu.h"
++ * Table 18-380 Generic Error Status Block
-diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
++ */
-index XXXXXXX..XXXXXXX 100644
++#define ACPI_GHES_GESB_SIZE                 20
---- a/hw/arm/vexpress.c
++
-+++ b/hw/arm/vexpress.c
++/*
-@@ -XXX,XX +XXX,XX @@
++ * Values for error_severity field
- #include "hw/sysbus.h"
++ */
- #include "hw/arm/arm.h"
++enum AcpiGenericErrorSeverity {
- #include "hw/arm/primecell.h"
++    ACPI_CPER_SEV_RECOVERABLE = 0,
--#include "hw/devices.h"
++    ACPI_CPER_SEV_FATAL = 1,
-+#include "hw/net/lan9118.h"
++    ACPI_CPER_SEV_CORRECTED = 2,
- #include "hw/i2c/i2c.h"
++    ACPI_CPER_SEV_NONE = 3,
- #include "net/net.h"
++};
- #include "sysemu/sysemu.h"
++
-diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
+ /*
-index XXXXXXX..XXXXXXX 100644
+  * Hardware Error Notification
---- a/hw/net/lan9118.c
+  * ACPI 4.0: 17.3.2.7 Hardware Error Notification
-+++ b/hw/net/lan9118.c
+@@ -XXX,XX +XXX,XX @@ static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
-@@ -XXX,XX +XXX,XX @@
+     build_append_int_noprefix(table, 0, 4);
- #include "hw/sysbus.h"
+ }
- #include "net/net.h"
- #include "net/eth.h"
++/*
--#include "hw/devices.h"
++ * Generic Error Data Entry
-+#include "hw/net/lan9118.h"
++ * ACPI 6.1: 18.3.2.7.1 Generic Error Data
- #include "sysemu/sysemu.h"
++ */
- #include "hw/ptimer.h"
++static void acpi_ghes_generic_error_data(GArray *table,
- #include "qemu/log.h"
++                const uint8_t *section_type, uint32_t error_severity,
 +                uint8_t validation_bits, uint8_t flags,
 +                uint32_t error_data_length, QemuUUID fru_id,
 +                uint64_t time_stamp)
 +{
 +    const uint8_t fru_text[20] = {0};
 +
 +    /* Section Type */
 +    g_array_append_vals(table, section_type, 16);
 +
 +    /* Error Severity */
 +    build_append_int_noprefix(table, error_severity, 4);
 +    /* Revision */
 +    build_append_int_noprefix(table, 0x300, 2);
 +    /* Validation Bits */
 +    build_append_int_noprefix(table, validation_bits, 1);
 +    /* Flags */
 +    build_append_int_noprefix(table, flags, 1);
 +    /* Error Data Length */
 +    build_append_int_noprefix(table, error_data_length, 4);
 +
 +    /* FRU Id */
 +    g_array_append_vals(table, fru_id.data, ARRAY_SIZE(fru_id.data));
 +
 +    /* FRU Text */
 +    g_array_append_vals(table, fru_text, sizeof(fru_text));
 +
 +    /* Timestamp */
 +    build_append_int_noprefix(table, time_stamp, 8);
 +}
 +
 +/*
 + * Generic Error Status Block
 + * ACPI 6.1: 18.3.2.7.1 Generic Error Data
 + */
 +static void acpi_ghes_generic_error_status(GArray *table, uint32_t block_status,
 +                uint32_t raw_data_offset, uint32_t raw_data_length,
 +                uint32_t data_length, uint32_t error_severity)
 +{
 +    /* Block Status */
 +    build_append_int_noprefix(table, block_status, 4);
 +    /* Raw Data Offset */
 +    build_append_int_noprefix(table, raw_data_offset, 4);
 +    /* Raw Data Length */
 +    build_append_int_noprefix(table, raw_data_length, 4);
 +    /* Data Length */
 +    build_append_int_noprefix(table, data_length, 4);
 +    /* Error Severity */
 +    build_append_int_noprefix(table, error_severity, 4);
 +}
 +
 +/* UEFI 2.6: N.2.5 Memory Error Section */
 +static void acpi_ghes_build_append_mem_cper(GArray *table,
 +                                            uint64_t error_physical_addr)
 +{
 +    /*
 +     * Memory Error Record
 +     */
 +
 +    /* Validation Bits */
 +    build_append_int_noprefix(table,
 +                              (1ULL << 14) | /* Type Valid */
 +                              (1ULL << 1) /* Physical Address Valid */,
 +                              8);
 +    /* Error Status */
 +    build_append_int_noprefix(table, 0, 8);
 +    /* Physical Address */
 +    build_append_int_noprefix(table, error_physical_addr, 8);
 +    /* Skip all the detailed information normally found in such a record */
 +    build_append_int_noprefix(table, 0, 48);
 +    /* Memory Error Type */
 +    build_append_int_noprefix(table, 0 /* Unknown error */, 1);
 +    /* Skip all the detailed information normally found in such a record */
 +    build_append_int_noprefix(table, 0, 7);
 +}
 +
 +static int acpi_ghes_record_mem_error(uint64_t error_block_address,
 +                                      uint64_t error_physical_addr)
 +{
 +    GArray *block;
 +
 +    /* Memory Error Section Type */
 +    const uint8_t uefi_cper_mem_sec[] =
 +          UUID_LE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
 +                  0xED, 0x7C, 0x83, 0xB1);
 +
 +    /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
 +     * Table 17-13 Generic Error Data Entry
 +     */
 +    QemuUUID fru_id = {};
 +    uint32_t data_length;
 +
 +    block = g_array_new(false, true /* clear */, 1);
 +
 +    /* This is the length if adding a new generic error data entry*/
 +    data_length = ACPI_GHES_DATA_LENGTH + ACPI_GHES_MEM_CPER_LENGTH;
 +
 +    /*
 +     * Check whether it will run out of the preallocated memory if adding a new
 +     * generic error data entry
 +     */
 +    if ((data_length + ACPI_GHES_GESB_SIZE) > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
 +        error_report("Not enough memory to record new CPER!!!");
 +        g_array_free(block, true);
 +        return -1;
 +    }
 +
 +    /* Build the new generic error status block header */
 +    acpi_ghes_generic_error_status(block, ACPI_GEBS_UNCORRECTABLE,
 +        0, 0, data_length, ACPI_CPER_SEV_RECOVERABLE);
 +
 +    /* Build this new generic error data entry header */
 +    acpi_ghes_generic_error_data(block, uefi_cper_mem_sec,
 +        ACPI_CPER_SEV_RECOVERABLE, 0, 0,
 +        ACPI_GHES_MEM_CPER_LENGTH, fru_id, 0);
 +
 +    /* Build the memory section CPER for above new generic error data entry */
 +    acpi_ghes_build_append_mem_cper(block, error_physical_addr);
 +
 +    /* Write the generic error data entry into guest memory */
 +    cpu_physical_memory_write(error_block_address, block->data, block->len);
 +
 +    g_array_free(block, true);
 +
 +    return 0;
 +}
 +
  /*
   * Build table for the hardware error fw_cfg blob.
   * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
@@ -XXX,XX +XXX,XX @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
      fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
          NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
  }
 +
 +int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
 +{
 +    uint64_t error_block_addr, read_ack_register_addr, read_ack_register = 0;
 +    uint64_t start_addr;
 +    bool ret = -1;
 +    AcpiGedState *acpi_ged_state;
 +    AcpiGhesState *ags;
 +
 +    assert(source_id < ACPI_HEST_SRC_ID_RESERVED);
 +
 +    acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
 +                                                       NULL));
 +    g_assert(acpi_ged_state);
 +    ags = &acpi_ged_state->ghes_state;
 +
 +    start_addr = le64_to_cpu(ags->ghes_addr_le);
 +
 +    if (physical_address) {
 +
 +        if (source_id < ACPI_HEST_SRC_ID_RESERVED) {
 +            start_addr += source_id * sizeof(uint64_t);
 +        }
 +
 +        cpu_physical_memory_read(start_addr, &error_block_addr,
 +                                 sizeof(error_block_addr));
 +
 +        error_block_addr = le64_to_cpu(error_block_addr);
 +
 +        read_ack_register_addr = start_addr +
 +            ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t);
 +
 +        cpu_physical_memory_read(read_ack_register_addr,
 +                                 &read_ack_register, sizeof(read_ack_register));
 +
 +        /* zero means OSPM does not acknowledge the error */
 +        if (!read_ack_register) {
 +            error_report("OSPM does not acknowledge previous error,"
 +                " so can not record CPER for current error anymore");
 +        } else if (error_block_addr) {
 +            read_ack_register = cpu_to_le64(0);
 +            /*
 +             * Clear the Read Ack Register, OSPM will write it to 1 when
 +             * it acknowledges this error.
 +             */
 +            cpu_physical_memory_write(read_ack_register_addr,
 +                &read_ack_register, sizeof(uint64_t));
 +
 +            ret = acpi_ghes_record_mem_error(error_block_addr,
 +                                             physical_address);
 +        } else
 +            error_report("can not find Generic Error Status Block");
 +    }
 +
 +    return ret;
 +}
 --
 .20.1

-[Qemu-devel] [PULL 14/42] target/arm: Implement v7m_update_fpccr()
+[PULL 27/45] target-arm: kvm64: handle SIGBUS signal from kernel or KVM
-Implement the code which updates the FPCCR register on an
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-exception entry where we are going to use lazy FP stacking.
-We have to defer to the NVIC to determine whether the
+Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
-various exceptions are currently ready or not.
+translates the host VA delivered by host to guest PA, then fills this PA
+to guest APEI GHES memory, then notifies guest according to the SIGBUS
 type.
 When guest accesses the poisoned memory, it will generate a Synchronous
 External Abort(SEA). Then host kernel gets an APEI notification and calls
 memory_failure() to unmapped the affected page in stage 2, finally
 returns to guest.
 Guest continues to access the PG_hwpoison page, it will trap to KVM as
 stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
 Qemu, Qemu records this error address into guest APEI GHES memory and
 notifes guest using Synchronous-External-Abort(SEA).
 In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
 in which we can setup the type of exception and the syndrome information.
 When switching to guest, the target vcpu will jump to the synchronous
 external abort vector table entry.
 The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
 ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
 not valid and hold an UNKNOWN value. These values will be set to KVM
 register structures through KVM_SET_ONE_REG IOCTL.
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Message-id: 20200512030609.19593-10-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20190416125744.27770-12-peter.maydell@linaro.org
 ---
- target/arm/cpu.h      | 14 +++++++++
+ include/sysemu/kvm.h    |  3 +-
- hw/intc/armv7m_nvic.c | 34 ++++++++++++++++++++++
+ target/arm/cpu.h        |  4 +++
- target/arm/helper.c   | 67 ++++++++++++++++++++++++++++++++++++++++++-
+ target/arm/internals.h  |  5 +--
-files changed, 114 insertions(+), 1 deletion(-)
+ target/i386/cpu.h       |  2 ++
+ target/arm/helper.c     |  2 +-
  target/arm/kvm64.c      | 77 +++++++++++++++++++++++++++++++++++++++++
  target/arm/tlb_helper.c |  2 +-
 files changed, 89 insertions(+), 6 deletions(-)
 diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
 index XXXXXXX..XXXXXXX 100644
 --- a/include/sysemu/kvm.h
 +++ b/include/sysemu/kvm.h
@@ -XXX,XX +XXX,XX @@ bool kvm_vcpu_id_is_valid(int vcpu_id);
  /* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
  unsigned long kvm_arch_vcpu_id(CPUState *cpu);
 -#ifdef TARGET_I386
 -#define KVM_HAVE_MCE_INJECTION 1
 +#ifdef KVM_HAVE_MCE_INJECTION
  void kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
  #endif
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_acknowledge_irq(void *opaque);
+@@ -XXX,XX +XXX,XX @@
-  * (Ignoring -1, this is the same as the RETTOBASE value before completion.)
+ /* ARM processors have a weak memory model */
-  */
+ #define TCG_GUEST_DEFAULT_MO      (0)
- int armv7m_nvic_complete_irq(void *opaque, int irq, bool secure);
-+/**
++#ifdef TARGET_AARCH64
-+ * armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure)
++#define KVM_HAVE_MCE_INJECTION 1
-+ * @opaque: the NVIC
++#endif
-+ * @irq: the exception number to mark pending
++
-+ * @secure: false for non-banked exceptions or for the nonsecure
+ #define EXCP_UDEF            1   /* undefined instruction */
-+ * version of a banked exception, true for the secure version of a banked
+ #define EXCP_SWI             2   /* software interrupt */
-+ * exception.
+ #define EXCP_PREFETCH_ABORT  3
-+ *
+diff --git a/target/arm/internals.h b/target/arm/internals.h
-+ * Return whether an exception is "ready", i.e. whether the exception is
+index XXXXXXX..XXXXXXX 100644
-+ * enabled and is configured at a priority which would allow it to
+--- a/target/arm/internals.h
-+ * interrupt the current execution priority. This controls whether the
++++ b/target/arm/internals.h
-+ * RDY bit for it in the FPCCR is set.
+@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_insn_abort(int same_el, int ea, int s1ptw, int fsc)
-+ */
+         | ARM_EL_IL | (ea << 9) | (s1ptw << 7) | fsc;
-+bool armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure);
+ }
- /**
-  * armv7m_nvic_raw_execution_priority: return the raw execution priority
+-static inline uint32_t syn_data_abort_no_iss(int same_el,
-  * @opaque: the NVIC
++static inline uint32_t syn_data_abort_no_iss(int same_el, int fnv,
-diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
+                                              int ea, int cm, int s1ptw,
-index XXXXXXX..XXXXXXX 100644
+                                              int wnr, int fsc)
 --- a/hw/intc/armv7m_nvic.c
 +++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ int armv7m_nvic_complete_irq(void *opaque, int irq, bool secure)
      return ret;
  }
 +bool armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure)
 +{
 +    /*
 +     * Return whether an exception is "ready", i.e. it is enabled and is
 +     * configured at a priority which would allow it to interrupt the
 +     * current execution priority.
 +     *
 +     * irq and secure have the same semantics as for armv7m_nvic_set_pending():
 +     * for non-banked exceptions secure is always false; for banked exceptions
 +     * it indicates which of the exceptions is required.
 +     */
 +    NVICState *s = (NVICState *)opaque;
 +    bool banked = exc_is_banked(irq);
 +    VecInfo *vec;
 +    int running = nvic_exec_prio(s);
 +
 +    assert(irq > ARMV7M_EXCP_RESET && irq < s->num_irq);
 +    assert(!secure || banked);
 +
 +    /*
 +     * HardFault is an odd special case: we always check against -1,
 +     * even if we're secure and HardFault has priority -3; we never
 +     * need to check for enabled state.
 +     */
 +    if (irq == ARMV7M_EXCP_HARD) {
 +        return running > -1;
 +    }
 +
 +    vec = (banked && secure) ? &s->sec_vectors[irq] : &s->vectors[irq];
 +
 +    return vec->enabled &&
 +        exc_group_prio(s, vec->prio, secure) < running;
 +}
 +
  /* callback when external interrupt line is changed */
  static void set_irq_level(void *opaque, int n, int level)
  {
+     return (EC_DATAABORT << ARM_EL_EC_SHIFT) | (same_el << ARM_EL_EC_SHIFT)
+            | ARM_EL_IL
+-           | (ea << 9) | (cm << 8) | (s1ptw << 7) | (wnr << 6) | fsc;
++           | (fnv << 10) | (ea << 9) | (cm << 8) | (s1ptw << 7)
++           | (wnr << 6) | fsc;
+ }
+ static inline uint32_t syn_data_abort_with_iss(int same_el,
+diff --git a/target/i386/cpu.h b/target/i386/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/i386/cpu.h
++++ b/target/i386/cpu.h
+@@ -XXX,XX +XXX,XX @@
+ /* The x86 has a strong memory model with some store-after-load re-ordering */
+ #define TCG_GUEST_DEFAULT_MO      (TCG_MO_ALL & ~TCG_MO_ST_LD)
++#define KVM_HAVE_MCE_INJECTION 1
++
+ /* Maximum instruction code size */
+ #define TARGET_MAX_INSN_SIZE 16
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
+@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
-     env->thumb = addr & 1;
+              * Report exception with ESR indicating a fault due to a
- }
+              * translation table walk for a cache maintenance instruction.
+              */
-+static void v7m_update_fpccr(CPUARMState *env, uint32_t frameptr,
+-            syn = syn_data_abort_no_iss(current_el == target_el,
-+                             bool apply_splim)
++            syn = syn_data_abort_no_iss(current_el == target_el, 0,
                                          fi.ea, 1, fi.s1ptw, 1, fsc);
              env->exception.vaddress = value;
              env->exception.fsr = fsr;
 diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm64.c
 +++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@
  #include "sysemu/kvm_int.h"
  #include "kvm_arm.h"
  #include "internals.h"
 +#include "hw/acpi/acpi.h"
 +#include "hw/acpi/ghes.h"
 +#include "hw/arm/virt.h"
  static bool have_guest_debug;
@@ -XXX,XX +XXX,XX @@ int kvm_arm_cpreg_level(uint64_t regidx)
      return KVM_PUT_RUNTIME_STATE;
  }
 +/* Callers must hold the iothread mutex lock */
 +static void kvm_inject_arm_sea(CPUState *c)
 +{
++    ARMCPU *cpu = ARM_CPU(c);
++    CPUARMState *env = &cpu->env;
++    CPUClass *cc = CPU_GET_CLASS(c);
++    uint32_t esr;
++    bool same_el;
++
++    c->exception_index = EXCP_DATA_ABORT;
++    env->exception.target_el = 1;
++
 +    /*
-+     * Like the pseudocode UpdateFPCCR: save state in FPCAR and FPCCR
++     * Set the DFSC to synchronous external abort and set FnV to not valid,
-+     * that we will need later in order to do lazy FP reg stacking.
++     * this will tell guest the FAR_ELx is UNKNOWN for this abort.
 +     */
-+    bool is_secure = env->v7m.secure;
++    same_el = arm_current_el(env) == env->exception.target_el;
-+    void *nvic = env->nvic;
++    esr = syn_data_abort_no_iss(same_el, 1, 0, 0, 0, 0, 0x10);
-+    /*
++
-+     * Some bits are unbanked and live always in fpccr[M_REG_S]; some bits
++    env->exception.syndrome = esr;
-+     * are banked and we want to update the bit in the bank for the
++
-+     * current security state; and in one case we want to specifically
++    cc->do_interrupt(c);
-+     * update the NS banked version of a bit even if we are secure.
++}
-+     */
++
-+    uint32_t *fpccr_s = &env->v7m.fpccr[M_REG_S];
+ #define AARCH64_CORE_REG(x)   (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
-+    uint32_t *fpccr_ns = &env->v7m.fpccr[M_REG_NS];
+                  KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
-+    uint32_t *fpccr = &env->v7m.fpccr[is_secure];
-+    bool hfrdy, bfrdy, mmrdy, ns_ufrdy, s_ufrdy, sfrdy, monrdy;
+@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
-+
+     return ret;
-+    env->v7m.fpcar[is_secure] = frameptr & ~0x7;
+ }
-+
-+    if (apply_splim && arm_feature(env, ARM_FEATURE_V8)) {
++void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
-+        bool splimviol;
++{
-+        uint32_t splim = v7m_sp_limit(env);
++    ram_addr_t ram_addr;
-+        bool ign = armv7m_nvic_neg_prio_requested(nvic, is_secure) &&
++    hwaddr paddr;
-+            (env->v7m.ccr[is_secure] & R_V7M_CCR_STKOFHFNMIGN_MASK);
++    Object *obj = qdev_get_machine();
-+
++    VirtMachineState *vms = VIRT_MACHINE(obj);
-+        splimviol = !ign && frameptr < splim;
++    bool acpi_enabled = virt_is_acpi_enabled(vms);
-+        *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, SPLIMVIOL, splimviol);
++
 +    assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
 +
 +    if (acpi_enabled && addr &&
 +            object_property_get_bool(obj, "ras", NULL)) {
 +        ram_addr = qemu_ram_addr_from_host(addr);
 +        if (ram_addr != RAM_ADDR_INVALID &&
 +            kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
 +            kvm_hwpoison_page_add(ram_addr);
 +            /*
 +             * If this is a BUS_MCEERR_AR, we know we have been called
 +             * synchronously from the vCPU thread, so we can easily
 +             * synchronize the state and inject an error.
 +             *
 +             * TODO: we currently don't tell the guest at all about
 +             * BUS_MCEERR_AO. In that case we might either be being
 +             * called synchronously from the vCPU thread, or a bit
 +             * later from the main thread, so doing the injection of
 +             * the error would be more complicated.
 +             */
 +            if (code == BUS_MCEERR_AR) {
 +                kvm_cpu_synchronize_state(c);
 +                if (!acpi_ghes_record_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
 +                    kvm_inject_arm_sea(c);
 +                } else {
 +                    error_report("failed to record the error");
 +                    abort();
 +                }
 +            }
 +            return;
 +        }
 +        if (code == BUS_MCEERR_AO) {
 +            error_report("Hardware memory error at addr %p for memory used by "
 +                "QEMU itself instead of guest system!", addr);
 +        }
 +    }
 +
-+    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, LSPACT, 1);
++    if (code == BUS_MCEERR_AR) {
-+
++        error_report("Hardware memory error!");
-+    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, S, is_secure);
++        exit(1);
 +
 +    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, USER, arm_current_el(env) == 0);
 +
 +    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, THREAD,
 +                        !arm_v7m_is_handler_mode(env));
 +
 +    hfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_HARD, false);
 +    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, HFRDY, hfrdy);
 +
 +    bfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_BUS, false);
 +    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, BFRDY, bfrdy);
 +
 +    mmrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_MEM, is_secure);
 +    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, MMRDY, mmrdy);
 +
 +    ns_ufrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_USAGE, false);
 +    *fpccr_ns = FIELD_DP32(*fpccr_ns, V7M_FPCCR, UFRDY, ns_ufrdy);
 +
 +    monrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_DEBUG, false);
 +    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, MONRDY, monrdy);
 +
 +    if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
 +        s_ufrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_USAGE, true);
 +        *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, UFRDY, s_ufrdy);
 +
 +        sfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_SECURE, false);
 +        *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, SFRDY, sfrdy);
 +    }
 +}
 +
- static bool v7m_push_stack(ARMCPU *cpu)
+ /* C6.6.29 BRK instruction */
- {
+ static const uint32_t brk_insn = 0xd4200000;
-     /* Do the "set up stack frame" part of exception entry,
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
+diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
-                 }
+index XXXXXXX..XXXXXXX 100644
-             } else {
+--- a/target/arm/tlb_helper.c
-                 /* Lazy stacking enabled, save necessary info to stack later */
++++ b/target/arm/tlb_helper.c
--                /* TODO : equivalent of UpdateFPCCR() pseudocode */
+@@ -XXX,XX +XXX,XX @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn,
-+                v7m_update_fpccr(env, frameptr + 0x20, true);
+      * ISV field.
-             }
+      */
-         }
+     if (!(template_syn & ARM_EL_ISV) || target_el != 2 || s1ptw) {
-     }
+-        syn = syn_data_abort_no_iss(same_el,
 +        syn = syn_data_abort_no_iss(same_el, 0,
                                      ea, 0, s1ptw, is_write, fsc);
      } else {
          /*
 --
 .20.1

-[Qemu-devel] [PULL 34/42] hw/devices: Move TC6393XB declarations into a new header
+[PULL 28/45] MAINTAINERS: Add ACPI/HEST/GHES entries
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Dongjiu Geng <gengdongjiu@huawei.com>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+I and Xiang are willing to review the APEI-related patches and
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+volunteer as the reviewers for the HEST/GHES part.
-Message-id: 20190412165416.7977-5-philmd@redhat.com
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
 Acked-by: Michael S. Tsirkin <mst@redhat.com>
 Message-id: 20200512030609.19593-11-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/devices.h          |  6 ------
+ MAINTAINERS | 9 +++++++++
- include/hw/display/tc6393xb.h | 24 ++++++++++++++++++++++++
+file changed, 9 insertions(+)
  hw/arm/tosa.c                 |  2 +-
  hw/display/tc6393xb.c         |  2 +-
  MAINTAINERS                   |  1 +
 files changed, 27 insertions(+), 8 deletions(-)
  create mode 100644 include/hw/display/tc6393xb.h
-diff --git a/include/hw/devices.h b/include/hw/devices.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
-+++ b/include/hw/devices.h
-@@ -XXX,XX +XXX,XX @@ void *tahvo_init(qemu_irq irq, int betty);
- void retu_key_event(void *retu, int state);
--/* tc6393xb.c */
--typedef struct TC6393xbState TC6393xbState;
--TC6393xbState *tc6393xb_init(struct MemoryRegion *sysmem,
--                             uint32_t base, qemu_irq irq);
--qemu_irq tc6393xb_l3v_get(TC6393xbState *s);
--
- #endif
-diff --git a/include/hw/display/tc6393xb.h b/include/hw/display/tc6393xb.h
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/include/hw/display/tc6393xb.h
-@@ -XXX,XX +XXX,XX @@
-+/*
-+ * Toshiba TC6393XB I/O Controller.
-+ * Found in Sharp Zaurus SL-6000 (tosa) or some
-+ * Toshiba e-Series PDAs.
-+ *
-+ * Copyright (c) 2007 Hervé Poussineau
-+ *
-+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
-+ * See the COPYING file in the top-level directory.
-+ */
-+
-+#ifndef HW_DISPLAY_TC6393XB_H
-+#define HW_DISPLAY_TC6393XB_H
-+
-+#include "exec/memory.h"
-+#include "hw/irq.h"
-+
-+typedef struct TC6393xbState TC6393xbState;
-+
-+TC6393xbState *tc6393xb_init(struct MemoryRegion *sysmem,
-+                             uint32_t base, qemu_irq irq);
-+qemu_irq tc6393xb_l3v_get(TC6393xbState *s);
-+
-+#endif
-diff --git a/hw/arm/tosa.c b/hw/arm/tosa.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/tosa.c
-+++ b/hw/arm/tosa.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/hw.h"
- #include "hw/arm/pxa.h"
- #include "hw/arm/arm.h"
--#include "hw/devices.h"
- #include "hw/arm/sharpsl.h"
- #include "hw/pcmcia.h"
- #include "hw/boards.h"
-+#include "hw/display/tc6393xb.h"
- #include "hw/i2c/i2c.h"
- #include "hw/ssi/ssi.h"
- #include "hw/sysbus.h"
-diff --git a/hw/display/tc6393xb.c b/hw/display/tc6393xb.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/display/tc6393xb.c
-+++ b/hw/display/tc6393xb.c
-@@ -XXX,XX +XXX,XX @@
- #include "qapi/error.h"
- #include "qemu/host-utils.h"
- #include "hw/hw.h"
--#include "hw/devices.h"
-+#include "hw/display/tc6393xb.h"
- #include "hw/block/flash.h"
- #include "ui/console.h"
- #include "ui/pixel_ops.h"
 diff --git a/MAINTAINERS b/MAINTAINERS
 index XXXXXXX..XXXXXXX 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
-@@ -XXX,XX +XXX,XX @@ F: hw/misc/mst_fpga.c
+@@ -XXX,XX +XXX,XX @@ F: tests/qtest/bios-tables-test.c
- F: hw/misc/max111x.c
+ F: tests/qtest/acpi-utils.[hc]
- F: include/hw/arm/pxa.h
+ F: tests/data/acpi/
- F: include/hw/arm/sharpsl.h
-+F: include/hw/display/tc6393xb.h
++ACPI/HEST/GHES
++R: Dongjiu Geng <gengdongjiu@huawei.com>
- SABRELITE / i.MX6
++R: Xiang Zheng <zhengxiang9@huawei.com>
- M: Peter Maydell <peter.maydell@linaro.org>
++L: qemu-arm@nongnu.org
 +S: Maintained
 +F: hw/acpi/ghes.c
 +F: include/hw/acpi/ghes.h
 +F: docs/specs/acpi_hest_ghes.rst
 +
  ppc4xx
  M: David Gibson <david@gibson.dropbear.id.au>
  L: qemu-ppc@nongnu.org
 --
 .20.1

-[Qemu-devel] [PULL 09/42] target/arm: Decode FP instructions for M profile
+[PULL 29/45] target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
-Correct the decode of the M-profile "coprocessor and
+Convert the Neon VQRDMLAH and VQRDMLSH insns in the 3-reg-same group
-floating-point instructions" space:
+to decodetree.  These don't use do_3same() because they want to
- * op0 == 0b11 is always unallocated
+operate on VFP double registers, whose offsets are different from the
- * if the CPU has an FPU then all insns with op1 == 0b101
+neon_reg_offset() calculations do_3same does.
    are floating point and go to disas_vfp_insn()
 For the moment we leave VLLDM and VLSTM as NOPs; in
 a later commit we will fill in the proper implementation
 for the case where an FPU is present.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-7-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-2-peter.maydell@linaro.org
 ---
- target/arm/translate.c | 26 ++++++++++++++++++++++----
+ target/arm/neon-dp.decode       |  3 +++
-file changed, 22 insertions(+), 4 deletions(-)
+ target/arm/translate-neon.inc.c | 15 +++++++++++++++
  target/arm/translate.c          | 14 ++------------
 files changed, 20 insertions(+), 12 deletions(-)
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/neon-dp.decode
++++ b/target/arm/neon-dp.decode
+@@ -XXX,XX +XXX,XX @@ VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
+ VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
+ VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
++
++VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
++VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
+     }
+     return do_3same(s, a, gen_VMUL_p_3s);
+ }
++
++#define DO_VQRDMLAH(INSN, FUNC)                                         \
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
++    {                                                                   \
++        if (!dc_isar_feature(aa32_rdm, s)) {                            \
++            return false;                                               \
++        }                                                               \
++        if (a->size != 1 && a->size != 2) {                             \
++            return false;                                               \
++        }                                                               \
++        return do_3same(s, a, FUNC);                                    \
++    }
++
++DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
++DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     case 6: case 7: case 14: case 15:
+             if (!u) {
-         /* Coprocessor.  */
+                 break;  /* VPADD */
-         if (arm_dc_feature(s, ARM_FEATURE_M)) {
+             }
--            /* We don't currently implement M profile FP support,
+-            /* VQRDMLAH */
--             * so this entire space should give a NOCP fault, with
+-            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
--             * the exception of the v8M VLLDM and VLSTM insns, which
+-                gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs,
--             * must be NOPs in Secure state and UNDEF in Nonsecure state.
+-                                     vec_size, vec_size);
-+            /* 0b111x_11xx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx */
+-                return 0;
-+            if (extract32(insn, 24, 2) == 3) {
+-            }
-+                goto illegal_op; /* op0 = 0b11 : unallocated */
++            /* VQRDMLAH : handled by decodetree */
-+            }
+             return 1;
-+
-+            /*
+         case NEON_3R_VFM_VQRDMLSH:
-+             * Decode VLLDM and VLSTM first: these are nonstandard because:
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+             *  * if there is no FPU then these insns must NOP in
+                 }
 +             *    Secure state and UNDEF in Nonsecure state
 +             *  * if there is an FPU then these insns do not have
 +             *    the usual behaviour that disas_vfp_insn() provides of
 +             *    being controlled by CPACR/NSACR enable bits or the
 +             *    lazy-stacking logic.
               */
              if (arm_dc_feature(s, ARM_FEATURE_V8) &&
                  (insn & 0xffa00f00) == 0xec200a00) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                  /* Just NOP since FP support is not implemented */
                  break;
              }
-+            if (arm_dc_feature(s, ARM_FEATURE_VFP) &&
+-            /* VQRDMLSH */
-+                ((insn >> 8) & 0xe) == 10) {
+-            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
-+                /* FP, and the CPU supports it */
+-                gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs,
-+                if (disas_vfp_insn(s, insn)) {
+-                                     vec_size, vec_size);
-+                    goto illegal_op;
+-                return 0;
-+                }
+-            }
-+                break;
++            /* VQRDMLSH : handled by decodetree */
-+            }
+             return 1;
-+
-             /* All other insns: NOCP */
+         case NEON_3R_VABD:
              gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
                                 default_exception_el(s));
 --
 .20.1

-[Qemu-devel] [PULL 07/42] target/arm: Disable most VFP sysregs for M-profile
+[PULL 30/45] target/arm: Convert Neon 3-reg-same SHA to decodetree
-The only "system register" that M-profile floating point exposes
+Convert the Neon SHA instructions in the 3-reg-same group
-via the VMRS/VMRS instructions is FPSCR, and it does not have
+to decodetree.
 the odd special case for rd==15. Add a check to ensure we only
 expose FPSCR.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-5-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-3-peter.maydell@linaro.org
 ---
- target/arm/translate.c | 19 +++++++++++++++++--
+ target/arm/neon-dp.decode       |  10 +++
-file changed, 17 insertions(+), 2 deletions(-)
+ target/arm/translate-neon.inc.c | 139 ++++++++++++++++++++++++++++++++
+ target/arm/translate.c          |  46 +----------
 files changed, 151 insertions(+), 44 deletions(-)
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-dp.decode
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
  VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
  VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
 +
 +SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
 +                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +SHA256H_3s       1111 001 1 0 . 00 .... .... 1100 . 1 . 0 .... \
 +                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +SHA256H2_3s      1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
 +                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
 +                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
  VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
  DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
  DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
 +
 +static bool trans_SHA1_3s(DisasContext *s, arg_SHA1_3s *a)
 +{
 +    TCGv_ptr ptr1, ptr2, ptr3;
 +    TCGv_i32 tmp;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
 +        !dc_isar_feature(aa32_sha1, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    ptr1 = vfp_reg_ptr(true, a->vd);
 +    ptr2 = vfp_reg_ptr(true, a->vn);
 +    ptr3 = vfp_reg_ptr(true, a->vm);
 +    tmp = tcg_const_i32(a->optype);
 +    gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp);
 +    tcg_temp_free_i32(tmp);
 +    tcg_temp_free_ptr(ptr1);
 +    tcg_temp_free_ptr(ptr2);
 +    tcg_temp_free_ptr(ptr3);
 +
 +    return true;
 +}
 +
 +static bool trans_SHA256H_3s(DisasContext *s, arg_SHA256H_3s *a)
 +{
 +    TCGv_ptr ptr1, ptr2, ptr3;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
 +        !dc_isar_feature(aa32_sha2, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    ptr1 = vfp_reg_ptr(true, a->vd);
 +    ptr2 = vfp_reg_ptr(true, a->vn);
 +    ptr3 = vfp_reg_ptr(true, a->vm);
 +    gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
 +    tcg_temp_free_ptr(ptr1);
 +    tcg_temp_free_ptr(ptr2);
 +    tcg_temp_free_ptr(ptr3);
 +
 +    return true;
 +}
 +
 +static bool trans_SHA256H2_3s(DisasContext *s, arg_SHA256H2_3s *a)
 +{
 +    TCGv_ptr ptr1, ptr2, ptr3;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
 +        !dc_isar_feature(aa32_sha2, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    ptr1 = vfp_reg_ptr(true, a->vd);
 +    ptr2 = vfp_reg_ptr(true, a->vn);
 +    ptr3 = vfp_reg_ptr(true, a->vm);
 +    gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
 +    tcg_temp_free_ptr(ptr1);
 +    tcg_temp_free_ptr(ptr2);
 +    tcg_temp_free_ptr(ptr3);
 +
 +    return true;
 +}
 +
 +static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
 +{
 +    TCGv_ptr ptr1, ptr2, ptr3;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
 +        !dc_isar_feature(aa32_sha2, s)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    ptr1 = vfp_reg_ptr(true, a->vd);
 +    ptr2 = vfp_reg_ptr(true, a->vn);
 +    ptr3 = vfp_reg_ptr(true, a->vm);
 +    gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
 +    tcg_temp_free_ptr(ptr1);
 +    tcg_temp_free_ptr(ptr2);
 +    tcg_temp_free_ptr(ptr3);
 +
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                     }
+     int vec_size;
-                 }
+     uint32_t imm;
-             } else { /* !dp */
+     TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
-+                bool is_sysreg;
+-    TCGv_ptr ptr1, ptr2, ptr3;
-+
++    TCGv_ptr ptr1, ptr2;
-                 if ((insn & 0x6f) != 0x00)
+     TCGv_i64 tmp64;
-                     return 1;
-                 rn = VFP_SREG_N(insn);
+     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+                is_sysreg = extract32(insn, 21, 1);
+             return 1;
-+
+         }
-+                if (arm_dc_feature(s, ARM_FEATURE_M)) {
+         switch (op) {
-+                    /*
+-        case NEON_3R_SHA:
-+                     * The only M-profile VFP vmrs/vmsr sysreg is FPSCR.
+-            /* The SHA-1/SHA-256 3-register instructions require special
-+                     * Writes to R15 are UNPREDICTABLE; we choose to undef.
+-             * treatment here, as their size field is overloaded as an
-+                     */
+-             * op type selector, and they all consume their input in a
-+                    if (is_sysreg && (rd == 15 || (rn >> 1) != ARM_VFP_FPSCR)) {
+-             * single pass.
-+                        return 1;
+-             */
-+                    }
+-            if (!q) {
-+                }
+-                return 1;
-+
+-            }
-                 if (insn & ARM_CP_RW_BIT) {
+-            if (!u) { /* SHA-1 */
-                     /* vfp->arm */
+-                if (!dc_isar_feature(aa32_sha1, s)) {
--                    if (insn & (1 << 21)) {
+-                    return 1;
-+                    if (is_sysreg) {
+-                }
-                         /* system register */
+-                ptr1 = vfp_reg_ptr(true, rd);
-                         rn >>= 1;
+-                ptr2 = vfp_reg_ptr(true, rn);
+-                ptr3 = vfp_reg_ptr(true, rm);
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+-                tmp4 = tcg_const_i32(size);
-                     }
+-                gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp4);
-                 } else {
+-                tcg_temp_free_i32(tmp4);
-                     /* arm->vfp */
+-            } else { /* SHA-256 */
--                    if (insn & (1 << 21)) {
+-                if (!dc_isar_feature(aa32_sha2, s) || size == 3) {
-+                    if (is_sysreg) {
+-                    return 1;
-                         rn >>= 1;
+-                }
-                         /* system register */
+-                ptr1 = vfp_reg_ptr(true, rd);
-                         switch (rn) {
+-                ptr2 = vfp_reg_ptr(true, rn);
 -                ptr3 = vfp_reg_ptr(true, rm);
 -                switch (size) {
 -                case 0:
 -                    gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
 -                    break;
 -                case 1:
 -                    gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
 -                    break;
 -                case 2:
 -                    gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
 -                    break;
 -                }
 -            }
 -            tcg_temp_free_ptr(ptr1);
 -            tcg_temp_free_ptr(ptr2);
 -            tcg_temp_free_ptr(ptr3);
 -            return 0;
 -
          case NEON_3R_VPADD_VQRDMLAH:
              if (!u) {
                  break;  /* VPADD */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VMUL:
          case NEON_3R_VML:
          case NEON_3R_VSHL:
 +        case NEON_3R_SHA:
              /* Already handled by decodetree */
              return 1;
          }
 --
 .20.1

-[Qemu-devel] [PULL 29/42] target/arm: Enable FPU for Cortex-M4 and Cortex-M33
+[PULL 31/45] target/arm: Convert Neon 64-bit element 3-reg-same insns
-Enable the FPU by default for the Cortex-M4 and Cortex-M33.
+Convert the 64-bit element insns in the 3-reg-same group
 to decodetree. This covers VQSHL, VRSHL and VQRSHL where
 size==0b11.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-27-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-4-peter.maydell@linaro.org
 ---
- target/arm/cpu.c | 8 ++++++++
+ target/arm/neon-dp.decode       | 13 +++++++++++
-file changed, 8 insertions(+)
+ target/arm/translate-neon.inc.c | 24 +++++++++++++++++++++
  target/arm/translate.c          | 38 ++-------------------------------
 files changed, 39 insertions(+), 36 deletions(-)
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
+@@ -XXX,XX +XXX,XX @@ VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
-     set_feature(&cpu->env, ARM_FEATURE_M);
+ VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
-     set_feature(&cpu->env, ARM_FEATURE_M_MAIN);
+ VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
-     set_feature(&cpu->env, ARM_FEATURE_THUMB_DSP);
-+    set_feature(&cpu->env, ARM_FEATURE_VFP4);
++# Insns operating on 64-bit elements (size!=0b11 handled elsewhere)
-     cpu->midr = 0x410fc240; /* r0p0 */
++# The _rev suffix indicates that Vn and Vm are reversed (as explained
-     cpu->pmsav7_dregion = 8;
++# by the comment for the @3same_rev format).
-+    cpu->isar.mvfr0 = 0x10110021;
++@3same_64_rev    .... ... . . . 11 .... .... .... . q:1 . . .... \
-+    cpu->isar.mvfr1 = 0x11000011;
++                 &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
-+    cpu->isar.mvfr2 = 0x00000000;
++
-     cpu->id_pfr0 = 0x00000030;
++VQSHL_S64_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-     cpu->id_pfr1 = 0x00000200;
++VQSHL_U64_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-     cpu->id_dfr0 = 0x00100000;
++VRSHL_S64_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
++VRSHL_U64_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-     set_feature(&cpu->env, ARM_FEATURE_M_MAIN);
++VQRSHL_S64_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-     set_feature(&cpu->env, ARM_FEATURE_M_SECURITY);
++VQRSHL_U64_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-     set_feature(&cpu->env, ARM_FEATURE_THUMB_DSP);
++
-+    set_feature(&cpu->env, ARM_FEATURE_VFP4);
+ VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
-     cpu->midr = 0x410fd213; /* r0p3 */
+ VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
-     cpu->pmsav7_dregion = 16;
+ VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
-     cpu->sau_sregion = 8;
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-+    cpu->isar.mvfr0 = 0x10110021;
+index XXXXXXX..XXXXXXX 100644
-+    cpu->isar.mvfr1 = 0x11000011;
+--- a/target/arm/translate-neon.inc.c
-+    cpu->isar.mvfr2 = 0x00000040;
++++ b/target/arm/translate-neon.inc.c
-     cpu->id_pfr0 = 0x00000030;
+@@ -XXX,XX +XXX,XX @@ static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
-     cpu->id_pfr1 = 0x00000210;
-     cpu->id_dfr0 = 0x00200000;
+     return true;
  }
 +
 +#define DO_3SAME_64(INSN, FUNC)                                         \
 +    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 +                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 +                                uint32_t oprsz, uint32_t maxsz)         \
 +    {                                                                   \
 +        static const GVecGen3 op = { .fni8 = FUNC };                    \
 +        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &op);      \
 +    }                                                                   \
 +    DO_3SAME(INSN, gen_##INSN##_3s)
 +
 +#define DO_3SAME_64_ENV(INSN, FUNC)                                     \
 +    static void gen_##INSN##_elt(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m)    \
 +    {                                                                   \
 +        FUNC(d, cpu_env, n, m);                                         \
 +    }                                                                   \
 +    DO_3SAME_64(INSN, gen_##INSN##_elt)
 +
 +DO_3SAME_64(VRSHL_S64, gen_helper_neon_rshl_s64)
 +DO_3SAME_64(VRSHL_U64, gen_helper_neon_rshl_u64)
 +DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
 +DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
 +DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
 +DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          }
          if (size == 3) {
 -            /* 64-bit element instructions. */
 -            for (pass = 0; pass < (q ? 2 : 1); pass++) {
 -                neon_load_reg64(cpu_V0, rn + pass);
 -                neon_load_reg64(cpu_V1, rm + pass);
 -                switch (op) {
 -                case NEON_3R_VQSHL:
 -                    if (u) {
 -                        gen_helper_neon_qshl_u64(cpu_V0, cpu_env,
 -                                                 cpu_V1, cpu_V0);
 -                    } else {
 -                        gen_helper_neon_qshl_s64(cpu_V0, cpu_env,
 -                                                 cpu_V1, cpu_V0);
 -                    }
 -                    break;
 -                case NEON_3R_VRSHL:
 -                    if (u) {
 -                        gen_helper_neon_rshl_u64(cpu_V0, cpu_V1, cpu_V0);
 -                    } else {
 -                        gen_helper_neon_rshl_s64(cpu_V0, cpu_V1, cpu_V0);
 -                    }
 -                    break;
 -                case NEON_3R_VQRSHL:
 -                    if (u) {
 -                        gen_helper_neon_qrshl_u64(cpu_V0, cpu_env,
 -                                                  cpu_V1, cpu_V0);
 -                    } else {
 -                        gen_helper_neon_qrshl_s64(cpu_V0, cpu_env,
 -                                                  cpu_V1, cpu_V0);
 -                    }
 -                    break;
 -                default:
 -                    abort();
 -                }
 -                neon_store_reg64(cpu_V0, rd + pass);
 -            }
 -            return 0;
 +            /* 64-bit element instructions: handled by decodetree */
 +            return 1;
          }
          pairwise = 0;
          switch (op) {
 --
 .20.1

-[Qemu-devel] [PULL 18/42] target/arm: Handle floating point registers in exception return
+[PULL 32/45] target/arm: Convert Neon VHADD 3-reg-same insns
-Handle floating point registers in exception return.
+Convert the Neon VHADD insns in the 3-reg-same group to decodetree.
 This corresponds to pseudocode functions ValidateExceptionReturn(),
 ExceptionReturn(), PopStack() and ConsumeExcStackFrame().
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-16-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-5-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 142 +++++++++++++++++++++++++++++++++++++++++++-
+ target/arm/neon-dp.decode       |  2 ++
-file changed, 141 insertions(+), 1 deletion(-)
+ target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
  target/arm/translate.c          |  4 +---
 files changed, 27 insertions(+), 3 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
+@@ -XXX,XX +XXX,XX @@
-     bool rettobase = false;
+ @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
-     bool exc_secure = false;
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
-     bool return_to_secure;
-+    bool ftype;
++VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
-+    bool restore_s16_s31;
++VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
+ VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
-     /* If we're not in Handler mode then jumps to magic exception-exit
+ VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
-      * addresses don't have magic behaviour. However for the v8M
-@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-                       excret);
+index XXXXXXX..XXXXXXX 100644
-     }
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
-+    ftype = excret & R_V7M_EXCRET_FTYPE_MASK;
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
  DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
  DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
  DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
 +
-+    if (!arm_feature(env, ARM_FEATURE_VFP) && !ftype) {
++#define DO_3SAME_32(INSN, FUNC)                                         \
-+        qemu_log_mask(LOG_GUEST_ERROR, "M profile: zero FTYPE in exception "
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+                      "exit PC value 0x%" PRIx32 " is UNPREDICTABLE "
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+                      "if FPU not present\n",
++                                uint32_t oprsz, uint32_t maxsz)         \
-+                      excret);
++    {                                                                   \
-+        ftype = true;
++        static const GVecGen3 ops[4] = {                                \
 +            { .fni4 = gen_helper_neon_##FUNC##8 },                      \
 +            { .fni4 = gen_helper_neon_##FUNC##16 },                     \
 +            { .fni4 = gen_helper_neon_##FUNC##32 },                     \
 +            { 0 },                                                      \
 +        };                                                              \
 +        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
 +    }                                                                   \
 +    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
 +    {                                                                   \
 +        if (a->size > 2) {                                              \
 +            return false;                                               \
 +        }                                                               \
 +        return do_3same(s, a, gen_##INSN##_3s);                         \
 +    }
 +
-     if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
++DO_3SAME_32(VHADD_S, hadd_s)
-         /* EXC_RETURN.ES validation check (R_SMFL). We must do this before
++DO_3SAME_32(VHADD_U, hadd_u)
-          * we pick which FAULTMASK to clear.
+diff --git a/target/arm/translate.c b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
+index XXXXXXX..XXXXXXX 100644
-      */
+--- a/target/arm/translate.c
-     write_v7m_control_spsel_for_secstate(env, return_to_sp_process, exc_secure);
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+    /*
+         case NEON_3R_VML:
-+     * Clear scratch FP values left in caller saved registers; this
+         case NEON_3R_VSHL:
-+     * must happen before any kind of tail chaining.
+         case NEON_3R_SHA:
-+     */
++        case NEON_3R_VHADD:
-+    if ((env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_CLRONRET_MASK) &&
+             /* Already handled by decodetree */
-+        (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK)) {
+             return 1;
 +        if (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK) {
 +            env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
 +            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
 +            qemu_log_mask(CPU_LOG_INT, "...taking SecureFault on existing "
 +                          "stackframe: error during lazy state deactivation\n");
 +            v7m_exception_taken(cpu, excret, true, false);
 +            return;
 +        } else {
 +            /* Clear s0..s15 and FPSCR */
 +            int i;
 +
 +            for (i = 0; i < 16; i += 2) {
 +                *aa32_vfp_dreg(env, i / 2) = 0;
 +            }
 +            vfp_set_fpscr(env, 0);
 +        }
 +    }
 +
      if (sfault) {
          env->v7m.sfsr |= R_V7M_SFSR_INVER_MASK;
          armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
              }
          }
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+        if (!ftype) {
+             tmp2 = neon_load_reg(rm, pass);
-+            /* FP present and we need to handle it */
+         }
-+            if (!return_to_secure &&
+         switch (op) {
-+                (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK)) {
+-        case NEON_3R_VHADD:
-+                armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
+-            GEN_NEON_INTEGER_OP(hadd);
-+                env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
+-            break;
-+                qemu_log_mask(CPU_LOG_INT,
+         case NEON_3R_VRHADD:
-+                              "...taking SecureFault on existing stackframe: "
+             GEN_NEON_INTEGER_OP(rhadd);
-+                              "Secure LSPACT set but exception return is "
+             break;
 +                              "not to secure state\n");
 +                v7m_exception_taken(cpu, excret, true, false);
 +                return;
 +            }
 +
 +            restore_s16_s31 = return_to_secure &&
 +                (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK);
 +
 +            if (env->v7m.fpccr[return_to_secure] & R_V7M_FPCCR_LSPACT_MASK) {
 +                /* State in FPU is still valid, just clear LSPACT */
 +                env->v7m.fpccr[return_to_secure] &= ~R_V7M_FPCCR_LSPACT_MASK;
 +            } else {
 +                int i;
 +                uint32_t fpscr;
 +                bool cpacr_pass, nsacr_pass;
 +
 +                cpacr_pass = v7m_cpacr_pass(env, return_to_secure,
 +                                            return_to_priv);
 +                nsacr_pass = return_to_secure ||
 +                    extract32(env->v7m.nsacr, 10, 1);
 +
 +                if (!cpacr_pass) {
 +                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
 +                                            return_to_secure);
 +                    env->v7m.cfsr[return_to_secure] |= R_V7M_CFSR_NOCP_MASK;
 +                    qemu_log_mask(CPU_LOG_INT,
 +                                  "...taking UsageFault on existing "
 +                                  "stackframe: CPACR.CP10 prevents unstacking "
 +                                  "FP regs\n");
 +                    v7m_exception_taken(cpu, excret, true, false);
 +                    return;
 +                } else if (!nsacr_pass) {
 +                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, true);
 +                    env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_INVPC_MASK;
 +                    qemu_log_mask(CPU_LOG_INT,
 +                                  "...taking Secure UsageFault on existing "
 +                                  "stackframe: NSACR.CP10 prevents unstacking "
 +                                  "FP regs\n");
 +                    v7m_exception_taken(cpu, excret, true, false);
 +                    return;
 +                }
 +
 +                for (i = 0; i < (restore_s16_s31 ? 32 : 16); i += 2) {
 +                    uint32_t slo, shi;
 +                    uint64_t dn;
 +                    uint32_t faddr = frameptr + 0x20 + 4 * i;
 +
 +                    if (i >= 16) {
 +                        faddr += 8; /* Skip the slot for the FPSCR */
 +                    }
 +
 +                    pop_ok = pop_ok &&
 +                        v7m_stack_read(cpu, &slo, faddr, mmu_idx) &&
 +                        v7m_stack_read(cpu, &shi, faddr + 4, mmu_idx);
 +
 +                    if (!pop_ok) {
 +                        break;
 +                    }
 +
 +                    dn = (uint64_t)shi << 32 | slo;
 +                    *aa32_vfp_dreg(env, i / 2) = dn;
 +                }
 +                pop_ok = pop_ok &&
 +                    v7m_stack_read(cpu, &fpscr, frameptr + 0x60, mmu_idx);
 +                if (pop_ok) {
 +                    vfp_set_fpscr(env, fpscr);
 +                }
 +                if (!pop_ok) {
 +                    /*
 +                     * These regs are 0 if security extension present;
 +                     * otherwise merely UNKNOWN. We zero always.
 +                     */
 +                    for (i = 0; i < (restore_s16_s31 ? 32 : 16); i += 2) {
 +                        *aa32_vfp_dreg(env, i / 2) = 0;
 +                    }
 +                    vfp_set_fpscr(env, 0);
 +                }
 +            }
 +        }
 +        env->v7m.control[M_REG_S] = FIELD_DP32(env->v7m.control[M_REG_S],
 +                                               V7M_CONTROL, FPCA, !ftype);
 +
          /* Commit to consuming the stack frame */
          frameptr += 0x20;
 +        if (!ftype) {
 +            frameptr += 0x48;
 +            if (restore_s16_s31) {
 +                frameptr += 0x40;
 +            }
 +        }
          /* Undo stack alignment (the SPREALIGN bit indicates that the original
           * pre-exception SP was not 8-aligned and we added a padding word to
           * align it, so we undo this by ORing in the bit that increases it
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
          *frame_sp_p = frameptr;
      }
      /* This xpsr_write() will invalidate frame_sp_p as it may switch stack */
 -    xpsr_write(env, xpsr, ~XPSR_SPREALIGN);
 +    xpsr_write(env, xpsr, ~(XPSR_SPREALIGN | XPSR_SFPA));
 +
 +    if (env->v7m.secure) {
 +        bool sfpa = xpsr & XPSR_SFPA;
 +
 +        env->v7m.control[M_REG_S] = FIELD_DP32(env->v7m.control[M_REG_S],
 +                                               V7M_CONTROL, SFPA, sfpa);
 +    }
      /* The restored xPSR exception field will be zero if we're
       * resuming in Thread mode. If that doesn't match what the
 --
 .20.1

-[Qemu-devel] [PULL 19/42] target/arm: Move NS TBFLAG from bit 19 to bit 6
+[PULL 33/45] target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
-Move the NS TBFLAG down from bit 19 to bit 6, which has not
+Convert the Neon VABA and VABD insns in the 3-reg-same group to
-been used since commit c1e3781090b9d36c60 in 2015, when we
+decodetree.
 started passing the entire MMU index in the TB flags rather
 than just a 'privilege level' bit.
 This rearrangement is not strictly necessary, but means that
 we can put M-profile-only bits next to each other rather
 than scattered across the flag word.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-17-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-6-peter.maydell@linaro.org
 ---
- target/arm/cpu.h | 11 ++++++-----
+ target/arm/neon-dp.decode       |  6 ++++++
-file changed, 6 insertions(+), 5 deletions(-)
+ target/arm/translate-neon.inc.c |  4 ++++
  target/arm/translate.c          | 22 ++--------------------
 files changed, 12 insertions(+), 20 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
+@@ -XXX,XX +XXX,XX @@ VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
- FIELD(TBFLAG_A32, THUMB, 0, 1)
+ VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
- FIELD(TBFLAG_A32, VECLEN, 1, 3)
+ VMIN_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
- FIELD(TBFLAG_A32, VECSTRIDE, 4, 2)
-+/*
++VABD_S_3s        1111 001 0 0 . .. .... .... 0111 . . . 0 .... @3same
-+ * Indicates whether cp register reads and writes by guest code should access
++VABD_U_3s        1111 001 1 0 . .. .... .... 0111 . . . 0 .... @3same
-+ * the secure or nonsecure bank of banked registers; note that this is not
++
-+ * the same thing as the current security state of the processor!
++VABA_S_3s        1111 001 0 0 . .. .... .... 0111 . . . 1 .... @3same
-+ */
++VABA_U_3s        1111 001 1 0 . .. .... .... 0111 . . . 1 .... @3same
-+FIELD(TBFLAG_A32, NS, 6, 1)
++
- FIELD(TBFLAG_A32, VFPEN, 7, 1)
+ VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
- FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
+ VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
- FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-  * checks on the other bits at runtime
+index XXXXXXX..XXXXXXX 100644
-  */
+--- a/target/arm/translate-neon.inc.c
- FIELD(TBFLAG_A32, XSCALE_CPAR, 17, 2)
++++ b/target/arm/translate-neon.inc.c
--/* Indicates whether cp register reads and writes by guest code should access
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
-- * the secure or nonsecure bank of banked registers; note that this is not
+ DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
-- * the same thing as the current security state of the processor!
+ DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
-- */
+ DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
--FIELD(TBFLAG_A32, NS, 19, 1)
++DO_3SAME_NO_SZ_3(VABD_S, gen_gvec_sabd)
- /* For M profile only, Handler (ie not Thread) mode */
++DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba)
- FIELD(TBFLAG_A32, HANDLER, 21, 1)
++DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd)
- /* For M profile only, whether we should generate stack-limit checks */
++DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba)
  #define DO_3SAME_CMP(INSN, COND)                                        \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              /* VQRDMLSH : handled by decodetree */
              return 1;
 -        case NEON_3R_VABD:
 -            if (u) {
 -                gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs,
 -                              vec_size, vec_size);
 -            } else {
 -                gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs,
 -                              vec_size, vec_size);
 -            }
 -            return 0;
 -
 -        case NEON_3R_VABA:
 -            if (u) {
 -                gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
 -                              vec_size, vec_size);
 -            } else {
 -                gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
 -                              vec_size, vec_size);
 -            }
 -            return 0;
 -
          case NEON_3R_VADD_VSUB:
          case NEON_3R_LOGIC:
          case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VSHL:
          case NEON_3R_SHA:
          case NEON_3R_VHADD:
 +        case NEON_3R_VABD:
 +        case NEON_3R_VABA:
              /* Already handled by decodetree */
              return 1;
          }
 --
 .20.1

-[Qemu-devel] [PULL 15/42] target/arm: Clear CONTROL.SFPA in BXNS and BLXNS
+[PULL 34/45] target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
-For v8M floating point support, transitions from Secure
+Convert the Neon VRHADD and VHSUB 3-reg-same insns to decodetree.
-to Non-secure state via BLNS and BLXNS must clear the
+(These are all the other insns in 3-reg-same which were using
-CONTROL.SFPA bit. (This corresponds to the pseudocode
+GEN_NEON_INTEGER_OP() and which are not pairwise or
-BranchToNS() function.)
+reversed-operands.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-13-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-7-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 4 ++++
+ target/arm/neon-dp.decode       | 6 ++++++
-file changed, 4 insertions(+)
+ target/arm/translate-neon.inc.c | 4 ++++
  target/arm/translate.c          | 8 ++------
 files changed, 12 insertions(+), 6 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_bxns)(CPUARMState *env, uint32_t dest)
+@@ -XXX,XX +XXX,XX @@ VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
-     /* translate.c should have made BXNS UNDEF unless we're secure */
+ VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
-     assert(env->v7m.secure);
+ VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
-+    if (!(dest & 1)) {
++VRHADD_S_3s      1111 001 0 0 . .. .... .... 0001 . . . 0 .... @3same
-+        env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
++VRHADD_U_3s      1111 001 1 0 . .. .... .... 0001 . . . 0 .... @3same
-+    }
++
-     switch_v7m_security_state(env, dest & 1);
+ @3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
-     env->thumb = 1;
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
-     env->regs[15] = dest & ~1;
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
+@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
-          */
+ VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
-         write_v7m_exception(env, 1);
+ VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
-     }
-+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
++VHSUB_S_3s       1111 001 0 0 . .. .... .... 0010 . . . 0 .... @3same
-     switch_v7m_security_state(env, 0);
++VHSUB_U_3s       1111 001 1 0 . .. .... .... 0010 . . . 0 .... @3same
-     env->thumb = 1;
++
-     env->regs[15] = dest;
+ VQSUB_S_3s       1111 001 0 0 . .. .... .... 0010 . . . 1 .... @3same
  VQSUB_U_3s       1111 001 1 0 . .. .... .... 0010 . . . 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
  DO_3SAME_32(VHADD_S, hadd_s)
  DO_3SAME_32(VHADD_U, hadd_u)
 +DO_3SAME_32(VHSUB_S, hsub_s)
 +DO_3SAME_32(VHSUB_U, hsub_u)
 +DO_3SAME_32(VRHADD_S, rhadd_s)
 +DO_3SAME_32(VRHADD_U, rhadd_u)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VSHL:
          case NEON_3R_SHA:
          case NEON_3R_VHADD:
 +        case NEON_3R_VRHADD:
 +        case NEON_3R_VHSUB:
          case NEON_3R_VABD:
          case NEON_3R_VABA:
              /* Already handled by decodetree */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tmp2 = neon_load_reg(rm, pass);
          }
          switch (op) {
 -        case NEON_3R_VRHADD:
 -            GEN_NEON_INTEGER_OP(rhadd);
 -            break;
 -        case NEON_3R_VHSUB:
 -            GEN_NEON_INTEGER_OP(hsub);
 -            break;
          case NEON_3R_VQSHL:
              GEN_NEON_INTEGER_OP_ENV(qshl);
              break;
 --
 .20.1

-[Qemu-devel] [PULL 13/42] target/arm: Handle floating point registers in exception entry
+[PULL 35/45] target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
-Handle floating point registers in exception entry.
+Convert the VQSHL, VRSHL and VQRSHL insns in the 3-reg-same
-This corresponds to the FP-specific parts of the pseudocode
+group to decodetree. We have already implemented the size==0b11
-functions ActivateException() and PushStack().
+case of these insns; this commit handles the remaining sizes.
 We defer the code corresponding to UpdateFPCCR() to a later patch.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-11-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-8-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 98 +++++++++++++++++++++++++++++++++++++++++++--
+ target/arm/neon-dp.decode       | 30 ++++++++++++++++++-----
-file changed, 95 insertions(+), 3 deletions(-)
+ target/arm/translate-neon.inc.c | 43 +++++++++++++++++++++++++++++++++
  target/arm/translate.c          | 22 +++--------------
 files changed, 70 insertions(+), 25 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
+@@ -XXX,XX +XXX,XX @@ VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
-     switch_v7m_security_state(env, targets_secure);
+ @3same_64_rev    .... ... . . . 11 .... .... .... . q:1 . . .... \
-     write_v7m_control_spsel(env, 0);
+                  &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
-     arm_clear_exclusive(env);
-+    /* Clear SFPA and FPCA (has no effect if no FPU) */
+-VQSHL_S64_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-+    env->v7m.control[M_REG_S] &=
+-VQSHL_U64_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-+        ~(R_V7M_CONTROL_FPCA_MASK | R_V7M_CONTROL_SFPA_MASK);
+-VRSHL_S64_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-     /* Clear IT bits */
+-VRSHL_U64_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-     env->condexec_bits = 0;
+-VQRSHL_S64_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-     env->regs[14] = lr;
+-VQRSHL_U64_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++{
-     uint32_t xpsr = xpsr_read(env);
++  VQSHL_S64_3s   1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-     uint32_t frameptr = env->regs[13];
++  VQSHL_S_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_rev
-     ARMMMUIdx mmu_idx = arm_mmu_idx(env);
++}
-+    uint32_t framesize;
++{
-+    bool nsacr_cp10 = extract32(env->v7m.nsacr, 10, 1);
++  VQSHL_U64_3s   1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-+
++  VQSHL_U_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev
-+    if ((env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) &&
++}
-+        (env->v7m.secure || nsacr_cp10)) {
++{
-+        if (env->v7m.secure &&
++  VRSHL_S64_3s   1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-+            env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK) {
++  VRSHL_S_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev
-+            framesize = 0xa8;
++}
-+        } else {
++{
-+            framesize = 0x68;
++  VRSHL_U64_3s   1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-+        }
++  VRSHL_U_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev
-+    } else {
++}
-+        framesize = 0x20;
++{
-+    }
++  VQRSHL_S64_3s  1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
++  VQRSHL_S_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_rev
-     /* Align stack pointer if the guest wants that */
++}
-     if ((frameptr & 4) &&
++{
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++  VQRSHL_U64_3s  1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-         xpsr |= XPSR_SPREALIGN;
++  VQRSHL_U_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_rev
 +}
  VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
  VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
          return do_3same(s, a, gen_##INSN##_3s);                         \
      }
--    frameptr -= 0x20;
++/*
-+    xpsr &= ~XPSR_SFPA;
++ * Some helper functions need to be passed the cpu_env. In order
-+    if (env->v7m.secure &&
++ * to use those with the gvec APIs like tcg_gen_gvec_3() we need
-+        (env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
++ * to create wrapper functions whose prototype is a NeonGenTwoOpFn()
-+        xpsr |= XPSR_SFPA;
++ * and which call a NeonGenTwoOpEnvFn().
 + */
 +#define WRAP_ENV_FN(WRAPNAME, FUNC)                                     \
 +    static void WRAPNAME(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m)            \
 +    {                                                                   \
 +        FUNC(d, cpu_env, n, m);                                         \
 +    }
 +
-+    frameptr -= framesize;
++#define DO_3SAME_32_ENV(INSN, FUNC)                                     \
++    WRAP_ENV_FN(gen_##INSN##_tramp8, gen_helper_neon_##FUNC##8);        \
-     if (arm_feature(env, ARM_FEATURE_V8)) {
++    WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##16);      \
-         uint32_t limit = v7m_sp_limit(env);
++    WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##32);      \
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-         v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-         v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
++                                uint32_t oprsz, uint32_t maxsz)         \
++    {                                                                   \
-+    if (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) {
++        static const GVecGen3 ops[4] = {                                \
-+        /* FPU is active, try to save its registers */
++            { .fni4 = gen_##INSN##_tramp8 },                            \
-+        bool fpccr_s = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
++            { .fni4 = gen_##INSN##_tramp16 },                           \
-+        bool lspact = env->v7m.fpccr[fpccr_s] & R_V7M_FPCCR_LSPACT_MASK;
++            { .fni4 = gen_##INSN##_tramp32 },                           \
-+
++            { 0 },                                                      \
-+        if (lspact && arm_feature(env, ARM_FEATURE_M_SECURITY)) {
++        };                                                              \
-+            qemu_log_mask(CPU_LOG_INT,
++        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
-+                          "...SecureFault because LSPACT and FPCA both set\n");
++    }                                                                   \
-+            env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
++    {                                                                   \
-+        } else if (!env->v7m.secure && !nsacr_cp10) {
++        if (a->size > 2) {                                              \
-+            qemu_log_mask(CPU_LOG_INT,
++            return false;                                               \
-+                          "...Secure UsageFault with CFSR.NOCP because "
++        }                                                               \
-+                          "NSACR.CP10 prevents stacking FP regs\n");
++        return do_3same(s, a, gen_##INSN##_3s);                         \
 +            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, M_REG_S);
 +            env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_NOCP_MASK;
 +        } else {
 +            if (!(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPEN_MASK)) {
 +                /* Lazy stacking disabled, save registers now */
 +                int i;
 +                bool cpacr_pass = v7m_cpacr_pass(env, env->v7m.secure,
 +                                                 arm_current_el(env) != 0);
 +
 +                if (stacked_ok && !cpacr_pass) {
 +                    /*
 +                     * Take UsageFault if CPACR forbids access. The pseudocode
 +                     * here does a full CheckCPEnabled() but we know the NSACR
 +                     * check can never fail as we have already handled that.
 +                     */
 +                    qemu_log_mask(CPU_LOG_INT,
 +                                  "...UsageFault with CFSR.NOCP because "
 +                                  "CPACR.CP10 prevents stacking FP regs\n");
 +                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
 +                                            env->v7m.secure);
 +                    env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_NOCP_MASK;
 +                    stacked_ok = false;
 +                }
 +
 +                for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
 +                    uint64_t dn = *aa32_vfp_dreg(env, i / 2);
 +                    uint32_t faddr = frameptr + 0x20 + 4 * i;
 +                    uint32_t slo = extract64(dn, 0, 32);
 +                    uint32_t shi = extract64(dn, 32, 32);
 +
 +                    if (i >= 16) {
 +                        faddr += 8; /* skip the slot for the FPSCR */
 +                    }
 +                    stacked_ok = stacked_ok &&
 +                        v7m_stack_write(cpu, faddr, slo, mmu_idx, false) &&
 +                        v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, false);
 +                }
 +                stacked_ok = stacked_ok &&
 +                    v7m_stack_write(cpu, frameptr + 0x60,
 +                                    vfp_get_fpscr(env), mmu_idx, false);
 +                if (cpacr_pass) {
 +                    for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
 +                        *aa32_vfp_dreg(env, i / 2) = 0;
 +                    }
 +                    vfp_set_fpscr(env, 0);
 +                }
 +            } else {
 +                /* Lazy stacking enabled, save necessary info to stack later */
 +                /* TODO : equivalent of UpdateFPCCR() pseudocode */
 +            }
 +        }
 +    }
 +
-     /*
+ DO_3SAME_32(VHADD_S, hadd_s)
-      * If we broke a stack limit then SP was already updated earlier;
+ DO_3SAME_32(VHADD_U, hadd_u)
-      * otherwise we update SP regardless of whether any of the stack
+ DO_3SAME_32(VHSUB_S, hsub_s)
-@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
+ DO_3SAME_32(VHSUB_U, hsub_u)
+ DO_3SAME_32(VRHADD_S, rhadd_s)
-     if (arm_feature(env, ARM_FEATURE_V8)) {
+ DO_3SAME_32(VRHADD_U, rhadd_u)
-         lr = R_V7M_EXCRET_RES1_MASK |
++DO_3SAME_32(VRSHL_S, rshl_s)
--            R_V7M_EXCRET_DCRS_MASK |
++DO_3SAME_32(VRSHL_U, rshl_u)
--            R_V7M_EXCRET_FTYPE_MASK;
++
-+            R_V7M_EXCRET_DCRS_MASK;
++DO_3SAME_32_ENV(VQSHL_S, qshl_s)
-         /* The S bit indicates whether we should return to Secure
++DO_3SAME_32_ENV(VQSHL_U, qshl_u)
-          * or NonSecure (ie our current state).
++DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
-          * The ES bit indicates whether we're taking this exception
++DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
-@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
+diff --git a/target/arm/translate.c b/target/arm/translate.c
-         if (env->v7m.secure) {
+index XXXXXXX..XXXXXXX 100644
-             lr |= R_V7M_EXCRET_S_MASK;
+--- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VHSUB:
          case NEON_3R_VABD:
          case NEON_3R_VABA:
 +        case NEON_3R_VQSHL:
 +        case NEON_3R_VRSHL:
 +        case NEON_3R_VQRSHL:
              /* Already handled by decodetree */
              return 1;
          }
-+        if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK)) {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+            lr |= R_V7M_EXCRET_FTYPE_MASK;
+         }
-+        }
+         pairwise = 0;
-     } else {
+         switch (op) {
-         lr = R_V7M_EXCRET_RES1_MASK |
+-        case NEON_3R_VQSHL:
-             R_V7M_EXCRET_S_MASK |
+-        case NEON_3R_VRSHL:
 -        case NEON_3R_VQRSHL:
 -            {
 -                int rtmp;
 -                /* Shift instruction operands are reversed.  */
 -                rtmp = rn;
 -                rn = rm;
 -                rm = rtmp;
 -            }
 -            break;
          case NEON_3R_VPADD_VQRDMLAH:
          case NEON_3R_VPMAX:
          case NEON_3R_VPMIN:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tmp2 = neon_load_reg(rm, pass);
          }
          switch (op) {
 -        case NEON_3R_VQSHL:
 -            GEN_NEON_INTEGER_OP_ENV(qshl);
 -            break;
 -        case NEON_3R_VRSHL:
 -            GEN_NEON_INTEGER_OP(rshl);
 -            break;
 -        case NEON_3R_VQRSHL:
 -            GEN_NEON_INTEGER_OP_ENV(qrshl);
              break;
          case NEON_3R_VPMAX:
              GEN_NEON_INTEGER_OP(pmax);
 --
 .20.1

-[Qemu-devel] [PULL 08/42] target/arm: Honour M-profile FP enable bits
+[PULL 36/45] target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
-Like AArch64, M-profile floating point has no FPEXC enable
+Convert the Neon integer VPMAX and VPMIN 3-reg-same insns to
-bit to gate floating point; so always set the VFPEN TB flag.
+decodetree. These are 'pairwise' operations.
 M-profile also has CPACR and NSACR similar to A-profile;
 they behave slightly differently:
  * the CPACR is banked between Secure and Non-Secure
  * if the NSACR forces a trap then this is taken to
    the Secure state, not the Non-Secure state
 Honour the CPACR and NSACR settings. The NSACR handling
 requires us to borrow the exception.target_el field
 (usually meaningless for M profile) to distinguish the
 NOCP UsageFault taken to Secure state from the more
 usual fault taken to the current security state.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-6-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-9-peter.maydell@linaro.org
 ---
- target/arm/helper.c    | 55 +++++++++++++++++++++++++++++++++++++++---
+ target/arm/neon-dp.decode       |  9 +++++
- target/arm/translate.c | 10 ++++++--
+ target/arm/translate-neon.inc.c | 71 +++++++++++++++++++++++++++++++++
-files changed, 60 insertions(+), 5 deletions(-)
+ target/arm/translate.c          | 17 +-------
 files changed, 82 insertions(+), 15 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t excp_idx,
+@@ -XXX,XX +XXX,XX @@
-     return target_el;
+ @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
- }
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+/*
++@3same_q0        .... ... . . . size:2 .... .... .... . 0 . . .... \
-+ * Return true if the v7M CPACR permits access to the FPU for the specified
++                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
-+ * security state and privilege level.
++
-+ */
+ VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
-+static bool v7m_cpacr_pass(CPUARMState *env, bool is_secure, bool is_priv)
+ VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
  VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
@@ -XXX,XX +XXX,XX @@ VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
  VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
  VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
 +VPMAX_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 0 .... @3same_q0
 +VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
 +
 +VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 +VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 +
  VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
  SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_32_ENV(VQSHL_S, qshl_s)
  DO_3SAME_32_ENV(VQSHL_U, qshl_u)
  DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
  DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
 +
 +static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
 +{
-+    switch (extract32(env->v7m.cpacr[is_secure], 20, 2)) {
++    /* Operations handled pairwise 32 bits at a time */
-+    case 0:
++    TCGv_i32 tmp, tmp2, tmp3;
-+    case 2: /* UNPREDICTABLE: we treat like 0 */
++
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
-+    case 1:
++    }
-+        return is_priv;
++
-+    case 3:
++    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if (a->size == 3) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
-+    default:
-+        g_assert_not_reached();
 +    }
++
++    assert(a->q == 0); /* enforced by decode patterns */
++
++    /*
++     * Note that we have to be careful not to clobber the source operands
++     * in the "vm == vd" case by storing the result of the first pass too
++     * early. Since Q is 0 there are always just two passes, so instead
++     * of a complicated loop over each pass we just unroll.
++     */
++    tmp = neon_load_reg(a->vn, 0);
++    tmp2 = neon_load_reg(a->vn, 1);
++    fn(tmp, tmp, tmp2);
++    tcg_temp_free_i32(tmp2);
++
++    tmp3 = neon_load_reg(a->vm, 0);
++    tmp2 = neon_load_reg(a->vm, 1);
++    fn(tmp3, tmp3, tmp2);
++    tcg_temp_free_i32(tmp2);
++
++    neon_store_reg(a->vd, 0, tmp);
++    neon_store_reg(a->vd, 1, tmp3);
++    return true;
 +}
 +
- static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
++#define DO_3SAME_PAIR(INSN, func)                                       \
-                             ARMMMUIdx mmu_idx, bool ignfault)
++    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
- {
++    {                                                                   \
-@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
++        static NeonGenTwoOpFn * const fns[] = {                         \
-         env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_UNDEFINSTR_MASK;
++            gen_helper_neon_##func##8,                                  \
-         break;
++            gen_helper_neon_##func##16,                                 \
-     case EXCP_NOCP:
++            gen_helper_neon_##func##32,                                 \
--        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
++        };                                                              \
--        env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_NOCP_MASK;
++        if (a->size > 2) {                                              \
-+    {
++            return false;                                               \
-+        /*
++        }                                                               \
-+         * NOCP might be directed to something other than the current
++        return do_3same_pair(s, a, fns[a->size]);                       \
 +         * security state if this fault is because of NSACR; we indicate
 +         * the target security state using exception.target_el.
 +         */
 +        int target_secstate;
 +
 +        if (env->exception.target_el == 3) {
 +            target_secstate = M_REG_S;
 +        } else {
 +            target_secstate = env->v7m.secure;
 +        }
 +        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, target_secstate);
 +        env->v7m.cfsr[target_secstate] |= R_V7M_CFSR_NOCP_MASK;
          break;
 +    }
      case EXCP_INVSTATE:
          armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
          env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_INVSTATE_MASK;
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
          return 0;
      }
 +    if (arm_feature(env, ARM_FEATURE_M)) {
 +        /* CPACR can cause a NOCP UsageFault taken to current security state */
 +        if (!v7m_cpacr_pass(env, env->v7m.secure, cur_el != 0)) {
 +            return 1;
 +        }
 +
 +        if (arm_feature(env, ARM_FEATURE_M_SECURITY) && !env->v7m.secure) {
 +            if (!extract32(env->v7m.nsacr, 10, 1)) {
 +                /* FP insns cause a NOCP UsageFault taken to Secure */
 +                return 3;
 +            }
 +        }
 +
 +        return 0;
 +    }
 +
-     /* The CPACR controls traps to EL1, or PL1 if we're 32 bit:
++/* 32-bit pairwise ops end up the same as the elementwise versions.  */
-      * 0, 2 : trap EL0 and EL1/PL1 accesses
++#define gen_helper_neon_pmax_s32  tcg_gen_smax_i32
-      * 1    : trap only EL0 accesses
++#define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
-@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
++#define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
-         flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, arm_sctlr_b(env));
++#define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
-         flags = FIELD_DP32(flags, TBFLAG_A32, NS, !access_secure_reg(env));
++
-         if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
++DO_3SAME_PAIR(VPMAX_S, pmax_s)
--            || arm_el_is_aa64(env, 1)) {
++DO_3SAME_PAIR(VPMIN_S, pmin_s)
-+            || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
++DO_3SAME_PAIR(VPMAX_U, pmax_u)
-             flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
++DO_3SAME_PAIR(VPMIN_U, pmin_u)
          }
          flags = FIELD_DP32(flags, TBFLAG_A32, XSCALE_CPAR, env->cp15.c15_cpar);
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
       * for attempts to execute invalid vfp/neon encodings with FP disabled.
       */
      if (s->fp_excp_el) {
 -        gen_exception_insn(s, 4, EXCP_UDEF,
 -                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
 +        if (arm_dc_feature(s, ARM_FEATURE_M)) {
 +            gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
 +                               s->fp_excp_el);
 +        } else {
 +            gen_exception_insn(s, 4, EXCP_UDEF,
 +                               syn_fp_access_trap(1, 0xe, false),
 +                               s->fp_excp_el);
 +        }
          return 0;
      }
+ }
 -/* 32-bit pairwise ops end up the same as the elementwise versions.  */
 -#define gen_helper_neon_pmax_s32  tcg_gen_smax_i32
 -#define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
 -#define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
 -#define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
 -
  #define GEN_NEON_INTEGER_OP_ENV(name) do { \
      switch ((size << 1) | u) { \
      case 0: \
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VQSHL:
          case NEON_3R_VRSHL:
          case NEON_3R_VQRSHL:
 +        case NEON_3R_VPMAX:
 +        case NEON_3R_VPMIN:
              /* Already handled by decodetree */
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          pairwise = 0;
          switch (op) {
          case NEON_3R_VPADD_VQRDMLAH:
 -        case NEON_3R_VPMAX:
 -        case NEON_3R_VPMIN:
              pairwise = 1;
              break;
          case NEON_3R_FLOAT_ARITH:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tmp2 = neon_load_reg(rm, pass);
          }
          switch (op) {
 -            break;
 -        case NEON_3R_VPMAX:
 -            GEN_NEON_INTEGER_OP(pmax);
 -            break;
 -        case NEON_3R_VPMIN:
 -            GEN_NEON_INTEGER_OP(pmin);
 -            break;
          case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high.  */
              if (!u) { /* VQDMULH */
                  switch (size) {
 --
 .20.1

-[Qemu-devel] [PULL 10/42] target/arm: Clear CONTROL_S.SFPA in SG insn if FPU present
+[PULL 37/45] target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
-If the floating point extension is present, then the SG instruction
+Convert the Neon integer VPADD 3-reg-same insns to decodetree.  These
-must clear the CONTROL_S.SFPA bit. Implement this.
+are 'pairwise' operations.  (Note that VQRDMLAH, which shares the
+same primary opcode but has U=1, has already been converted.)
 (On a no-FPU system the bit will always be zero, so we don't need
 to make the clearing of the bit conditional on ARM_FEATURE_VFP.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-8-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-10-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 1 +
+ target/arm/neon-dp.decode       |  2 ++
-file changed, 1 insertion(+)
+ target/arm/translate-neon.inc.c |  2 ++
  target/arm/translate.c          | 19 +------------------
 files changed, 5 insertions(+), 18 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static bool v7m_handle_execute_nsc(ARMCPU *cpu)
+@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
-     qemu_log_mask(CPU_LOG_INT, "...really an SG instruction at 0x%08" PRIx32
+ VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
-                   ", executing it\n", env->regs[15]);
+ VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
-     env->regs[14] &= ~1;
-+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
++VPADD_3s         1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
-     switch_v7m_security_state(env, true);
++
-     xpsr_write(env, 0, XPSR_IT);
+ VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
-     env->regs[15] += 4;
  SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
  #define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
  #define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
  #define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
 +#define gen_helper_neon_padd_u32  tcg_gen_add_i32
  DO_3SAME_PAIR(VPMAX_S, pmax_s)
  DO_3SAME_PAIR(VPMIN_S, pmin_s)
  DO_3SAME_PAIR(VPMAX_U, pmax_u)
  DO_3SAME_PAIR(VPMIN_U, pmin_u)
 +DO_3SAME_PAIR(VPADD, padd_u)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              return 1;
          }
          switch (op) {
 -        case NEON_3R_VPADD_VQRDMLAH:
 -            if (!u) {
 -                break;  /* VPADD */
 -            }
 -            /* VQRDMLAH : handled by decodetree */
 -            return 1;
 -
          case NEON_3R_VFM_VQRDMLSH:
              if (!u) {
                  /* VFM, VFMS */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VQRSHL:
          case NEON_3R_VPMAX:
          case NEON_3R_VPMIN:
 +        case NEON_3R_VPADD_VQRDMLAH:
              /* Already handled by decodetree */
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          }
          pairwise = 0;
          switch (op) {
 -        case NEON_3R_VPADD_VQRDMLAH:
 -            pairwise = 1;
 -            break;
          case NEON_3R_FLOAT_ARITH:
              pairwise = (u && size < 2); /* if VPADD (float) */
              break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  }
              }
              break;
 -        case NEON_3R_VPADD_VQRDMLAH:
 -            switch (size) {
 -            case 0: gen_helper_neon_padd_u8(tmp, tmp, tmp2); break;
 -            case 1: gen_helper_neon_padd_u16(tmp, tmp, tmp2); break;
 -            case 2: tcg_gen_add_i32(tmp, tmp, tmp2); break;
 -            default: abort();
 -            }
 -            break;
          case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
 .20.1

-[Qemu-devel] [PULL 16/42] target/arm: Clean excReturn bits when tail chaining
+[PULL 38/45] target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
-The TailChain() pseudocode specifies that a tail chaining
+Convert the Neon VQDMULH and VQRDMULH 3-reg-same insns to
-exception should sanitize the excReturn all-ones bits and
+decodetree. These are the last integer operations in the
-(if there is no FPU) the excReturn FType bits; we weren't
+-reg-same group.
 doing this.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-14-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-11-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 8 ++++++++
+ target/arm/neon-dp.decode       |  3 +++
-file changed, 8 insertions(+)
+ target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
  target/arm/translate.c          | 24 +-----------------------
 files changed, 28 insertions(+), 23 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
+@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
-     qemu_log_mask(CPU_LOG_INT, "...taking pending %s exception %d\n",
+ VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
-                   targets_secure ? "secure" : "nonsecure", exc);
+ VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
-+    if (dotailchain) {
++VQDMULH_3s       1111 001 0 0 . .. .... .... 1011 . . . 0 .... @3same
-+        /* Sanitize LR FType and PREFIX bits */
++VQRDMULH_3s      1111 001 1 0 . .. .... .... 1011 . . . 0 .... @3same
-+        if (!arm_feature(env, ARM_FEATURE_VFP)) {
++
-+            lr |= R_V7M_EXCRET_FTYPE_MASK;
+ VPADD_3s         1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
-+        }
-+        lr = deposit32(lr, 24, 8, 0xff);
+ VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPMIN_S, pmin_s)
  DO_3SAME_PAIR(VPMAX_U, pmax_u)
  DO_3SAME_PAIR(VPMIN_U, pmin_u)
  DO_3SAME_PAIR(VPADD, padd_u)
 +
 +#define DO_3SAME_VQDMULH(INSN, FUNC)                                    \
 +    WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16);    \
 +    WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32);    \
 +    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 +                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 +                                uint32_t oprsz, uint32_t maxsz)         \
 +    {                                                                   \
 +        static const GVecGen3 ops[2] = {                                \
 +            { .fni4 = gen_##INSN##_tramp16 },                           \
 +            { .fni4 = gen_##INSN##_tramp32 },                           \
 +        };                                                              \
 +        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece - 1]); \
 +    }                                                                   \
 +    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
 +    {                                                                   \
 +        if (a->size != 1 && a->size != 2) {                             \
 +            return false;                                               \
 +        }                                                               \
 +        return do_3same(s, a, gen_##INSN##_3s);                         \
 +    }
 +
-     if (arm_feature(env, ARM_FEATURE_V8)) {
++DO_3SAME_VQDMULH(VQDMULH, qdmulh)
-         if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
++DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
-             (lr & R_V7M_EXCRET_S_MASK)) {
+diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VPMAX:
          case NEON_3R_VPMIN:
          case NEON_3R_VPADD_VQRDMLAH:
 +        case NEON_3R_VQDMULH_VQRDMULH:
              /* Already handled by decodetree */
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              tmp2 = neon_load_reg(rm, pass);
          }
          switch (op) {
 -        case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high.  */
 -            if (!u) { /* VQDMULH */
 -                switch (size) {
 -                case 1:
 -                    gen_helper_neon_qdmulh_s16(tmp, cpu_env, tmp, tmp2);
 -                    break;
 -                case 2:
 -                    gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
 -                    break;
 -                default: abort();
 -                }
 -            } else { /* VQRDMULH */
 -                switch (size) {
 -                case 1:
 -                    gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
 -                    break;
 -                case 2:
 -                    gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
 -                    break;
 -                default: abort();
 -                }
 -            }
 -            break;
          case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
 .20.1

-[Qemu-devel] [PULL 28/42] target/arm: Implement VLLDM for v7M CPUs with an FPU
+[PULL 39/45] target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
-Implement the VLLDM instruction for v7M for the FPU present cas.
+Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree.
 We already have gvec helpers for addition and subtraction, but must
 add one for fabd.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-26-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-12-peter.maydell@linaro.org
 ---
- target/arm/helper.h    |  1 +
+ target/arm/helper.h             |  3 ++-
- target/arm/helper.c    | 54 ++++++++++++++++++++++++++++++++++++++++++
+ target/arm/neon-dp.decode       |  8 ++++++++
- target/arm/translate.c |  2 +-
+ target/arm/neon_helper.c        |  7 -------
-files changed, 56 insertions(+), 1 deletion(-)
+ target/arm/translate-neon.inc.c | 28 ++++++++++++++++++++++++++++
  target/arm/translate.c          | 10 +++-------
  target/arm/vec_helper.c         |  7 +++++++
 files changed, 48 insertions(+), 15 deletions(-)
 diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.h
 +++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(neon_qneg_s16, TCG_CALL_NO_RWG, i32, env, i32)
- DEF_HELPER_1(v7m_preserve_fp_state, void, env)
+ DEF_HELPER_FLAGS_2(neon_qneg_s32, TCG_CALL_NO_RWG, i32, env, i32)
+ DEF_HELPER_FLAGS_2(neon_qneg_s64, TCG_CALL_NO_RWG, i64, env, i64)
- DEF_HELPER_2(v7m_vlstm, void, env, i32)
-+DEF_HELPER_2(v7m_vlldm, void, env, i32)
+-DEF_HELPER_3(neon_abd_f32, i32, i32, i32, ptr)
+ DEF_HELPER_3(neon_ceq_f32, i32, i32, i32, ptr)
- DEF_HELPER_2(v8m_stackcheck, void, env, i32)
+ DEF_HELPER_3(neon_cge_f32, i32, i32, i32, ptr)
+ DEF_HELPER_3(neon_cgt_f32, i32, i32, i32, ptr)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 +
  DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
                     void, ptr, ptr, ptr, ptr, i32)
  DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
+@@ -XXX,XX +XXX,XX @@
-     g_assert_not_reached();
+ @3same_q0        .... ... . . . size:2 .... .... .... . 0 . . .... \
                   &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
 +# For FP insns the high bit of 'size' is used as part of opcode decode
 +@3same_fp        .... ... . . . . size:1 .... .... .... . q:1 . . .... \
 +                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
  VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
  VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
  VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
@@ -XXX,XX +XXX,XX @@ SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
                   vm=%vm_dp vn=%vn_dp vd=%vd_dp
  VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
 +
 +VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
 +VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
 +VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
 diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon_helper.c
 +++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qneg_s64)(CPUARMState *env, uint64_t x)
  }
-+void HELPER(v7m_vlldm)(CPUARMState *env, uint32_t fptr)
+ /* NEON Float helpers.  */
-+{
+-uint32_t HELPER(neon_abd_f32)(uint32_t a, uint32_t b, void *fpstp)
-+    /* translate.c should never generate calls here in user-only mode */
+-{
-+    g_assert_not_reached();
+-    float_status *fpst = fpstp;
-+}
+-    float32 f0 = make_float32(a);
 -    float32 f1 = make_float32(b);
 -    return float32_val(float32_abs(float32_sub(f0, f1, fpst)));
 -}
  /* Floating point comparisons produce an integer result.
   * Note that EQ doesn't signal InvalidOp for QNaNs but GE and GT do.
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
  DO_3SAME_VQDMULH(VQDMULH, qdmulh)
  DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
 +
- uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
++/*
- {
++ * For all the functions using this macro, size == 1 means fp16,
-     /* The TT instructions can be used by unprivileged code, but in
++ * which is an architecture extension we don't implement yet.
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
++ */
-     env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
++#define DO_3S_FP_GVEC(INSN,FUNC)                                        \
- }
++    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
++                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+void HELPER(v7m_vlldm)(CPUARMState *env, uint32_t fptr)
++                                uint32_t oprsz, uint32_t maxsz)         \
-+{
++    {                                                                   \
-+    /* fptr is the value of Rn, the frame pointer we load the FP regs from */
++        TCGv_ptr fpst = get_fpstatus_ptr(1);                            \
-+    assert(env->v7m.secure);
++        tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, fpst,                \
-+
++                           oprsz, maxsz, 0, FUNC);                      \
-+    if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
++        tcg_temp_free_ptr(fpst);                                        \
-+        return;
++    }                                                                   \
 +    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a)     \
 +    {                                                                   \
 +        if (a->size != 0) {                                             \
 +            /* TODO fp16 support */                                     \
 +            return false;                                               \
 +        }                                                               \
 +        return do_3same(s, a, gen_##INSN##_3s);                         \
 +    }
 +
-+    /* Check access to the coprocessor is permitted */
-+    if (!v7m_cpacr_pass(env, true, arm_current_el(env) != 0)) {
-+        raise_exception_ra(env, EXCP_NOCP, 0, 1, GETPC());
-+    }
 +
-+    if (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK) {
++DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
-+        /* State in FP is still valid */
++DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
-+        env->v7m.fpccr[M_REG_S] &= ~R_V7M_FPCCR_LSPACT_MASK;
++DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
 +    } else {
 +        bool ts = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK;
 +        int i;
 +        uint32_t fpscr;
 +
 +        if (fptr & 7) {
 +            raise_exception_ra(env, EXCP_UNALIGNED, 0, 1, GETPC());
 +        }
 +
 +        for (i = 0; i < (ts ? 32 : 16); i += 2) {
 +            uint32_t slo, shi;
 +            uint64_t dn;
 +            uint32_t faddr = fptr + 4 * i;
 +
 +            if (i >= 16) {
 +                faddr += 8; /* skip the slot for the FPSCR */
 +            }
 +
 +            slo = cpu_ldl_data(env, faddr);
 +            shi = cpu_ldl_data(env, faddr + 4);
 +
 +            dn = (uint64_t) shi << 32 | slo;
 +            *aa32_vfp_dreg(env, i / 2) = dn;
 +        }
 +        fpscr = cpu_ldl_data(env, fptr + 0x40);
 +        vfp_set_fpscr(env, fpscr);
 +    }
 +
 +    env->v7m.control[M_REG_S] |= R_V7M_CONTROL_FPCA_MASK;
 +}
 +
  static bool v7m_push_stack(ARMCPU *cpu)
  {
      /* Do the "set up stack frame" part of exception entry,
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                     TCGv_i32 fptr = load_reg(s, rn);
+         switch (op) {
+         case NEON_3R_FLOAT_ARITH:
-                     if (extract32(insn, 20, 1)) {
+             pairwise = (u && size < 2); /* if VPADD (float) */
--                        /* VLLDM */
++            if (!pairwise) {
-+                        gen_helper_v7m_vlldm(cpu_env, fptr);
++                return 1; /* handled by decodetree */
-                     } else {
++            }
-                         gen_helper_v7m_vlstm(cpu_env, fptr);
+             break;
-                     }
+         case NEON_3R_FLOAT_MINMAX:
              pairwise = u; /* if VPMIN/VPMAX (float) */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
              switch ((u << 2) | size) {
 -            case 0: /* VADD */
              case 4: /* VPADD */
                  gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
                  break;
 -            case 2: /* VSUB */
 -                gen_helper_vfp_subs(tmp, tmp, tmp2, fpstatus);
 -                break;
 -            case 6: /* VABD */
 -                gen_helper_neon_abd_f32(tmp, tmp, tmp2, fpstatus);
 -                break;
              default:
                  abort();
              }
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *stat)
      return result;
  }
 +static float32 float32_abd(float32 op1, float32 op2, float_status *stat)
 +{
 +    return float32_abs(float32_sub(op1, op2, stat));
 +}
 +
  #define DO_3OP(NAME, FUNC, TYPE) \
  void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
  {                                                                          \
@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16)
  DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32)
  DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
 +DO_3OP(gvec_fabd_s, float32_abd, float32)
 +
  #ifdef TARGET_AARCH64
  DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
 --
 .20.1

-[Qemu-devel] [PULL 05/42] hw/intc/armv7m_nvic: Allow reading of M-profile MVFR* registers
+[PULL 40/45] target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
-For M-profile the MVFR* ID registers are memory mapped, in the
+Convert the Neon float VPMIN, VPMAX and VPADD 3-reg-same insns to
-range we implement via the NVIC. Allow them to be read.
+decodetree. These are the only remaining 'pairwise' operations,
-(If the CPU has no FPU, these registers are defined to be RAZ.)
+so we can delete the pairwise-specific bits of the old decoder's
 for-each-element loop now.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-3-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-13-peter.maydell@linaro.org
 ---
- hw/intc/armv7m_nvic.c | 6 ++++++
+ target/arm/neon-dp.decode       |  5 +++
-file changed, 6 insertions(+)
+ target/arm/translate-neon.inc.c | 63 +++++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 63 +++++----------------------------
-diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
+files changed, 76 insertions(+), 55 deletions(-)
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/hw/intc/armv7m_nvic.c
+--- a/target/arm/neon-dp.decode
-+++ b/hw/intc/armv7m_nvic.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
+@@ -XXX,XX +XXX,XX @@
-             return 0;
+ # For FP insns the high bit of 'size' is used as part of opcode decode
-         }
+ @3same_fp        .... ... . . . . size:1 .... .... .... . q:1 . . .... \
-         return cpu->env.v7m.sfar;
+                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+    case 0xf40: /* MVFR0 */
++@3same_fp_q0     .... ... . . . . size:1 .... .... .... . 0 . . .... \
-+        return cpu->isar.mvfr0;
++                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
-+    case 0xf44: /* MVFR1 */
-+        return cpu->isar.mvfr1;
+ VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
-+    case 0xf48: /* MVFR2 */
+ VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
-+        return cpu->isar.mvfr2;
+@@ -XXX,XX +XXX,XX @@ VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
-     default:
-     bad_offset:
+ VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
-         qemu_log_mask(LOG_GUEST_ERROR, "NVIC: Bad read offset 0x%x\n", offset);
+ VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
 +VPADD_fp_3s      1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
  VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
 +VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
 +VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
  DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
  DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
  DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
 +
 +static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
 +{
 +    /* FP operations handled pairwise 32 bits at a time */
 +    TCGv_i32 tmp, tmp2, tmp3;
 +    TCGv_ptr fpstatus;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    assert(a->q == 0); /* enforced by decode patterns */
 +
 +    /*
 +     * Note that we have to be careful not to clobber the source operands
 +     * in the "vm == vd" case by storing the result of the first pass too
 +     * early. Since Q is 0 there are always just two passes, so instead
 +     * of a complicated loop over each pass we just unroll.
 +     */
 +    fpstatus = get_fpstatus_ptr(1);
 +    tmp = neon_load_reg(a->vn, 0);
 +    tmp2 = neon_load_reg(a->vn, 1);
 +    fn(tmp, tmp, tmp2, fpstatus);
 +    tcg_temp_free_i32(tmp2);
 +
 +    tmp3 = neon_load_reg(a->vm, 0);
 +    tmp2 = neon_load_reg(a->vm, 1);
 +    fn(tmp3, tmp3, tmp2, fpstatus);
 +    tcg_temp_free_i32(tmp2);
 +    tcg_temp_free_ptr(fpstatus);
 +
 +    neon_store_reg(a->vd, 0, tmp);
 +    neon_store_reg(a->vd, 1, tmp3);
 +    return true;
 +}
 +
 +/*
 + * For all the functions using this macro, size == 1 means fp16,
 + * which is an architecture extension we don't implement yet.
 + */
 +#define DO_3S_FP_PAIR(INSN,FUNC)                                    \
 +    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
 +    {                                                               \
 +        if (a->size != 0) {                                         \
 +            /* TODO fp16 support */                                 \
 +            return false;                                           \
 +        }                                                           \
 +        return do_3same_fp_pair(s, a, FUNC);                        \
 +    }
 +
 +DO_3S_FP_PAIR(VPADD, gen_helper_vfp_adds)
 +DO_3S_FP_PAIR(VPMAX, gen_helper_vfp_maxs)
 +DO_3S_FP_PAIR(VPMIN, gen_helper_vfp_mins)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      int shift;
      int pass;
      int count;
 -    int pairwise;
      int u;
      int vec_size;
      uint32_t imm;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VPMIN:
          case NEON_3R_VPADD_VQRDMLAH:
          case NEON_3R_VQDMULH_VQRDMULH:
 +        case NEON_3R_FLOAT_ARITH:
              /* Already handled by decodetree */
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              /* 64-bit element instructions: handled by decodetree */
              return 1;
          }
 -        pairwise = 0;
          switch (op) {
 -        case NEON_3R_FLOAT_ARITH:
 -            pairwise = (u && size < 2); /* if VPADD (float) */
 -            if (!pairwise) {
 -                return 1; /* handled by decodetree */
 -            }
 -            break;
          case NEON_3R_FLOAT_MINMAX:
 -            pairwise = u; /* if VPMIN/VPMAX (float) */
 +            if (u) {
 +                return 1; /* VPMIN/VPMAX handled by decodetree */
 +            }
              break;
          case NEON_3R_FLOAT_CMP:
              if (!u && size) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              break;
          }
 -        if (pairwise && q) {
 -            /* All the pairwise insns UNDEF if Q is set */
 -            return 1;
 -        }
 -
          for (pass = 0; pass < (q ? 4 : 2); pass++) {
 -        if (pairwise) {
 -            /* Pairwise.  */
 -            if (pass < 1) {
 -                tmp = neon_load_reg(rn, 0);
 -                tmp2 = neon_load_reg(rn, 1);
 -            } else {
 -                tmp = neon_load_reg(rm, 0);
 -                tmp2 = neon_load_reg(rm, 1);
 -            }
 -        } else {
 -            /* Elementwise.  */
 -            tmp = neon_load_reg(rn, pass);
 -            tmp2 = neon_load_reg(rm, pass);
 -        }
 +        /* Elementwise.  */
 +        tmp = neon_load_reg(rn, pass);
 +        tmp2 = neon_load_reg(rm, pass);
          switch (op) {
 -        case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
 -        {
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            switch ((u << 2) | size) {
 -            case 4: /* VPADD */
 -                gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
 -                break;
 -            default:
 -                abort();
 -            }
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
          case NEON_3R_FLOAT_MULTIPLY:
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          }
          tcg_temp_free_i32(tmp2);
 -        /* Save the result.  For elementwise operations we can put it
 -           straight into the destination register.  For pairwise operations
 -           we have to be careful to avoid clobbering the source operands.  */
 -        if (pairwise && rd == rm) {
 -            neon_store_scratch(pass, tmp);
 -        } else {
 -            neon_store_reg(rd, pass, tmp);
 -        }
 +        neon_store_reg(rd, pass, tmp);
          } /* for pass */
 -        if (pairwise && rd == rm) {
 -            for (pass = 0; pass < (q ? 4 : 2); pass++) {
 -                tmp = neon_load_scratch(pass);
 -                neon_store_reg(rd, pass, tmp);
 -            }
 -        }
          /* End of 3 register same size operations.  */
      } else if (insn & (1 << 4)) {
          if ((insn & 0x00380080) != 0) {
 --
 .20.1

-[Qemu-devel] [PULL 24/42] target/arm: New function armv7m_nvic_set_pending_lazyfp()
+[PULL 41/45] target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
-In the v7M architecture, if an exception is generated in the process
+Convert the Neon integer VMUL, VMLA, and VMLS 3-reg-same inssn to
-of doing the lazy stacking of FP registers, the handling of
+decodetree.
 possible escalation to HardFault is treated differently to the normal
 approach: it works based on the saved information about exception
 readiness that was stored in the FPCCR when the stack frame was
 created. Provide a new function armv7m_nvic_set_pending_lazyfp()
 which pends exceptions during lazy stacking, and implements
 this logic.
-This corresponds to the pseudocode TakePreserveFPException().
+We don't have a gvec helper for multiply-accumulate, so VMLA and VMLS
 need a loop function do_3same_fp().  This takes a reads_vd parameter
 to do_3same_fp() which tells it to load the old value into vd before
 calling the callback function, in the same way that the do_vfp_3op_sp()
 and do_vfp_3op_dp() functions in translate-vfp.inc.c work. (The
 only uses in this patch pass reads_vd == true, but later commits
 will use reads_vd == false.)
 This conversion fixes in passing an underdecoding for VMUL
 (originally reported by Fredrik Strupe <fredrik@strupe.net>): bit 1
 of the 'size' field must be 0.  The old decoder didn't enforce this,
 but the decodetree pattern does.
 The gen_VMLA_fp_reg() function performs the addition operation
 with the operands in the opposite order to the old decoder:
 since Neon sets 'default NaN mode' float32_add operations are
 commutative so there is no behaviour difference, but putting
 them this way around matches the Arm ARM pseudocode and the
 required operation order for the subtraction in gen_VMLS_fp_reg().
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-22-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-14-peter.maydell@linaro.org
 ---
- target/arm/cpu.h      | 12 ++++++
+ target/arm/neon-dp.decode       |  3 ++
- hw/intc/armv7m_nvic.c | 96 +++++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate-neon.inc.c | 81 +++++++++++++++++++++++++++++++++
-files changed, 108 insertions(+)
+ target/arm/translate.c          | 17 +------
 files changed, 85 insertions(+), 16 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_set_pending(void *opaque, int irq, bool secure);
+@@ -XXX,XX +XXX,XX @@ VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
-  * a different exception).
+ VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
-  */
+ VPADD_fp_3s      1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
- void armv7m_nvic_set_pending_derived(void *opaque, int irq, bool secure);
+ VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
-+/**
++VMLA_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
-+ * armv7m_nvic_set_pending_lazyfp: mark this lazy FP exception as pending
++VMLS_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 1 .... @3same_fp
-+ * @opaque: the NVIC
++VMUL_fp_3s       1111 001 1 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
-+ * @irq: the exception number to mark pending
+ VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
-+ * @secure: false for non-banked exceptions or for the nonsecure
+ VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
-+ * version of a banked exception, true for the secure version of a banked
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 + * exception.
 + *
 + * Similar to armv7m_nvic_set_pending(), but specifically for exceptions
 + * generated in the course of lazy stacking of FP registers.
 + */
 +void armv7m_nvic_set_pending_lazyfp(void *opaque, int irq, bool secure);
  /**
   * armv7m_nvic_get_pending_irq_info: return highest priority pending
   *    exception, and whether it targets Secure state
 diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/intc/armv7m_nvic.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/intc/armv7m_nvic.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_set_pending_derived(void *opaque, int irq, bool secure)
+@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
-     do_armv7m_nvic_set_pending(opaque, irq, secure, true);
+ DO_3SAME_VQDMULH(VQDMULH, qdmulh)
- }
+ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
-+void armv7m_nvic_set_pending_lazyfp(void *opaque, int irq, bool secure)
++static bool do_3same_fp(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn,
 +                        bool reads_vd)
 +{
 +    /*
-+     * Pend an exception during lazy FP stacking. This differs
++     * FP operations handled elementwise 32 bits at a time.
-+     * from the usual exception pending because the logic for
++     * If reads_vd is true then the old value of Vd will be
-+     * whether we should escalate depends on the saved context
++     * loaded before calling the callback function. This is
-+     * in the FPCCR register, not on the current state of the CPU/NVIC.
++     * used for multiply-accumulate type operations.
 +     */
-+    NVICState *s = (NVICState *)opaque;
++    TCGv_i32 tmp, tmp2;
-+    bool banked = exc_is_banked(irq);
++    int pass;
 +    VecInfo *vec;
 +    bool targets_secure;
 +    bool escalate = false;
 +    /*
 +     * We will only look at bits in fpccr if this is a banked exception
 +     * (in which case 'secure' tells us whether it is the S or NS version).
 +     * All the bits for the non-banked exceptions are in fpccr_s.
 +     */
 +    uint32_t fpccr_s = s->cpu->env.v7m.fpccr[M_REG_S];
 +    uint32_t fpccr = s->cpu->env.v7m.fpccr[secure];
 +
-+    assert(irq > ARMV7M_EXCP_RESET && irq < s->num_irq);
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+    assert(!secure || banked);
++        return false;
 +
 +    vec = (banked && secure) ? &s->sec_vectors[irq] : &s->vectors[irq];
 +
 +    targets_secure = banked ? secure : exc_targets_secure(s, irq);
 +
 +    switch (irq) {
 +    case ARMV7M_EXCP_DEBUG:
 +        if (!(fpccr_s & R_V7M_FPCCR_MONRDY_MASK)) {
 +            /* Ignore DebugMonitor exception */
 +            return;
 +        }
 +        break;
 +    case ARMV7M_EXCP_MEM:
 +        escalate = !(fpccr & R_V7M_FPCCR_MMRDY_MASK);
 +        break;
 +    case ARMV7M_EXCP_USAGE:
 +        escalate = !(fpccr & R_V7M_FPCCR_UFRDY_MASK);
 +        break;
 +    case ARMV7M_EXCP_BUS:
 +        escalate = !(fpccr_s & R_V7M_FPCCR_BFRDY_MASK);
 +        break;
 +    case ARMV7M_EXCP_SECURE:
 +        escalate = !(fpccr_s & R_V7M_FPCCR_SFRDY_MASK);
 +        break;
 +    default:
 +        g_assert_not_reached();
 +    }
 +
-+    if (escalate) {
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-+        /*
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+         * Escalate to HardFault: faults that initially targeted Secure
++        ((a->vd | a->vn | a->vm) & 0x10)) {
-+         * continue to do so, even if HF normally targets NonSecure.
++        return false;
 +         */
 +        irq = ARMV7M_EXCP_HARD;
 +        if (arm_feature(&s->cpu->env, ARM_FEATURE_M_SECURITY) &&
 +            (targets_secure ||
 +             !(s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK))) {
 +            vec = &s->sec_vectors[irq];
 +        } else {
 +            vec = &s->vectors[irq];
 +        }
 +    }
 +
-+    if (!vec->enabled ||
++    if ((a->vn | a->vm | a->vd) & a->q) {
-+        nvic_exec_prio(s) <= exc_group_prio(s, vec->prio, secure)) {
++        return false;
 +        if (!(fpccr_s & R_V7M_FPCCR_HFRDY_MASK)) {
 +            /*
 +             * We want to escalate to HardFault but the context the
 +             * FP state belongs to prevents the exception pre-empting.
 +             */
 +            cpu_abort(&s->cpu->parent_obj,
 +                      "Lockup: can't escalate to HardFault during "
 +                      "lazy FP register stacking\n");
 +        }
 +    }
 +
-+    if (escalate) {
++    if (!vfp_access_check(s)) {
-+        s->cpu->env.v7m.hfsr |= R_V7M_HFSR_FORCED_MASK;
++        return true;
 +    }
-+    if (!vec->pending) {
++
-+        vec->pending = 1;
++    TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-+        /*
++    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
-+         * We do not call nvic_irq_update(), because we know our caller
++        tmp = neon_load_reg(a->vn, pass);
-+         * is going to handle causing us to take the exception by
++        tmp2 = neon_load_reg(a->vm, pass);
-+         * raising EXCP_LAZYFP, so raising the IRQ line would be
++        if (reads_vd) {
-+         * pointless extra work. We just need to recompute the
++            TCGv_i32 tmp_rd = neon_load_reg(a->vd, pass);
-+         * priorities so that armv7m_nvic_can_take_pending_exception()
++            fn(tmp_rd, tmp, tmp2, fpstatus);
-+         * returns the right answer.
++            neon_store_reg(a->vd, pass, tmp_rd);
-+         */
++            tcg_temp_free_i32(tmp);
-+        nvic_recompute_state(s);
++        } else {
 +            fn(tmp, tmp, tmp2, fpstatus);
 +            neon_store_reg(a->vd, pass, tmp);
 +        }
 +        tcg_temp_free_i32(tmp2);
 +    }
++    tcg_temp_free_ptr(fpstatus);
++    return true;
 +}
 +
- /* Make pending IRQ active.  */
+ /*
- void armv7m_nvic_acknowledge_irq(void *opaque)
+  * For all the functions using this macro, size == 1 means fp16,
   * which is an architecture extension we don't implement yet.
@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
  DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
  DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
  DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
 +DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
 +
 +/*
 + * For all the functions using this macro, size == 1 means fp16,
 + * which is an architecture extension we don't implement yet.
 + */
 +#define DO_3S_FP(INSN,FUNC,READS_VD)                                \
 +    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
 +    {                                                               \
 +        if (a->size != 0) {                                         \
 +            /* TODO fp16 support */                                 \
 +            return false;                                           \
 +        }                                                           \
 +        return do_3same_fp(s, a, FUNC, READS_VD);                   \
 +    }
 +
 +static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
 +                            TCGv_ptr fpstatus)
 +{
 +    gen_helper_vfp_muls(vn, vn, vm, fpstatus);
 +    gen_helper_vfp_adds(vd, vd, vn, fpstatus);
 +}
 +
 +static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
 +                            TCGv_ptr fpstatus)
 +{
 +    gen_helper_vfp_muls(vn, vn, vm, fpstatus);
 +    gen_helper_vfp_subs(vd, vd, vn, fpstatus);
 +}
 +
 +DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
 +DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
  static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
  {
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+         case NEON_3R_VPADD_VQRDMLAH:
+         case NEON_3R_VQDMULH_VQRDMULH:
+         case NEON_3R_FLOAT_ARITH:
++        case NEON_3R_FLOAT_MULTIPLY:
+             /* Already handled by decodetree */
+             return 1;
+         }
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+         tmp = neon_load_reg(rn, pass);
+         tmp2 = neon_load_reg(rm, pass);
+         switch (op) {
+-        case NEON_3R_FLOAT_MULTIPLY:
+-        {
+-            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
+-            gen_helper_vfp_muls(tmp, tmp, tmp2, fpstatus);
+-            if (!u) {
+-                tcg_temp_free_i32(tmp2);
+-                tmp2 = neon_load_reg(rd, pass);
+-                if (size == 0) {
+-                    gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
+-                } else {
+-                    gen_helper_vfp_subs(tmp, tmp2, tmp, fpstatus);
+-                }
+-            }
+-            tcg_temp_free_ptr(fpstatus);
+-            break;
+-        }
+         case NEON_3R_FLOAT_CMP:
+         {
+             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
 .20.1

-[Qemu-devel] [PULL 20/42] target/arm: Overlap VECSTRIDE and XSCALE_CPAR TB flags
+[PULL 42/45] target/arm: Convert Neon 3-reg-same compare insns to decodetree
-We are close to running out of TB flags for AArch32; we could
+Convert the Neon integer 3-reg-same compare insns VCGE, VCGT,
-start using the cs_base word, but before we do that we can
+VCEQ, VACGE and VACGT to decodetree.
 economise on our usage by sharing the same bits for the VFP
 VECSTRIDE field and the XScale XSCALE_CPAR field. This
 works because no XScale CPU ever had VFP.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-18-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-15-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       | 10 ++++++----
+ target/arm/neon-dp.decode       |  5 +++++
- target/arm/cpu.c       |  7 +++++++
+ target/arm/translate-neon.inc.c |  6 +++++
- target/arm/helper.c    |  6 +++++-
+ target/arm/translate.c          | 39 ++-------------------------------
- target/arm/translate.c |  9 +++++++--
+files changed, 13 insertions(+), 37 deletions(-)
 files changed, 25 insertions(+), 7 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
+@@ -XXX,XX +XXX,XX @@ VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
- FIELD(TBFLAG_A32, THUMB, 0, 1)
+ VMLA_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
- FIELD(TBFLAG_A32, VECLEN, 1, 3)
+ VMLS_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 1 .... @3same_fp
- FIELD(TBFLAG_A32, VECSTRIDE, 4, 2)
+ VMUL_fp_3s       1111 001 1 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
-+/*
++VCEQ_fp_3s       1111 001 0 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
-+ * We store the bottom two bits of the CPAR as TB flags and handle
++VCGE_fp_3s       1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
-+ * checks on the other bits at runtime. This shares the same bits as
++VACGE_fp_3s      1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
-+ * VECSTRIDE, which is OK as no XScale CPU has VFP.
++VCGT_fp_3s       1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
-+ */
++VACGT_fp_3s      1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
-+FIELD(TBFLAG_A32, XSCALE_CPAR, 4, 2)
+ VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
- /*
+ VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
-  * Indicates whether cp register reads and writes by guest code should access
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
   * the secure or nonsecure bank of banked registers; note that this is not
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
  FIELD(TBFLAG_A32, VFPEN, 7, 1)
  FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
  FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
 -/* We store the bottom two bits of the CPAR as TB flags and handle
 - * checks on the other bits at runtime
 - */
 -FIELD(TBFLAG_A32, XSCALE_CPAR, 17, 2)
  /* For M profile only, Handler (ie not Thread) mode */
  FIELD(TBFLAG_A32, HANDLER, 21, 1)
  /* For M profile only, whether we should generate stack-limit checks */
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/cpu.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
+@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
-         set_feature(env, ARM_FEATURE_THUMB_DSP);
+         return do_3same_fp(s, a, FUNC, READS_VD);                   \
      }
-+    /*
++DO_3S_FP(VCEQ, gen_helper_neon_ceq_f32, false)
-+     * We rely on no XScale CPU having VFP so we can use the same bits in the
++DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
-+     * TB flags field for VECSTRIDE and XSCALE_CPAR.
++DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
-+     */
++DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
-+    assert(!(arm_feature(env, ARM_FEATURE_VFP) &&
++DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
 +             arm_feature(env, ARM_FEATURE_XSCALE)));
 +
-     if (arm_feature(env, ARM_FEATURE_V7) &&
+ static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
-         !arm_feature(env, ARM_FEATURE_M) &&
+                             TCGv_ptr fpstatus)
-         !arm_feature(env, ARM_FEATURE_PMSA)) {
+ {
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
              || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
              flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
          }
 -        flags = FIELD_DP32(flags, TBFLAG_A32, XSCALE_CPAR, env->cp15.c15_cpar);
 +        /* Note that XSCALE_CPAR shares bits with VECSTRIDE */
 +        if (arm_feature(env, ARM_FEATURE_XSCALE)) {
 +            flags = FIELD_DP32(flags, TBFLAG_A32,
 +                               XSCALE_CPAR, env->cp15.c15_cpar);
 +        }
      }
      flags = FIELD_DP32(flags, TBFLAG_ANY, MMUIDX, arm_to_core_mmu_idx(mmu_idx));
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     dc->fp_excp_el = FIELD_EX32(tb_flags, TBFLAG_ANY, FPEXC_EL);
+         case NEON_3R_VQDMULH_VQRDMULH:
-     dc->vfp_enabled = FIELD_EX32(tb_flags, TBFLAG_A32, VFPEN);
+         case NEON_3R_FLOAT_ARITH:
-     dc->vec_len = FIELD_EX32(tb_flags, TBFLAG_A32, VECLEN);
+         case NEON_3R_FLOAT_MULTIPLY:
--    dc->vec_stride = FIELD_EX32(tb_flags, TBFLAG_A32, VECSTRIDE);
++        case NEON_3R_FLOAT_CMP:
--    dc->c15_cpar = FIELD_EX32(tb_flags, TBFLAG_A32, XSCALE_CPAR);
++        case NEON_3R_FLOAT_ACMP:
-+    if (arm_feature(env, ARM_FEATURE_XSCALE)) {
+             /* Already handled by decodetree */
-+        dc->c15_cpar = FIELD_EX32(tb_flags, TBFLAG_A32, XSCALE_CPAR);
+             return 1;
-+        dc->vec_stride = 0;
+         }
-+    } else {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+        dc->vec_stride = FIELD_EX32(tb_flags, TBFLAG_A32, VECSTRIDE);
+                 return 1; /* VPMIN/VPMAX handled by decodetree */
-+        dc->c15_cpar = 0;
+             }
-+    }
+             break;
-     dc->v7m_handler_mode = FIELD_EX32(tb_flags, TBFLAG_A32, HANDLER);
+-        case NEON_3R_FLOAT_CMP:
-     dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
+-            if (!u && size) {
-         regime_is_secure(env, dc->mmu_idx);
+-                /* no encoding for U=0 C=1x */
 -                return 1;
 -            }
 -            break;
 -        case NEON_3R_FLOAT_ACMP:
 -            if (!u) {
 -                return 1;
 -            }
 -            break;
          case NEON_3R_FLOAT_MISC:
              /* VMAXNM/VMINNM in ARMv8 */
              if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          tmp = neon_load_reg(rn, pass);
          tmp2 = neon_load_reg(rm, pass);
          switch (op) {
 -        case NEON_3R_FLOAT_CMP:
 -        {
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            if (!u) {
 -                gen_helper_neon_ceq_f32(tmp, tmp, tmp2, fpstatus);
 -            } else {
 -                if (size == 0) {
 -                    gen_helper_neon_cge_f32(tmp, tmp, tmp2, fpstatus);
 -                } else {
 -                    gen_helper_neon_cgt_f32(tmp, tmp, tmp2, fpstatus);
 -                }
 -            }
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
 -        case NEON_3R_FLOAT_ACMP:
 -        {
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            if (size == 0) {
 -                gen_helper_neon_acge_f32(tmp, tmp, tmp2, fpstatus);
 -            } else {
 -                gen_helper_neon_acgt_f32(tmp, tmp, tmp2, fpstatus);
 -            }
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
          case NEON_3R_FLOAT_MINMAX:
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
 .20.1

-[Qemu-devel] [PULL 04/42] target/arm: Make sure M-profile FPSCR RES0 bits are not settable
+[PULL 43/45] target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
-Enforce that for M-profile various FPSCR bits which are RES0 there
+The usual location for the env argument in the argument list of a TCG helper
-but have defined meanings on A-profile are never settable. This
+is immediately after the return-value argument. recps_f32 and rsqrts_f32
-ensures that M-profile code can't enable the A-profile behaviour
+differ in that they put it at the end.
-(notably vector length/stride handling) by accident.
 Move the env argument to its usual place; this will allow us to
 more easily use these helper functions with the gvec APIs.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-2-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-16-peter.maydell@linaro.org
 ---
- target/arm/vfp_helper.c | 8 ++++++++
+ target/arm/helper.h     | 4 ++--
-file changed, 8 insertions(+)
+ target/arm/translate.c  | 4 ++--
  target/arm/vfp_helper.c | 4 ++--
 files changed, 6 insertions(+), 6 deletions(-)
+diff --git a/target/arm/helper.h b/target/arm/helper.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.h
++++ b/target/arm/helper.h
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, ptr, i32)
+ DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
+ DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
+-DEF_HELPER_3(recps_f32, f32, f32, f32, env)
+-DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env)
++DEF_HELPER_3(recps_f32, f32, env, f32, f32)
++DEF_HELPER_3(rsqrts_f32, f32, env, f32, f32)
+ DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
+ DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
+ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
+diff --git a/target/arm/translate.c b/target/arm/translate.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate.c
++++ b/target/arm/translate.c
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+                 tcg_temp_free_ptr(fpstatus);
+             } else {
+                 if (size == 0) {
+-                    gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env);
++                    gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
+                 } else {
+-                    gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env);
++                    gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
+               }
+             }
+             break;
 diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vfp_helper.c
 +++ b/target/arm/vfp_helper.c
-@@ -XXX,XX +XXX,XX @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
+@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
-         val &= ~FPCR_FZ16;
+ #define float32_three make_float32(0x40400000)
-     }
+ #define float32_one_point_five make_float32(0x3fc00000)
-+    if (arm_feature(env, ARM_FEATURE_M)) {
+-float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
-+        /*
++float32 HELPER(recps_f32)(CPUARMState *env, float32 a, float32 b)
-+         * M profile FPSCR is RES0 for the QC, STRIDE, FZ16, LEN bits
+ {
-+         * and also for the trapped-exception-handling bits IxE.
+     float_status *s = &env->vfp.standard_fp_status;
-+         */
+     if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
-+        val &= 0xf7c0009f;
+@@ -XXX,XX +XXX,XX @@ float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
-+    }
+     return float32_sub(float32_two, float32_mul(a, b, s), s);
-+
+ }
-     /*
-      * We don't implement trapped exception handling, so the
+-float32 HELPER(rsqrts_f32)(float32 a, float32 b, CPUARMState *env)
-      * trap enable bits, IDE|IXE|UFE|OFE|DZE|IOE are all RAZ/WI (not RES0!)
++float32 HELPER(rsqrts_f32)(CPUARMState *env, float32 a, float32 b)
  {
      float_status *s = &env->vfp.standard_fp_status;
      float32 product;
 --
 .20.1

-[Qemu-devel] [PULL 27/42] target/arm: Implement VLSTM for v7M CPUs with an FPU
+[PULL 44/45] target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
-Implement the VLSTM instruction for v7M for the FPU present case.
+Convert the Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS 3-reg-same
 insns to decodetree. (These are all the remaining non-accumulation
 instructions in this group.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-25-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-17-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       |  2 +
+ target/arm/neon-dp.decode       |  6 +++
- target/arm/helper.h    |  2 +
+ target/arm/translate-neon.inc.c | 70 +++++++++++++++++++++++++++++++++
- target/arm/helper.c    | 84 ++++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 42 +-------------------
- target/arm/translate.c | 15 +++++++-
+files changed, 78 insertions(+), 40 deletions(-)
 files changed, 102 insertions(+), 1 deletion(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ VCGE_fp_3s       1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
- #define EXCP_INVSTATE       18   /* v7M INVSTATE UsageFault */
+ VACGE_fp_3s      1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
- #define EXCP_STKOF          19   /* v8M STKOF UsageFault */
+ VCGT_fp_3s       1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
- #define EXCP_LAZYFP         20   /* v7M fault during lazy FP stacking */
+ VACGT_fp_3s      1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
-+#define EXCP_LSERR          21   /* v8M LSERR SecureFault */
++VMAX_fp_3s       1111 001 0 0 . 0 . .... .... 1111 ... 0 .... @3same_fp
-+#define EXCP_UNALIGNED      22   /* v7M UNALIGNED UsageFault */
++VMIN_fp_3s       1111 001 0 0 . 1 . .... .... 1111 ... 0 .... @3same_fp
- /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
+ VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
+ VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
- #define ARMV7M_EXCP_RESET   1
++VRECPS_fp_3s     1111 001 0 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
-diff --git a/target/arm/helper.h b/target/arm/helper.h
++VRSQRTS_fp_3s    1111 001 0 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
 +VMAXNM_fp_3s     1111 001 1 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
 +VMINNM_fp_3s     1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/helper.h
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
+@@ -XXX,XX +XXX,XX @@ DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
+ DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
- DEF_HELPER_1(v7m_preserve_fp_state, void, env)
+ DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
+ DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
-+DEF_HELPER_2(v7m_vlstm, void, env, i32)
++DO_3S_FP(VMAX, gen_helper_vfp_maxs, false)
 +DO_3S_FP(VMIN, gen_helper_vfp_mins, false)
  static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
                              TCGv_ptr fpstatus)
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
  DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
  DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
 +static bool trans_VMAXNM_fp_3s(DisasContext *s, arg_3same *a)
 +{
 +    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
 +        return false;
 +    }
 +
- DEF_HELPER_2(v8m_stackcheck, void, env, i32)
++    if (a->size != 0) {
++        /* TODO fp16 support */
- DEF_HELPER_4(access_check_cp_reg, void, env, ptr, i32, i32)
++        return false;
-diff --git a/target/arm/helper.c b/target/arm/helper.c
++    }
-index XXXXXXX..XXXXXXX 100644
++
---- a/target/arm/helper.c
++    return do_3same_fp(s, a, gen_helper_vfp_maxnums, false);
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
      g_assert_not_reached();
  }
 +void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
 +{
 +    /* translate.c should never generate calls here in user-only mode */
 +    g_assert_not_reached();
 +}
 +
- uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
++static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
  {
      /* The TT instructions can be used by unprivileged code, but in
@@ -XXX,XX +XXX,XX @@ static void v7m_update_fpccr(CPUARMState *env, uint32_t frameptr,
      }
  }
 +void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
 +{
-+    /* fptr is the value of Rn, the frame pointer we store the FP regs to */
++    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
-+    bool s = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
++        return false;
 +    bool lspact = env->v7m.fpccr[s] & R_V7M_FPCCR_LSPACT_MASK;
 +
 +    assert(env->v7m.secure);
 +
 +    if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
 +        return;
 +    }
 +
-+    /* Check access to the coprocessor is permitted */
++    if (a->size != 0) {
-+    if (!v7m_cpacr_pass(env, true, arm_current_el(env) != 0)) {
++        /* TODO fp16 support */
-+        raise_exception_ra(env, EXCP_NOCP, 0, 1, GETPC());
++        return false;
 +    }
 +
-+    if (lspact) {
++    return do_3same_fp(s, a, gen_helper_vfp_minnums, false);
-+        /* LSPACT should not be active when there is active FP state */
++}
-+        raise_exception_ra(env, EXCP_LSERR, 0, 1, GETPC());
++
 +WRAP_ENV_FN(gen_VRECPS_tramp, gen_helper_recps_f32)
 +
 +static void gen_VRECPS_fp_3s(unsigned vece, uint32_t rd_ofs,
 +                             uint32_t rn_ofs, uint32_t rm_ofs,
 +                             uint32_t oprsz, uint32_t maxsz)
 +{
 +    static const GVecGen3 ops = { .fni4 = gen_VRECPS_tramp };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
 +}
 +
 +static bool trans_VRECPS_fp_3s(DisasContext *s, arg_3same *a)
 +{
 +    if (a->size != 0) {
 +        /* TODO fp16 support */
 +        return false;
 +    }
 +
-+    if (fptr & 7) {
++    return do_3same(s, a, gen_VRECPS_fp_3s);
-+        raise_exception_ra(env, EXCP_UNALIGNED, 0, 1, GETPC());
++}
 +
 +WRAP_ENV_FN(gen_VRSQRTS_tramp, gen_helper_rsqrts_f32)
 +
 +static void gen_VRSQRTS_fp_3s(unsigned vece, uint32_t rd_ofs,
 +                              uint32_t rn_ofs, uint32_t rm_ofs,
 +                              uint32_t oprsz, uint32_t maxsz)
 +{
 +    static const GVecGen3 ops = { .fni4 = gen_VRSQRTS_tramp };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
 +}
 +
 +static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
 +{
 +    if (a->size != 0) {
 +        /* TODO fp16 support */
 +        return false;
 +    }
 +
-+    /*
++    return do_3same(s, a, gen_VRSQRTS_fp_3s);
 +     * Note that we do not use v7m_stack_write() here, because the
 +     * accesses should not set the FSR bits for stacking errors if they
 +     * fail. (In pseudocode terms, they are AccType_NORMAL, not AccType_STACK
 +     * or AccType_LAZYFP). Faults in cpu_stl_data() will throw exceptions
 +     * and longjmp out.
 +     */
 +    if (!(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPEN_MASK)) {
 +        bool ts = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK;
 +        int i;
 +
 +        for (i = 0; i < (ts ? 32 : 16); i += 2) {
 +            uint64_t dn = *aa32_vfp_dreg(env, i / 2);
 +            uint32_t faddr = fptr + 4 * i;
 +            uint32_t slo = extract64(dn, 0, 32);
 +            uint32_t shi = extract64(dn, 32, 32);
 +
 +            if (i >= 16) {
 +                faddr += 8; /* skip the slot for the FPSCR */
 +            }
 +            cpu_stl_data(env, faddr, slo);
 +            cpu_stl_data(env, faddr + 4, shi);
 +        }
 +        cpu_stl_data(env, fptr + 0x40, vfp_get_fpscr(env));
 +
 +        /*
 +         * If TS is 0 then s0 to s15 and FPSCR are UNKNOWN; we choose to
 +         * leave them unchanged, matching our choice in v7m_preserve_fp_state.
 +         */
 +        if (ts) {
 +            for (i = 0; i < 32; i += 2) {
 +                *aa32_vfp_dreg(env, i / 2) = 0;
 +            }
 +            vfp_set_fpscr(env, 0);
 +        }
 +    } else {
 +        v7m_update_fpccr(env, fptr, false);
 +    }
 +
 +    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
 +}
 +
- static bool v7m_push_stack(ARMCPU *cpu)
+ static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
  {
-     /* Do the "set up stack frame" part of exception entry,
+     /* FP operations handled pairwise 32 bits at a time */
@@ -XXX,XX +XXX,XX @@ static void arm_log_exception(int idx)
              [EXCP_INVSTATE] = "v7M INVSTATE UsageFault",
              [EXCP_STKOF] = "v8M STKOF UsageFault",
              [EXCP_LAZYFP] = "v7M exception during lazy FP stacking",
 +            [EXCP_LSERR] = "v8M LSERR UsageFault",
 +            [EXCP_UNALIGNED] = "v7M UNALIGNED UsageFault",
          };
          if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
          armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
          env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_STKOF_MASK;
          break;
 +    case EXCP_LSERR:
 +        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
 +        env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
 +        break;
 +    case EXCP_UNALIGNED:
 +        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
 +        env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_UNALIGNED_MASK;
 +        break;
      case EXCP_SWI:
          /* The PC already points to the next instruction.  */
          armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SVC, env->v7m.secure);
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                 if (!s->v8m_secure || (insn & 0x0040f0ff)) {
+         case NEON_3R_FLOAT_MULTIPLY:
-                     goto illegal_op;
+         case NEON_3R_FLOAT_CMP:
-                 }
+         case NEON_3R_FLOAT_ACMP:
--                /* Just NOP since FP support is not implemented */
++        case NEON_3R_FLOAT_MINMAX:
-+
++        case NEON_3R_FLOAT_MISC:
-+                if (arm_dc_feature(s, ARM_FEATURE_VFP)) {
+             /* Already handled by decodetree */
-+                    TCGv_i32 fptr = load_reg(s, rn);
+             return 1;
-+
+         }
-+                    if (extract32(insn, 20, 1)) {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+                        /* VLLDM */
+             return 1;
-+                    } else {
+         }
-+                        gen_helper_v7m_vlstm(cpu_env, fptr);
+         switch (op) {
-+                    }
+-        case NEON_3R_FLOAT_MINMAX:
-+                    tcg_temp_free_i32(fptr);
+-            if (u) {
-+
+-                return 1; /* VPMIN/VPMAX handled by decodetree */
-+                    /* End the TB, because we have updated FP control bits */
+-            }
-+                    s->base.is_jmp = DISAS_UPDATE;
+-            break;
-+                }
+-        case NEON_3R_FLOAT_MISC:
-                 break;
+-            /* VMAXNM/VMINNM in ARMv8 */
-             }
+-            if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
-             if (arm_dc_feature(s, ARM_FEATURE_VFP) &&
+-                return 1;
 -            }
 -            break;
          case NEON_3R_VFM_VQRDMLSH:
              if (!dc_isar_feature(aa32_simdfmac, s)) {
                  return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          tmp = neon_load_reg(rn, pass);
          tmp2 = neon_load_reg(rm, pass);
          switch (op) {
 -        case NEON_3R_FLOAT_MINMAX:
 -        {
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            if (size == 0) {
 -                gen_helper_vfp_maxs(tmp, tmp, tmp2, fpstatus);
 -            } else {
 -                gen_helper_vfp_mins(tmp, tmp, tmp2, fpstatus);
 -            }
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
 -        case NEON_3R_FLOAT_MISC:
 -            if (u) {
 -                /* VMAXNM/VMINNM */
 -                TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                if (size == 0) {
 -                    gen_helper_vfp_maxnums(tmp, tmp, tmp2, fpstatus);
 -                } else {
 -                    gen_helper_vfp_minnums(tmp, tmp, tmp2, fpstatus);
 -                }
 -                tcg_temp_free_ptr(fpstatus);
 -            } else {
 -                if (size == 0) {
 -                    gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
 -                } else {
 -                    gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
 -              }
 -            }
 -            break;
          case NEON_3R_VFM_VQRDMLSH:
          {
              /* VFMA, VFMS: fused multiply-add */
 --
 .20.1

-[Qemu-devel] [PULL 25/42] target/arm: Add lazy-FP-stacking support to v7m_stack_write()
+[PULL 45/45] target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree
-Pushing registers to the stack for v7M needs to handle three cases:
+Convert the Neon floating point VFMA and VFMS insn to decodetree.
- * the "normal" case where we pend exceptions
+These are the last insns in the 3-reg-same group so we can
- * an "ignore faults" case where we set FSR bits but
+remove all the support/loop code from the old decoder.
    do not pend exceptions (this is used when we are
    handling some kinds of derived exception on exception entry)
  * a "lazy FP stacking" case, where different FSR bits
    are set and the exception is pended differently
 Implement this by changing the existing flag argument that
 tells us whether to ignore faults or not into an enum that
 specifies which of the 3 modes we should handle.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-23-peter.maydell@linaro.org
+Message-id: 20200512163904.10918-18-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 118 +++++++++++++++++++++++++++++---------------
+ target/arm/neon-dp.decode       |   3 +
-file changed, 79 insertions(+), 39 deletions(-)
+ target/arm/translate-neon.inc.c |  41 ++++++++
+ target/arm/translate.c          | 176 +-------------------------------
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+files changed, 46 insertions(+), 174 deletions(-)
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static bool v7m_cpacr_pass(CPUARMState *env, bool is_secure, bool is_priv)
+@@ -XXX,XX +XXX,XX @@ SHA256H2_3s      1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
  SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
                   vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +VFMA_fp_3s       1111 001 0 0 . 0 . .... .... 1100 ... 1 .... @3same_fp
 +VFMS_fp_3s       1111 001 0 0 . 1 . .... .... 1100 ... 1 .... @3same_fp
 +
  VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
  VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
      return do_3same(s, a, gen_VRSQRTS_fp_3s);
  }
 +static void gen_VFMA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
 +                            TCGv_ptr fpstatus)
 +{
 +    gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
 +}
 +
 +static bool trans_VFMA_fp_3s(DisasContext *s, arg_3same *a)
 +{
 +    if (!dc_isar_feature(aa32_simdfmac, s)) {
 +        return false;
 +    }
 +
 +    if (a->size != 0) {
 +        /* TODO fp16 support */
 +        return false;
 +    }
 +
 +    return do_3same_fp(s, a, gen_VFMA_fp_3s, true);
 +}
 +
 +static void gen_VFMS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
 +                            TCGv_ptr fpstatus)
 +{
 +    gen_helper_vfp_negs(vn, vn);
 +    gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
 +}
 +
 +static bool trans_VFMS_fp_3s(DisasContext *s, arg_3same *a)
 +{
 +    if (!dc_isar_feature(aa32_simdfmac, s)) {
 +        return false;
 +    }
 +
 +    if (a->size != 0) {
 +        /* TODO fp16 support */
 +        return false;
 +    }
 +
 +    return do_3same_fp(s, a, gen_VFMS_fp_3s, true);
 +}
 +
  static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
  {
      /* FP operations handled pairwise 32 bits at a time */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
      }
  }
-+/*
+-/* Symbolic constants for op fields for Neon 3-register same-length.
-+ * What kind of stack write are we doing? This affects how exceptions
+- * The values correspond to bits [11:8,4]; see the ARM ARM DDI0406B
-+ * generated during the stacking are treated.
+- * table A7-9.
-+ */
+- */
-+typedef enum StackingMode {
+-#define NEON_3R_VHADD 0
-+    STACK_NORMAL,
+-#define NEON_3R_VQADD 1
-+    STACK_IGNFAULTS,
+-#define NEON_3R_VRHADD 2
-+    STACK_LAZYFP,
+-#define NEON_3R_LOGIC 3 /* VAND,VBIC,VORR,VMOV,VORN,VEOR,VBIF,VBIT,VBSL */
-+} StackingMode;
+-#define NEON_3R_VHSUB 4
-+
+-#define NEON_3R_VQSUB 5
- static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
+-#define NEON_3R_VCGT 6
--                            ARMMMUIdx mmu_idx, bool ignfault)
+-#define NEON_3R_VCGE 7
-+                            ARMMMUIdx mmu_idx, StackingMode mode)
+-#define NEON_3R_VSHL 8
- {
+-#define NEON_3R_VQSHL 9
-     CPUState *cs = CPU(cpu);
+-#define NEON_3R_VRSHL 10
-     CPUARMState *env = &cpu->env;
+-#define NEON_3R_VQRSHL 11
-@@ -XXX,XX +XXX,XX @@ static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
+-#define NEON_3R_VMAX 12
-                       &attrs, &prot, &page_size, &fi, NULL)) {
+-#define NEON_3R_VMIN 13
-         /* MPU/SAU lookup failed */
+-#define NEON_3R_VABD 14
-         if (fi.type == ARMFault_QEMU_SFault) {
+-#define NEON_3R_VABA 15
--            qemu_log_mask(CPU_LOG_INT,
+-#define NEON_3R_VADD_VSUB 16
--                          "...SecureFault with SFSR.AUVIOL during stacking\n");
+-#define NEON_3R_VTST_VCEQ 17
--            env->v7m.sfsr |= R_V7M_SFSR_AUVIOL_MASK | R_V7M_SFSR_SFARVALID_MASK;
+-#define NEON_3R_VML 18 /* VMLA, VMLS */
-+            if (mode == STACK_LAZYFP) {
+-#define NEON_3R_VMUL 19
-+                qemu_log_mask(CPU_LOG_INT,
+-#define NEON_3R_VPMAX 20
-+                              "...SecureFault with SFSR.LSPERR "
+-#define NEON_3R_VPMIN 21
-+                              "during lazy stacking\n");
+-#define NEON_3R_VQDMULH_VQRDMULH 22
-+                env->v7m.sfsr |= R_V7M_SFSR_LSPERR_MASK;
+-#define NEON_3R_VPADD_VQRDMLAH 23
-+            } else {
+-#define NEON_3R_SHA 24 /* SHA1C,SHA1P,SHA1M,SHA1SU0,SHA256H{2},SHA256SU1 */
-+                qemu_log_mask(CPU_LOG_INT,
+-#define NEON_3R_VFM_VQRDMLSH 25 /* VFMA, VFMS, VQRDMLSH */
-+                              "...SecureFault with SFSR.AUVIOL "
+-#define NEON_3R_FLOAT_ARITH 26 /* float VADD, VSUB, VPADD, VABD */
-+                              "during stacking\n");
+-#define NEON_3R_FLOAT_MULTIPLY 27 /* float VMLA, VMLS, VMUL */
-+                env->v7m.sfsr |= R_V7M_SFSR_AUVIOL_MASK;
+-#define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */
-+            }
+-#define NEON_3R_FLOAT_ACMP 29 /* float VACGE, VACGT, VACLE, VACLT */
-+            env->v7m.sfsr |= R_V7M_SFSR_SFARVALID_MASK;
+-#define NEON_3R_FLOAT_MINMAX 30 /* float VMIN, VMAX */
-             env->v7m.sfar = addr;
+-#define NEON_3R_FLOAT_MISC 31 /* float VRECPS, VRSQRTS, VMAXNM/MINNM */
-             exc = ARMV7M_EXCP_SECURE;
+-
-             exc_secure = false;
+-static const uint8_t neon_3r_sizes[] = {
-         } else {
+-    [NEON_3R_VHADD] = 0x7,
--            qemu_log_mask(CPU_LOG_INT, "...MemManageFault with CFSR.MSTKERR\n");
+-    [NEON_3R_VQADD] = 0xf,
--            env->v7m.cfsr[secure] |= R_V7M_CFSR_MSTKERR_MASK;
+-    [NEON_3R_VRHADD] = 0x7,
-+            if (mode == STACK_LAZYFP) {
+-    [NEON_3R_LOGIC] = 0xf, /* size field encodes op type */
-+                qemu_log_mask(CPU_LOG_INT,
+-    [NEON_3R_VHSUB] = 0x7,
-+                              "...MemManageFault with CFSR.MLSPERR\n");
+-    [NEON_3R_VQSUB] = 0xf,
-+                env->v7m.cfsr[secure] |= R_V7M_CFSR_MLSPERR_MASK;
+-    [NEON_3R_VCGT] = 0x7,
-+            } else {
+-    [NEON_3R_VCGE] = 0x7,
-+                qemu_log_mask(CPU_LOG_INT,
+-    [NEON_3R_VSHL] = 0xf,
-+                              "...MemManageFault with CFSR.MSTKERR\n");
+-    [NEON_3R_VQSHL] = 0xf,
-+                env->v7m.cfsr[secure] |= R_V7M_CFSR_MSTKERR_MASK;
+-    [NEON_3R_VRSHL] = 0xf,
-+            }
+-    [NEON_3R_VQRSHL] = 0xf,
-             exc = ARMV7M_EXCP_MEM;
+-    [NEON_3R_VMAX] = 0x7,
-             exc_secure = secure;
+-    [NEON_3R_VMIN] = 0x7,
-         }
+-    [NEON_3R_VABD] = 0x7,
-@@ -XXX,XX +XXX,XX @@ static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
+-    [NEON_3R_VABA] = 0x7,
-                          attrs, &txres);
+-    [NEON_3R_VADD_VSUB] = 0xf,
-     if (txres != MEMTX_OK) {
+-    [NEON_3R_VTST_VCEQ] = 0x7,
-         /* BusFault trying to write the data */
+-    [NEON_3R_VML] = 0x7,
--        qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.STKERR\n");
+-    [NEON_3R_VMUL] = 0x7,
--        env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_STKERR_MASK;
+-    [NEON_3R_VPMAX] = 0x7,
-+        if (mode == STACK_LAZYFP) {
+-    [NEON_3R_VPMIN] = 0x7,
-+            qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.LSPERR\n");
+-    [NEON_3R_VQDMULH_VQRDMULH] = 0x6,
-+            env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_LSPERR_MASK;
+-    [NEON_3R_VPADD_VQRDMLAH] = 0x7,
-+        } else {
+-    [NEON_3R_SHA] = 0xf, /* size field encodes op type */
-+            qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.STKERR\n");
+-    [NEON_3R_VFM_VQRDMLSH] = 0x7, /* For VFM, size bit 1 encodes op */
-+            env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_STKERR_MASK;
+-    [NEON_3R_FLOAT_ARITH] = 0x5, /* size bit 1 encodes op */
-+        }
+-    [NEON_3R_FLOAT_MULTIPLY] = 0x5, /* size bit 1 encodes op */
-         exc = ARMV7M_EXCP_BUS;
+-    [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */
-         exc_secure = false;
+-    [NEON_3R_FLOAT_ACMP] = 0x5, /* size bit 1 encodes op */
-         goto pend_fault;
+-    [NEON_3R_FLOAT_MINMAX] = 0x5, /* size bit 1 encodes op */
-@@ -XXX,XX +XXX,XX @@ pend_fault:
+-    [NEON_3R_FLOAT_MISC] = 0x5, /* size bit 1 encodes op */
-      * later if we have two derived exceptions.
+-};
-      * The only case when we must not pend the exception but instead
+-
-      * throw it away is if we are doing the push of the callee registers
+ /* Symbolic constants for op fields for Neon 2-register miscellaneous.
--     * and we've already generated a derived exception. Even in this
+  * The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
--     * case we will still update the fault status registers.
+  * table A7-13.
-+     * and we've already generated a derived exception (this is indicated
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+     * by the caller passing STACK_IGNFAULTS). Even in this case we will
+     rm_ofs = neon_reg_offset(rm, 0);
-+     * still update the fault status registers.
-      */
+     if ((insn & (1 << 23)) == 0) {
--    if (!ignfault) {
+-        /* Three register same length.  */
-+    switch (mode) {
+-        op = ((insn >> 7) & 0x1e) | ((insn >> 4) & 1);
-+    case STACK_NORMAL:
+-        /* Catch invalid op and bad size combinations: UNDEF */
-         armv7m_nvic_set_pending_derived(env->nvic, exc, exc_secure);
+-        if ((neon_3r_sizes[op] & (1 << size)) == 0) {
-+        break;
+-            return 1;
-+    case STACK_LAZYFP:
+-        }
-+        armv7m_nvic_set_pending_lazyfp(env->nvic, exc, exc_secure);
+-        /* All insns of this form UNDEF for either this condition or the
-+        break;
+-         * superset of cases "Q==1"; we catch the latter later.
-+    case STACK_IGNFAULTS:
+-         */
-+        break;
+-        if (q && ((rd | rn | rm) & 1)) {
-     }
+-            return 1;
-     return false;
+-        }
- }
+-        switch (op) {
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
+-        case NEON_3R_VFM_VQRDMLSH:
-     uint32_t limit;
+-            if (!u) {
-     bool want_psp;
+-                /* VFM, VFMS */
-     uint32_t sig;
+-                if (size == 1) {
-+    StackingMode smode = ignore_faults ? STACK_IGNFAULTS : STACK_NORMAL;
+-                    return 1;
+-                }
-     if (dotailchain) {
+-                break;
-         bool mode = lr & R_V7M_EXCRET_MODE_MASK;
+-            }
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
+-            /* VQRDMLSH : handled by decodetree */
-      */
+-            return 1;
-     sig = v7m_integrity_sig(env, lr);
+-
-     stacked_ok =
+-        case NEON_3R_VADD_VSUB:
--        v7m_stack_write(cpu, frameptr, sig, mmu_idx, ignore_faults) &&
+-        case NEON_3R_LOGIC:
--        v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx,
+-        case NEON_3R_VMAX:
--                        ignore_faults) &&
+-        case NEON_3R_VMIN:
--        v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx,
+-        case NEON_3R_VTST_VCEQ:
--                        ignore_faults) &&
+-        case NEON_3R_VCGT:
--        v7m_stack_write(cpu, frameptr + 0x10, env->regs[6], mmu_idx,
+-        case NEON_3R_VCGE:
--                        ignore_faults) &&
+-        case NEON_3R_VQADD:
--        v7m_stack_write(cpu, frameptr + 0x14, env->regs[7], mmu_idx,
+-        case NEON_3R_VQSUB:
--                        ignore_faults) &&
+-        case NEON_3R_VMUL:
--        v7m_stack_write(cpu, frameptr + 0x18, env->regs[8], mmu_idx,
+-        case NEON_3R_VML:
--                        ignore_faults) &&
+-        case NEON_3R_VSHL:
--        v7m_stack_write(cpu, frameptr + 0x1c, env->regs[9], mmu_idx,
+-        case NEON_3R_SHA:
--                        ignore_faults) &&
+-        case NEON_3R_VHADD:
--        v7m_stack_write(cpu, frameptr + 0x20, env->regs[10], mmu_idx,
+-        case NEON_3R_VRHADD:
--                        ignore_faults) &&
+-        case NEON_3R_VHSUB:
--        v7m_stack_write(cpu, frameptr + 0x24, env->regs[11], mmu_idx,
+-        case NEON_3R_VABD:
--                        ignore_faults);
+-        case NEON_3R_VABA:
-+        v7m_stack_write(cpu, frameptr, sig, mmu_idx, smode) &&
+-        case NEON_3R_VQSHL:
-+        v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx, smode) &&
+-        case NEON_3R_VRSHL:
-+        v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx, smode) &&
+-        case NEON_3R_VQRSHL:
-+        v7m_stack_write(cpu, frameptr + 0x10, env->regs[6], mmu_idx, smode) &&
+-        case NEON_3R_VPMAX:
-+        v7m_stack_write(cpu, frameptr + 0x14, env->regs[7], mmu_idx, smode) &&
+-        case NEON_3R_VPMIN:
-+        v7m_stack_write(cpu, frameptr + 0x18, env->regs[8], mmu_idx, smode) &&
+-        case NEON_3R_VPADD_VQRDMLAH:
-+        v7m_stack_write(cpu, frameptr + 0x1c, env->regs[9], mmu_idx, smode) &&
+-        case NEON_3R_VQDMULH_VQRDMULH:
-+        v7m_stack_write(cpu, frameptr + 0x20, env->regs[10], mmu_idx, smode) &&
+-        case NEON_3R_FLOAT_ARITH:
-+        v7m_stack_write(cpu, frameptr + 0x24, env->regs[11], mmu_idx, smode);
+-        case NEON_3R_FLOAT_MULTIPLY:
+-        case NEON_3R_FLOAT_CMP:
-     /* Update SP regardless of whether any of the stack accesses failed. */
+-        case NEON_3R_FLOAT_ACMP:
-     *frame_sp_p = frameptr;
+-        case NEON_3R_FLOAT_MINMAX:
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
+-        case NEON_3R_FLOAT_MISC:
-      * if it has higher priority).
+-            /* Already handled by decodetree */
-      */
+-            return 1;
-     stacked_ok = stacked_ok &&
+-        }
--        v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, false) &&
+-
--        v7m_stack_write(cpu, frameptr + 4, env->regs[1], mmu_idx, false) &&
+-        if (size == 3) {
--        v7m_stack_write(cpu, frameptr + 8, env->regs[2], mmu_idx, false) &&
+-            /* 64-bit element instructions: handled by decodetree */
--        v7m_stack_write(cpu, frameptr + 12, env->regs[3], mmu_idx, false) &&
+-            return 1;
--        v7m_stack_write(cpu, frameptr + 16, env->regs[12], mmu_idx, false) &&
+-        }
--        v7m_stack_write(cpu, frameptr + 20, env->regs[14], mmu_idx, false) &&
+-        switch (op) {
--        v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
+-        case NEON_3R_VFM_VQRDMLSH:
--        v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
+-            if (!dc_isar_feature(aa32_simdfmac, s)) {
-+        v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, STACK_NORMAL) &&
+-                return 1;
-+        v7m_stack_write(cpu, frameptr + 4, env->regs[1],
+-            }
-+                        mmu_idx, STACK_NORMAL) &&
+-            break;
-+        v7m_stack_write(cpu, frameptr + 8, env->regs[2],
+-        default:
-+                        mmu_idx, STACK_NORMAL) &&
+-            break;
-+        v7m_stack_write(cpu, frameptr + 12, env->regs[3],
+-        }
-+                        mmu_idx, STACK_NORMAL) &&
+-
-+        v7m_stack_write(cpu, frameptr + 16, env->regs[12],
+-        for (pass = 0; pass < (q ? 4 : 2); pass++) {
-+                        mmu_idx, STACK_NORMAL) &&
+-
-+        v7m_stack_write(cpu, frameptr + 20, env->regs[14],
+-        /* Elementwise.  */
-+                        mmu_idx, STACK_NORMAL) &&
+-        tmp = neon_load_reg(rn, pass);
-+        v7m_stack_write(cpu, frameptr + 24, env->regs[15],
+-        tmp2 = neon_load_reg(rm, pass);
-+                        mmu_idx, STACK_NORMAL) &&
+-        switch (op) {
-+        v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, STACK_NORMAL);
+-        case NEON_3R_VFM_VQRDMLSH:
+-        {
-     if (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) {
+-            /* VFMA, VFMS: fused multiply-add */
-         /* FPU is active, try to save its registers */
+-            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
+-            TCGv_i32 tmp3 = neon_load_reg(rd, pass);
-                         faddr += 8; /* skip the slot for the FPSCR */
+-            if (size) {
-                     }
+-                /* VFMS */
-                     stacked_ok = stacked_ok &&
+-                gen_helper_vfp_negs(tmp, tmp);
--                        v7m_stack_write(cpu, faddr, slo, mmu_idx, false) &&
+-            }
--                        v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, false);
+-            gen_helper_vfp_muladds(tmp, tmp, tmp2, tmp3, fpstatus);
-+                        v7m_stack_write(cpu, faddr, slo,
+-            tcg_temp_free_i32(tmp3);
-+                                        mmu_idx, STACK_NORMAL) &&
+-            tcg_temp_free_ptr(fpstatus);
-+                        v7m_stack_write(cpu, faddr + 4, shi,
+-            break;
-+                                        mmu_idx, STACK_NORMAL);
+-        }
-                 }
+-        default:
-                 stacked_ok = stacked_ok &&
+-            abort();
-                     v7m_stack_write(cpu, frameptr + 0x60,
+-        }
--                                    vfp_get_fpscr(env), mmu_idx, false);
+-        tcg_temp_free_i32(tmp2);
-+                                    vfp_get_fpscr(env), mmu_idx, STACK_NORMAL);
+-
-                 if (cpacr_pass) {
+-        neon_store_reg(rd, pass, tmp);
-                     for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
+-
-                         *aa32_vfp_dreg(env, i / 2) = 0;
+-        } /* for pass */
 -        /* End of 3 register same size operations.  */
 +        /* Three register same length: handled by decodetree */
 +        return 1;
      } else if (insn & (1 << 4)) {
          if ((insn & 0x00380080) != 0) {
              /* Two registers and shift.  */
 --
 .20.1

First pullreq for arm of the 4.1 series, since I'm back from
holiday now. This is mostly my M-profile FPU series and Philippe's
devices.h cleanup. I have a pile of other patchsets to work through
in my to-review folder, but 42 patches is definitely quite
big enough to send now...

thanks
-- PMM

The following changes since commit 413a99a92c13ec408dcf2adaa87918dc81e890c8:

Add Nios II semihosting support. (2019-04-29 16:09:51 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20190429

for you to fetch changes up to 437cc27ddfded3bbab6afd5ac1761e0e195edba7:

hw/devices: Move SMSC 91C111 declaration into a new header (2019-04-29 17:57:21 +0100)

----------------------------------------------------------------
target-arm queue:
 * remove "bag of random stuff" hw/devices.h header
 * implement FPU for Cortex-M and enable it for Cortex-M4 and -M33
 * hw/dma: Compile the bcm2835_dma device as common object
 * configure: Remove --source-path option
 * hw/ssi/xilinx_spips: Avoid variable length array
 * hw/arm/smmuv3: Remove SMMUNotifierNode

----------------------------------------------------------------
Eric Auger (1):
      hw/arm/smmuv3: Remove SMMUNotifierNode

Peter Maydell (28):
      hw/ssi/xilinx_spips: Avoid variable length array
      configure: Remove --source-path option
      target/arm: Make sure M-profile FPSCR RES0 bits are not settable
      hw/intc/armv7m_nvic: Allow reading of M-profile MVFR* registers
      target/arm: Implement dummy versions of M-profile FP-related registers
      target/arm: Disable most VFP sysregs for M-profile
      target/arm: Honour M-profile FP enable bits
      target/arm: Decode FP instructions for M profile
      target/arm: Clear CONTROL_S.SFPA in SG insn if FPU present
      target/arm: Handle SFPA and FPCA bits in reads and writes of CONTROL
      target/arm/helper: don't return early for STKOF faults during stacking
      target/arm: Handle floating point registers in exception entry
      target/arm: Implement v7m_update_fpccr()
      target/arm: Clear CONTROL.SFPA in BXNS and BLXNS
      target/arm: Clean excReturn bits when tail chaining
      target/arm: Allow for floating point in callee stack integrity check
      target/arm: Handle floating point registers in exception return
      target/arm: Move NS TBFLAG from bit 19 to bit 6
      target/arm: Overlap VECSTRIDE and XSCALE_CPAR TB flags
      target/arm: Set FPCCR.S when executing M-profile floating point insns
      target/arm: Activate M-profile floating point context when FPCCR.ASPEN is set
      target/arm: New helper function arm_v7m_mmu_idx_all()
      target/arm: New function armv7m_nvic_set_pending_lazyfp()
      target/arm: Add lazy-FP-stacking support to v7m_stack_write()
      target/arm: Implement M-profile lazy FP state preservation
      target/arm: Implement VLSTM for v7M CPUs with an FPU
      target/arm: Implement VLLDM for v7M CPUs with an FPU
      target/arm: Enable FPU for Cortex-M4 and Cortex-M33

Philippe Mathieu-Daudé (13):
      hw/dma: Compile the bcm2835_dma device as common object
      hw/arm/aspeed: Use TYPE_TMP105/TYPE_PCA9552 instead of hardcoded string
      hw/arm/nseries: Use TYPE_TMP105 instead of hardcoded string
      hw/display/tc6393xb: Remove unused functions
      hw/devices: Move TC6393XB declarations into a new header
      hw/devices: Move Blizzard declarations into a new header
      hw/devices: Move CBus declarations into a new header
      hw/devices: Move Gamepad declarations into a new header
      hw/devices: Move TI touchscreen declarations into a new header
      hw/devices: Move LAN9118 declarations into a new header
      hw/net/ne2000-isa: Add guards to the header
      hw/net/lan9118: Export TYPE_LAN9118 and use it instead of hardcoded string
      hw/devices: Move SMSC 91C111 declaration into a new header

From: Eric Auger <eric.auger@redhat.com>

The SMMUNotifierNode struct is not necessary and brings extra
complexity so let's remove it. We now directly track the SMMUDevices
which have registered IOMMU MR notifiers.

This is inspired from the same transformation on intel-iommu
done in commit b4a4ba0d68f50f218ee3957b6638dbee32a5eeef
("intel-iommu: remove IntelIOMMUNotifierNode")

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-id: 20190409160219.19026-1-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/smmu-common.h |  8 ++------
 hw/arm/smmu-common.c         |  6 +++---
 hw/arm/smmuv3.c              | 28 +++++++---------------------
 3 files changed, 12 insertions(+), 30 deletions(-)

diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -XXX,XX +XXX,XX @@ typedef struct SMMUDevice {
     AddressSpace       as;
     uint32_t           cfg_cache_hits;
     uint32_t           cfg_cache_misses;
+    QLIST_ENTRY(SMMUDevice) next;
 } SMMUDevice;
 
-typedef struct SMMUNotifierNode {
-    SMMUDevice *sdev;
-    QLIST_ENTRY(SMMUNotifierNode) next;
-} SMMUNotifierNode;
-
 typedef struct SMMUPciBus {
     PCIBus       *bus;
     SMMUDevice   *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
@@ -XXX,XX +XXX,XX @@ typedef struct SMMUState {
     GHashTable *iotlb;
     SMMUPciBus *smmu_pcibus_by_bus_num[SMMU_PCI_BUS_MAX];
     PCIBus *pci_bus;
-    QLIST_HEAD(, SMMUNotifierNode) notifiers_list;
+    QLIST_HEAD(, SMMUDevice) devices_with_notifiers;
     uint8_t bus_num;
     PCIBus *primary_bus;
 } SMMUState;
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -XXX,XX +XXX,XX @@ inline void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
 /* Unmap all notifiers of all mr's */
 void smmu_inv_notifiers_all(SMMUState *s)
 {
-    SMMUNotifierNode *node;
+    SMMUDevice *sdev;
 
-    QLIST_FOREACH(node, &s->notifiers_list, next) {
-        smmu_inv_notifiers_mr(&node->sdev->iommu);
+    QLIST_FOREACH(sdev, &s->devices_with_notifiers, next) {
+        smmu_inv_notifiers_mr(&sdev->iommu);
     }
 }
 
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
 /* invalidate an asid/iova tuple in all mr's */
 static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova)
 {
-    SMMUNotifierNode *node;
+    SMMUDevice *sdev;
 
-    QLIST_FOREACH(node, &s->notifiers_list, next) {
-        IOMMUMemoryRegion *mr = &node->sdev->iommu;
+    QLIST_FOREACH(sdev, &s->devices_with_notifiers, next) {
+        IOMMUMemoryRegion *mr = &sdev->iommu;
         IOMMUNotifier *n;
 
         trace_smmuv3_inv_notifiers_iova(mr->parent_obj.name, asid, iova);
@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
     SMMUDevice *sdev = container_of(iommu, SMMUDevice, iommu);
     SMMUv3State *s3 = sdev->smmu;
     SMMUState *s = &(s3->smmu_state);
-    SMMUNotifierNode *node = NULL;
-    SMMUNotifierNode *next_node = NULL;
 
     if (new & IOMMU_NOTIFIER_MAP) {
         int bus_num = pci_bus_num(sdev->bus);
@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
 
     if (old == IOMMU_NOTIFIER_NONE) {
         trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
-        node = g_malloc0(sizeof(*node));
-        node->sdev = sdev;
-        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
-        return;
-    }
-
-    /* update notifier node with new flags */
-    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
-        if (node->sdev == sdev) {
-            if (new == IOMMU_NOTIFIER_NONE) {
-                trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
-                QLIST_REMOVE(node, next);
-                g_free(node);
-            }
-            return;
-        }
+        QLIST_INSERT_HEAD(&s->devices_with_notifiers, sdev, next);
+    } else if (new == IOMMU_NOTIFIER_NONE) {
+        trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
+        QLIST_REMOVE(sdev, next);
     }
 }
 
-- 
2.20.1

In the stripe8() function we use a variable length array; however
we know that the maximum length required is MAX_NUM_BUSSES. Use
a fixed-length array and an assert instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190328152635.2794-1-peter.maydell@linaro.org
---
 hw/ssi/xilinx_spips.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/ssi/xilinx_spips.c
+++ b/hw/ssi/xilinx_spips.c
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_qspips_reset(DeviceState *d)
 
 static inline void stripe8(uint8_t *x, int num, bool dir)
 {
-    uint8_t r[num];
-    memset(r, 0, sizeof(uint8_t) * num);
+    uint8_t r[MAX_NUM_BUSSES];
     int idx[2] = {0, 0};
     int bit[2] = {0, 7};
     int d = dir;
 
+    assert(num <= MAX_NUM_BUSSES);
+    memset(r, 0, sizeof(uint8_t) * num);
+
     for (idx[0] = 0; idx[0] < num; ++idx[0]) {
         for (bit[0] = 7; bit[0] >= 0; bit[0]--) {
             r[idx[!d]] |= x[idx[d]] & 1 << bit[d] ? 1 << bit[!d] : 0;
-- 
2.20.1

Normally configure identifies the source path by looking
at the location where the configure script itself exists.
We also provide a --source-path option which lets the user
manually override this.

There isn't really an obvious use case for the --source-path
option, and in commit 927128222b0a91f56c13a in 2017 we
accidentally added some logic that looks at $source_path
before the command line option that overrides it has been
processed.

The fact that nobody complained suggests that there isn't
any use of this option and we aren't testing it either;
remove it. This allows us to move the "make $source_path
absolute" logic up so that there is no window in the script
where $source_path is set but not yet absolute.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20190318134019.23729-1-peter.maydell@linaro.org
---
 configure | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/configure b/configure
index XXXXXXX..XXXXXXX 100755
--- a/configure
+++ b/configure
@@ -XXX,XX +XXX,XX @@ ld_has() {
 
 # default parameters
 source_path=$(dirname "$0")
+# make source path absolute
+source_path=$(cd "$source_path"; pwd)
 cpu=""
 iasl="iasl"
 interp_prefix="/usr/gnemul/qemu-%M"
@@ -XXX,XX +XXX,XX @@ for opt do
   ;;
   --cxx=*) CXX="$optarg"
   ;;
-  --source-path=*) source_path="$optarg"
-  ;;
   --cpu=*) cpu="$optarg"
   ;;
   --extra-cflags=*) QEMU_CFLAGS="$QEMU_CFLAGS $optarg"
@@ -XXX,XX +XXX,XX @@ if test "$debug_info" = "yes"; then
     LDFLAGS="-g $LDFLAGS"
 fi
 
-# make source path absolute
-source_path=$(cd "$source_path"; pwd)
-
 # running configure in the source tree?
 # we know that's the case if configure is there.
 if test -f "./configure"; then
@@ -XXX,XX +XXX,XX @@ for opt do
   ;;
   --interp-prefix=*) interp_prefix="$optarg"
   ;;
-  --source-path=*)
-  ;;
   --cross-prefix=*)
   ;;
   --cc=*)
@@ -XXX,XX +XXX,XX @@ $(echo Available targets: $default_target_list | \
   --target-list-exclude=LIST exclude a set of targets from the default target-list
 
 Advanced options (experts only):
-  --source-path=PATH       path of source code [$source_path]
   --cross-prefix=PREFIX    use PREFIX for compile tools [$cross_prefix]
   --cc=CC                  use C compiler CC [$cc]
   --iasl=IASL              use ACPI compiler IASL [$iasl]
-- 
2.20.1

Enforce that for M-profile various FPSCR bits which are RES0 there
but have defined meanings on A-profile are never settable. This
ensures that M-profile code can't enable the A-profile behaviour
(notably vector length/stride handling) by accident.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-2-peter.maydell@linaro.org
---
 target/arm/vfp_helper.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vfp_helper.c
+++ b/target/arm/vfp_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
         val &= ~FPCR_FZ16;
     }
 
+    if (arm_feature(env, ARM_FEATURE_M)) {
+        /*
+         * M profile FPSCR is RES0 for the QC, STRIDE, FZ16, LEN bits
+         * and also for the trapped-exception-handling bits IxE.
+         */
+        val &= 0xf7c0009f;
+    }
+
     /*
      * We don't implement trapped exception handling, so the
      * trap enable bits, IDE|IXE|UFE|OFE|DZE|IOE are all RAZ/WI (not RES0!)
-- 
2.20.1

For M-profile the MVFR* ID registers are memory mapped, in the
range we implement via the NVIC. Allow them to be read.
(If the CPU has no FPU, these registers are defined to be RAZ.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-3-peter.maydell@linaro.org
---
 hw/intc/armv7m_nvic.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
             return 0;
         }
         return cpu->env.v7m.sfar;
+    case 0xf40: /* MVFR0 */
+        return cpu->isar.mvfr0;
+    case 0xf44: /* MVFR1 */
+        return cpu->isar.mvfr1;
+    case 0xf48: /* MVFR2 */
+        return cpu->isar.mvfr2;
     default:
     bad_offset:
         qemu_log_mask(LOG_GUEST_ERROR, "NVIC: Bad read offset 0x%x\n", offset);
-- 
2.20.1

The M-profile floating point support has three associated config
registers: FPCAR, FPCCR and FPDSCR. It also makes the registers
CPACR and NSACR have behaviour other than reads-as-zero.
Add support for all of these as simple reads-as-written registers.
We will hook up actual functionality later.

The main complexity here is handling the FPCCR register, which
has a mix of banked and unbanked bits.

Note that we don't share storage with the A-profile
cpu->cp15.nsacr and cpu->cp15.cpacr_el1, though the behaviour
is quite similar, for two reasons:
 * the M profile CPACR is banked between security states
 * it preserves the invariant that M profile uses no state
   inside the cp15 substruct

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-4-peter.maydell@linaro.org
---
 target/arm/cpu.h      |  34 ++++++++++++
 hw/intc/armv7m_nvic.c | 125 ++++++++++++++++++++++++++++++++++++++++++
 target/arm/cpu.c      |   5 ++
 target/arm/machine.c  |  16 ++++++
 4 files changed, 180 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
         uint32_t scr[M_REG_NUM_BANKS];
         uint32_t msplim[M_REG_NUM_BANKS];
         uint32_t psplim[M_REG_NUM_BANKS];
+        uint32_t fpcar[M_REG_NUM_BANKS];
+        uint32_t fpccr[M_REG_NUM_BANKS];
+        uint32_t fpdscr[M_REG_NUM_BANKS];
+        uint32_t cpacr[M_REG_NUM_BANKS];
+        uint32_t nsacr;
     } v7m;
 
     /* Information associated with an exception about to be taken:
@@ -XXX,XX +XXX,XX @@ FIELD(V7M_CSSELR, LEVEL, 1, 3)
  */
 FIELD(V7M_CSSELR, INDEX, 0, 4)
 
+/* v7M FPCCR bits */
+FIELD(V7M_FPCCR, LSPACT, 0, 1)
+FIELD(V7M_FPCCR, USER, 1, 1)
+FIELD(V7M_FPCCR, S, 2, 1)
+FIELD(V7M_FPCCR, THREAD, 3, 1)
+FIELD(V7M_FPCCR, HFRDY, 4, 1)
+FIELD(V7M_FPCCR, MMRDY, 5, 1)
+FIELD(V7M_FPCCR, BFRDY, 6, 1)
+FIELD(V7M_FPCCR, SFRDY, 7, 1)
+FIELD(V7M_FPCCR, MONRDY, 8, 1)
+FIELD(V7M_FPCCR, SPLIMVIOL, 9, 1)
+FIELD(V7M_FPCCR, UFRDY, 10, 1)
+FIELD(V7M_FPCCR, RES0, 11, 15)
+FIELD(V7M_FPCCR, TS, 26, 1)
+FIELD(V7M_FPCCR, CLRONRETS, 27, 1)
+FIELD(V7M_FPCCR, CLRONRET, 28, 1)
+FIELD(V7M_FPCCR, LSPENS, 29, 1)
+FIELD(V7M_FPCCR, LSPEN, 30, 1)
+FIELD(V7M_FPCCR, ASPEN, 31, 1)
+/* These bits are banked. Others are non-banked and live in the M_REG_S bank */
+#define R_V7M_FPCCR_BANKED_MASK                 \
+    (R_V7M_FPCCR_LSPACT_MASK |                  \
+     R_V7M_FPCCR_USER_MASK |                    \
+     R_V7M_FPCCR_THREAD_MASK |                  \
+     R_V7M_FPCCR_MMRDY_MASK |                   \
+     R_V7M_FPCCR_SPLIMVIOL_MASK |               \
+     R_V7M_FPCCR_UFRDY_MASK |                   \
+     R_V7M_FPCCR_ASPEN_MASK)
+
 /*
  * System register ID fields.
  */
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
     }
     case 0xd84: /* CSSELR */
         return cpu->env.v7m.csselr[attrs.secure];
+    case 0xd88: /* CPACR */
+        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            return 0;
+        }
+        return cpu->env.v7m.cpacr[attrs.secure];
+    case 0xd8c: /* NSACR */
+        if (!attrs.secure || !arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            return 0;
+        }
+        return cpu->env.v7m.nsacr;
     /* TODO: Implement debug registers.  */
     case 0xd90: /* MPU_TYPE */
         /* Unified MPU; if the MPU is not present this value is zero */
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
             return 0;
         }
         return cpu->env.v7m.sfar;
+    case 0xf34: /* FPCCR */
+        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            return 0;
+        }
+        if (attrs.secure) {
+            return cpu->env.v7m.fpccr[M_REG_S];
+        } else {
+            /*
+             * NS can read LSPEN, CLRONRET and MONRDY. It can read
+             * BFRDY and HFRDY if AIRCR.BFHFNMINS != 0;
+             * other non-banked bits RAZ.
+             * TODO: MONRDY should RAZ/WI if DEMCR.SDME is set.
+             */
+            uint32_t value = cpu->env.v7m.fpccr[M_REG_S];
+            uint32_t mask = R_V7M_FPCCR_LSPEN_MASK |
+                R_V7M_FPCCR_CLRONRET_MASK |
+                R_V7M_FPCCR_MONRDY_MASK;
+
+            if (s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK) {
+                mask |= R_V7M_FPCCR_BFRDY_MASK | R_V7M_FPCCR_HFRDY_MASK;
+            }
+
+            value &= mask;
+
+            value |= cpu->env.v7m.fpccr[M_REG_NS];
+            return value;
+        }
+    case 0xf38: /* FPCAR */
+        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            return 0;
+        }
+        return cpu->env.v7m.fpcar[attrs.secure];
+    case 0xf3c: /* FPDSCR */
+        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            return 0;
+        }
+        return cpu->env.v7m.fpdscr[attrs.secure];
     case 0xf40: /* MVFR0 */
         return cpu->isar.mvfr0;
     case 0xf44: /* MVFR1 */
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
             cpu->env.v7m.csselr[attrs.secure] = value & R_V7M_CSSELR_INDEX_MASK;
         }
         break;
+    case 0xd88: /* CPACR */
+        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            /* We implement only the Floating Point extension's CP10/CP11 */
+            cpu->env.v7m.cpacr[attrs.secure] = value & (0xf << 20);
+        }
+        break;
+    case 0xd8c: /* NSACR */
+        if (attrs.secure && arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            /* We implement only the Floating Point extension's CP10/CP11 */
+            cpu->env.v7m.nsacr = value & (3 << 10);
+        }
+        break;
     case 0xd90: /* MPU_TYPE */
         return; /* RO */
     case 0xd94: /* MPU_CTRL */
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
         }
         break;
     }
+    case 0xf34: /* FPCCR */
+        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            /* Not all bits here are banked. */
+            uint32_t fpccr_s;
+
+            if (!arm_feature(&cpu->env, ARM_FEATURE_V8)) {
+                /* Don't allow setting of bits not present in v7M */
+                value &= (R_V7M_FPCCR_LSPACT_MASK |
+                          R_V7M_FPCCR_USER_MASK |
+                          R_V7M_FPCCR_THREAD_MASK |
+                          R_V7M_FPCCR_HFRDY_MASK |
+                          R_V7M_FPCCR_MMRDY_MASK |
+                          R_V7M_FPCCR_BFRDY_MASK |
+                          R_V7M_FPCCR_MONRDY_MASK |
+                          R_V7M_FPCCR_LSPEN_MASK |
+                          R_V7M_FPCCR_ASPEN_MASK);
+            }
+            value &= ~R_V7M_FPCCR_RES0_MASK;
+
+            if (!attrs.secure) {
+                /* Some non-banked bits are configurably writable by NS */
+                fpccr_s = cpu->env.v7m.fpccr[M_REG_S];
+                if (!(fpccr_s & R_V7M_FPCCR_LSPENS_MASK)) {
+                    uint32_t lspen = FIELD_EX32(value, V7M_FPCCR, LSPEN);
+                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, LSPEN, lspen);
+                }
+                if (!(fpccr_s & R_V7M_FPCCR_CLRONRETS_MASK)) {
+                    uint32_t cor = FIELD_EX32(value, V7M_FPCCR, CLRONRET);
+                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, CLRONRET, cor);
+                }
+                if ((s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK)) {
+                    uint32_t hfrdy = FIELD_EX32(value, V7M_FPCCR, HFRDY);
+                    uint32_t bfrdy = FIELD_EX32(value, V7M_FPCCR, BFRDY);
+                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, HFRDY, hfrdy);
+                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, BFRDY, bfrdy);
+                }
+                /* TODO MONRDY should RAZ/WI if DEMCR.SDME is set */
+                {
+                    uint32_t monrdy = FIELD_EX32(value, V7M_FPCCR, MONRDY);
+                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, MONRDY, monrdy);
+                }
+
+                /*
+                 * All other non-banked bits are RAZ/WI from NS; write
+                 * just the banked bits to fpccr[M_REG_NS].
+                 */
+                value &= R_V7M_FPCCR_BANKED_MASK;
+                cpu->env.v7m.fpccr[M_REG_NS] = value;
+            } else {
+                fpccr_s = value;
+            }
+            cpu->env.v7m.fpccr[M_REG_S] = fpccr_s;
+        }
+        break;
+    case 0xf38: /* FPCAR */
+        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            value &= ~7;
+            cpu->env.v7m.fpcar[attrs.secure] = value;
+        }
+        break;
+    case 0xf3c: /* FPDSCR */
+        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            value &= 0x07c00000;
+            cpu->env.v7m.fpdscr[attrs.secure] = value;
+        }
+        break;
     case 0xf50: /* ICIALLU */
     case 0xf58: /* ICIMVAU */
     case 0xf5c: /* DCIMVAC */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
             env->v7m.ccr[M_REG_S] |= R_V7M_CCR_UNALIGN_TRP_MASK;
         }
 
+        if (arm_feature(env, ARM_FEATURE_VFP)) {
+            env->v7m.fpccr[M_REG_NS] = R_V7M_FPCCR_ASPEN_MASK;
+            env->v7m.fpccr[M_REG_S] = R_V7M_FPCCR_ASPEN_MASK |
+                R_V7M_FPCCR_LSPEN_MASK | R_V7M_FPCCR_S_MASK;
+        }
         /* Unlike A/R profile, M profile defines the reset LR value */
         env->regs[14] = 0xffffffff;
 
diff --git a/target/arm/machine.c b/target/arm/machine.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m_v8m = {
     }
 };
 
+static const VMStateDescription vmstate_m_fp = {
+    .name = "cpu/m/fp",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = vfp_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32_ARRAY(env.v7m.fpcar, ARMCPU, M_REG_NUM_BANKS),
+        VMSTATE_UINT32_ARRAY(env.v7m.fpccr, ARMCPU, M_REG_NUM_BANKS),
+        VMSTATE_UINT32_ARRAY(env.v7m.fpdscr, ARMCPU, M_REG_NUM_BANKS),
+        VMSTATE_UINT32_ARRAY(env.v7m.cpacr, ARMCPU, M_REG_NUM_BANKS),
+        VMSTATE_UINT32(env.v7m.nsacr, ARMCPU),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_m = {
     .name = "cpu/m",
     .version_id = 4,
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m = {
         &vmstate_m_scr,
         &vmstate_m_other_sp,
         &vmstate_m_v8m,
+        &vmstate_m_fp,
         NULL
     }
 };
-- 
2.20.1

The only "system register" that M-profile floating point exposes
via the VMRS/VMRS instructions is FPSCR, and it does not have
the odd special case for rd==15. Add a check to ensure we only
expose FPSCR.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-5-peter.maydell@linaro.org
---
 target/arm/translate.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
                     }
                 }
             } else { /* !dp */
+                bool is_sysreg;
+
                 if ((insn & 0x6f) != 0x00)
                     return 1;
                 rn = VFP_SREG_N(insn);
+
+                is_sysreg = extract32(insn, 21, 1);
+
+                if (arm_dc_feature(s, ARM_FEATURE_M)) {
+                    /*
+                     * The only M-profile VFP vmrs/vmsr sysreg is FPSCR.
+                     * Writes to R15 are UNPREDICTABLE; we choose to undef.
+                     */
+                    if (is_sysreg && (rd == 15 || (rn >> 1) != ARM_VFP_FPSCR)) {
+                        return 1;
+                    }
+                }
+
                 if (insn & ARM_CP_RW_BIT) {
                     /* vfp->arm */
-                    if (insn & (1 << 21)) {
+                    if (is_sysreg) {
                         /* system register */
                         rn >>= 1;
 
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
                     }
                 } else {
                     /* arm->vfp */
-                    if (insn & (1 << 21)) {
+                    if (is_sysreg) {
                         rn >>= 1;
                         /* system register */
                         switch (rn) {
-- 
2.20.1

Like AArch64, M-profile floating point has no FPEXC enable
bit to gate floating point; so always set the VFPEN TB flag.

M-profile also has CPACR and NSACR similar to A-profile;
they behave slightly differently:
 * the CPACR is banked between Secure and Non-Secure
 * if the NSACR forces a trap then this is taken to
   the Secure state, not the Non-Secure state

Honour the CPACR and NSACR settings. The NSACR handling
requires us to borrow the exception.target_el field
(usually meaningless for M profile) to distinguish the
NOCP UsageFault taken to Secure state from the more
usual fault taken to the current security state.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-6-peter.maydell@linaro.org
---
 target/arm/helper.c    | 55 +++++++++++++++++++++++++++++++++++++++---
 target/arm/translate.c | 10 ++++++--
 2 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t excp_idx,
     return target_el;
 }
 
+/*
+ * Return true if the v7M CPACR permits access to the FPU for the specified
+ * security state and privilege level.
+ */
+static bool v7m_cpacr_pass(CPUARMState *env, bool is_secure, bool is_priv)
+{
+    switch (extract32(env->v7m.cpacr[is_secure], 20, 2)) {
+    case 0:
+    case 2: /* UNPREDICTABLE: we treat like 0 */
+        return false;
+    case 1:
+        return is_priv;
+    case 3:
+        return true;
+    default:
+        g_assert_not_reached();
+    }
+}
+
 static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
                             ARMMMUIdx mmu_idx, bool ignfault)
 {
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
         env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_UNDEFINSTR_MASK;
         break;
     case EXCP_NOCP:
-        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
-        env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_NOCP_MASK;
+    {
+        /*
+         * NOCP might be directed to something other than the current
+         * security state if this fault is because of NSACR; we indicate
+         * the target security state using exception.target_el.
+         */
+        int target_secstate;
+
+        if (env->exception.target_el == 3) {
+            target_secstate = M_REG_S;
+        } else {
+            target_secstate = env->v7m.secure;
+        }
+        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, target_secstate);
+        env->v7m.cfsr[target_secstate] |= R_V7M_CFSR_NOCP_MASK;
         break;
+    }
     case EXCP_INVSTATE:
         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
         env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_INVSTATE_MASK;
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
         return 0;
     }
 
+    if (arm_feature(env, ARM_FEATURE_M)) {
+        /* CPACR can cause a NOCP UsageFault taken to current security state */
+        if (!v7m_cpacr_pass(env, env->v7m.secure, cur_el != 0)) {
+            return 1;
+        }
+
+        if (arm_feature(env, ARM_FEATURE_M_SECURITY) && !env->v7m.secure) {
+            if (!extract32(env->v7m.nsacr, 10, 1)) {
+                /* FP insns cause a NOCP UsageFault taken to Secure */
+                return 3;
+            }
+        }
+
+        return 0;
+    }
+
     /* The CPACR controls traps to EL1, or PL1 if we're 32 bit:
      * 0, 2 : trap EL0 and EL1/PL1 accesses
      * 1    : trap only EL0 accesses
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, arm_sctlr_b(env));
         flags = FIELD_DP32(flags, TBFLAG_A32, NS, !access_secure_reg(env));
         if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
-            || arm_el_is_aa64(env, 1)) {
+            || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
             flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
         }
         flags = FIELD_DP32(flags, TBFLAG_A32, XSCALE_CPAR, env->cp15.c15_cpar);
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
      * for attempts to execute invalid vfp/neon encodings with FP disabled.
      */
     if (s->fp_excp_el) {
-        gen_exception_insn(s, 4, EXCP_UDEF,
-                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
+        if (arm_dc_feature(s, ARM_FEATURE_M)) {
+            gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
+                               s->fp_excp_el);
+        } else {
+            gen_exception_insn(s, 4, EXCP_UDEF,
+                               syn_fp_access_trap(1, 0xe, false),
+                               s->fp_excp_el);
+        }
         return 0;
     }
 
-- 
2.20.1

Correct the decode of the M-profile "coprocessor and
floating-point instructions" space:
 * op0 == 0b11 is always unallocated
 * if the CPU has an FPU then all insns with op1 == 0b101
   are floating point and go to disas_vfp_insn()

For the moment we leave VLLDM and VLSTM as NOPs; in
a later commit we will fill in the proper implementation
for the case where an FPU is present.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-7-peter.maydell@linaro.org
---
 target/arm/translate.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
     case 6: case 7: case 14: case 15:
         /* Coprocessor.  */
         if (arm_dc_feature(s, ARM_FEATURE_M)) {
-            /* We don't currently implement M profile FP support,
-             * so this entire space should give a NOCP fault, with
-             * the exception of the v8M VLLDM and VLSTM insns, which
-             * must be NOPs in Secure state and UNDEF in Nonsecure state.
+            /* 0b111x_11xx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx */
+            if (extract32(insn, 24, 2) == 3) {
+                goto illegal_op; /* op0 = 0b11 : unallocated */
+            }
+
+            /*
+             * Decode VLLDM and VLSTM first: these are nonstandard because:
+             *  * if there is no FPU then these insns must NOP in
+             *    Secure state and UNDEF in Nonsecure state
+             *  * if there is an FPU then these insns do not have
+             *    the usual behaviour that disas_vfp_insn() provides of
+             *    being controlled by CPACR/NSACR enable bits or the
+             *    lazy-stacking logic.
              */
             if (arm_dc_feature(s, ARM_FEATURE_V8) &&
                 (insn & 0xffa00f00) == 0xec200a00) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                 /* Just NOP since FP support is not implemented */
                 break;
             }
+            if (arm_dc_feature(s, ARM_FEATURE_VFP) &&
+                ((insn >> 8) & 0xe) == 10) {
+                /* FP, and the CPU supports it */
+                if (disas_vfp_insn(s, insn)) {
+                    goto illegal_op;
+                }
+                break;
+            }
+
             /* All other insns: NOCP */
             gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
                                default_exception_el(s));
-- 
2.20.1

If the floating point extension is present, then the SG instruction
must clear the CONTROL_S.SFPA bit. Implement this.

(On a no-FPU system the bit will always be zero, so we don't need
to make the clearing of the bit conditional on ARM_FEATURE_VFP.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-8-peter.maydell@linaro.org
---
 target/arm/helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static bool v7m_handle_execute_nsc(ARMCPU *cpu)
     qemu_log_mask(CPU_LOG_INT, "...really an SG instruction at 0x%08" PRIx32
                   ", executing it\n", env->regs[15]);
     env->regs[14] &= ~1;
+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
     switch_v7m_security_state(env, true);
     xpsr_write(env, 0, XPSR_IT);
     env->regs[15] += 4;
-- 
2.20.1

The M-profile CONTROL register has two bits -- SFPA and FPCA --
which relate to floating-point support, and should be RES0 otherwise.
Handle them correctly in the MSR/MRS register access code.
Neither is banked between security states, so they are stored
in v7m.control[M_REG_S] regardless of current security state.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-9-peter.maydell@linaro.org
---
 target/arm/helper.c | 57 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 49 insertions(+), 8 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
         return xpsr_read(env) & mask;
         break;
     case 20: /* CONTROL */
-        return env->v7m.control[env->v7m.secure];
+    {
+        uint32_t value = env->v7m.control[env->v7m.secure];
+        if (!env->v7m.secure) {
+            /* SFPA is RAZ/WI from NS; FPCA is stored in the M_REG_S bank */
+            value |= env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK;
+        }
+        return value;
+    }
     case 0x94: /* CONTROL_NS */
         /* We have to handle this here because unprivileged Secure code
          * can read the NS CONTROL register.
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
         if (!env->v7m.secure) {
             return 0;
         }
-        return env->v7m.control[M_REG_NS];
+        return env->v7m.control[M_REG_NS] |
+            (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK);
     }
 
     if (el == 0) {
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
      */
     uint32_t mask = extract32(maskreg, 8, 4);
     uint32_t reg = extract32(maskreg, 0, 8);
+    int cur_el = arm_current_el(env);
 
-    if (arm_current_el(env) == 0 && reg > 7) {
-        /* only xPSR sub-fields may be written by unprivileged */
+    if (cur_el == 0 && reg > 7 && reg != 20) {
+        /*
+         * only xPSR sub-fields and CONTROL.SFPA may be written by
+         * unprivileged code
+         */
         return;
     }
 
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
                 env->v7m.control[M_REG_NS] &= ~R_V7M_CONTROL_NPRIV_MASK;
                 env->v7m.control[M_REG_NS] |= val & R_V7M_CONTROL_NPRIV_MASK;
             }
+            /*
+             * SFPA is RAZ/WI from NS. FPCA is RO if NSACR.CP10 == 0,
+             * RES0 if the FPU is not present, and is stored in the S bank
+             */
+            if (arm_feature(env, ARM_FEATURE_VFP) &&
+                extract32(env->v7m.nsacr, 10, 1)) {
+                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
+                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_FPCA_MASK;
+            }
             return;
         case 0x98: /* SP_NS */
         {
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
         env->v7m.faultmask[env->v7m.secure] = val & 1;
         break;
     case 20: /* CONTROL */
-        /* Writing to the SPSEL bit only has an effect if we are in
+        /*
+         * Writing to the SPSEL bit only has an effect if we are in
          * thread mode; other bits can be updated by any privileged code.
          * write_v7m_control_spsel() deals with updating the SPSEL bit in
          * env->v7m.control, so we only need update the others.
          * For v7M, we must just ignore explicit writes to SPSEL in handler
          * mode; for v8M the write is permitted but will have no effect.
+         * All these bits are writes-ignored from non-privileged code,
+         * except for SFPA.
          */
-        if (arm_feature(env, ARM_FEATURE_V8) ||
-            !arm_v7m_is_handler_mode(env)) {
+        if (cur_el > 0 && (arm_feature(env, ARM_FEATURE_V8) ||
+                           !arm_v7m_is_handler_mode(env))) {
             write_v7m_control_spsel(env, (val & R_V7M_CONTROL_SPSEL_MASK) != 0);
         }
-        if (arm_feature(env, ARM_FEATURE_M_MAIN)) {
+        if (cur_el > 0 && arm_feature(env, ARM_FEATURE_M_MAIN)) {
             env->v7m.control[env->v7m.secure] &= ~R_V7M_CONTROL_NPRIV_MASK;
             env->v7m.control[env->v7m.secure] |= val & R_V7M_CONTROL_NPRIV_MASK;
         }
+        if (arm_feature(env, ARM_FEATURE_VFP)) {
+            /*
+             * SFPA is RAZ/WI from NS or if no FPU.
+             * FPCA is RO if NSACR.CP10 == 0, RES0 if the FPU is not present.
+             * Both are stored in the S bank.
+             */
+            if (env->v7m.secure) {
+                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
+                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_SFPA_MASK;
+            }
+            if (cur_el > 0 &&
+                (env->v7m.secure || !arm_feature(env, ARM_FEATURE_M_SECURITY) ||
+                 extract32(env->v7m.nsacr, 10, 1))) {
+                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
+                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_FPCA_MASK;
+            }
+        }
         break;
     default:
     bad_reg:
-- 
2.20.1

Currently the code in v7m_push_stack() which detects a violation
of the v8M stack limit simply returns early if it does so. This
is OK for the current integer-only code, but won't work for the
floating point handling we're about to add. We need to continue
executing the rest of the function so that we check for other
exceptions like not having permission to use the FPU and so
that we correctly set the FPCCR state if we are doing lazy
stacking. Refactor to avoid the early return.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-10-peter.maydell@linaro.org
---
 target/arm/helper.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
      * should ignore further stack faults trying to process
      * that derived exception.)
      */
-    bool stacked_ok;
+    bool stacked_ok = true, limitviol = false;
     CPUARMState *env = &cpu->env;
     uint32_t xpsr = xpsr_read(env);
     uint32_t frameptr = env->regs[13];
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
             armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
                                     env->v7m.secure);
             env->regs[13] = limit;
-            return true;
+            /*
+             * We won't try to perform any further memory accesses but
+             * we must continue through the following code to check for
+             * permission faults during FPU state preservation, and we
+             * must update FPCCR if lazy stacking is enabled.
+             */
+            limitviol = true;
+            stacked_ok = false;
         }
     }
 
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
      * (which may be taken in preference to the one we started with
      * if it has higher priority).
      */
-    stacked_ok =
+    stacked_ok = stacked_ok &&
         v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, false) &&
         v7m_stack_write(cpu, frameptr + 4, env->regs[1], mmu_idx, false) &&
         v7m_stack_write(cpu, frameptr + 8, env->regs[2], mmu_idx, false) &&
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
         v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
         v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
 
-    /* Update SP regardless of whether any of the stack accesses failed. */
-    env->regs[13] = frameptr;
+    /*
+     * If we broke a stack limit then SP was already updated earlier;
+     * otherwise we update SP regardless of whether any of the stack
+     * accesses failed or we took some other kind of fault.
+     */
+    if (!limitviol) {
+        env->regs[13] = frameptr;
+    }
 
     return !stacked_ok;
 }
-- 
2.20.1

Handle floating point registers in exception entry.
This corresponds to the FP-specific parts of the pseudocode
functions ActivateException() and PushStack().

We defer the code corresponding to UpdateFPCCR() to a later patch.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-11-peter.maydell@linaro.org
---
 target/arm/helper.c | 98 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 95 insertions(+), 3 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     switch_v7m_security_state(env, targets_secure);
     write_v7m_control_spsel(env, 0);
     arm_clear_exclusive(env);
+    /* Clear SFPA and FPCA (has no effect if no FPU) */
+    env->v7m.control[M_REG_S] &=
+        ~(R_V7M_CONTROL_FPCA_MASK | R_V7M_CONTROL_SFPA_MASK);
     /* Clear IT bits */
     env->condexec_bits = 0;
     env->regs[14] = lr;
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
     uint32_t xpsr = xpsr_read(env);
     uint32_t frameptr = env->regs[13];
     ARMMMUIdx mmu_idx = arm_mmu_idx(env);
+    uint32_t framesize;
+    bool nsacr_cp10 = extract32(env->v7m.nsacr, 10, 1);
+
+    if ((env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) &&
+        (env->v7m.secure || nsacr_cp10)) {
+        if (env->v7m.secure &&
+            env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK) {
+            framesize = 0xa8;
+        } else {
+            framesize = 0x68;
+        }
+    } else {
+        framesize = 0x20;
+    }
 
     /* Align stack pointer if the guest wants that */
     if ((frameptr & 4) &&
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
         xpsr |= XPSR_SPREALIGN;
     }
 
-    frameptr -= 0x20;
+    xpsr &= ~XPSR_SFPA;
+    if (env->v7m.secure &&
+        (env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
+        xpsr |= XPSR_SFPA;
+    }
+
+    frameptr -= framesize;
 
     if (arm_feature(env, ARM_FEATURE_V8)) {
         uint32_t limit = v7m_sp_limit(env);
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
         v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
         v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
 
+    if (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) {
+        /* FPU is active, try to save its registers */
+        bool fpccr_s = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
+        bool lspact = env->v7m.fpccr[fpccr_s] & R_V7M_FPCCR_LSPACT_MASK;
+
+        if (lspact && arm_feature(env, ARM_FEATURE_M_SECURITY)) {
+            qemu_log_mask(CPU_LOG_INT,
+                          "...SecureFault because LSPACT and FPCA both set\n");
+            env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
+        } else if (!env->v7m.secure && !nsacr_cp10) {
+            qemu_log_mask(CPU_LOG_INT,
+                          "...Secure UsageFault with CFSR.NOCP because "
+                          "NSACR.CP10 prevents stacking FP regs\n");
+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, M_REG_S);
+            env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_NOCP_MASK;
+        } else {
+            if (!(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPEN_MASK)) {
+                /* Lazy stacking disabled, save registers now */
+                int i;
+                bool cpacr_pass = v7m_cpacr_pass(env, env->v7m.secure,
+                                                 arm_current_el(env) != 0);
+
+                if (stacked_ok && !cpacr_pass) {
+                    /*
+                     * Take UsageFault if CPACR forbids access. The pseudocode
+                     * here does a full CheckCPEnabled() but we know the NSACR
+                     * check can never fail as we have already handled that.
+                     */
+                    qemu_log_mask(CPU_LOG_INT,
+                                  "...UsageFault with CFSR.NOCP because "
+                                  "CPACR.CP10 prevents stacking FP regs\n");
+                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
+                                            env->v7m.secure);
+                    env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_NOCP_MASK;
+                    stacked_ok = false;
+                }
+
+                for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
+                    uint64_t dn = *aa32_vfp_dreg(env, i / 2);
+                    uint32_t faddr = frameptr + 0x20 + 4 * i;
+                    uint32_t slo = extract64(dn, 0, 32);
+                    uint32_t shi = extract64(dn, 32, 32);
+
+                    if (i >= 16) {
+                        faddr += 8; /* skip the slot for the FPSCR */
+                    }
+                    stacked_ok = stacked_ok &&
+                        v7m_stack_write(cpu, faddr, slo, mmu_idx, false) &&
+                        v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, false);
+                }
+                stacked_ok = stacked_ok &&
+                    v7m_stack_write(cpu, frameptr + 0x60,
+                                    vfp_get_fpscr(env), mmu_idx, false);
+                if (cpacr_pass) {
+                    for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
+                        *aa32_vfp_dreg(env, i / 2) = 0;
+                    }
+                    vfp_set_fpscr(env, 0);
+                }
+            } else {
+                /* Lazy stacking enabled, save necessary info to stack later */
+                /* TODO : equivalent of UpdateFPCCR() pseudocode */
+            }
+        }
+    }
+
     /*
      * If we broke a stack limit then SP was already updated earlier;
      * otherwise we update SP regardless of whether any of the stack
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
 
     if (arm_feature(env, ARM_FEATURE_V8)) {
         lr = R_V7M_EXCRET_RES1_MASK |
-            R_V7M_EXCRET_DCRS_MASK |
-            R_V7M_EXCRET_FTYPE_MASK;
+            R_V7M_EXCRET_DCRS_MASK;
         /* The S bit indicates whether we should return to Secure
          * or NonSecure (ie our current state).
          * The ES bit indicates whether we're taking this exception
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
         if (env->v7m.secure) {
             lr |= R_V7M_EXCRET_S_MASK;
         }
+        if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK)) {
+            lr |= R_V7M_EXCRET_FTYPE_MASK;
+        }
     } else {
         lr = R_V7M_EXCRET_RES1_MASK |
             R_V7M_EXCRET_S_MASK |
-- 
2.20.1

Implement the code which updates the FPCCR register on an
exception entry where we are going to use lazy FP stacking.
We have to defer to the NVIC to determine whether the
various exceptions are currently ready or not.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190416125744.27770-12-peter.maydell@linaro.org
---
 target/arm/cpu.h      | 14 +++++++++
 hw/intc/armv7m_nvic.c | 34 ++++++++++++++++++++++
 target/arm/helper.c   | 67 ++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_acknowledge_irq(void *opaque);
  * (Ignoring -1, this is the same as the RETTOBASE value before completion.)
  */
 int armv7m_nvic_complete_irq(void *opaque, int irq, bool secure);
+/**
+ * armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure)
+ * @opaque: the NVIC
+ * @irq: the exception number to mark pending
+ * @secure: false for non-banked exceptions or for the nonsecure
+ * version of a banked exception, true for the secure version of a banked
+ * exception.
+ *
+ * Return whether an exception is "ready", i.e. whether the exception is
+ * enabled and is configured at a priority which would allow it to
+ * interrupt the current execution priority. This controls whether the
+ * RDY bit for it in the FPCCR is set.
+ */
+bool armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure);
 /**
  * armv7m_nvic_raw_execution_priority: return the raw execution priority
  * @opaque: the NVIC
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ int armv7m_nvic_complete_irq(void *opaque, int irq, bool secure)
     return ret;
 }
 
+bool armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure)
+{
+    /*
+     * Return whether an exception is "ready", i.e. it is enabled and is
+     * configured at a priority which would allow it to interrupt the
+     * current execution priority.
+     *
+     * irq and secure have the same semantics as for armv7m_nvic_set_pending():
+     * for non-banked exceptions secure is always false; for banked exceptions
+     * it indicates which of the exceptions is required.
+     */
+    NVICState *s = (NVICState *)opaque;
+    bool banked = exc_is_banked(irq);
+    VecInfo *vec;
+    int running = nvic_exec_prio(s);
+
+    assert(irq > ARMV7M_EXCP_RESET && irq < s->num_irq);
+    assert(!secure || banked);
+
+    /*
+     * HardFault is an odd special case: we always check against -1,
+     * even if we're secure and HardFault has priority -3; we never
+     * need to check for enabled state.
+     */
+    if (irq == ARMV7M_EXCP_HARD) {
+        return running > -1;
+    }
+
+    vec = (banked && secure) ? &s->sec_vectors[irq] : &s->vectors[irq];
+
+    return vec->enabled &&
+        exc_group_prio(s, vec->prio, secure) < running;
+}
+
 /* callback when external interrupt line is changed */
 static void set_irq_level(void *opaque, int n, int level)
 {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     env->thumb = addr & 1;
 }
 
+static void v7m_update_fpccr(CPUARMState *env, uint32_t frameptr,
+                             bool apply_splim)
+{
+    /*
+     * Like the pseudocode UpdateFPCCR: save state in FPCAR and FPCCR
+     * that we will need later in order to do lazy FP reg stacking.
+     */
+    bool is_secure = env->v7m.secure;
+    void *nvic = env->nvic;
+    /*
+     * Some bits are unbanked and live always in fpccr[M_REG_S]; some bits
+     * are banked and we want to update the bit in the bank for the
+     * current security state; and in one case we want to specifically
+     * update the NS banked version of a bit even if we are secure.
+     */
+    uint32_t *fpccr_s = &env->v7m.fpccr[M_REG_S];
+    uint32_t *fpccr_ns = &env->v7m.fpccr[M_REG_NS];
+    uint32_t *fpccr = &env->v7m.fpccr[is_secure];
+    bool hfrdy, bfrdy, mmrdy, ns_ufrdy, s_ufrdy, sfrdy, monrdy;
+
+    env->v7m.fpcar[is_secure] = frameptr & ~0x7;
+
+    if (apply_splim && arm_feature(env, ARM_FEATURE_V8)) {
+        bool splimviol;
+        uint32_t splim = v7m_sp_limit(env);
+        bool ign = armv7m_nvic_neg_prio_requested(nvic, is_secure) &&
+            (env->v7m.ccr[is_secure] & R_V7M_CCR_STKOFHFNMIGN_MASK);
+
+        splimviol = !ign && frameptr < splim;
+        *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, SPLIMVIOL, splimviol);
+    }
+
+    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, LSPACT, 1);
+
+    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, S, is_secure);
+
+    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, USER, arm_current_el(env) == 0);
+
+    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, THREAD,
+                        !arm_v7m_is_handler_mode(env));
+
+    hfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_HARD, false);
+    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, HFRDY, hfrdy);
+
+    bfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_BUS, false);
+    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, BFRDY, bfrdy);
+
+    mmrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_MEM, is_secure);
+    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, MMRDY, mmrdy);
+
+    ns_ufrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_USAGE, false);
+    *fpccr_ns = FIELD_DP32(*fpccr_ns, V7M_FPCCR, UFRDY, ns_ufrdy);
+
+    monrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_DEBUG, false);
+    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, MONRDY, monrdy);
+
+    if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
+        s_ufrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_USAGE, true);
+        *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, UFRDY, s_ufrdy);
+
+        sfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_SECURE, false);
+        *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, SFRDY, sfrdy);
+    }
+}
+
 static bool v7m_push_stack(ARMCPU *cpu)
 {
     /* Do the "set up stack frame" part of exception entry,
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
                 }
             } else {
                 /* Lazy stacking enabled, save necessary info to stack later */
-                /* TODO : equivalent of UpdateFPCCR() pseudocode */
+                v7m_update_fpccr(env, frameptr + 0x20, true);
             }
         }
     }
-- 
2.20.1

For v8M floating point support, transitions from Secure
to Non-secure state via BLNS and BLXNS must clear the
CONTROL.SFPA bit. (This corresponds to the pseudocode
BranchToNS() function.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-13-peter.maydell@linaro.org
---
 target/arm/helper.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_bxns)(CPUARMState *env, uint32_t dest)
     /* translate.c should have made BXNS UNDEF unless we're secure */
     assert(env->v7m.secure);
 
+    if (!(dest & 1)) {
+        env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
+    }
     switch_v7m_security_state(env, dest & 1);
     env->thumb = 1;
     env->regs[15] = dest & ~1;
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
          */
         write_v7m_exception(env, 1);
     }
+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
     switch_v7m_security_state(env, 0);
     env->thumb = 1;
     env->regs[15] = dest;
-- 
2.20.1

The TailChain() pseudocode specifies that a tail chaining
exception should sanitize the excReturn all-ones bits and
(if there is no FPU) the excReturn FType bits; we weren't
doing this.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-14-peter.maydell@linaro.org
---
 target/arm/helper.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     qemu_log_mask(CPU_LOG_INT, "...taking pending %s exception %d\n",
                   targets_secure ? "secure" : "nonsecure", exc);
 
+    if (dotailchain) {
+        /* Sanitize LR FType and PREFIX bits */
+        if (!arm_feature(env, ARM_FEATURE_VFP)) {
+            lr |= R_V7M_EXCRET_FTYPE_MASK;
+        }
+        lr = deposit32(lr, 24, 8, 0xff);
+    }
+
     if (arm_feature(env, ARM_FEATURE_V8)) {
         if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
             (lr & R_V7M_EXCRET_S_MASK)) {
-- 
2.20.1

The magic value pushed onto the callee stack as an integrity
check is different if floating point is present.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-15-peter.maydell@linaro.org
---
 target/arm/helper.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ load_fail:
     return false;
 }
 
+static uint32_t v7m_integrity_sig(CPUARMState *env, uint32_t lr)
+{
+    /*
+     * Return the integrity signature value for the callee-saves
+     * stack frame section. @lr is the exception return payload/LR value
+     * whose FType bit forms bit 0 of the signature if FP is present.
+     */
+    uint32_t sig = 0xfefa125a;
+
+    if (!arm_feature(env, ARM_FEATURE_VFP) || (lr & R_V7M_EXCRET_FTYPE_MASK)) {
+        sig |= 1;
+    }
+    return sig;
+}
+
 static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
                                   bool ignore_faults)
 {
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     bool stacked_ok;
     uint32_t limit;
     bool want_psp;
+    uint32_t sig;
 
     if (dotailchain) {
         bool mode = lr & R_V7M_EXCRET_MODE_MASK;
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     /* Write as much of the stack frame as we can. A write failure may
      * cause us to pend a derived exception.
      */
+    sig = v7m_integrity_sig(env, lr);
     stacked_ok =
-        v7m_stack_write(cpu, frameptr, 0xfefa125b, mmu_idx, ignore_faults) &&
+        v7m_stack_write(cpu, frameptr, sig, mmu_idx, ignore_faults) &&
         v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx,
                         ignore_faults) &&
         v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx,
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
         if (return_to_secure &&
             ((excret & R_V7M_EXCRET_ES_MASK) == 0 ||
              (excret & R_V7M_EXCRET_DCRS_MASK) == 0)) {
-            uint32_t expected_sig = 0xfefa125b;
             uint32_t actual_sig;
 
             pop_ok = v7m_stack_read(cpu, &actual_sig, frameptr, mmu_idx);
 
-            if (pop_ok && expected_sig != actual_sig) {
+            if (pop_ok && v7m_integrity_sig(env, excret) != actual_sig) {
                 /* Take a SecureFault on the current stack */
                 env->v7m.sfsr |= R_V7M_SFSR_INVIS_MASK;
                 armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
-- 
2.20.1

Handle floating point registers in exception return.
This corresponds to pseudocode functions ValidateExceptionReturn(),
ExceptionReturn(), PopStack() and ConsumeExcStackFrame().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-16-peter.maydell@linaro.org
---
 target/arm/helper.c | 142 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 141 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
     bool rettobase = false;
     bool exc_secure = false;
     bool return_to_secure;
+    bool ftype;
+    bool restore_s16_s31;
 
     /* If we're not in Handler mode then jumps to magic exception-exit
      * addresses don't have magic behaviour. However for the v8M
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
                       excret);
     }
 
+    ftype = excret & R_V7M_EXCRET_FTYPE_MASK;
+
+    if (!arm_feature(env, ARM_FEATURE_VFP) && !ftype) {
+        qemu_log_mask(LOG_GUEST_ERROR, "M profile: zero FTYPE in exception "
+                      "exit PC value 0x%" PRIx32 " is UNPREDICTABLE "
+                      "if FPU not present\n",
+                      excret);
+        ftype = true;
+    }
+
     if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
         /* EXC_RETURN.ES validation check (R_SMFL). We must do this before
          * we pick which FAULTMASK to clear.
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
      */
     write_v7m_control_spsel_for_secstate(env, return_to_sp_process, exc_secure);
 
+    /*
+     * Clear scratch FP values left in caller saved registers; this
+     * must happen before any kind of tail chaining.
+     */
+    if ((env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_CLRONRET_MASK) &&
+        (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK)) {
+        if (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK) {
+            env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
+            qemu_log_mask(CPU_LOG_INT, "...taking SecureFault on existing "
+                          "stackframe: error during lazy state deactivation\n");
+            v7m_exception_taken(cpu, excret, true, false);
+            return;
+        } else {
+            /* Clear s0..s15 and FPSCR */
+            int i;
+
+            for (i = 0; i < 16; i += 2) {
+                *aa32_vfp_dreg(env, i / 2) = 0;
+            }
+            vfp_set_fpscr(env, 0);
+        }
+    }
+
     if (sfault) {
         env->v7m.sfsr |= R_V7M_SFSR_INVER_MASK;
         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
             }
         }
 
+        if (!ftype) {
+            /* FP present and we need to handle it */
+            if (!return_to_secure &&
+                (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK)) {
+                armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
+                env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
+                qemu_log_mask(CPU_LOG_INT,
+                              "...taking SecureFault on existing stackframe: "
+                              "Secure LSPACT set but exception return is "
+                              "not to secure state\n");
+                v7m_exception_taken(cpu, excret, true, false);
+                return;
+            }
+
+            restore_s16_s31 = return_to_secure &&
+                (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK);
+
+            if (env->v7m.fpccr[return_to_secure] & R_V7M_FPCCR_LSPACT_MASK) {
+                /* State in FPU is still valid, just clear LSPACT */
+                env->v7m.fpccr[return_to_secure] &= ~R_V7M_FPCCR_LSPACT_MASK;
+            } else {
+                int i;
+                uint32_t fpscr;
+                bool cpacr_pass, nsacr_pass;
+
+                cpacr_pass = v7m_cpacr_pass(env, return_to_secure,
+                                            return_to_priv);
+                nsacr_pass = return_to_secure ||
+                    extract32(env->v7m.nsacr, 10, 1);
+
+                if (!cpacr_pass) {
+                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
+                                            return_to_secure);
+                    env->v7m.cfsr[return_to_secure] |= R_V7M_CFSR_NOCP_MASK;
+                    qemu_log_mask(CPU_LOG_INT,
+                                  "...taking UsageFault on existing "
+                                  "stackframe: CPACR.CP10 prevents unstacking "
+                                  "FP regs\n");
+                    v7m_exception_taken(cpu, excret, true, false);
+                    return;
+                } else if (!nsacr_pass) {
+                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, true);
+                    env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_INVPC_MASK;
+                    qemu_log_mask(CPU_LOG_INT,
+                                  "...taking Secure UsageFault on existing "
+                                  "stackframe: NSACR.CP10 prevents unstacking "
+                                  "FP regs\n");
+                    v7m_exception_taken(cpu, excret, true, false);
+                    return;
+                }
+
+                for (i = 0; i < (restore_s16_s31 ? 32 : 16); i += 2) {
+                    uint32_t slo, shi;
+                    uint64_t dn;
+                    uint32_t faddr = frameptr + 0x20 + 4 * i;
+
+                    if (i >= 16) {
+                        faddr += 8; /* Skip the slot for the FPSCR */
+                    }
+
+                    pop_ok = pop_ok &&
+                        v7m_stack_read(cpu, &slo, faddr, mmu_idx) &&
+                        v7m_stack_read(cpu, &shi, faddr + 4, mmu_idx);
+
+                    if (!pop_ok) {
+                        break;
+                    }
+
+                    dn = (uint64_t)shi << 32 | slo;
+                    *aa32_vfp_dreg(env, i / 2) = dn;
+                }
+                pop_ok = pop_ok &&
+                    v7m_stack_read(cpu, &fpscr, frameptr + 0x60, mmu_idx);
+                if (pop_ok) {
+                    vfp_set_fpscr(env, fpscr);
+                }
+                if (!pop_ok) {
+                    /*
+                     * These regs are 0 if security extension present;
+                     * otherwise merely UNKNOWN. We zero always.
+                     */
+                    for (i = 0; i < (restore_s16_s31 ? 32 : 16); i += 2) {
+                        *aa32_vfp_dreg(env, i / 2) = 0;
+                    }
+                    vfp_set_fpscr(env, 0);
+                }
+            }
+        }
+        env->v7m.control[M_REG_S] = FIELD_DP32(env->v7m.control[M_REG_S],
+                                               V7M_CONTROL, FPCA, !ftype);
+
         /* Commit to consuming the stack frame */
         frameptr += 0x20;
+        if (!ftype) {
+            frameptr += 0x48;
+            if (restore_s16_s31) {
+                frameptr += 0x40;
+            }
+        }
         /* Undo stack alignment (the SPREALIGN bit indicates that the original
          * pre-exception SP was not 8-aligned and we added a padding word to
          * align it, so we undo this by ORing in the bit that increases it
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
         *frame_sp_p = frameptr;
     }
     /* This xpsr_write() will invalidate frame_sp_p as it may switch stack */
-    xpsr_write(env, xpsr, ~XPSR_SPREALIGN);
+    xpsr_write(env, xpsr, ~(XPSR_SPREALIGN | XPSR_SFPA));
+
+    if (env->v7m.secure) {
+        bool sfpa = xpsr & XPSR_SFPA;
+
+        env->v7m.control[M_REG_S] = FIELD_DP32(env->v7m.control[M_REG_S],
+                                               V7M_CONTROL, SFPA, sfpa);
+    }
 
     /* The restored xPSR exception field will be zero if we're
      * resuming in Thread mode. If that doesn't match what the
-- 
2.20.1

Move the NS TBFLAG down from bit 19 to bit 6, which has not
been used since commit c1e3781090b9d36c60 in 2015, when we
started passing the entire MMU index in the TB flags rather
than just a 'privilege level' bit.

This rearrangement is not strictly necessary, but means that
we can put M-profile-only bits next to each other rather
than scattered across the flag word.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-17-peter.maydell@linaro.org
---
 target/arm/cpu.h | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

We are close to running out of TB flags for AArch32; we could
start using the cs_base word, but before we do that we can
economise on our usage by sharing the same bits for the VFP
VECSTRIDE field and the XScale XSCALE_CPAR field. This
works because no XScale CPU ever had VFP.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-18-peter.maydell@linaro.org
---
 target/arm/cpu.h       | 10 ++++++----
 target/arm/cpu.c       |  7 +++++++
 target/arm/helper.c    |  6 +++++-
 target/arm/translate.c |  9 +++++++--
 4 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
 FIELD(TBFLAG_A32, THUMB, 0, 1)
 FIELD(TBFLAG_A32, VECLEN, 1, 3)
 FIELD(TBFLAG_A32, VECSTRIDE, 4, 2)
+/*
+ * We store the bottom two bits of the CPAR as TB flags and handle
+ * checks on the other bits at runtime. This shares the same bits as
+ * VECSTRIDE, which is OK as no XScale CPU has VFP.
+ */
+FIELD(TBFLAG_A32, XSCALE_CPAR, 4, 2)
 /*
  * Indicates whether cp register reads and writes by guest code should access
  * the secure or nonsecure bank of banked registers; note that this is not
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
 FIELD(TBFLAG_A32, VFPEN, 7, 1)
 FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
 FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
-/* We store the bottom two bits of the CPAR as TB flags and handle
- * checks on the other bits at runtime
- */
-FIELD(TBFLAG_A32, XSCALE_CPAR, 17, 2)
 /* For M profile only, Handler (ie not Thread) mode */
 FIELD(TBFLAG_A32, HANDLER, 21, 1)
 /* For M profile only, whether we should generate stack-limit checks */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         set_feature(env, ARM_FEATURE_THUMB_DSP);
     }
 
+    /*
+     * We rely on no XScale CPU having VFP so we can use the same bits in the
+     * TB flags field for VECSTRIDE and XSCALE_CPAR.
+     */
+    assert(!(arm_feature(env, ARM_FEATURE_VFP) &&
+             arm_feature(env, ARM_FEATURE_XSCALE)));
+
     if (arm_feature(env, ARM_FEATURE_V7) &&
         !arm_feature(env, ARM_FEATURE_M) &&
         !arm_feature(env, ARM_FEATURE_PMSA)) {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
             || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
             flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
         }
-        flags = FIELD_DP32(flags, TBFLAG_A32, XSCALE_CPAR, env->cp15.c15_cpar);
+        /* Note that XSCALE_CPAR shares bits with VECSTRIDE */
+        if (arm_feature(env, ARM_FEATURE_XSCALE)) {
+            flags = FIELD_DP32(flags, TBFLAG_A32,
+                               XSCALE_CPAR, env->cp15.c15_cpar);
+        }
     }
 
     flags = FIELD_DP32(flags, TBFLAG_ANY, MMUIDX, arm_to_core_mmu_idx(mmu_idx));
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     dc->fp_excp_el = FIELD_EX32(tb_flags, TBFLAG_ANY, FPEXC_EL);
     dc->vfp_enabled = FIELD_EX32(tb_flags, TBFLAG_A32, VFPEN);
     dc->vec_len = FIELD_EX32(tb_flags, TBFLAG_A32, VECLEN);
-    dc->vec_stride = FIELD_EX32(tb_flags, TBFLAG_A32, VECSTRIDE);
-    dc->c15_cpar = FIELD_EX32(tb_flags, TBFLAG_A32, XSCALE_CPAR);
+    if (arm_feature(env, ARM_FEATURE_XSCALE)) {
+        dc->c15_cpar = FIELD_EX32(tb_flags, TBFLAG_A32, XSCALE_CPAR);
+        dc->vec_stride = 0;
+    } else {
+        dc->vec_stride = FIELD_EX32(tb_flags, TBFLAG_A32, VECSTRIDE);
+        dc->c15_cpar = 0;
+    }
     dc->v7m_handler_mode = FIELD_EX32(tb_flags, TBFLAG_A32, HANDLER);
     dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
         regime_is_secure(env, dc->mmu_idx);
-- 
2.20.1

The M-profile FPCCR.S bit indicates the security status of
the floating point context. In the pseudocode ExecuteFPCheck()
function it is unconditionally set to match the current
security state whenever a floating point instruction is
executed.

Implement this by adding a new TB flag which tracks whether
FPCCR.S is different from the current security state, so
that we only need to emit the code to update it in the
less-common case when it is not already set correctly.

Note that we will add the handling for the other work done
by ExecuteFPCheck() in later commits.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-19-peter.maydell@linaro.org
---
 target/arm/cpu.h       |  2 ++
 target/arm/translate.h |  1 +
 target/arm/helper.c    |  5 +++++
 target/arm/translate.c | 20 ++++++++++++++++++++
 4 files changed, 28 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
 FIELD(TBFLAG_A32, VFPEN, 7, 1)
 FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
 FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
+/* For M profile only, set if FPCCR.S does not match current security state */
+FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1)
 /* For M profile only, Handler (ie not Thread) mode */
 FIELD(TBFLAG_A32, HANDLER, 21, 1)
 /* For M profile only, whether we should generate stack-limit checks */
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     bool v7m_handler_mode;
     bool v8m_secure; /* true if v8M and we're in Secure mode */
     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
+    bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
      * so that top level loop can generate correct syndrome information.
      */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         flags = FIELD_DP32(flags, TBFLAG_A32, STACKCHECK, 1);
     }
 
+    if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
+        FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S) != env->v7m.secure) {
+        flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
+    }
+
     *pflags = flags;
     *cs_base = 0;
 }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
         }
     }
 
+    if (arm_dc_feature(s, ARM_FEATURE_M)) {
+        /* Handle M-profile lazy FP state mechanics */
+
+        /* Update ownership of FP context: set FPCCR.S to match current state */
+        if (s->v8m_fpccr_s_wrong) {
+            TCGv_i32 tmp;
+
+            tmp = load_cpu_field(v7m.fpccr[M_REG_S]);
+            if (s->v8m_secure) {
+                tcg_gen_ori_i32(tmp, tmp, R_V7M_FPCCR_S_MASK);
+            } else {
+                tcg_gen_andi_i32(tmp, tmp, ~R_V7M_FPCCR_S_MASK);
+            }
+            store_cpu_field(tmp, v7m.fpccr[M_REG_S]);
+            /* Don't need to do this for any further FP insns in this TB */
+            s->v8m_fpccr_s_wrong = false;
+        }
+    }
+
     if (extract32(insn, 28, 4) == 0xf) {
         /*
          * Encodings with T=1 (Thumb) or unconditional (ARM):
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
         regime_is_secure(env, dc->mmu_idx);
     dc->v8m_stackcheck = FIELD_EX32(tb_flags, TBFLAG_A32, STACKCHECK);
+    dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
     dc->cp_regs = cpu->cp_regs;
     dc->features = env->features;
 
-- 
2.20.1

The M-profile FPCCR.ASPEN bit indicates that automatic floating-point
context preservation is enabled. Before executing any floating-point
instruction, if FPCCR.ASPEN is set and the CONTROL FPCA/SFPA bits
indicate that there is no active floating point context then we
must create a new context (by initializing FPSCR and setting
FPCA/SFPA to indicate that the context is now active). In the
pseudocode this is handled by ExecuteFPCheck().

Implement this with a new TB flag which tracks whether we
need to create a new FP context.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-20-peter.maydell@linaro.org
---
 target/arm/cpu.h       |  2 ++
 target/arm/translate.h |  1 +
 target/arm/helper.c    | 13 +++++++++++++
 target/arm/translate.c | 29 +++++++++++++++++++++++++++++
 4 files changed, 45 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
 FIELD(TBFLAG_A32, VFPEN, 7, 1)
 FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
 FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
+/* For M profile only, set if we must create a new FP context */
+FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1)
 /* For M profile only, set if FPCCR.S does not match current security state */
 FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1)
 /* For M profile only, Handler (ie not Thread) mode */
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     bool v8m_secure; /* true if v8M and we're in Secure mode */
     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
     bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
+    bool v7m_new_fp_ctxt_needed; /* ASPEN set but no active FP context */
     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
      * so that top level loop can generate correct syndrome information.
      */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
     }
 
+    if (arm_feature(env, ARM_FEATURE_M) &&
+        (env->v7m.fpccr[env->v7m.secure] & R_V7M_FPCCR_ASPEN_MASK) &&
+        (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) ||
+         (env->v7m.secure &&
+          !(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)))) {
+        /*
+         * ASPEN is set, but FPCA/SFPA indicate that there is no active
+         * FP context; we must create a new FP context before executing
+         * any FP insn.
+         */
+        flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
+    }
+
     *pflags = flags;
     *cs_base = 0;
 }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
             /* Don't need to do this for any further FP insns in this TB */
             s->v8m_fpccr_s_wrong = false;
         }
+
+        if (s->v7m_new_fp_ctxt_needed) {
+            /*
+             * Create new FP context by updating CONTROL.FPCA, CONTROL.SFPA
+             * and the FPSCR.
+             */
+            TCGv_i32 control, fpscr;
+            uint32_t bits = R_V7M_CONTROL_FPCA_MASK;
+
+            fpscr = load_cpu_field(v7m.fpdscr[s->v8m_secure]);
+            gen_helper_vfp_set_fpscr(cpu_env, fpscr);
+            tcg_temp_free_i32(fpscr);
+            /*
+             * We don't need to arrange to end the TB, because the only
+             * parts of FPSCR which we cache in the TB flags are the VECLEN
+             * and VECSTRIDE, and those don't exist for M-profile.
+             */
+
+            if (s->v8m_secure) {
+                bits |= R_V7M_CONTROL_SFPA_MASK;
+            }
+            control = load_cpu_field(v7m.control[M_REG_S]);
+            tcg_gen_ori_i32(control, control, bits);
+            store_cpu_field(control, v7m.control[M_REG_S]);
+            /* Don't need to do this for any further FP insns in this TB */
+            s->v7m_new_fp_ctxt_needed = false;
+        }
     }
 
     if (extract32(insn, 28, 4) == 0xf) {
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
         regime_is_secure(env, dc->mmu_idx);
     dc->v8m_stackcheck = FIELD_EX32(tb_flags, TBFLAG_A32, STACKCHECK);
     dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
+    dc->v7m_new_fp_ctxt_needed =
+        FIELD_EX32(tb_flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED);
     dc->cp_regs = cpu->cp_regs;
     dc->features = env->features;
 
-- 
2.20.1

Add a new helper function which returns the MMU index to use
for v7M, where the caller specifies all of the security
state, privilege level and whether the execution priority
is negative, and reimplement the existing
arm_v7m_mmu_idx_for_secstate_and_priv() in terms of it.

We are going to need this for the lazy-FP-stacking code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-21-peter.maydell@linaro.org
---
 target/arm/cpu.h    |  7 +++++++
 target/arm/helper.c | 14 +++++++++++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx)
     }
 }
 
+/*
+ * Return the MMU index for a v7M CPU with all relevant information
+ * manually specified.
+ */
+ARMMMUIdx arm_v7m_mmu_idx_all(CPUARMState *env,
+                              bool secstate, bool priv, bool negpri);
+
 /* Return the MMU index for a v7M CPU in the specified security and
  * privilege state.
  */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
     return 0;
 }
 
-ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
-                                                bool secstate, bool priv)
+ARMMMUIdx arm_v7m_mmu_idx_all(CPUARMState *env,
+                              bool secstate, bool priv, bool negpri)
 {
     ARMMMUIdx mmu_idx = ARM_MMU_IDX_M;
 
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
         mmu_idx |= ARM_MMU_IDX_M_PRIV;
     }
 
-    if (armv7m_nvic_neg_prio_requested(env->nvic, secstate)) {
+    if (negpri) {
         mmu_idx |= ARM_MMU_IDX_M_NEGPRI;
     }
 
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
     return mmu_idx;
 }
 
+ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
+                                                bool secstate, bool priv)
+{
+    bool negpri = armv7m_nvic_neg_prio_requested(env->nvic, secstate);
+
+    return arm_v7m_mmu_idx_all(env, secstate, priv, negpri);
+}
+
 /* Return the MMU index for a v7M CPU in the specified security state */
 ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate)
 {
-- 
2.20.1

In the v7M architecture, if an exception is generated in the process
of doing the lazy stacking of FP registers, the handling of
possible escalation to HardFault is treated differently to the normal
approach: it works based on the saved information about exception
readiness that was stored in the FPCCR when the stack frame was
created. Provide a new function armv7m_nvic_set_pending_lazyfp()
which pends exceptions during lazy stacking, and implements
this logic.

This corresponds to the pseudocode TakePreserveFPException().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-22-peter.maydell@linaro.org
---
 target/arm/cpu.h      | 12 ++++++
 hw/intc/armv7m_nvic.c | 96 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 108 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_set_pending(void *opaque, int irq, bool secure);
  * a different exception).
  */
 void armv7m_nvic_set_pending_derived(void *opaque, int irq, bool secure);
+/**
+ * armv7m_nvic_set_pending_lazyfp: mark this lazy FP exception as pending
+ * @opaque: the NVIC
+ * @irq: the exception number to mark pending
+ * @secure: false for non-banked exceptions or for the nonsecure
+ * version of a banked exception, true for the secure version of a banked
+ * exception.
+ *
+ * Similar to armv7m_nvic_set_pending(), but specifically for exceptions
+ * generated in the course of lazy stacking of FP registers.
+ */
+void armv7m_nvic_set_pending_lazyfp(void *opaque, int irq, bool secure);
 /**
  * armv7m_nvic_get_pending_irq_info: return highest priority pending
  *    exception, and whether it targets Secure state
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_set_pending_derived(void *opaque, int irq, bool secure)
     do_armv7m_nvic_set_pending(opaque, irq, secure, true);
 }
 
+void armv7m_nvic_set_pending_lazyfp(void *opaque, int irq, bool secure)
+{
+    /*
+     * Pend an exception during lazy FP stacking. This differs
+     * from the usual exception pending because the logic for
+     * whether we should escalate depends on the saved context
+     * in the FPCCR register, not on the current state of the CPU/NVIC.
+     */
+    NVICState *s = (NVICState *)opaque;
+    bool banked = exc_is_banked(irq);
+    VecInfo *vec;
+    bool targets_secure;
+    bool escalate = false;
+    /*
+     * We will only look at bits in fpccr if this is a banked exception
+     * (in which case 'secure' tells us whether it is the S or NS version).
+     * All the bits for the non-banked exceptions are in fpccr_s.
+     */
+    uint32_t fpccr_s = s->cpu->env.v7m.fpccr[M_REG_S];
+    uint32_t fpccr = s->cpu->env.v7m.fpccr[secure];
+
+    assert(irq > ARMV7M_EXCP_RESET && irq < s->num_irq);
+    assert(!secure || banked);
+
+    vec = (banked && secure) ? &s->sec_vectors[irq] : &s->vectors[irq];
+
+    targets_secure = banked ? secure : exc_targets_secure(s, irq);
+
+    switch (irq) {
+    case ARMV7M_EXCP_DEBUG:
+        if (!(fpccr_s & R_V7M_FPCCR_MONRDY_MASK)) {
+            /* Ignore DebugMonitor exception */
+            return;
+        }
+        break;
+    case ARMV7M_EXCP_MEM:
+        escalate = !(fpccr & R_V7M_FPCCR_MMRDY_MASK);
+        break;
+    case ARMV7M_EXCP_USAGE:
+        escalate = !(fpccr & R_V7M_FPCCR_UFRDY_MASK);
+        break;
+    case ARMV7M_EXCP_BUS:
+        escalate = !(fpccr_s & R_V7M_FPCCR_BFRDY_MASK);
+        break;
+    case ARMV7M_EXCP_SECURE:
+        escalate = !(fpccr_s & R_V7M_FPCCR_SFRDY_MASK);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    if (escalate) {
+        /*
+         * Escalate to HardFault: faults that initially targeted Secure
+         * continue to do so, even if HF normally targets NonSecure.
+         */
+        irq = ARMV7M_EXCP_HARD;
+        if (arm_feature(&s->cpu->env, ARM_FEATURE_M_SECURITY) &&
+            (targets_secure ||
+             !(s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK))) {
+            vec = &s->sec_vectors[irq];
+        } else {
+            vec = &s->vectors[irq];
+        }
+    }
+
+    if (!vec->enabled ||
+        nvic_exec_prio(s) <= exc_group_prio(s, vec->prio, secure)) {
+        if (!(fpccr_s & R_V7M_FPCCR_HFRDY_MASK)) {
+            /*
+             * We want to escalate to HardFault but the context the
+             * FP state belongs to prevents the exception pre-empting.
+             */
+            cpu_abort(&s->cpu->parent_obj,
+                      "Lockup: can't escalate to HardFault during "
+                      "lazy FP register stacking\n");
+        }
+    }
+
+    if (escalate) {
+        s->cpu->env.v7m.hfsr |= R_V7M_HFSR_FORCED_MASK;
+    }
+    if (!vec->pending) {
+        vec->pending = 1;
+        /*
+         * We do not call nvic_irq_update(), because we know our caller
+         * is going to handle causing us to take the exception by
+         * raising EXCP_LAZYFP, so raising the IRQ line would be
+         * pointless extra work. We just need to recompute the
+         * priorities so that armv7m_nvic_can_take_pending_exception()
+         * returns the right answer.
+         */
+        nvic_recompute_state(s);
+    }
+}
+
 /* Make pending IRQ active.  */
 void armv7m_nvic_acknowledge_irq(void *opaque)
 {
-- 
2.20.1

Pushing registers to the stack for v7M needs to handle three cases:
 * the "normal" case where we pend exceptions
 * an "ignore faults" case where we set FSR bits but
   do not pend exceptions (this is used when we are
   handling some kinds of derived exception on exception entry)
 * a "lazy FP stacking" case, where different FSR bits
   are set and the exception is pended differently

Implement this by changing the existing flag argument that
tells us whether to ignore faults or not into an enum that
specifies which of the 3 modes we should handle.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-23-peter.maydell@linaro.org
---
 target/arm/helper.c | 118 +++++++++++++++++++++++++++++---------------
 1 file changed, 79 insertions(+), 39 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static bool v7m_cpacr_pass(CPUARMState *env, bool is_secure, bool is_priv)
     }
 }
 
+/*
+ * What kind of stack write are we doing? This affects how exceptions
+ * generated during the stacking are treated.
+ */
+typedef enum StackingMode {
+    STACK_NORMAL,
+    STACK_IGNFAULTS,
+    STACK_LAZYFP,
+} StackingMode;
+
 static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
-                            ARMMMUIdx mmu_idx, bool ignfault)
+                            ARMMMUIdx mmu_idx, StackingMode mode)
 {
     CPUState *cs = CPU(cpu);
     CPUARMState *env = &cpu->env;
@@ -XXX,XX +XXX,XX @@ static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
                       &attrs, &prot, &page_size, &fi, NULL)) {
         /* MPU/SAU lookup failed */
         if (fi.type == ARMFault_QEMU_SFault) {
-            qemu_log_mask(CPU_LOG_INT,
-                          "...SecureFault with SFSR.AUVIOL during stacking\n");
-            env->v7m.sfsr |= R_V7M_SFSR_AUVIOL_MASK | R_V7M_SFSR_SFARVALID_MASK;
+            if (mode == STACK_LAZYFP) {
+                qemu_log_mask(CPU_LOG_INT,
+                              "...SecureFault with SFSR.LSPERR "
+                              "during lazy stacking\n");
+                env->v7m.sfsr |= R_V7M_SFSR_LSPERR_MASK;
+            } else {
+                qemu_log_mask(CPU_LOG_INT,
+                              "...SecureFault with SFSR.AUVIOL "
+                              "during stacking\n");
+                env->v7m.sfsr |= R_V7M_SFSR_AUVIOL_MASK;
+            }
+            env->v7m.sfsr |= R_V7M_SFSR_SFARVALID_MASK;
             env->v7m.sfar = addr;
             exc = ARMV7M_EXCP_SECURE;
             exc_secure = false;
         } else {
-            qemu_log_mask(CPU_LOG_INT, "...MemManageFault with CFSR.MSTKERR\n");
-            env->v7m.cfsr[secure] |= R_V7M_CFSR_MSTKERR_MASK;
+            if (mode == STACK_LAZYFP) {
+                qemu_log_mask(CPU_LOG_INT,
+                              "...MemManageFault with CFSR.MLSPERR\n");
+                env->v7m.cfsr[secure] |= R_V7M_CFSR_MLSPERR_MASK;
+            } else {
+                qemu_log_mask(CPU_LOG_INT,
+                              "...MemManageFault with CFSR.MSTKERR\n");
+                env->v7m.cfsr[secure] |= R_V7M_CFSR_MSTKERR_MASK;
+            }
             exc = ARMV7M_EXCP_MEM;
             exc_secure = secure;
         }
@@ -XXX,XX +XXX,XX @@ static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
                          attrs, &txres);
     if (txres != MEMTX_OK) {
         /* BusFault trying to write the data */
-        qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.STKERR\n");
-        env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_STKERR_MASK;
+        if (mode == STACK_LAZYFP) {
+            qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.LSPERR\n");
+            env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_LSPERR_MASK;
+        } else {
+            qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.STKERR\n");
+            env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_STKERR_MASK;
+        }
         exc = ARMV7M_EXCP_BUS;
         exc_secure = false;
         goto pend_fault;
@@ -XXX,XX +XXX,XX @@ pend_fault:
      * later if we have two derived exceptions.
      * The only case when we must not pend the exception but instead
      * throw it away is if we are doing the push of the callee registers
-     * and we've already generated a derived exception. Even in this
-     * case we will still update the fault status registers.
+     * and we've already generated a derived exception (this is indicated
+     * by the caller passing STACK_IGNFAULTS). Even in this case we will
+     * still update the fault status registers.
      */
-    if (!ignfault) {
+    switch (mode) {
+    case STACK_NORMAL:
         armv7m_nvic_set_pending_derived(env->nvic, exc, exc_secure);
+        break;
+    case STACK_LAZYFP:
+        armv7m_nvic_set_pending_lazyfp(env->nvic, exc, exc_secure);
+        break;
+    case STACK_IGNFAULTS:
+        break;
     }
     return false;
 }
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     uint32_t limit;
     bool want_psp;
     uint32_t sig;
+    StackingMode smode = ignore_faults ? STACK_IGNFAULTS : STACK_NORMAL;
 
     if (dotailchain) {
         bool mode = lr & R_V7M_EXCRET_MODE_MASK;
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
      */
     sig = v7m_integrity_sig(env, lr);
     stacked_ok =
-        v7m_stack_write(cpu, frameptr, sig, mmu_idx, ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x10, env->regs[6], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x14, env->regs[7], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x18, env->regs[8], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x1c, env->regs[9], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x20, env->regs[10], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x24, env->regs[11], mmu_idx,
-                        ignore_faults);
+        v7m_stack_write(cpu, frameptr, sig, mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x10, env->regs[6], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x14, env->regs[7], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x18, env->regs[8], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x1c, env->regs[9], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x20, env->regs[10], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x24, env->regs[11], mmu_idx, smode);
 
     /* Update SP regardless of whether any of the stack accesses failed. */
     *frame_sp_p = frameptr;
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
      * if it has higher priority).
      */
     stacked_ok = stacked_ok &&
-        v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 4, env->regs[1], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 8, env->regs[2], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 12, env->regs[3], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 16, env->regs[12], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 20, env->regs[14], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
+        v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 4, env->regs[1],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 8, env->regs[2],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 12, env->regs[3],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 16, env->regs[12],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 20, env->regs[14],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 24, env->regs[15],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, STACK_NORMAL);
 
     if (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) {
         /* FPU is active, try to save its registers */
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
                         faddr += 8; /* skip the slot for the FPSCR */
                     }
                     stacked_ok = stacked_ok &&
-                        v7m_stack_write(cpu, faddr, slo, mmu_idx, false) &&
-                        v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, false);
+                        v7m_stack_write(cpu, faddr, slo,
+                                        mmu_idx, STACK_NORMAL) &&
+                        v7m_stack_write(cpu, faddr + 4, shi,
+                                        mmu_idx, STACK_NORMAL);
                 }
                 stacked_ok = stacked_ok &&
                     v7m_stack_write(cpu, frameptr + 0x60,
-                                    vfp_get_fpscr(env), mmu_idx, false);
+                                    vfp_get_fpscr(env), mmu_idx, STACK_NORMAL);
                 if (cpacr_pass) {
                     for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
                         *aa32_vfp_dreg(env, i / 2) = 0;
-- 
2.20.1

The M-profile architecture floating point system supports
lazy FP state preservation, where FP registers are not
pushed to the stack when an exception occurs but are instead
only saved if and when the first FP instruction in the exception
handler is executed. Implement this in QEMU, corresponding
to the check of LSPACT in the pseudocode ExecuteFPCheck().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-24-peter.maydell@linaro.org
---
 target/arm/cpu.h       |   3 ++
 target/arm/helper.h    |   2 +
 target/arm/translate.h |   1 +
 target/arm/helper.c    | 112 +++++++++++++++++++++++++++++++++++++++++
 target/arm/translate.c |  22 ++++++++
 5 files changed, 140 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@
 #define EXCP_NOCP           17   /* v7M NOCP UsageFault */
 #define EXCP_INVSTATE       18   /* v7M INVSTATE UsageFault */
 #define EXCP_STKOF          19   /* v8M STKOF UsageFault */
+#define EXCP_LAZYFP         20   /* v7M fault during lazy FP stacking */
 /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
 
 #define ARMV7M_EXCP_RESET   1
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
 FIELD(TBFLAG_A32, VFPEN, 7, 1)
 FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
 FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
+/* For M profile only, set if FPCCR.LSPACT is set */
+FIELD(TBFLAG_A32, LSPACT, 18, 1)
 /* For M profile only, set if we must create a new FP context */
 FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1)
 /* For M profile only, set if FPCCR.S does not match current security state */
diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(v7m_blxns, void, env, i32)
 
 DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
 
+DEF_HELPER_1(v7m_preserve_fp_state, void, env)
+
 DEF_HELPER_2(v8m_stackcheck, void, env, i32)
 
 DEF_HELPER_4(access_check_cp_reg, void, env, ptr, i32, i32)
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
     bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
     bool v7m_new_fp_ctxt_needed; /* ASPEN set but no active FP context */
+    bool v7m_lspact; /* FPCCR.LSPACT set */
     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
      * so that top level loop can generate correct syndrome information.
      */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
     g_assert_not_reached();
 }
 
+void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
+{
+    /* translate.c should never generate calls here in user-only mode */
+    g_assert_not_reached();
+}
+
 uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
 {
     /* The TT instructions can be used by unprivileged code, but in
@@ -XXX,XX +XXX,XX @@ pend_fault:
     return false;
 }
 
+void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
+{
+    /*
+     * Preserve FP state (because LSPACT was set and we are about
+     * to execute an FP instruction). This corresponds to the
+     * PreserveFPState() pseudocode.
+     * We may throw an exception if the stacking fails.
+     */
+    ARMCPU *cpu = arm_env_get_cpu(env);
+    bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
+    bool negpri = !(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_HFRDY_MASK);
+    bool is_priv = !(env->v7m.fpccr[is_secure] & R_V7M_FPCCR_USER_MASK);
+    bool splimviol = env->v7m.fpccr[is_secure] & R_V7M_FPCCR_SPLIMVIOL_MASK;
+    uint32_t fpcar = env->v7m.fpcar[is_secure];
+    bool stacked_ok = true;
+    bool ts = is_secure && (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK);
+    bool take_exception;
+
+    /* Take the iothread lock as we are going to touch the NVIC */
+    qemu_mutex_lock_iothread();
+
+    /* Check the background context had access to the FPU */
+    if (!v7m_cpacr_pass(env, is_secure, is_priv)) {
+        armv7m_nvic_set_pending_lazyfp(env->nvic, ARMV7M_EXCP_USAGE, is_secure);
+        env->v7m.cfsr[is_secure] |= R_V7M_CFSR_NOCP_MASK;
+        stacked_ok = false;
+    } else if (!is_secure && !extract32(env->v7m.nsacr, 10, 1)) {
+        armv7m_nvic_set_pending_lazyfp(env->nvic, ARMV7M_EXCP_USAGE, M_REG_S);
+        env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_NOCP_MASK;
+        stacked_ok = false;
+    }
+
+    if (!splimviol && stacked_ok) {
+        /* We only stack if the stack limit wasn't violated */
+        int i;
+        ARMMMUIdx mmu_idx;
+
+        mmu_idx = arm_v7m_mmu_idx_all(env, is_secure, is_priv, negpri);
+        for (i = 0; i < (ts ? 32 : 16); i += 2) {
+            uint64_t dn = *aa32_vfp_dreg(env, i / 2);
+            uint32_t faddr = fpcar + 4 * i;
+            uint32_t slo = extract64(dn, 0, 32);
+            uint32_t shi = extract64(dn, 32, 32);
+
+            if (i >= 16) {
+                faddr += 8; /* skip the slot for the FPSCR */
+            }
+            stacked_ok = stacked_ok &&
+                v7m_stack_write(cpu, faddr, slo, mmu_idx, STACK_LAZYFP) &&
+                v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, STACK_LAZYFP);
+        }
+
+        stacked_ok = stacked_ok &&
+            v7m_stack_write(cpu, fpcar + 0x40,
+                            vfp_get_fpscr(env), mmu_idx, STACK_LAZYFP);
+    }
+
+    /*
+     * We definitely pended an exception, but it's possible that it
+     * might not be able to be taken now. If its priority permits us
+     * to take it now, then we must not update the LSPACT or FP regs,
+     * but instead jump out to take the exception immediately.
+     * If it's just pending and won't be taken until the current
+     * handler exits, then we do update LSPACT and the FP regs.
+     */
+    take_exception = !stacked_ok &&
+        armv7m_nvic_can_take_pending_exception(env->nvic);
+
+    qemu_mutex_unlock_iothread();
+
+    if (take_exception) {
+        raise_exception_ra(env, EXCP_LAZYFP, 0, 1, GETPC());
+    }
+
+    env->v7m.fpccr[is_secure] &= ~R_V7M_FPCCR_LSPACT_MASK;
+
+    if (ts) {
+        /* Clear s0 to s31 and the FPSCR */
+        int i;
+
+        for (i = 0; i < 32; i += 2) {
+            *aa32_vfp_dreg(env, i / 2) = 0;
+        }
+        vfp_set_fpscr(env, 0);
+    }
+    /*
+     * Otherwise s0 to s15 and FPSCR are UNKNOWN; we choose to leave them
+     * unchanged.
+     */
+}
+
 /* Write to v7M CONTROL.SPSEL bit for the specified security bank.
  * This may change the current stack pointer between Main and Process
  * stack pointers if it is done for the CONTROL register for the current
@@ -XXX,XX +XXX,XX @@ static void arm_log_exception(int idx)
             [EXCP_NOCP] = "v7M NOCP UsageFault",
             [EXCP_INVSTATE] = "v7M INVSTATE UsageFault",
             [EXCP_STKOF] = "v8M STKOF UsageFault",
+            [EXCP_LAZYFP] = "v7M exception during lazy FP stacking",
         };
 
         if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
             return;
         }
         break;
+    case EXCP_LAZYFP:
+        /*
+         * We already pended the specific exception in the NVIC in the
+         * v7m_preserve_fp_state() helper function.
+         */
+        break;
     default:
         cpu_abort(cs, "Unhandled exception 0x%x\n", cs->exception_index);
         return; /* Never happens.  Keep compiler happy.  */
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
     }
 
+    if (arm_feature(env, ARM_FEATURE_M)) {
+        bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
+
+        if (env->v7m.fpccr[is_secure] & R_V7M_FPCCR_LSPACT_MASK) {
+            flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
+        }
+    }
+
     *pflags = flags;
     *cs_base = 0;
 }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
     if (arm_dc_feature(s, ARM_FEATURE_M)) {
         /* Handle M-profile lazy FP state mechanics */
 
+        /* Trigger lazy-state preservation if necessary */
+        if (s->v7m_lspact) {
+            /*
+             * Lazy state saving affects external memory and also the NVIC,
+             * so we must mark it as an IO operation for icount.
+             */
+            if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+                gen_io_start();
+            }
+            gen_helper_v7m_preserve_fp_state(cpu_env);
+            if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+                gen_io_end();
+            }
+            /*
+             * If the preserve_fp_state helper doesn't throw an exception
+             * then it will clear LSPACT; we don't need to repeat this for
+             * any further FP insns in this TB.
+             */
+            s->v7m_lspact = false;
+        }
+
         /* Update ownership of FP context: set FPCCR.S to match current state */
         if (s->v8m_fpccr_s_wrong) {
             TCGv_i32 tmp;
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
     dc->v7m_new_fp_ctxt_needed =
         FIELD_EX32(tb_flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED);
+    dc->v7m_lspact = FIELD_EX32(tb_flags, TBFLAG_A32, LSPACT);
     dc->cp_regs = cpu->cp_regs;
     dc->features = env->features;
 
-- 
2.20.1

Implement the VLSTM instruction for v7M for the FPU present case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-25-peter.maydell@linaro.org
---
 target/arm/cpu.h       |  2 +
 target/arm/helper.h    |  2 +
 target/arm/helper.c    | 84 ++++++++++++++++++++++++++++++++++++++++++
 target/arm/translate.c | 15 +++++++-
 4 files changed, 102 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@
 #define EXCP_INVSTATE       18   /* v7M INVSTATE UsageFault */
 #define EXCP_STKOF          19   /* v8M STKOF UsageFault */
 #define EXCP_LAZYFP         20   /* v7M fault during lazy FP stacking */
+#define EXCP_LSERR          21   /* v8M LSERR SecureFault */
+#define EXCP_UNALIGNED      22   /* v7M UNALIGNED UsageFault */
 /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
 
 #define ARMV7M_EXCP_RESET   1
diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
 
 DEF_HELPER_1(v7m_preserve_fp_state, void, env)
 
+DEF_HELPER_2(v7m_vlstm, void, env, i32)
+
 DEF_HELPER_2(v8m_stackcheck, void, env, i32)
 
 DEF_HELPER_4(access_check_cp_reg, void, env, ptr, i32, i32)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
     g_assert_not_reached();
 }
 
+void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
+{
+    /* translate.c should never generate calls here in user-only mode */
+    g_assert_not_reached();
+}
+
 uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
 {
     /* The TT instructions can be used by unprivileged code, but in
@@ -XXX,XX +XXX,XX @@ static void v7m_update_fpccr(CPUARMState *env, uint32_t frameptr,
     }
 }
 
+void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
+{
+    /* fptr is the value of Rn, the frame pointer we store the FP regs to */
+    bool s = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
+    bool lspact = env->v7m.fpccr[s] & R_V7M_FPCCR_LSPACT_MASK;
+
+    assert(env->v7m.secure);
+
+    if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
+        return;
+    }
+
+    /* Check access to the coprocessor is permitted */
+    if (!v7m_cpacr_pass(env, true, arm_current_el(env) != 0)) {
+        raise_exception_ra(env, EXCP_NOCP, 0, 1, GETPC());
+    }
+
+    if (lspact) {
+        /* LSPACT should not be active when there is active FP state */
+        raise_exception_ra(env, EXCP_LSERR, 0, 1, GETPC());
+    }
+
+    if (fptr & 7) {
+        raise_exception_ra(env, EXCP_UNALIGNED, 0, 1, GETPC());
+    }
+
+    /*
+     * Note that we do not use v7m_stack_write() here, because the
+     * accesses should not set the FSR bits for stacking errors if they
+     * fail. (In pseudocode terms, they are AccType_NORMAL, not AccType_STACK
+     * or AccType_LAZYFP). Faults in cpu_stl_data() will throw exceptions
+     * and longjmp out.
+     */
+    if (!(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPEN_MASK)) {
+        bool ts = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK;
+        int i;
+
+        for (i = 0; i < (ts ? 32 : 16); i += 2) {
+            uint64_t dn = *aa32_vfp_dreg(env, i / 2);
+            uint32_t faddr = fptr + 4 * i;
+            uint32_t slo = extract64(dn, 0, 32);
+            uint32_t shi = extract64(dn, 32, 32);
+
+            if (i >= 16) {
+                faddr += 8; /* skip the slot for the FPSCR */
+            }
+            cpu_stl_data(env, faddr, slo);
+            cpu_stl_data(env, faddr + 4, shi);
+        }
+        cpu_stl_data(env, fptr + 0x40, vfp_get_fpscr(env));
+
+        /*
+         * If TS is 0 then s0 to s15 and FPSCR are UNKNOWN; we choose to
+         * leave them unchanged, matching our choice in v7m_preserve_fp_state.
+         */
+        if (ts) {
+            for (i = 0; i < 32; i += 2) {
+                *aa32_vfp_dreg(env, i / 2) = 0;
+            }
+            vfp_set_fpscr(env, 0);
+        }
+    } else {
+        v7m_update_fpccr(env, fptr, false);
+    }
+
+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
+}
+
 static bool v7m_push_stack(ARMCPU *cpu)
 {
     /* Do the "set up stack frame" part of exception entry,
@@ -XXX,XX +XXX,XX @@ static void arm_log_exception(int idx)
             [EXCP_INVSTATE] = "v7M INVSTATE UsageFault",
             [EXCP_STKOF] = "v8M STKOF UsageFault",
             [EXCP_LAZYFP] = "v7M exception during lazy FP stacking",
+            [EXCP_LSERR] = "v8M LSERR UsageFault",
+            [EXCP_UNALIGNED] = "v7M UNALIGNED UsageFault",
         };
 
         if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
         env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_STKOF_MASK;
         break;
+    case EXCP_LSERR:
+        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
+        env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
+        break;
+    case EXCP_UNALIGNED:
+        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
+        env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_UNALIGNED_MASK;
+        break;
     case EXCP_SWI:
         /* The PC already points to the next instruction.  */
         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SVC, env->v7m.secure);
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                 if (!s->v8m_secure || (insn & 0x0040f0ff)) {
                     goto illegal_op;
                 }
-                /* Just NOP since FP support is not implemented */
+
+                if (arm_dc_feature(s, ARM_FEATURE_VFP)) {
+                    TCGv_i32 fptr = load_reg(s, rn);
+
+                    if (extract32(insn, 20, 1)) {
+                        /* VLLDM */
+                    } else {
+                        gen_helper_v7m_vlstm(cpu_env, fptr);
+                    }
+                    tcg_temp_free_i32(fptr);
+
+                    /* End the TB, because we have updated FP control bits */
+                    s->base.is_jmp = DISAS_UPDATE;
+                }
                 break;
             }
             if (arm_dc_feature(s, ARM_FEATURE_VFP) &&
-- 
2.20.1

Implement the VLLDM instruction for v7M for the FPU present cas.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-26-peter.maydell@linaro.org
---
 target/arm/helper.h    |  1 +
 target/arm/helper.c    | 54 ++++++++++++++++++++++++++++++++++++++++++
 target/arm/translate.c |  2 +-
 3 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
 DEF_HELPER_1(v7m_preserve_fp_state, void, env)
 
 DEF_HELPER_2(v7m_vlstm, void, env, i32)
+DEF_HELPER_2(v7m_vlldm, void, env, i32)
 
 DEF_HELPER_2(v8m_stackcheck, void, env, i32)
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
     g_assert_not_reached();
 }
 
+void HELPER(v7m_vlldm)(CPUARMState *env, uint32_t fptr)
+{
+    /* translate.c should never generate calls here in user-only mode */
+    g_assert_not_reached();
+}
+
 uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
 {
     /* The TT instructions can be used by unprivileged code, but in
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
     env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
 }
 
+void HELPER(v7m_vlldm)(CPUARMState *env, uint32_t fptr)
+{
+    /* fptr is the value of Rn, the frame pointer we load the FP regs from */
+    assert(env->v7m.secure);
+
+    if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
+        return;
+    }
+
+    /* Check access to the coprocessor is permitted */
+    if (!v7m_cpacr_pass(env, true, arm_current_el(env) != 0)) {
+        raise_exception_ra(env, EXCP_NOCP, 0, 1, GETPC());
+    }
+
+    if (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK) {
+        /* State in FP is still valid */
+        env->v7m.fpccr[M_REG_S] &= ~R_V7M_FPCCR_LSPACT_MASK;
+    } else {
+        bool ts = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK;
+        int i;
+        uint32_t fpscr;
+
+        if (fptr & 7) {
+            raise_exception_ra(env, EXCP_UNALIGNED, 0, 1, GETPC());
+        }
+
+        for (i = 0; i < (ts ? 32 : 16); i += 2) {
+            uint32_t slo, shi;
+            uint64_t dn;
+            uint32_t faddr = fptr + 4 * i;
+
+            if (i >= 16) {
+                faddr += 8; /* skip the slot for the FPSCR */
+            }
+
+            slo = cpu_ldl_data(env, faddr);
+            shi = cpu_ldl_data(env, faddr + 4);
+
+            dn = (uint64_t) shi << 32 | slo;
+            *aa32_vfp_dreg(env, i / 2) = dn;
+        }
+        fpscr = cpu_ldl_data(env, fptr + 0x40);
+        vfp_set_fpscr(env, fpscr);
+    }
+
+    env->v7m.control[M_REG_S] |= R_V7M_CONTROL_FPCA_MASK;
+}
+
 static bool v7m_push_stack(ARMCPU *cpu)
 {
     /* Do the "set up stack frame" part of exception entry,
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                     TCGv_i32 fptr = load_reg(s, rn);
 
                     if (extract32(insn, 20, 1)) {
-                        /* VLLDM */
+                        gen_helper_v7m_vlldm(cpu_env, fptr);
                     } else {
                         gen_helper_v7m_vlstm(cpu_env, fptr);
                     }
-- 
2.20.1

Enable the FPU by default for the Cortex-M4 and Cortex-M33.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-27-peter.maydell@linaro.org
---
 target/arm/cpu.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_M);
     set_feature(&cpu->env, ARM_FEATURE_M_MAIN);
     set_feature(&cpu->env, ARM_FEATURE_THUMB_DSP);
+    set_feature(&cpu->env, ARM_FEATURE_VFP4);
     cpu->midr = 0x410fc240; /* r0p0 */
     cpu->pmsav7_dregion = 8;
+    cpu->isar.mvfr0 = 0x10110021;
+    cpu->isar.mvfr1 = 0x11000011;
+    cpu->isar.mvfr2 = 0x00000000;
     cpu->id_pfr0 = 0x00000030;
     cpu->id_pfr1 = 0x00000200;
     cpu->id_dfr0 = 0x00100000;
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_M_MAIN);
     set_feature(&cpu->env, ARM_FEATURE_M_SECURITY);
     set_feature(&cpu->env, ARM_FEATURE_THUMB_DSP);
+    set_feature(&cpu->env, ARM_FEATURE_VFP4);
     cpu->midr = 0x410fd213; /* r0p3 */
     cpu->pmsav7_dregion = 16;
     cpu->sau_sregion = 8;
+    cpu->isar.mvfr0 = 0x10110021;
+    cpu->isar.mvfr1 = 0x11000011;
+    cpu->isar.mvfr2 = 0x00000040;
     cpu->id_pfr0 = 0x00000030;
     cpu->id_pfr1 = 0x00000210;
     cpu->id_dfr0 = 0x00200000;
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-2-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/aspeed.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/arm/aspeed_soc.h"
 #include "hw/boards.h"
 #include "hw/i2c/smbus_eeprom.h"
+#include "hw/misc/pca9552.h"
+#include "hw/misc/tmp105.h"
 #include "qemu/log.h"
 #include "sysemu/block-backend.h"
 #include "hw/loader.h"
@@ -XXX,XX +XXX,XX @@ static void ast2500_evb_i2c_init(AspeedBoardState *bmc)
                           eeprom_buf);
 
     /* The AST2500 EVB expects a LM75 but a TMP105 is compatible */
-    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 7), "tmp105", 0x4d);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 7),
+                     TYPE_TMP105, 0x4d);
 
     /* The AST2500 EVB does not have an RTC. Let's pretend that one is
      * plugged on the I2C bus header */
@@ -XXX,XX +XXX,XX @@ static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
     AspeedSoCState *soc = &bmc->soc;
     uint8_t *eeprom_buf = g_malloc0(8 * 1024);
 
-    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 3), "pca9552", 0x60);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 3), TYPE_PCA9552,
+                     0x60);
 
     i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "tmp423", 0x4c);
     i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 5), "tmp423", 0x4c);
 
     /* The Witherspoon expects a TMP275 but a TMP105 is compatible */
-    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 9), "tmp105", 0x4a);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 9), TYPE_TMP105,
+                     0x4a);
 
     /* The witherspoon board expects Epson RX8900 I2C RTC but a ds1338 is
      * good enough */
@@ -XXX,XX +XXX,XX @@ static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
 
     smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 11), 0x51,
                           eeprom_buf);
-    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 11), "pca9552",
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 11), TYPE_PCA9552,
                      0x60);
 }
 
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-3-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/nseries.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/nseries.c
+++ b/hw/arm/nseries.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/boards.h"
 #include "hw/i2c/i2c.h"
 #include "hw/devices.h"
+#include "hw/misc/tmp105.h"
 #include "hw/block/flash.h"
 #include "hw/hw.h"
 #include "hw/bt.h"
@@ -XXX,XX +XXX,XX @@ static void n8x0_i2c_setup(struct n800_s *s)
     qemu_register_powerdown_notifier(&n8x0_system_powerdown_notifier);
 
     /* Attach a TMP105 PM chip (A0 wired to ground) */
-    dev = i2c_create_slave(i2c, "tmp105", N8X0_TMP105_ADDR);
+    dev = i2c_create_slave(i2c, TYPE_TMP105, N8X0_TMP105_ADDR);
     qdev_connect_gpio_out(dev, 0, tmp_irq);
 }
 
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

No code used the tc6393xb_gpio_in_get() and tc6393xb_gpio_out_set()
functions since their introduction in commit 88d2c950b002. Time to
remove them.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-4-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/devices.h  |  3 ---
 hw/display/tc6393xb.c | 16 ----------------
 2 files changed, 19 deletions(-)

diff --git a/include/hw/devices.h b/include/hw/devices.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/devices.h
+++ b/include/hw/devices.h
@@ -XXX,XX +XXX,XX @@ void retu_key_event(void *retu, int state);
 typedef struct TC6393xbState TC6393xbState;
 TC6393xbState *tc6393xb_init(struct MemoryRegion *sysmem,
                              uint32_t base, qemu_irq irq);
-void tc6393xb_gpio_out_set(TC6393xbState *s, int line,
-                    qemu_irq handler);
-qemu_irq *tc6393xb_gpio_in_get(TC6393xbState *s);
 qemu_irq tc6393xb_l3v_get(TC6393xbState *s);
 
 #endif
diff --git a/hw/display/tc6393xb.c b/hw/display/tc6393xb.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/display/tc6393xb.c
+++ b/hw/display/tc6393xb.c
@@ -XXX,XX +XXX,XX @@ struct TC6393xbState {
              blanked : 1;
 };
 
-qemu_irq *tc6393xb_gpio_in_get(TC6393xbState *s)
-{
-    return s->gpio_in;
-}
-
 static void tc6393xb_gpio_set(void *opaque, int line, int level)
 {
 //    TC6393xbState *s = opaque;
@@ -XXX,XX +XXX,XX @@ static void tc6393xb_gpio_set(void *opaque, int line, int level)
     // FIXME: how does the chip reflect the GPIO input level change?
 }
 
-void tc6393xb_gpio_out_set(TC6393xbState *s, int line,
-                    qemu_irq handler)
-{
-    if (line >= TC6393XB_GPIOS) {
-        fprintf(stderr, "TC6393xb: no GPIO pin %d\n", line);
-        return;
-    }
-
-    s->handler[line] = handler;
-}
-
 static void tc6393xb_gpio_handler_update(TC6393xbState *s)
 {
     uint32_t level, diff;
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-5-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/devices.h          |  6 ------
 include/hw/display/tc6393xb.h | 24 ++++++++++++++++++++++++
 hw/arm/tosa.c                 |  2 +-
 hw/display/tc6393xb.c         |  2 +-
 MAINTAINERS                   |  1 +
 5 files changed, 27 insertions(+), 8 deletions(-)
 create mode 100644 include/hw/display/tc6393xb.h

diff --git a/include/hw/devices.h b/include/hw/devices.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/devices.h
+++ b/include/hw/devices.h
@@ -XXX,XX +XXX,XX @@ void *tahvo_init(qemu_irq irq, int betty);
 
 void retu_key_event(void *retu, int state);
 
-/* tc6393xb.c */
-typedef struct TC6393xbState TC6393xbState;
-TC6393xbState *tc6393xb_init(struct MemoryRegion *sysmem,
-                             uint32_t base, qemu_irq irq);
-qemu_irq tc6393xb_l3v_get(TC6393xbState *s);
-
 #endif
diff --git a/include/hw/display/tc6393xb.h b/include/hw/display/tc6393xb.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/display/tc6393xb.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * Toshiba TC6393XB I/O Controller.
+ * Found in Sharp Zaurus SL-6000 (tosa) or some
+ * Toshiba e-Series PDAs.
+ *
+ * Copyright (c) 2007 Hervé Poussineau
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef HW_DISPLAY_TC6393XB_H
+#define HW_DISPLAY_TC6393XB_H
+
+#include "exec/memory.h"
+#include "hw/irq.h"
+
+typedef struct TC6393xbState TC6393xbState;
+
+TC6393xbState *tc6393xb_init(struct MemoryRegion *sysmem,
+                             uint32_t base, qemu_irq irq);
+qemu_irq tc6393xb_l3v_get(TC6393xbState *s);
+
+#endif
diff --git a/hw/arm/tosa.c b/hw/arm/tosa.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/tosa.c
+++ b/hw/arm/tosa.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/hw.h"
 #include "hw/arm/pxa.h"
 #include "hw/arm/arm.h"
-#include "hw/devices.h"
 #include "hw/arm/sharpsl.h"
 #include "hw/pcmcia.h"
 #include "hw/boards.h"
+#include "hw/display/tc6393xb.h"
 #include "hw/i2c/i2c.h"
 #include "hw/ssi/ssi.h"
 #include "hw/sysbus.h"
diff --git a/hw/display/tc6393xb.c b/hw/display/tc6393xb.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/display/tc6393xb.c
+++ b/hw/display/tc6393xb.c
@@ -XXX,XX +XXX,XX @@
 #include "qapi/error.h"
 #include "qemu/host-utils.h"
 #include "hw/hw.h"
-#include "hw/devices.h"
+#include "hw/display/tc6393xb.h"
 #include "hw/block/flash.h"
 #include "ui/console.h"
 #include "ui/pixel_ops.h"
diff --git a/MAINTAINERS b/MAINTAINERS
index XXXXXXX..XXXXXXX 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ F: hw/misc/mst_fpga.c
 F: hw/misc/max111x.c
 F: include/hw/arm/pxa.h
 F: include/hw/arm/sharpsl.h
+F: include/hw/display/tc6393xb.h
 
 SABRELITE / i.MX6
 M: Peter Maydell <peter.maydell@linaro.org>
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

Add an entries the Blizzard device in MAINTAINERS.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-6-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/devices.h          |  7 -------
 include/hw/display/blizzard.h | 22 ++++++++++++++++++++++
 hw/arm/nseries.c              |  1 +
 hw/display/blizzard.c         |  2 +-
 MAINTAINERS                   |  2 ++
 5 files changed, 26 insertions(+), 8 deletions(-)
 create mode 100644 include/hw/display/blizzard.h

diff --git a/include/hw/devices.h b/include/hw/devices.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/devices.h
+++ b/include/hw/devices.h
@@ -XXX,XX +XXX,XX @@ void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
 /* stellaris_input.c */
 void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
 
-/* blizzard.c */
-void *s1d13745_init(qemu_irq gpio_int);
-void s1d13745_write(void *opaque, int dc, uint16_t value);
-void s1d13745_write_block(void *opaque, int dc,
-                void *buf, size_t len, int pitch);
-uint16_t s1d13745_read(void *opaque, int dc);
-
 /* cbus.c */
 typedef struct {
     qemu_irq clk;
diff --git a/include/hw/display/blizzard.h b/include/hw/display/blizzard.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/display/blizzard.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * Epson S1D13744/S1D13745 (Blizzard/Hailstorm/Tornado) LCD/TV controller.
+ *
+ * Copyright (C) 2008 Nokia Corporation
+ * Written by Andrzej Zaborowski
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef HW_DISPLAY_BLIZZARD_H
+#define HW_DISPLAY_BLIZZARD_H
+
+#include "hw/irq.h"
+
+void *s1d13745_init(qemu_irq gpio_int);
+void s1d13745_write(void *opaque, int dc, uint16_t value);
+void s1d13745_write_block(void *opaque, int dc,
+                          void *buf, size_t len, int pitch);
+uint16_t s1d13745_read(void *opaque, int dc);
+
+#endif
diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/nseries.c
+++ b/hw/arm/nseries.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/boards.h"
 #include "hw/i2c/i2c.h"
 #include "hw/devices.h"
+#include "hw/display/blizzard.h"
 #include "hw/misc/tmp105.h"
 #include "hw/block/flash.h"
 #include "hw/hw.h"
diff --git a/hw/display/blizzard.c b/hw/display/blizzard.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/display/blizzard.c
+++ b/hw/display/blizzard.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/osdep.h"
 #include "qemu-common.h"
 #include "ui/console.h"
-#include "hw/devices.h"
+#include "hw/display/blizzard.h"
 #include "ui/pixel_ops.h"
 
 typedef void (*blizzard_fn_t)(uint8_t *, const uint8_t *, unsigned int);
diff --git a/MAINTAINERS b/MAINTAINERS
index XXXXXXX..XXXXXXX 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ M: Peter Maydell <peter.maydell@linaro.org>
 L: qemu-arm@nongnu.org
 S: Odd Fixes
 F: hw/arm/nseries.c
+F: hw/display/blizzard.c
 F: hw/input/lm832x.c
 F: hw/input/tsc2005.c
 F: hw/misc/cbus.c
 F: hw/timer/twl92230.c
+F: include/hw/display/blizzard.h
 
 Palm
 M: Andrzej Zaborowski <balrogg@gmail.com>
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-7-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/devices.h   | 14 --------------
 include/hw/misc/cbus.h | 32 ++++++++++++++++++++++++++++++++
 hw/arm/nseries.c       |  1 +
 hw/misc/cbus.c         |  2 +-
 MAINTAINERS            |  1 +
 5 files changed, 35 insertions(+), 15 deletions(-)
 create mode 100644 include/hw/misc/cbus.h

diff --git a/include/hw/devices.h b/include/hw/devices.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/devices.h
+++ b/include/hw/devices.h
@@ -XXX,XX +XXX,XX @@ void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
 /* stellaris_input.c */
 void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
 
-/* cbus.c */
-typedef struct {
-    qemu_irq clk;
-    qemu_irq dat;
-    qemu_irq sel;
-} CBus;
-CBus *cbus_init(qemu_irq dat_out);
-void cbus_attach(CBus *bus, void *slave_opaque);
-
-void *retu_init(qemu_irq irq, int vilma);
-void *tahvo_init(qemu_irq irq, int betty);
-
-void retu_key_event(void *retu, int state);
-
 #endif
diff --git a/include/hw/misc/cbus.h b/include/hw/misc/cbus.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/misc/cbus.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * CBUS three-pin bus and the Retu / Betty / Tahvo / Vilma / Avilma /
+ * Hinku / Vinku / Ahne / Pihi chips used in various Nokia platforms.
+ * Based on reverse-engineering of a linux driver.
+ *
+ * Copyright (C) 2008 Nokia Corporation
+ * Written by Andrzej Zaborowski
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef HW_MISC_CBUS_H
+#define HW_MISC_CBUS_H
+
+#include "hw/irq.h"
+
+typedef struct {
+    qemu_irq clk;
+    qemu_irq dat;
+    qemu_irq sel;
+} CBus;
+
+CBus *cbus_init(qemu_irq dat_out);
+void cbus_attach(CBus *bus, void *slave_opaque);
+
+void *retu_init(qemu_irq irq, int vilma);
+void *tahvo_init(qemu_irq irq, int betty);
+
+void retu_key_event(void *retu, int state);
+
+#endif
diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/nseries.c
+++ b/hw/arm/nseries.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/i2c/i2c.h"
 #include "hw/devices.h"
 #include "hw/display/blizzard.h"
+#include "hw/misc/cbus.h"
 #include "hw/misc/tmp105.h"
 #include "hw/block/flash.h"
 #include "hw/hw.h"
diff --git a/hw/misc/cbus.c b/hw/misc/cbus.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/cbus.c
+++ b/hw/misc/cbus.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/osdep.h"
 #include "hw/hw.h"
 #include "hw/irq.h"
-#include "hw/devices.h"
+#include "hw/misc/cbus.h"
 #include "sysemu/sysemu.h"
 
 //#define DEBUG
diff --git a/MAINTAINERS b/MAINTAINERS
index XXXXXXX..XXXXXXX 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ F: hw/input/tsc2005.c
 F: hw/misc/cbus.c
 F: hw/timer/twl92230.c
 F: include/hw/display/blizzard.h
+F: include/hw/misc/cbus.h
 
 Palm
 M: Andrzej Zaborowski <balrogg@gmail.com>
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-8-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/devices.h       |  3 ---
 include/hw/input/gamepad.h | 19 +++++++++++++++++++
 hw/arm/stellaris.c         |  2 +-
 hw/input/stellaris_input.c |  2 +-
 MAINTAINERS                |  1 +
 5 files changed, 22 insertions(+), 5 deletions(-)
 create mode 100644 include/hw/input/gamepad.h

diff --git a/include/hw/devices.h b/include/hw/devices.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/devices.h
+++ b/include/hw/devices.h
@@ -XXX,XX +XXX,XX @@ void *tsc2005_init(qemu_irq pintdav);
 uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len);
 void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
 
-/* stellaris_input.c */
-void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
-
 #endif
diff --git a/include/hw/input/gamepad.h b/include/hw/input/gamepad.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/input/gamepad.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * Gamepad style buttons connected to IRQ/GPIO lines
+ *
+ * Copyright (c) 2007 CodeSourcery.
+ * Written by Paul Brook
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef HW_INPUT_GAMEPAD_H
+#define HW_INPUT_GAMEPAD_H
+
+#include "hw/irq.h"
+
+/* stellaris_input.c */
+void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
+
+#endif
diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/stellaris.c
+++ b/hw/arm/stellaris.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/sysbus.h"
 #include "hw/ssi/ssi.h"
 #include "hw/arm/arm.h"
-#include "hw/devices.h"
 #include "qemu/timer.h"
 #include "hw/i2c/i2c.h"
 #include "net/net.h"
@@ -XXX,XX +XXX,XX @@
 #include "sysemu/sysemu.h"
 #include "hw/arm/armv7m.h"
 #include "hw/char/pl011.h"
+#include "hw/input/gamepad.h"
 #include "hw/watchdog/cmsdk-apb-watchdog.h"
 #include "hw/misc/unimp.h"
 #include "cpu.h"
diff --git a/hw/input/stellaris_input.c b/hw/input/stellaris_input.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/input/stellaris_input.c
+++ b/hw/input/stellaris_input.c
@@ -XXX,XX +XXX,XX @@
  */
 #include "qemu/osdep.h"
 #include "hw/hw.h"
-#include "hw/devices.h"
+#include "hw/input/gamepad.h"
 #include "ui/console.h"
 
 typedef struct {
diff --git a/MAINTAINERS b/MAINTAINERS
index XXXXXXX..XXXXXXX 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ M: Peter Maydell <peter.maydell@linaro.org>
 L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/*/stellaris*
+F: include/hw/input/gamepad.h
 
 Versatile Express
 M: Peter Maydell <peter.maydell@linaro.org>
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

Since uWireSlave is only used in this new header, there is no
need to expose it via "qemu/typedefs.h".

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-9-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/omap.h      |  6 +-----
 include/hw/devices.h       | 15 ---------------
 include/hw/input/tsc2xxx.h | 36 ++++++++++++++++++++++++++++++++++++
 include/qemu/typedefs.h    |  1 -
 hw/arm/nseries.c           |  2 +-
 hw/arm/palm.c              |  2 +-
 hw/input/tsc2005.c         |  2 +-
 hw/input/tsc210x.c         |  4 ++--
 MAINTAINERS                |  2 ++
 9 files changed, 44 insertions(+), 26 deletions(-)
 create mode 100644 include/hw/input/tsc2xxx.h

diff --git a/include/hw/arm/omap.h b/include/hw/arm/omap.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/omap.h
+++ b/include/hw/arm/omap.h
@@ -XXX,XX +XXX,XX @@
 #include "exec/memory.h"
 # define hw_omap_h		"omap.h"
 #include "hw/irq.h"
+#include "hw/input/tsc2xxx.h"
 #include "target/arm/cpu-qom.h"
 #include "qemu/log.h"
 
@@ -XXX,XX +XXX,XX @@ qemu_irq *omap_mpuio_in_get(struct omap_mpuio_s *s);
 void omap_mpuio_out_set(struct omap_mpuio_s *s, int line, qemu_irq handler);
 void omap_mpuio_key(struct omap_mpuio_s *s, int row, int col, int down);
 
-struct uWireSlave {
-    uint16_t (*receive)(void *opaque);
-    void (*send)(void *opaque, uint16_t data);
-    void *opaque;
-};
 struct omap_uwire_s;
 void omap_uwire_attach(struct omap_uwire_s *s,
                 uWireSlave *slave, int chipselect);
diff --git a/include/hw/devices.h b/include/hw/devices.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/devices.h
+++ b/include/hw/devices.h
@@ -XXX,XX +XXX,XX @@
 /* Devices that have nowhere better to go.  */
 
 #include "hw/hw.h"
-#include "ui/console.h"
 
 /* smc91c111.c */
 void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
@@ -XXX,XX +XXX,XX @@ void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
 /* lan9118.c */
 void lan9118_init(NICInfo *, uint32_t, qemu_irq);
 
-/* tsc210x.c */
-uWireSlave *tsc2102_init(qemu_irq pint);
-uWireSlave *tsc2301_init(qemu_irq penirq, qemu_irq kbirq, qemu_irq dav);
-I2SCodec *tsc210x_codec(uWireSlave *chip);
-uint32_t tsc210x_txrx(void *opaque, uint32_t value, int len);
-void tsc210x_set_transform(uWireSlave *chip,
-                MouseTransformInfo *info);
-void tsc210x_key_event(uWireSlave *chip, int key, int down);
-
-/* tsc2005.c */
-void *tsc2005_init(qemu_irq pintdav);
-uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len);
-void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
-
 #endif
diff --git a/include/hw/input/tsc2xxx.h b/include/hw/input/tsc2xxx.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/input/tsc2xxx.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * TI touchscreen controller
+ *
+ * Copyright (c) 2006 Andrzej Zaborowski
+ * Copyright (C) 2008 Nokia Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef HW_INPUT_TSC2XXX_H
+#define HW_INPUT_TSC2XXX_H
+
+#include "hw/irq.h"
+#include "ui/console.h"
+
+typedef struct uWireSlave {
+    uint16_t (*receive)(void *opaque);
+    void (*send)(void *opaque, uint16_t data);
+    void *opaque;
+} uWireSlave;
+
+/* tsc210x.c */
+uWireSlave *tsc2102_init(qemu_irq pint);
+uWireSlave *tsc2301_init(qemu_irq penirq, qemu_irq kbirq, qemu_irq dav);
+I2SCodec *tsc210x_codec(uWireSlave *chip);
+uint32_t tsc210x_txrx(void *opaque, uint32_t value, int len);
+void tsc210x_set_transform(uWireSlave *chip, MouseTransformInfo *info);
+void tsc210x_key_event(uWireSlave *chip, int key, int down);
+
+/* tsc2005.c */
+void *tsc2005_init(qemu_irq pintdav);
+uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len);
+void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
+
+#endif
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index XXXXXXX..XXXXXXX 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -XXX,XX +XXX,XX @@ typedef struct RAMBlock RAMBlock;
 typedef struct Range Range;
 typedef struct SHPCDevice SHPCDevice;
 typedef struct SSIBus SSIBus;
-typedef struct uWireSlave uWireSlave;
 typedef struct VirtIODevice VirtIODevice;
 typedef struct Visitor Visitor;
 typedef void SaveStateHandler(QEMUFile *f, void *opaque);
diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/nseries.c
+++ b/hw/arm/nseries.c
@@ -XXX,XX +XXX,XX @@
 #include "ui/console.h"
 #include "hw/boards.h"
 #include "hw/i2c/i2c.h"
-#include "hw/devices.h"
 #include "hw/display/blizzard.h"
+#include "hw/input/tsc2xxx.h"
 #include "hw/misc/cbus.h"
 #include "hw/misc/tmp105.h"
 #include "hw/block/flash.h"
diff --git a/hw/arm/palm.c b/hw/arm/palm.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/palm.c
+++ b/hw/arm/palm.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/arm/omap.h"
 #include "hw/boards.h"
 #include "hw/arm/arm.h"
-#include "hw/devices.h"
+#include "hw/input/tsc2xxx.h"
 #include "hw/loader.h"
 #include "exec/address-spaces.h"
 #include "cpu.h"
diff --git a/hw/input/tsc2005.c b/hw/input/tsc2005.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/input/tsc2005.c
+++ b/hw/input/tsc2005.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/hw.h"
 #include "qemu/timer.h"
 #include "ui/console.h"
-#include "hw/devices.h"
+#include "hw/input/tsc2xxx.h"
 #include "trace.h"
 
 #define TSC_CUT_RESOLUTION(value, p)	((value) >> (16 - (p ? 12 : 10)))
diff --git a/hw/input/tsc210x.c b/hw/input/tsc210x.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/input/tsc210x.c
+++ b/hw/input/tsc210x.c
@@ -XXX,XX +XXX,XX @@
 #include "audio/audio.h"
 #include "qemu/timer.h"
 #include "ui/console.h"
-#include "hw/arm/omap.h"	/* For I2SCodec and uWireSlave */
-#include "hw/devices.h"
+#include "hw/arm/omap.h"            /* For I2SCodec */
+#include "hw/input/tsc2xxx.h"
 
 #define TSC_DATA_REGISTERS_PAGE		0x0
 #define TSC_CONTROL_REGISTERS_PAGE	0x1
diff --git a/MAINTAINERS b/MAINTAINERS
index XXXXXXX..XXXXXXX 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ F: hw/input/tsc2005.c
 F: hw/misc/cbus.c
 F: hw/timer/twl92230.c
 F: include/hw/display/blizzard.h
+F: include/hw/input/tsc2xxx.h
 F: include/hw/misc/cbus.h
 
 Palm
@@ -XXX,XX +XXX,XX @@ L: qemu-arm@nongnu.org
 S: Odd Fixes
 F: hw/arm/palm.c
 F: hw/input/tsc210x.c
+F: include/hw/input/tsc2xxx.h
 
 Raspberry Pi
 M: Peter Maydell <peter.maydell@linaro.org>
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-10-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/devices.h     |  3 ---
 include/hw/net/lan9118.h | 19 +++++++++++++++++++
 hw/arm/kzm.c             |  2 +-
 hw/arm/mps2.c            |  2 +-
 hw/arm/realview.c        |  1 +
 hw/arm/vexpress.c        |  2 +-
 hw/net/lan9118.c         |  2 +-
 7 files changed, 24 insertions(+), 7 deletions(-)
 create mode 100644 include/hw/net/lan9118.h

diff --git a/include/hw/devices.h b/include/hw/devices.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/devices.h
+++ b/include/hw/devices.h
@@ -XXX,XX +XXX,XX @@
 /* smc91c111.c */
 void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
 
-/* lan9118.c */
-void lan9118_init(NICInfo *, uint32_t, qemu_irq);
-
 #endif
diff --git a/include/hw/net/lan9118.h b/include/hw/net/lan9118.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/net/lan9118.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * SMSC LAN9118 Ethernet interface emulation
+ *
+ * Copyright (c) 2009 CodeSourcery, LLC.
+ * Written by Paul Brook
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef HW_NET_LAN9118_H
+#define HW_NET_LAN9118_H
+
+#include "hw/irq.h"
+#include "net/net.h"
+
+void lan9118_init(NICInfo *, uint32_t, qemu_irq);
+
+#endif
diff --git a/hw/arm/kzm.c b/hw/arm/kzm.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/kzm.c
+++ b/hw/arm/kzm.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/error-report.h"
 #include "exec/address-spaces.h"
 #include "net/net.h"
-#include "hw/devices.h"
+#include "hw/net/lan9118.h"
 #include "hw/char/serial.h"
 #include "sysemu/qtest.h"
 
diff --git a/hw/arm/mps2.c b/hw/arm/mps2.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/mps2.c
+++ b/hw/arm/mps2.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/timer/cmsdk-apb-timer.h"
 #include "hw/timer/cmsdk-apb-dualtimer.h"
 #include "hw/misc/mps2-scc.h"
-#include "hw/devices.h"
+#include "hw/net/lan9118.h"
 #include "net/net.h"
 
 typedef enum MPS2FPGAType {
diff --git a/hw/arm/realview.c b/hw/arm/realview.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/realview.c
+++ b/hw/arm/realview.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/arm/arm.h"
 #include "hw/arm/primecell.h"
 #include "hw/devices.h"
+#include "hw/net/lan9118.h"
 #include "hw/pci/pci.h"
 #include "net/net.h"
 #include "sysemu/sysemu.h"
diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/vexpress.c
+++ b/hw/arm/vexpress.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/sysbus.h"
 #include "hw/arm/arm.h"
 #include "hw/arm/primecell.h"
-#include "hw/devices.h"
+#include "hw/net/lan9118.h"
 #include "hw/i2c/i2c.h"
 #include "net/net.h"
 #include "sysemu/sysemu.h"
diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/net/lan9118.c
+++ b/hw/net/lan9118.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/sysbus.h"
 #include "net/net.h"
 #include "net/eth.h"
-#include "hw/devices.h"
+#include "hw/net/lan9118.h"
 #include "sysemu/sysemu.h"
 #include "hw/ptimer.h"
 #include "qemu/log.h"
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-12-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/net/lan9118.h | 2 ++
 hw/arm/exynos4_boards.c  | 3 ++-
 hw/arm/mps2-tz.c         | 3 ++-
 hw/net/lan9118.c         | 1 -
 4 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/hw/net/lan9118.h b/include/hw/net/lan9118.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/net/lan9118.h
+++ b/include/hw/net/lan9118.h
@@ -XXX,XX +XXX,XX @@
 #include "hw/irq.h"
 #include "net/net.h"
 
+#define TYPE_LAN9118 "lan9118"
+
 void lan9118_init(NICInfo *, uint32_t, qemu_irq);
 
 #endif
diff --git a/hw/arm/exynos4_boards.c b/hw/arm/exynos4_boards.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/exynos4_boards.c
+++ b/hw/arm/exynos4_boards.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/arm/arm.h"
 #include "exec/address-spaces.h"
 #include "hw/arm/exynos4210.h"
+#include "hw/net/lan9118.h"
 #include "hw/boards.h"
 
 #undef DEBUG
@@ -XXX,XX +XXX,XX @@ static void lan9215_init(uint32_t base, qemu_irq irq)
     /* This should be a 9215 but the 9118 is close enough */
     if (nd_table[0].used) {
         qemu_check_nic_model(&nd_table[0], "lan9118");
-        dev = qdev_create(NULL, "lan9118");
+        dev = qdev_create(NULL, TYPE_LAN9118);
         qdev_set_nic_properties(dev, &nd_table[0]);
         qdev_prop_set_uint32(dev, "mode_16bit", 1);
         qdev_init_nofail(dev);
diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/mps2-tz.c
+++ b/hw/arm/mps2-tz.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/arm/armsse.h"
 #include "hw/dma/pl080.h"
 #include "hw/ssi/pl022.h"
+#include "hw/net/lan9118.h"
 #include "net/net.h"
 #include "hw/core/split-irq.h"
 
@@ -XXX,XX +XXX,XX @@ static MemoryRegion *make_eth_dev(MPS2TZMachineState *mms, void *opaque,
      * except that it doesn't support the checksum-offload feature.
      */
     qemu_check_nic_model(nd, "lan9118");
-    mms->lan9118 = qdev_create(NULL, "lan9118");
+    mms->lan9118 = qdev_create(NULL, TYPE_LAN9118);
     qdev_set_nic_properties(mms->lan9118, nd);
     qdev_init_nofail(mms->lan9118);
 
diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/net/lan9118.c
+++ b/hw/net/lan9118.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118_packet = {
     }
 };
 
-#define TYPE_LAN9118 "lan9118"
 #define LAN9118(obj) OBJECT_CHECK(lan9118_state, (obj), TYPE_LAN9118)
 
 typedef struct {
-- 
2.20.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

This commit finally deletes "hw/devices.h".

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190412165416.7977-13-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/devices.h       | 11 -----------
 include/hw/net/smc91c111.h | 19 +++++++++++++++++++
 hw/arm/gumstix.c           |  2 +-
 hw/arm/integratorcp.c      |  2 +-
 hw/arm/mainstone.c         |  2 +-
 hw/arm/realview.c          |  2 +-
 hw/arm/versatilepb.c       |  2 +-
 hw/net/smc91c111.c         |  2 +-
 8 files changed, 25 insertions(+), 17 deletions(-)
 delete mode 100644 include/hw/devices.h
 create mode 100644 include/hw/net/smc91c111.h

diff --git a/include/hw/devices.h b/include/hw/devices.h
deleted file mode 100644
index XXXXXXX..XXXXXXX
--- a/include/hw/devices.h
+++ /dev/null
@@ -XXX,XX +XXX,XX @@
-#ifndef QEMU_DEVICES_H
-#define QEMU_DEVICES_H
-
-/* Devices that have nowhere better to go.  */
-
-#include "hw/hw.h"
-
-/* smc91c111.c */
-void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
-
-#endif
diff --git a/include/hw/net/smc91c111.h b/include/hw/net/smc91c111.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/net/smc91c111.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * SMSC 91C111 Ethernet interface emulation
+ *
+ * Copyright (c) 2005 CodeSourcery, LLC.
+ * Written by Paul Brook
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef HW_NET_SMC91C111_H
+#define HW_NET_SMC91C111_H
+
+#include "hw/irq.h"
+#include "net/net.h"
+
+void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
+
+#endif
diff --git a/hw/arm/gumstix.c b/hw/arm/gumstix.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/gumstix.c
+++ b/hw/arm/gumstix.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/arm/pxa.h"
 #include "net/net.h"
 #include "hw/block/flash.h"
-#include "hw/devices.h"
+#include "hw/net/smc91c111.h"
 #include "hw/boards.h"
 #include "exec/address-spaces.h"
 #include "sysemu/qtest.h"
diff --git a/hw/arm/integratorcp.c b/hw/arm/integratorcp.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/integratorcp.c
+++ b/hw/arm/integratorcp.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu-common.h"
 #include "cpu.h"
 #include "hw/sysbus.h"
-#include "hw/devices.h"
 #include "hw/boards.h"
 #include "hw/arm/arm.h"
 #include "hw/misc/arm_integrator_debug.h"
+#include "hw/net/smc91c111.h"
 #include "net/net.h"
 #include "exec/address-spaces.h"
 #include "sysemu/sysemu.h"
diff --git a/hw/arm/mainstone.c b/hw/arm/mainstone.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/mainstone.c
+++ b/hw/arm/mainstone.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/arm/pxa.h"
 #include "hw/arm/arm.h"
 #include "net/net.h"
-#include "hw/devices.h"
+#include "hw/net/smc91c111.h"
 #include "hw/boards.h"
 #include "hw/block/flash.h"
 #include "hw/sysbus.h"
diff --git a/hw/arm/realview.c b/hw/arm/realview.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/realview.c
+++ b/hw/arm/realview.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/sysbus.h"
 #include "hw/arm/arm.h"
 #include "hw/arm/primecell.h"
-#include "hw/devices.h"
 #include "hw/net/lan9118.h"
+#include "hw/net/smc91c111.h"
 #include "hw/pci/pci.h"
 #include "net/net.h"
 #include "sysemu/sysemu.h"
diff --git a/hw/arm/versatilepb.c b/hw/arm/versatilepb.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/versatilepb.c
+++ b/hw/arm/versatilepb.c
@@ -XXX,XX +XXX,XX @@
 #include "cpu.h"
 #include "hw/sysbus.h"
 #include "hw/arm/arm.h"
-#include "hw/devices.h"
+#include "hw/net/smc91c111.h"
 #include "net/net.h"
 #include "sysemu/sysemu.h"
 #include "hw/pci/pci.h"
diff --git a/hw/net/smc91c111.c b/hw/net/smc91c111.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/net/smc91c111.c
+++ b/hw/net/smc91c111.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/osdep.h"
 #include "hw/sysbus.h"
 #include "net/net.h"
-#include "hw/devices.h"
+#include "hw/net/smc91c111.h"
 #include "qemu/log.h"
 /* For crc32 */
 #include <zlib.h>
-- 
2.20.1

Mostly this is patches from me and RTH cleaning up and doing
more decodetree conversion for AArch32 Neon. The major new feature
is Dongjiu Geng's patchset to report host memory errors to KVM guests;
also a new aspeed board from Patrick Williams.

thanks
-- PMM

The following changes since commit 035b448b84f3557206abc44d786c5d3db2638f7d:

Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-05-14' into staging (2020-05-14 10:58:30 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200514

for you to fetch changes up to e95485f85657be21135c17a9226e297c21e73360:

target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree (2020-05-14 15:03:09 +0100)

----------------------------------------------------------------
target-arm queue:
 * target/arm: Use correct GDB XML for M-profile cores
 * target/arm: Code cleanup to use gvec APIs better
 * aspeed: Add support for the sonorapass-bmc board
 * target/arm: Support reporting KVM host memory errors
   to the guest via ACPI notifications
 * target/arm: Finish conversion of Neon 3-reg-same insns to decodetree

----------------------------------------------------------------
Dongjiu Geng (10):
      acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
      hw/arm/virt: Introduce a RAS machine option
      docs: APEI GHES generation and CPER record description
      ACPI: Build related register address fields via hardware error fw_cfg blob
      ACPI: Build Hardware Error Source Table
      ACPI: Record the Generic Error Status Block address
      KVM: Move hwpoison page related functions into kvm-all.c
      ACPI: Record Generic Error Status Block(GESB) table
      target-arm: kvm64: handle SIGBUS signal from kernel or KVM
      MAINTAINERS: Add ACPI/HEST/GHES entries

Patrick Williams (1):
      aspeed: Add support for the sonorapass-bmc board

Peter Maydell (18):
      target/arm: Use correct GDB XML for M-profile cores
      target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
      target/arm: Convert Neon 3-reg-same SHA to decodetree
      target/arm: Convert Neon 64-bit element 3-reg-same insns
      target/arm: Convert Neon VHADD 3-reg-same insns
      target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
      target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
      target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
      target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
      target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
      target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
      target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
      target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
      target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
      target/arm: Convert Neon 3-reg-same compare insns to decodetree
      target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
      target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
      target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree

Richard Henderson (16):
      target/arm: Create gen_gvec_[us]sra
      target/arm: Create gen_gvec_{u,s}{rshr,rsra}
      target/arm: Create gen_gvec_{sri,sli}
      target/arm: Remove unnecessary range check for VSHL
      target/arm: Tidy handle_vec_simd_shri
      target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
      target/arm: Create gen_gvec_{mla,mls}
      target/arm: Swap argument order for VSHL during decode
      target/arm: Create gen_gvec_{cmtst,ushl,sshl}
      target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
      target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
      target/arm: Create gen_gvec_{qrdmla,qrdmls}
      target/arm: Pass pointer to qc to qrdmla/qrdmls
      target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
      target/arm: Vectorize SABD/UABD
      target/arm: Vectorize SABA/UABA

GDB's remote protocol requires M-profile cores to use the feature
name 'org.gnu.gdb.arm.m-profile' instead of the 'org.gnu.gdb.arm.core'
feature used for A- and R-profile cores. We weren't doing this, which
meant GDB treated our M-profile cores like A-profile ones. This mostly
doesn't matter, but for instance means that it doesn't correctly
handle backtraces where an M-profile exception frame is involved.

Ship a copy of GDB's arm-m-profile.xml and use it on the M-profile
cores.  The integer registers have the same offsets as the
arm-core.xml, but register 25 is the M-profile XPSR rather than the
A-profile CPSR, so we need to update arm_cpu_gdb_read_register() and
arm_cpu_gdb_write_register() to handle XSPR reads and writes.

Fixes: https://bugs.launchpad.net/qemu/+bug/1877136
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200507134755.13997-1-peter.maydell@linaro.org
---
 configure                 |  4 ++--
 target/arm/cpu_tcg.c      |  1 +
 target/arm/gdbstub.c      | 22 ++++++++++++++++++----
 gdb-xml/arm-m-profile.xml | 27 +++++++++++++++++++++++++++
 4 files changed, 48 insertions(+), 6 deletions(-)
 create mode 100644 gdb-xml/arm-m-profile.xml

diff --git a/configure b/configure
index XXXXXXX..XXXXXXX 100755
--- a/configure
+++ b/configure
@@ -XXX,XX +XXX,XX @@ case "$target_name" in
     TARGET_SYSTBL_ABI=common,oabi
     bflt="yes"
     mttcg="yes"
-    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
+    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
   ;;
   aarch64|aarch64_be)
     TARGET_ARCH=aarch64
     TARGET_BASE_ARCH=arm
     bflt="yes"
     mttcg="yes"
-    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
+    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
   ;;
   cris)
   ;;
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -XXX,XX +XXX,XX @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
 #endif
 
     cc->cpu_exec_interrupt = arm_v7m_cpu_exec_interrupt;
+    cc->gdb_core_xml_file = "arm-m-profile.xml";
 }
 
 static const ARMCPUInfo arm_tcg_cpus[] = {
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
         }
         return gdb_get_reg32(mem_buf, 0);
     case 25:
-        /* CPSR */
-        return gdb_get_reg32(mem_buf, cpsr_read(env));
+        /* CPSR, or XPSR for M-profile */
+        if (arm_feature(env, ARM_FEATURE_M)) {
+            return gdb_get_reg32(mem_buf, xpsr_read(env));
+        } else {
+            return gdb_get_reg32(mem_buf, cpsr_read(env));
+        }
     }
     /* Unknown register.  */
     return 0;
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
         }
         return 4;
     case 25:
-        /* CPSR */
-        cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
+        /* CPSR, or XPSR for M-profile */
+        if (arm_feature(env, ARM_FEATURE_M)) {
+            /*
+             * Don't allow writing to XPSR.Exception as it can cause
+             * a transition into or out of handler mode (it's not
+             * writeable via the MSR insn so this is a reasonable
+             * restriction). Other fields are safe to update.
+             */
+            xpsr_write(env, tmp, ~XPSR_EXCP);
+        } else {
+            cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
+        }
         return 4;
     }
     /* Unknown register.  */
diff --git a/gdb-xml/arm-m-profile.xml b/gdb-xml/arm-m-profile.xml
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/gdb-xml/arm-m-profile.xml
@@ -XXX,XX +XXX,XX @@
+<?xml version="1.0"?>
+
+
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
+<feature name="org.gnu.gdb.arm.m-profile">
+  <reg name="r0" bitsize="32"/>
+  <reg name="r1" bitsize="32"/>
+  <reg name="r2" bitsize="32"/>
+  <reg name="r3" bitsize="32"/>
+  <reg name="r4" bitsize="32"/>
+  <reg name="r5" bitsize="32"/>
+  <reg name="r6" bitsize="32"/>
+  <reg name="r7" bitsize="32"/>
+  <reg name="r8" bitsize="32"/>
+  <reg name="r9" bitsize="32"/>
+  <reg name="r10" bitsize="32"/>
+  <reg name="r11" bitsize="32"/>
+  <reg name="r12" bitsize="32"/>
+  <reg name="sp" bitsize="32" type="data_ptr"/>
+  <reg name="lr" bitsize="32"/>
+  <reg name="pc" bitsize="32" type="code_ptr"/>
+  <reg name="xpsr" bitsize="32" regnum="25"/>
+</feature>
-- 
2.20.1