Series comparison

-[PULL 00/45] target-arm queue
+[PULL 00/23] target-arm queue
-Mostly this is patches from me and RTH cleaning up and doing
+Two small bugfixes, plus most of RTH's refactoring of cpregs
-more decodetree conversion for AArch32 Neon. The major new feature
+handling.
 is Dongjiu Geng's patchset to report host memory errors to KVM guests;
 also a new aspeed board from Patrick Williams.
-thanks
 -- PMM
-The following changes since commit 035b448b84f3557206abc44d786c5d3db2638f7d:
+The following changes since commit 1fba9dc71a170b3a05b9d3272dd8ecfe7f26e215:
-  Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-05-14' into staging (2020-05-14 10:58:30 +0100)
+  Merge tag 'pull-request-2022-05-04' of https://gitlab.com/thuth/qemu into staging (2022-05-04 08:07:02 -0700)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200514
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20220505
-for you to fetch changes up to e95485f85657be21135c17a9226e297c21e73360:
+for you to fetch changes up to 99a50d1a67c602126fc2b3a4812d3000eba9bf34:
-  target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree (2020-05-14 15:03:09 +0100)
+  target/arm: read access to performance counters from EL0 (2022-05-05 09:36:22 +0100)
 ----------------------------------------------------------------
 target-arm queue:
- * target/arm: Use correct GDB XML for M-profile cores
+ * Enable read access to performance counters from EL0
- * target/arm: Code cleanup to use gvec APIs better
+ * Enable SCTLR_EL1.BT0 for aarch64-linux-user
- * aspeed: Add support for the sonorapass-bmc board
+ * Refactoring of cpreg handling
  * target/arm: Support reporting KVM host memory errors
    to the guest via ACPI notifications
  * target/arm: Finish conversion of Neon 3-reg-same insns to decodetree
 ----------------------------------------------------------------
-Dongjiu Geng (10):
+Alex Zuepke (1):
-      acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
+      target/arm: read access to performance counters from EL0
       hw/arm/virt: Introduce a RAS machine option
       docs: APEI GHES generation and CPER record description
       ACPI: Build related register address fields via hardware error fw_cfg blob
       ACPI: Build Hardware Error Source Table
       ACPI: Record the Generic Error Status Block address
       KVM: Move hwpoison page related functions into kvm-all.c
       ACPI: Record Generic Error Status Block(GESB) table
       target-arm: kvm64: handle SIGBUS signal from kernel or KVM
       MAINTAINERS: Add ACPI/HEST/GHES entries
-Patrick Williams (1):
+Richard Henderson (22):
-      aspeed: Add support for the sonorapass-bmc board
+      target/arm: Enable SCTLR_EL1.BT0 for aarch64-linux-user
       target/arm: Split out cpregs.h
       target/arm: Reorg CPAccessResult and access_check_cp_reg
       target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
       target/arm: Make some more cpreg data static const
       target/arm: Reorg ARMCPRegInfo type field bits
       target/arm: Avoid bare abort() or assert(0)
       target/arm: Change cpreg access permissions to enum
       target/arm: Name CPState type
       target/arm: Name CPSecureState type
       target/arm: Drop always-true test in define_arm_vh_e2h_redirects_aliases
       target/arm: Store cpregs key in the hash table directly
       target/arm: Merge allocation of the cpreg and its name
       target/arm: Hoist computation of key in add_cpreg_to_hashtable
       target/arm: Consolidate cpreg updates in add_cpreg_to_hashtable
       target/arm: Use bool for is64 and ns in add_cpreg_to_hashtable
       target/arm: Hoist isbanked computation in add_cpreg_to_hashtable
       target/arm: Perform override check early in add_cpreg_to_hashtable
       target/arm: Reformat comments in add_cpreg_to_hashtable
       target/arm: Remove HOST_BIG_ENDIAN ifdef in add_cpreg_to_hashtable
       target/arm: Add isar predicates for FEAT_Debugv8p2
       target/arm: Add isar_feature_{aa64,any}_ras
-Peter Maydell (18):
+ target/arm/cpregs.h               | 453 ++++++++++++++++++++++++++++++++++++++
-      target/arm: Use correct GDB XML for M-profile cores
+ target/arm/cpu.h                  | 393 +++------------------------------
-      target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
+ hw/arm/pxa2xx.c                   |   2 +-
-      target/arm: Convert Neon 3-reg-same SHA to decodetree
+ hw/arm/pxa2xx_pic.c               |   2 +-
-      target/arm: Convert Neon 64-bit element 3-reg-same insns
+ hw/intc/arm_gicv3_cpuif.c         |   6 +-
-      target/arm: Convert Neon VHADD 3-reg-same insns
+ hw/intc/arm_gicv3_kvm.c           |   3 +-
-      target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
+ target/arm/cpu.c                  |  25 +--
-      target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
+ target/arm/cpu64.c                |   2 +-
-      target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
+ target/arm/cpu_tcg.c              |   5 +-
-      target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
+ target/arm/gdbstub.c              |   5 +-
-      target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
+ target/arm/helper.c               | 358 +++++++++++++-----------------
-      target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
+ target/arm/hvf/hvf.c              |   2 +-
-      target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
+ target/arm/kvm-stub.c             |   4 +-
-      target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
+ target/arm/kvm.c                  |   4 +-
-      target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
+ target/arm/machine.c              |   4 +-
-      target/arm: Convert Neon 3-reg-same compare insns to decodetree
+ target/arm/op_helper.c            |  57 ++---
-      target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
+ target/arm/translate-a64.c        |  14 +-
-      target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
+ target/arm/translate-neon.c       |   2 +-
-      target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree
+ target/arm/translate.c            |  13 +-
+ tests/tcg/aarch64/bti-3.c         |  42 ++++
-Richard Henderson (16):
+ tests/tcg/aarch64/Makefile.target |   6 +-
-      target/arm: Create gen_gvec_[us]sra
+files changed, 738 insertions(+), 664 deletions(-)
-      target/arm: Create gen_gvec_{u,s}{rshr,rsra}
+ create mode 100644 target/arm/cpregs.h
-      target/arm: Create gen_gvec_{sri,sli}
+ create mode 100644 tests/tcg/aarch64/bti-3.c
       target/arm: Remove unnecessary range check for VSHL
       target/arm: Tidy handle_vec_simd_shri
       target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
       target/arm: Create gen_gvec_{mla,mls}
       target/arm: Swap argument order for VSHL during decode
       target/arm: Create gen_gvec_{cmtst,ushl,sshl}
       target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
       target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
       target/arm: Create gen_gvec_{qrdmla,qrdmls}
       target/arm: Pass pointer to qc to qrdmla/qrdmls
       target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
       target/arm: Vectorize SABD/UABD
       target/arm: Vectorize SABA/UABA
  docs/specs/acpi_hest_ghes.rst          |  110 ++
  docs/specs/index.rst                   |    1 +
  configure                              |    4 +-
  default-configs/arm-softmmu.mak        |    1 +
  include/hw/acpi/aml-build.h            |    1 +
  include/hw/acpi/generic_event_device.h |    2 +
  include/hw/acpi/ghes.h                 |   74 +
  include/hw/arm/virt.h                  |    1 +
  include/qemu/uuid.h                    |   27 +
  include/sysemu/kvm.h                   |    3 +-
  include/sysemu/kvm_int.h               |   12 +
  target/arm/cpu.h                       |    4 +
  target/arm/helper.h                    |   78 +-
  target/arm/internals.h                 |    5 +-
  target/arm/translate.h                 |   84 +-
  target/i386/cpu.h                      |    2 +
  target/arm/neon-dp.decode              |  119 +-
  accel/kvm/kvm-all.c                    |   36 +
  hw/acpi/aml-build.c                    |    2 +
  hw/acpi/generic_event_device.c         |   19 +
  hw/acpi/ghes.c                         |  448 ++++++
  hw/acpi/nvdimm.c                       |   10 +-
  hw/arm/aspeed.c                        |   78 ++
  hw/arm/virt-acpi-build.c               |   15 +
  hw/arm/virt.c                          |   23 +
  target/arm/cpu_tcg.c                   |    1 +
  target/arm/gdbstub.c                   |   22 +-
  target/arm/helper.c                    |    2 +-
  target/arm/kvm64.c                     |   77 ++
  target/arm/neon_helper.c               |   17 -
  target/arm/tlb_helper.c                |    2 +-
  target/arm/translate-a64.c             |  210 +--
  target/arm/translate-neon.inc.c        |  682 +++++++++-
  target/arm/translate.c                 | 2349 +++++++++++++++++---------------
  target/arm/vec_helper.c                |  240 +++-
  target/arm/vfp_helper.c                |    9 +-
  target/i386/kvm.c                      |   36 -
  MAINTAINERS                            |    9 +
  gdb-xml/arm-m-profile.xml              |   27 +
  hw/acpi/Kconfig                        |    4 +
  hw/acpi/Makefile.objs                  |    1 +
 files changed, 3402 insertions(+), 1445 deletions(-)
  create mode 100644 docs/specs/acpi_hest_ghes.rst
  create mode 100644 include/hw/acpi/ghes.h
  create mode 100644 hw/acpi/ghes.c
  create mode 100644 gdb-xml/arm-m-profile.xml

-[PULL 01/45] target/arm: Use correct GDB XML for M-profile cores
+Deleted patch
-GDB's remote protocol requires M-profile cores to use the feature
-name 'org.gnu.gdb.arm.m-profile' instead of the 'org.gnu.gdb.arm.core'
-feature used for A- and R-profile cores. We weren't doing this, which
-meant GDB treated our M-profile cores like A-profile ones. This mostly
-doesn't matter, but for instance means that it doesn't correctly
-handle backtraces where an M-profile exception frame is involved.
-Ship a copy of GDB's arm-m-profile.xml and use it on the M-profile
-cores.  The integer registers have the same offsets as the
-arm-core.xml, but register 25 is the M-profile XPSR rather than the
-A-profile CPSR, so we need to update arm_cpu_gdb_read_register() and
-arm_cpu_gdb_write_register() to handle XSPR reads and writes.
-Fixes: https://bugs.launchpad.net/qemu/+bug/1877136
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Message-id: 20200507134755.13997-1-peter.maydell@linaro.org
----
- configure                 |  4 ++--
- target/arm/cpu_tcg.c      |  1 +
- target/arm/gdbstub.c      | 22 ++++++++++++++++++----
- gdb-xml/arm-m-profile.xml | 27 +++++++++++++++++++++++++++
-files changed, 48 insertions(+), 6 deletions(-)
- create mode 100644 gdb-xml/arm-m-profile.xml
-diff --git a/configure b/configure
-index XXXXXXX..XXXXXXX 100755
---- a/configure
-+++ b/configure
-@@ -XXX,XX +XXX,XX @@ case "$target_name" in
-     TARGET_SYSTBL_ABI=common,oabi
-     bflt="yes"
-     mttcg="yes"
--    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
-+    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
-   ;;
-   aarch64|aarch64_be)
-     TARGET_ARCH=aarch64
-     TARGET_BASE_ARCH=arm
-     bflt="yes"
-     mttcg="yes"
--    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
-+    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
-   ;;
-   cris)
-   ;;
-diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu_tcg.c
-+++ b/target/arm/cpu_tcg.c
-@@ -XXX,XX +XXX,XX @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
- #endif
-     cc->cpu_exec_interrupt = arm_v7m_cpu_exec_interrupt;
-+    cc->gdb_core_xml_file = "arm-m-profile.xml";
- }
- static const ARMCPUInfo arm_tcg_cpus[] = {
-diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/gdbstub.c
-+++ b/target/arm/gdbstub.c
-@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
-         }
-         return gdb_get_reg32(mem_buf, 0);
-     case 25:
--        /* CPSR */
--        return gdb_get_reg32(mem_buf, cpsr_read(env));
-+        /* CPSR, or XPSR for M-profile */
-+        if (arm_feature(env, ARM_FEATURE_M)) {
-+            return gdb_get_reg32(mem_buf, xpsr_read(env));
-+        } else {
-+            return gdb_get_reg32(mem_buf, cpsr_read(env));
-+        }
-     }
-     /* Unknown register.  */
-     return 0;
-@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
-         }
-         return 4;
-     case 25:
--        /* CPSR */
--        cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
-+        /* CPSR, or XPSR for M-profile */
-+        if (arm_feature(env, ARM_FEATURE_M)) {
-+            /*
-+             * Don't allow writing to XPSR.Exception as it can cause
-+             * a transition into or out of handler mode (it's not
-+             * writeable via the MSR insn so this is a reasonable
-+             * restriction). Other fields are safe to update.
-+             */
-+            xpsr_write(env, tmp, ~XPSR_EXCP);
-+        } else {
-+            cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
-+        }
-         return 4;
-     }
-     /* Unknown register.  */
-diff --git a/gdb-xml/arm-m-profile.xml b/gdb-xml/arm-m-profile.xml
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/gdb-xml/arm-m-profile.xml
-@@ -XXX,XX +XXX,XX @@
-+<?xml version="1.0"?>
-+<!-- Copyright (C) 2010-2020 Free Software Foundation, Inc.
-+
-+     Copying and distribution of this file, with or without modification,
-+     are permitted in any medium without royalty provided the copyright
-+     notice and this notice are preserved.  -->
-+
-+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
-+<feature name="org.gnu.gdb.arm.m-profile">
-+  <reg name="r0" bitsize="32"/>
-+  <reg name="r1" bitsize="32"/>
-+  <reg name="r2" bitsize="32"/>
-+  <reg name="r3" bitsize="32"/>
-+  <reg name="r4" bitsize="32"/>
-+  <reg name="r5" bitsize="32"/>
-+  <reg name="r6" bitsize="32"/>
-+  <reg name="r7" bitsize="32"/>
-+  <reg name="r8" bitsize="32"/>
-+  <reg name="r9" bitsize="32"/>
-+  <reg name="r10" bitsize="32"/>
-+  <reg name="r11" bitsize="32"/>
-+  <reg name="r12" bitsize="32"/>
-+  <reg name="sp" bitsize="32" type="data_ptr"/>
-+  <reg name="lr" bitsize="32"/>
-+  <reg name="pc" bitsize="32" type="code_ptr"/>
-+  <reg name="xpsr" bitsize="32" regnum="25"/>
-+</feature>
---
-.20.1

-[PULL 22/45] ACPI: Build related register address fields via hardware error fw_cfg blob
+[PULL 01/23] target/arm: Enable SCTLR_EL1.BT0 for aarch64-linux-user
-From: Dongjiu Geng <gengdongjiu@huawei.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-This patch builds error_block_address and read_ack_register fields
+This controls whether the PACI{A,B}SP instructions trap with BTYPE=3
-in hardware errors table , the error_block_address points to Generic
+(indirect branch from register other than x16/x17).  The linux kernel
-Error Status Block(GESB) via bios_linker. The max size for one GESB
+sets this in bti_enable().
 is 1kb, For more detailed information, please refer to
 document: docs/specs/acpi_hest_ghes.rst
-Now we only support one Error source, if necessary, we can extend to
+Resolves: https://gitlab.com/qemu-project/qemu/-/issues/998
-support more.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Suggested-by: Laszlo Ersek <lersek@redhat.com>
+Message-id: 20220427042312.294300-1-richard.henderson@linaro.org
-Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
+[PMM: remove stray change to makefile comment]
 Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Message-id: 20200512030609.19593-5-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- default-configs/arm-softmmu.mak |  1 +
+ target/arm/cpu.c                  |  2 ++
- include/hw/acpi/aml-build.h     |  1 +
+ tests/tcg/aarch64/bti-3.c         | 42 +++++++++++++++++++++++++++++++
- include/hw/acpi/ghes.h          | 28 +++++++++++
+ tests/tcg/aarch64/Makefile.target |  6 ++---
- hw/acpi/aml-build.c             |  2 +
+files changed, 47 insertions(+), 3 deletions(-)
- hw/acpi/ghes.c                  | 89 +++++++++++++++++++++++++++++++++
+ create mode 100644 tests/tcg/aarch64/bti-3.c
  hw/arm/virt-acpi-build.c        |  5 ++
  hw/acpi/Kconfig                 |  4 ++
  hw/acpi/Makefile.objs           |  1 +
 files changed, 131 insertions(+)
  create mode 100644 include/hw/acpi/ghes.h
  create mode 100644 hw/acpi/ghes.c
-diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
---- a/default-configs/arm-softmmu.mak
+--- a/target/arm/cpu.c
-+++ b/default-configs/arm-softmmu.mak
++++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ CONFIG_FSL_IMX7=y
+@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(DeviceState *dev)
- CONFIG_FSL_IMX6UL=y
+         /* Enable all PAC keys.  */
- CONFIG_SEMIHOSTING=y
+         env->cp15.sctlr_el[1] |= (SCTLR_EnIA | SCTLR_EnIB |
- CONFIG_ALLWINNER_H3=y
+                                   SCTLR_EnDA | SCTLR_EnDB);
-+CONFIG_ACPI_APEI=y
++        /* Trap on btype=3 for PACIxSP. */
-diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
++        env->cp15.sctlr_el[1] |= SCTLR_BT0;
-index XXXXXXX..XXXXXXX 100644
+         /* and to the FP/Neon instructions */
---- a/include/hw/acpi/aml-build.h
+         env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3);
-+++ b/include/hw/acpi/aml-build.h
+         /* and to the SVE instructions */
-@@ -XXX,XX +XXX,XX @@ struct AcpiBuildTables {
+diff --git a/tests/tcg/aarch64/bti-3.c b/tests/tcg/aarch64/bti-3.c
      GArray *rsdp;
      GArray *tcpalog;
      GArray *vmgenid;
 +    GArray *hardware_errors;
      BIOSLinker *linker;
  } AcpiBuildTables;
 diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
-+++ b/include/hw/acpi/ghes.h
++++ b/tests/tcg/aarch64/bti-3.c
 @@ -XXX,XX +XXX,XX @@
 +/*
-+ * Support for generating APEI tables and recording CPER for Guests
++ * BTI vs PACIASP
 + *
 + * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
 + *
 + * Author: Dongjiu Geng <gengdongjiu@huawei.com>
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation; either version 2 of the License, or
 + * (at your option) any later version.
 +
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 +
 + * You should have received a copy of the GNU General Public License along
 + * with this program; if not, see <http://www.gnu.org/licenses/>.
 + */
 +
-+#ifndef ACPI_GHES_H
++#include "bti-crt.inc.c"
 +#define ACPI_GHES_H
 +
-+#include "hw/acpi/bios-linker-loader.h"
++static void skip2_sigill(int sig, siginfo_t *info, ucontext_t *uc)
 +{
 +    uc->uc_mcontext.pc += 8;
 +    uc->uc_mcontext.pstate = 1;
 +}
 +
-+void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
++#define BTYPE_1() \
-+#endif
++    asm("mov %0,#1; adr x16, 1f; br x16; 1: hint #25; mov %0,#0" \
-diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
++        : "=r"(skipped) : : "x16", "x30")
 +
 +#define BTYPE_2() \
 +    asm("mov %0,#1; adr x16, 1f; blr x16; 1: hint #25; mov %0,#0" \
 +        : "=r"(skipped) : : "x16", "x30")
 +
 +#define BTYPE_3() \
 +    asm("mov %0,#1; adr x15, 1f; br x15; 1: hint #25; mov %0,#0" \
 +        : "=r"(skipped) : : "x15", "x30")
 +
 +#define TEST(WHICH, EXPECT) \
 +    do { WHICH(); fail += skipped ^ EXPECT; } while (0)
 +
 +int main()
 +{
 +    int fail = 0;
 +    int skipped;
 +
 +    /* Signal-like with SA_SIGINFO.  */
 +    signal_info(SIGILL, skip2_sigill);
 +
 +    /* With SCTLR_EL1.BT0 set, PACIASP is not compatible with type=3. */
 +    TEST(BTYPE_1, 0);
 +    TEST(BTYPE_2, 0);
 +    TEST(BTYPE_3, 1);
 +
 +    return fail;
 +}
 diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
 index XXXXXXX..XXXXXXX 100644
---- a/hw/acpi/aml-build.c
+--- a/tests/tcg/aarch64/Makefile.target
-+++ b/hw/acpi/aml-build.c
++++ b/tests/tcg/aarch64/Makefile.target
-@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_init(AcpiBuildTables *tables)
+@@ -XXX,XX +XXX,XX @@ endif
-     tables->table_data = g_array_new(false, true /* clear */, 1);
+ # BTI Tests
-     tables->tcpalog = g_array_new(false, true /* clear */, 1);
+ # bti-1 tests the elf notes, so we require special compiler support.
-     tables->vmgenid = g_array_new(false, true /* clear */, 1);
+ ifneq ($(CROSS_CC_HAS_ARMV8_BTI),)
-+    tables->hardware_errors = g_array_new(false, true /* clear */, 1);
+-AARCH64_TESTS += bti-1
-     tables->linker = bios_linker_loader_init();
+-bti-1: CFLAGS += -mbranch-protection=standard
- }
+-bti-1: LDFLAGS += -nostdlib
++AARCH64_TESTS += bti-1 bti-3
-@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre)
++bti-1 bti-3: CFLAGS += -mbranch-protection=standard
-     g_array_free(tables->table_data, true);
++bti-1 bti-3: LDFLAGS += -nostdlib
-     g_array_free(tables->tcpalog, mfre);
+ endif
-     g_array_free(tables->vmgenid, mfre);
+ # bti-2 tests PROT_BTI, so no special compiler support required.
-+    g_array_free(tables->hardware_errors, mfre);
+ AARCH64_TESTS += bti-2
  }
  /*
 diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + * Support for generating APEI tables and recording CPER for Guests
 + *
 + * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
 + *
 + * Author: Dongjiu Geng <gengdongjiu@huawei.com>
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation; either version 2 of the License, or
 + * (at your option) any later version.
 +
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 +
 + * You should have received a copy of the GNU General Public License along
 + * with this program; if not, see <http://www.gnu.org/licenses/>.
 + */
 +
 +#include "qemu/osdep.h"
 +#include "qemu/units.h"
 +#include "hw/acpi/ghes.h"
 +#include "hw/acpi/aml-build.h"
 +
 +#define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
 +#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
 +
 +/* The max size in bytes for one error block */
 +#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
 +
 +/* Now only support ARMv8 SEA notification type error source */
 +#define ACPI_GHES_ERROR_SOURCE_COUNT        1
 +
 +/*
 + * Build table for the hardware error fw_cfg blob.
 + * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
 + * See docs/specs/acpi_hest_ghes.rst for blobs format.
 + */
 +void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
 +{
 +    int i, error_status_block_offset;
 +
 +    /* Build error_block_address */
 +    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
 +        build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
 +    }
 +
 +    /* Build read_ack_register */
 +    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
 +        /*
 +         * Initialize the value of read_ack_register to 1, so GHES can be
 +         * writeable after (re)boot.
 +         * ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
 +         * (GHESv2 - Type 10)
 +         */
 +        build_append_int_noprefix(hardware_errors, 1, sizeof(uint64_t));
 +    }
 +
 +    /* Generic Error Status Block offset in the hardware error fw_cfg blob */
 +    error_status_block_offset = hardware_errors->len;
 +
 +    /* Reserve space for Error Status Data Block */
 +    acpi_data_push(hardware_errors,
 +        ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
 +
 +    /* Tell guest firmware to place hardware_errors blob into RAM */
 +    bios_linker_loader_alloc(linker, ACPI_GHES_ERRORS_FW_CFG_FILE,
 +                             hardware_errors, sizeof(uint64_t), false);
 +
 +    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
 +        /*
 +         * Tell firmware to patch error_block_address entries to point to
 +         * corresponding "Generic Error Status Block"
 +         */
 +        bios_linker_loader_add_pointer(linker,
 +            ACPI_GHES_ERRORS_FW_CFG_FILE, sizeof(uint64_t) * i,
 +            sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
 +            error_status_block_offset + i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
 +    }
 +
 +    /*
 +     * tell firmware to write hardware_errors GPA into
 +     * hardware_errors_addr fw_cfg, once the former has been initialized.
 +     */
 +    bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
 +        0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
 +}
 diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/virt-acpi-build.c
 +++ b/hw/arm/virt-acpi-build.c
@@ -XXX,XX +XXX,XX @@
  #include "sysemu/reset.h"
  #include "kvm_arm.h"
  #include "migration/vmstate.h"
 +#include "hw/acpi/ghes.h"
  #define ARM_SPI_BASE 32
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
      acpi_add_table(table_offsets, tables_blob);
      build_spcr(tables_blob, tables->linker, vms);
 +    if (vms->ras) {
 +        build_ghes_error_table(tables->hardware_errors, tables->linker);
 +    }
 +
      if (ms->numa_state->num_nodes > 0) {
          acpi_add_table(table_offsets, tables_blob);
          build_srat(tables_blob, tables->linker, vms);
 diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/acpi/Kconfig
 +++ b/hw/acpi/Kconfig
@@ -XXX,XX +XXX,XX @@ config ACPI_HMAT
      bool
      depends on ACPI
 +config ACPI_APEI
 +    bool
 +    depends on ACPI
 +
  config ACPI_PCI
      bool
      depends on ACPI && PCI
 diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/acpi/Makefile.objs
 +++ b/hw/acpi/Makefile.objs
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
  common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
  common-obj-$(CONFIG_ACPI_HMAT) += hmat.o
 +common-obj-$(CONFIG_ACPI_APEI) += ghes.o
  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
  common-obj-$(call lnot,$(CONFIG_PC)) += acpi-x86-stub.o
 --
-.20.1
+.25.1

-[PULL 17/45] target/arm: Vectorize SABA/UABA
+[PULL 02/23] target/arm: Split out cpregs.h
 From: Richard Henderson <richard.henderson@linaro.org>
-Include 64-bit element size in preparation for SVE2.
+Move ARMCPRegInfo and all related declarations to a new
 internal header, out of the public cpu.h.
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200513163245.17915-17-richard.henderson@linaro.org
+Message-id: 20220501055028.646596-2-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h        |  17 +++--
+ target/arm/cpregs.h        | 413 +++++++++++++++++++++++++++++++++++++
- target/arm/translate.h     |   5 ++
+ target/arm/cpu.h           | 368 ---------------------------------
- target/arm/neon_helper.c   |  10 ---
+ hw/arm/pxa2xx.c            |   1 +
- target/arm/translate-a64.c |  17 ++---
+ hw/arm/pxa2xx_pic.c        |   1 +
- target/arm/translate.c     | 134 +++++++++++++++++++++++++++++++++++--
+ hw/intc/arm_gicv3_cpuif.c  |   1 +
- target/arm/vec_helper.c    |  24 +++++++
+ hw/intc/arm_gicv3_kvm.c    |   2 +
-files changed, 174 insertions(+), 33 deletions(-)
+ target/arm/cpu.c           |   1 +
  target/arm/cpu64.c         |   1 +
  target/arm/cpu_tcg.c       |   1 +
  target/arm/gdbstub.c       |   3 +-
  target/arm/helper.c        |   1 +
  target/arm/op_helper.c     |   1 +
  target/arm/translate-a64.c |   4 +-
  target/arm/translate.c     |   3 +-
 files changed, 427 insertions(+), 374 deletions(-)
  create mode 100644 target/arm/cpregs.h
-diff --git a/target/arm/helper.h b/target/arm/helper.h
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
-index XXXXXXX..XXXXXXX 100644
+new file mode 100644
---- a/target/arm/helper.h
+index XXXXXXX..XXXXXXX
-+++ b/target/arm/helper.h
+--- /dev/null
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32)
++++ b/target/arm/cpregs.h
- DEF_HELPER_2(neon_pmax_u16, i32, i32, i32)
+@@ -XXX,XX +XXX,XX @@
- DEF_HELPER_2(neon_pmax_s16, i32, i32, i32)
++/*
++ * QEMU ARM CP Register access and descriptions
--DEF_HELPER_2(neon_abd_u8, i32, i32, i32)
++ *
--DEF_HELPER_2(neon_abd_s8, i32, i32, i32)
++ * Copyright (c) 2022 Linaro Ltd
--DEF_HELPER_2(neon_abd_u16, i32, i32, i32)
++ *
--DEF_HELPER_2(neon_abd_s16, i32, i32, i32)
++ * This program is free software; you can redistribute it and/or
--DEF_HELPER_2(neon_abd_u32, i32, i32, i32)
++ * modify it under the terms of the GNU General Public License
--DEF_HELPER_2(neon_abd_s32, i32, i32, i32)
++ * as published by the Free Software Foundation; either version 2
--
++ * of the License, or (at your option) any later version.
- DEF_HELPER_2(neon_shl_u16, i32, i32, i32)
++ *
- DEF_HELPER_2(neon_shl_s16, i32, i32, i32)
++ * This program is distributed in the hope that it will be useful,
- DEF_HELPER_2(neon_rshl_u8, i32, i32, i32)
++ * but WITHOUT ANY WARRANTY; without even the implied warranty of
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++ * GNU General Public License for more details.
- DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++ *
++ * You should have received a copy of the GNU General Public License
-+DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++ * along with this program; if not, see
-+DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++ * <http://www.gnu.org/licenses/gpl-2.0.html>
-+DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++ */
-+DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++
-+
++#ifndef TARGET_ARM_CPREGS_H
-+DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++#define TARGET_ARM_CPREGS_H
-+DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++
-+DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++/*
-+DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++ * ARMCPRegInfo type field bits. If the SPECIAL bit is set this is a
-+
++ * special-behaviour cp reg and bits [11..8] indicate what behaviour
- #ifdef TARGET_AARCH64
++ * it has. Otherwise it is a simple cp reg, where CONST indicates that
- #include "helper-a64.h"
++ * TCG can assume the value to be constant (ie load at translate time)
- #include "helper-sve.h"
++ * and 64BIT indicates a 64 bit wide coprocessor register. SUPPRESS_TB_END
-diff --git a/target/arm/translate.h b/target/arm/translate.h
++ * indicates that the TB should not be ended after a write to this register
-index XXXXXXX..XXXXXXX 100644
++ * (the default is that the TB ends after cp writes). OVERRIDE permits
---- a/target/arm/translate.h
++ * a register definition to override a previous definition for the
-+++ b/target/arm/translate.h
++ * same (cp, is64, crn, crm, opc1, opc2) tuple: either the new or the
-@@ -XXX,XX +XXX,XX @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++ * old must have the OVERRIDE bit set.
- void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++ * ALIAS indicates that this register is an alias view of some underlying
-                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
++ * state which is also visible via another register, and that the other
++ * register is handling migration and reset; registers marked ALIAS will not be
-+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++ * migrated but may have their state set by syncing of register state from KVM.
-+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
++ * NO_RAW indicates that this register has no underlying state and does not
-+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++ * support raw access for state saving/loading; it will not be used for either
-+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
++ * migration or KVM state synchronization. (Typically this is for "registers"
-+
++ * which are actually used as instructions for cache maintenance and so on.)
 + * IO indicates that this register does I/O and therefore its accesses
 + * need to be marked with gen_io_start() and also end the TB. In particular,
 + * registers which implement clocks or timers require this.
 + * RAISES_EXC is for when the read or write hook might raise an exception;
 + * the generated code will synchronize the CPU state before calling the hook
 + * so that it is safe for the hook to call raise_exception().
 + * NEWEL is for writes to registers that might change the exception
 + * level - typically on older ARM chips. For those cases we need to
 + * re-read the new el when recomputing the translation flags.
 + */
 +#define ARM_CP_SPECIAL           0x0001
 +#define ARM_CP_CONST             0x0002
 +#define ARM_CP_64BIT             0x0004
 +#define ARM_CP_SUPPRESS_TB_END   0x0008
 +#define ARM_CP_OVERRIDE          0x0010
 +#define ARM_CP_ALIAS             0x0020
 +#define ARM_CP_IO                0x0040
 +#define ARM_CP_NO_RAW            0x0080
 +#define ARM_CP_NOP               (ARM_CP_SPECIAL | 0x0100)
 +#define ARM_CP_WFI               (ARM_CP_SPECIAL | 0x0200)
 +#define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
 +#define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
 +#define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
 +#define ARM_CP_DC_GVA            (ARM_CP_SPECIAL | 0x0600)
 +#define ARM_CP_DC_GZVA           (ARM_CP_SPECIAL | 0x0700)
 +#define ARM_LAST_SPECIAL         ARM_CP_DC_GZVA
 +#define ARM_CP_FPU               0x1000
 +#define ARM_CP_SVE               0x2000
 +#define ARM_CP_NO_GDB            0x4000
 +#define ARM_CP_RAISES_EXC        0x8000
 +#define ARM_CP_NEWEL             0x10000
 +/* Used only as a terminator for ARMCPRegInfo lists */
 +#define ARM_CP_SENTINEL          0xfffff
 +/* Mask of only the flag bits in a type field */
 +#define ARM_CP_FLAG_MASK         0x1f0ff
 +
 +/*
 + * Valid values for ARMCPRegInfo state field, indicating which of
 + * the AArch32 and AArch64 execution states this register is visible in.
 + * If the reginfo doesn't explicitly specify then it is AArch32 only.
 + * If the reginfo is declared to be visible in both states then a second
 + * reginfo is synthesised for the AArch32 view of the AArch64 register,
 + * such that the AArch32 view is the lower 32 bits of the AArch64 one.
 + * Note that we rely on the values of these enums as we iterate through
 + * the various states in some places.
 + */
 +enum {
 +    ARM_CP_STATE_AA32 = 0,
 +    ARM_CP_STATE_AA64 = 1,
 +    ARM_CP_STATE_BOTH = 2,
 +};
 +
 +/*
 + * ARM CP register secure state flags.  These flags identify security state
 + * attributes for a given CP register entry.
 + * The existence of both or neither secure and non-secure flags indicates that
 + * the register has both a secure and non-secure hash entry.  A single one of
 + * these flags causes the register to only be hashed for the specified
 + * security state.
 + * Although definitions may have any combination of the S/NS bits, each
 + * registered entry will only have one to identify whether the entry is secure
 + * or non-secure.
 + */
 +enum {
 +    ARM_CP_SECSTATE_S =   (1 << 0), /* bit[0]: Secure state register */
 +    ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
 +};
 +
 +/*
 + * Return true if cptype is a valid type field. This is used to try to
 + * catch errors where the sentinel has been accidentally left off the end
 + * of a list of registers.
 + */
 +static inline bool cptype_valid(int cptype)
 +{
 +    return ((cptype & ~ARM_CP_FLAG_MASK) == 0)
 +        || ((cptype & ARM_CP_SPECIAL) &&
 +            ((cptype & ~ARM_CP_FLAG_MASK) <= ARM_LAST_SPECIAL));
 +}
 +
 +/*
 + * Access rights:
 + * We define bits for Read and Write access for what rev C of the v7-AR ARM ARM
 + * defines as PL0 (user), PL1 (fiq/irq/svc/abt/und/sys, ie privileged), and
 + * PL2 (hyp). The other level which has Read and Write bits is Secure PL1
 + * (ie any of the privileged modes in Secure state, or Monitor mode).
 + * If a register is accessible in one privilege level it's always accessible
 + * in higher privilege levels too. Since "Secure PL1" also follows this rule
 + * (ie anything visible in PL2 is visible in S-PL1, some things are only
 + * visible in S-PL1) but "Secure PL1" is a bit of a mouthful, we bend the
 + * terminology a little and call this PL3.
 + * In AArch64 things are somewhat simpler as the PLx bits line up exactly
 + * with the ELx exception levels.
 + *
 + * If access permissions for a register are more complex than can be
 + * described with these bits, then use a laxer set of restrictions, and
 + * do the more restrictive/complex check inside a helper function.
 + */
 +#define PL3_R 0x80
 +#define PL3_W 0x40
 +#define PL2_R (0x20 | PL3_R)
 +#define PL2_W (0x10 | PL3_W)
 +#define PL1_R (0x08 | PL2_R)
 +#define PL1_W (0x04 | PL2_W)
 +#define PL0_R (0x02 | PL1_R)
 +#define PL0_W (0x01 | PL1_W)
 +
 +/*
 + * For user-mode some registers are accessible to EL0 via a kernel
 + * trap-and-emulate ABI. In this case we define the read permissions
 + * as actually being PL0_R. However some bits of any given register
 + * may still be masked.
 + */
 +#ifdef CONFIG_USER_ONLY
 +#define PL0U_R PL0_R
 +#else
 +#define PL0U_R PL1_R
 +#endif
 +
 +#define PL3_RW (PL3_R | PL3_W)
 +#define PL2_RW (PL2_R | PL2_W)
 +#define PL1_RW (PL1_R | PL1_W)
 +#define PL0_RW (PL0_R | PL0_W)
 +
 +typedef enum CPAccessResult {
 +    /* Access is permitted */
 +    CP_ACCESS_OK = 0,
 +    /*
 +     * Access fails due to a configurable trap or enable which would
 +     * result in a categorized exception syndrome giving information about
 +     * the failing instruction (ie syndrome category 0x3, 0x4, 0x5, 0x6,
 +     * 0xc or 0x18). The exception is taken to the usual target EL (EL1 or
 +     * PL1 if in EL0, otherwise to the current EL).
 +     */
 +    CP_ACCESS_TRAP = 1,
 +    /*
 +     * Access fails and results in an exception syndrome 0x0 ("uncategorized").
 +     * Note that this is not a catch-all case -- the set of cases which may
 +     * result in this failure is specifically defined by the architecture.
 +     */
 +    CP_ACCESS_TRAP_UNCATEGORIZED = 2,
 +    /* As CP_ACCESS_TRAP, but for traps directly to EL2 or EL3 */
 +    CP_ACCESS_TRAP_EL2 = 3,
 +    CP_ACCESS_TRAP_EL3 = 4,
 +    /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
 +    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
 +    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
 +} CPAccessResult;
 +
 +typedef struct ARMCPRegInfo ARMCPRegInfo;
 +
 +/*
 + * Access functions for coprocessor registers. These cannot fail and
 + * may not raise exceptions.
 + */
 +typedef uint64_t CPReadFn(CPUARMState *env, const ARMCPRegInfo *opaque);
 +typedef void CPWriteFn(CPUARMState *env, const ARMCPRegInfo *opaque,
 +                       uint64_t value);
 +/* Access permission check functions for coprocessor registers. */
 +typedef CPAccessResult CPAccessFn(CPUARMState *env,
 +                                  const ARMCPRegInfo *opaque,
 +                                  bool isread);
 +/* Hook function for register reset */
 +typedef void CPResetFn(CPUARMState *env, const ARMCPRegInfo *opaque);
 +
 +#define CP_ANY 0xff
 +
 +/* Definition of an ARM coprocessor register */
 +struct ARMCPRegInfo {
 +    /* Name of register (useful mainly for debugging, need not be unique) */
 +    const char *name;
 +    /*
 +     * Location of register: coprocessor number and (crn,crm,opc1,opc2)
 +     * tuple. Any of crm, opc1 and opc2 may be CP_ANY to indicate a
 +     * 'wildcard' field -- any value of that field in the MRC/MCR insn
 +     * will be decoded to this register. The register read and write
 +     * callbacks will be passed an ARMCPRegInfo with the crn/crm/opc1/opc2
 +     * used by the program, so it is possible to register a wildcard and
 +     * then behave differently on read/write if necessary.
 +     * For 64 bit registers, only crm and opc1 are relevant; crn and opc2
 +     * must both be zero.
 +     * For AArch64-visible registers, opc0 is also used.
 +     * Since there are no "coprocessors" in AArch64, cp is purely used as a
 +     * way to distinguish (for KVM's benefit) guest-visible system registers
 +     * from demuxed ones provided to preserve the "no side effects on
 +     * KVM register read/write from QEMU" semantics. cp==0x13 is guest
 +     * visible (to match KVM's encoding); cp==0 will be converted to
 +     * cp==0x13 when the ARMCPRegInfo is registered, for convenience.
 +     */
 +    uint8_t cp;
 +    uint8_t crn;
 +    uint8_t crm;
 +    uint8_t opc0;
 +    uint8_t opc1;
 +    uint8_t opc2;
 +    /* Execution state in which this register is visible: ARM_CP_STATE_* */
 +    int state;
 +    /* Register type: ARM_CP_* bits/values */
 +    int type;
 +    /* Access rights: PL*_[RW] */
 +    int access;
 +    /* Security state: ARM_CP_SECSTATE_* bits/values */
 +    int secure;
 +    /*
 +     * The opaque pointer passed to define_arm_cp_regs_with_opaque() when
 +     * this register was defined: can be used to hand data through to the
 +     * register read/write functions, since they are passed the ARMCPRegInfo*.
 +     */
 +    void *opaque;
 +    /*
 +     * Value of this register, if it is ARM_CP_CONST. Otherwise, if
 +     * fieldoffset is non-zero, the reset value of the register.
 +     */
 +    uint64_t resetvalue;
 +    /*
 +     * Offset of the field in CPUARMState for this register.
 +     * This is not needed if either:
 +     *  1. type is ARM_CP_CONST or one of the ARM_CP_SPECIALs
 +     *  2. both readfn and writefn are specified
 +     */
 +    ptrdiff_t fieldoffset; /* offsetof(CPUARMState, field) */
 +
 +    /*
 +     * Offsets of the secure and non-secure fields in CPUARMState for the
 +     * register if it is banked.  These fields are only used during the static
 +     * registration of a register.  During hashing the bank associated
 +     * with a given security state is copied to fieldoffset which is used from
 +     * there on out.
 +     *
 +     * It is expected that register definitions use either fieldoffset or
 +     * bank_fieldoffsets in the definition but not both.  It is also expected
 +     * that both bank offsets are set when defining a banked register.  This
 +     * use indicates that a register is banked.
 +     */
 +    ptrdiff_t bank_fieldoffsets[2];
 +
 +    /*
 +     * Function for making any access checks for this register in addition to
 +     * those specified by the 'access' permissions bits. If NULL, no extra
 +     * checks required. The access check is performed at runtime, not at
 +     * translate time.
 +     */
 +    CPAccessFn *accessfn;
 +    /*
 +     * Function for handling reads of this register. If NULL, then reads
 +     * will be done by loading from the offset into CPUARMState specified
 +     * by fieldoffset.
 +     */
 +    CPReadFn *readfn;
 +    /*
 +     * Function for handling writes of this register. If NULL, then writes
 +     * will be done by writing to the offset into CPUARMState specified
 +     * by fieldoffset.
 +     */
 +    CPWriteFn *writefn;
 +    /*
 +     * Function for doing a "raw" read; used when we need to copy
 +     * coprocessor state to the kernel for KVM or out for
 +     * migration. This only needs to be provided if there is also a
 +     * readfn and it has side effects (for instance clear-on-read bits).
 +     */
 +    CPReadFn *raw_readfn;
 +    /*
 +     * Function for doing a "raw" write; used when we need to copy KVM
 +     * kernel coprocessor state into userspace, or for inbound
 +     * migration. This only needs to be provided if there is also a
 +     * writefn and it masks out "unwritable" bits or has write-one-to-clear
 +     * or similar behaviour.
 +     */
 +    CPWriteFn *raw_writefn;
 +    /*
 +     * Function for resetting the register. If NULL, then reset will be done
 +     * by writing resetvalue to the field specified in fieldoffset. If
 +     * fieldoffset is 0 then no reset will be done.
 +     */
 +    CPResetFn *resetfn;
 +
 +    /*
 +     * "Original" writefn and readfn.
 +     * For ARMv8.1-VHE register aliases, we overwrite the read/write
 +     * accessor functions of various EL1/EL0 to perform the runtime
 +     * check for which sysreg should actually be modified, and then
 +     * forwards the operation.  Before overwriting the accessors,
 +     * the original function is copied here, so that accesses that
 +     * really do go to the EL1/EL0 version proceed normally.
 +     * (The corresponding EL2 register is linked via opaque.)
 +     */
 +    CPReadFn *orig_readfn;
 +    CPWriteFn *orig_writefn;
 +};
 +
 +/*
 + * Macros which are lvalues for the field in CPUARMState for the
 + * ARMCPRegInfo *ri.
 + */
 +#define CPREG_FIELD32(env, ri) \
 +    (*(uint32_t *)((char *)(env) + (ri)->fieldoffset))
 +#define CPREG_FIELD64(env, ri) \
 +    (*(uint64_t *)((char *)(env) + (ri)->fieldoffset))
 +
 +#define REGINFO_SENTINEL { .type = ARM_CP_SENTINEL }
 +
 +void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
 +                                    const ARMCPRegInfo *regs, void *opaque);
 +void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
 +                                       const ARMCPRegInfo *regs, void *opaque);
 +static inline void define_arm_cp_regs(ARMCPU *cpu, const ARMCPRegInfo *regs)
 +{
 +    define_arm_cp_regs_with_opaque(cpu, regs, 0);
 +}
 +static inline void define_one_arm_cp_reg(ARMCPU *cpu, const ARMCPRegInfo *regs)
 +{
 +    define_one_arm_cp_reg_with_opaque(cpu, regs, 0);
 +}
 +const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp);
 +
 +/*
 + * Definition of an ARM co-processor register as viewed from
 + * userspace. This is used for presenting sanitised versions of
 + * registers to userspace when emulating the Linux AArch64 CPU
 + * ID/feature ABI (advertised as HWCAP_CPUID).
 + */
 +typedef struct ARMCPRegUserSpaceInfo {
 +    /* Name of register */
 +    const char *name;
 +
 +    /* Is the name actually a glob pattern */
 +    bool is_glob;
 +
 +    /* Only some bits are exported to user space */
 +    uint64_t exported_bits;
 +
 +    /* Fixed bits are applied after the mask */
 +    uint64_t fixed_bits;
 +} ARMCPRegUserSpaceInfo;
 +
 +#define REGUSERINFO_SENTINEL { .name = NULL }
 +
 +void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods);
 +
 +/* CPWriteFn that can be used to implement writes-ignored behaviour */
 +void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
 +                         uint64_t value);
 +/* CPReadFn that can be used for read-as-zero behaviour */
 +uint64_t arm_cp_read_zero(CPUARMState *env, const ARMCPRegInfo *ri);
 +
 +/*
 + * CPResetFn that does nothing, for use if no reset is required even
 + * if fieldoffset is non zero.
 + */
 +void arm_cp_reset_ignore(CPUARMState *env, const ARMCPRegInfo *opaque);
 +
 +/*
 + * Return true if this reginfo struct's field in the cpu state struct
 + * is 64 bits wide.
 + */
 +static inline bool cpreg_field_is_64bit(const ARMCPRegInfo *ri)
 +{
 +    return (ri->state == ARM_CP_STATE_AA64) || (ri->type & ARM_CP_64BIT);
 +}
 +
 +static inline bool cp_access_ok(int current_el,
 +                                const ARMCPRegInfo *ri, int isread)
 +{
 +    return (ri->access >> ((current_el * 2) + isread)) & 1;
 +}
 +
 +/* Raw read of a coprocessor register (as needed for migration, etc) */
 +uint64_t read_raw_cp_reg(CPUARMState *env, const ARMCPRegInfo *ri);
 +
 +#endif /* TARGET_ARM_CPREGS_H */
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
      return kvmid;
  }
 -/* ARMCPRegInfo type field bits. If the SPECIAL bit is set this is a
 - * special-behaviour cp reg and bits [11..8] indicate what behaviour
 - * it has. Otherwise it is a simple cp reg, where CONST indicates that
 - * TCG can assume the value to be constant (ie load at translate time)
 - * and 64BIT indicates a 64 bit wide coprocessor register. SUPPRESS_TB_END
 - * indicates that the TB should not be ended after a write to this register
 - * (the default is that the TB ends after cp writes). OVERRIDE permits
 - * a register definition to override a previous definition for the
 - * same (cp, is64, crn, crm, opc1, opc2) tuple: either the new or the
 - * old must have the OVERRIDE bit set.
 - * ALIAS indicates that this register is an alias view of some underlying
 - * state which is also visible via another register, and that the other
 - * register is handling migration and reset; registers marked ALIAS will not be
 - * migrated but may have their state set by syncing of register state from KVM.
 - * NO_RAW indicates that this register has no underlying state and does not
 - * support raw access for state saving/loading; it will not be used for either
 - * migration or KVM state synchronization. (Typically this is for "registers"
 - * which are actually used as instructions for cache maintenance and so on.)
 - * IO indicates that this register does I/O and therefore its accesses
 - * need to be marked with gen_io_start() and also end the TB. In particular,
 - * registers which implement clocks or timers require this.
 - * RAISES_EXC is for when the read or write hook might raise an exception;
 - * the generated code will synchronize the CPU state before calling the hook
 - * so that it is safe for the hook to call raise_exception().
 - * NEWEL is for writes to registers that might change the exception
 - * level - typically on older ARM chips. For those cases we need to
 - * re-read the new el when recomputing the translation flags.
 - */
 -#define ARM_CP_SPECIAL           0x0001
 -#define ARM_CP_CONST             0x0002
 -#define ARM_CP_64BIT             0x0004
 -#define ARM_CP_SUPPRESS_TB_END   0x0008
 -#define ARM_CP_OVERRIDE          0x0010
 -#define ARM_CP_ALIAS             0x0020
 -#define ARM_CP_IO                0x0040
 -#define ARM_CP_NO_RAW            0x0080
 -#define ARM_CP_NOP               (ARM_CP_SPECIAL | 0x0100)
 -#define ARM_CP_WFI               (ARM_CP_SPECIAL | 0x0200)
 -#define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
 -#define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
 -#define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
 -#define ARM_CP_DC_GVA            (ARM_CP_SPECIAL | 0x0600)
 -#define ARM_CP_DC_GZVA           (ARM_CP_SPECIAL | 0x0700)
 -#define ARM_LAST_SPECIAL         ARM_CP_DC_GZVA
 -#define ARM_CP_FPU               0x1000
 -#define ARM_CP_SVE               0x2000
 -#define ARM_CP_NO_GDB            0x4000
 -#define ARM_CP_RAISES_EXC        0x8000
 -#define ARM_CP_NEWEL             0x10000
 -/* Used only as a terminator for ARMCPRegInfo lists */
 -#define ARM_CP_SENTINEL          0xfffff
 -/* Mask of only the flag bits in a type field */
 -#define ARM_CP_FLAG_MASK         0x1f0ff
 -
 -/* Valid values for ARMCPRegInfo state field, indicating which of
 - * the AArch32 and AArch64 execution states this register is visible in.
 - * If the reginfo doesn't explicitly specify then it is AArch32 only.
 - * If the reginfo is declared to be visible in both states then a second
 - * reginfo is synthesised for the AArch32 view of the AArch64 register,
 - * such that the AArch32 view is the lower 32 bits of the AArch64 one.
 - * Note that we rely on the values of these enums as we iterate through
 - * the various states in some places.
 - */
 -enum {
 -    ARM_CP_STATE_AA32 = 0,
 -    ARM_CP_STATE_AA64 = 1,
 -    ARM_CP_STATE_BOTH = 2,
 -};
 -
 -/* ARM CP register secure state flags.  These flags identify security state
 - * attributes for a given CP register entry.
 - * The existence of both or neither secure and non-secure flags indicates that
 - * the register has both a secure and non-secure hash entry.  A single one of
 - * these flags causes the register to only be hashed for the specified
 - * security state.
 - * Although definitions may have any combination of the S/NS bits, each
 - * registered entry will only have one to identify whether the entry is secure
 - * or non-secure.
 - */
 -enum {
 -    ARM_CP_SECSTATE_S =   (1 << 0), /* bit[0]: Secure state register */
 -    ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
 -};
 -
 -/* Return true if cptype is a valid type field. This is used to try to
 - * catch errors where the sentinel has been accidentally left off the end
 - * of a list of registers.
 - */
 -static inline bool cptype_valid(int cptype)
 -{
 -    return ((cptype & ~ARM_CP_FLAG_MASK) == 0)
 -        || ((cptype & ARM_CP_SPECIAL) &&
 -            ((cptype & ~ARM_CP_FLAG_MASK) <= ARM_LAST_SPECIAL));
 -}
 -
 -/* Access rights:
 - * We define bits for Read and Write access for what rev C of the v7-AR ARM ARM
 - * defines as PL0 (user), PL1 (fiq/irq/svc/abt/und/sys, ie privileged), and
 - * PL2 (hyp). The other level which has Read and Write bits is Secure PL1
 - * (ie any of the privileged modes in Secure state, or Monitor mode).
 - * If a register is accessible in one privilege level it's always accessible
 - * in higher privilege levels too. Since "Secure PL1" also follows this rule
 - * (ie anything visible in PL2 is visible in S-PL1, some things are only
 - * visible in S-PL1) but "Secure PL1" is a bit of a mouthful, we bend the
 - * terminology a little and call this PL3.
 - * In AArch64 things are somewhat simpler as the PLx bits line up exactly
 - * with the ELx exception levels.
 - *
 - * If access permissions for a register are more complex than can be
 - * described with these bits, then use a laxer set of restrictions, and
 - * do the more restrictive/complex check inside a helper function.
 - */
 -#define PL3_R 0x80
 -#define PL3_W 0x40
 -#define PL2_R (0x20 | PL3_R)
 -#define PL2_W (0x10 | PL3_W)
 -#define PL1_R (0x08 | PL2_R)
 -#define PL1_W (0x04 | PL2_W)
 -#define PL0_R (0x02 | PL1_R)
 -#define PL0_W (0x01 | PL1_W)
 -
 -/*
 - * For user-mode some registers are accessible to EL0 via a kernel
 - * trap-and-emulate ABI. In this case we define the read permissions
 - * as actually being PL0_R. However some bits of any given register
 - * may still be masked.
 - */
 -#ifdef CONFIG_USER_ONLY
 -#define PL0U_R PL0_R
 -#else
 -#define PL0U_R PL1_R
 -#endif
 -
 -#define PL3_RW (PL3_R | PL3_W)
 -#define PL2_RW (PL2_R | PL2_W)
 -#define PL1_RW (PL1_R | PL1_W)
 -#define PL0_RW (PL0_R | PL0_W)
 -
  /* Return the highest implemented Exception Level */
  static inline int arm_highest_el(CPUARMState *env)
  {
@@ -XXX,XX +XXX,XX @@ static inline int arm_current_el(CPUARMState *env)
      }
  }
 -typedef struct ARMCPRegInfo ARMCPRegInfo;
 -
 -typedef enum CPAccessResult {
 -    /* Access is permitted */
 -    CP_ACCESS_OK = 0,
 -    /* Access fails due to a configurable trap or enable which would
 -     * result in a categorized exception syndrome giving information about
 -     * the failing instruction (ie syndrome category 0x3, 0x4, 0x5, 0x6,
 -     * 0xc or 0x18). The exception is taken to the usual target EL (EL1 or
 -     * PL1 if in EL0, otherwise to the current EL).
 -     */
 -    CP_ACCESS_TRAP = 1,
 -    /* Access fails and results in an exception syndrome 0x0 ("uncategorized").
 -     * Note that this is not a catch-all case -- the set of cases which may
 -     * result in this failure is specifically defined by the architecture.
 -     */
 -    CP_ACCESS_TRAP_UNCATEGORIZED = 2,
 -    /* As CP_ACCESS_TRAP, but for traps directly to EL2 or EL3 */
 -    CP_ACCESS_TRAP_EL2 = 3,
 -    CP_ACCESS_TRAP_EL3 = 4,
 -    /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
 -    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
 -    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
 -} CPAccessResult;
 -
 -/* Access functions for coprocessor registers. These cannot fail and
 - * may not raise exceptions.
 - */
 -typedef uint64_t CPReadFn(CPUARMState *env, const ARMCPRegInfo *opaque);
 -typedef void CPWriteFn(CPUARMState *env, const ARMCPRegInfo *opaque,
 -                       uint64_t value);
 -/* Access permission check functions for coprocessor registers. */
 -typedef CPAccessResult CPAccessFn(CPUARMState *env,
 -                                  const ARMCPRegInfo *opaque,
 -                                  bool isread);
 -/* Hook function for register reset */
 -typedef void CPResetFn(CPUARMState *env, const ARMCPRegInfo *opaque);
 -
 -#define CP_ANY 0xff
 -
 -/* Definition of an ARM coprocessor register */
 -struct ARMCPRegInfo {
 -    /* Name of register (useful mainly for debugging, need not be unique) */
 -    const char *name;
 -    /* Location of register: coprocessor number and (crn,crm,opc1,opc2)
 -     * tuple. Any of crm, opc1 and opc2 may be CP_ANY to indicate a
 -     * 'wildcard' field -- any value of that field in the MRC/MCR insn
 -     * will be decoded to this register. The register read and write
 -     * callbacks will be passed an ARMCPRegInfo with the crn/crm/opc1/opc2
 -     * used by the program, so it is possible to register a wildcard and
 -     * then behave differently on read/write if necessary.
 -     * For 64 bit registers, only crm and opc1 are relevant; crn and opc2
 -     * must both be zero.
 -     * For AArch64-visible registers, opc0 is also used.
 -     * Since there are no "coprocessors" in AArch64, cp is purely used as a
 -     * way to distinguish (for KVM's benefit) guest-visible system registers
 -     * from demuxed ones provided to preserve the "no side effects on
 -     * KVM register read/write from QEMU" semantics. cp==0x13 is guest
 -     * visible (to match KVM's encoding); cp==0 will be converted to
 -     * cp==0x13 when the ARMCPRegInfo is registered, for convenience.
 -     */
 -    uint8_t cp;
 -    uint8_t crn;
 -    uint8_t crm;
 -    uint8_t opc0;
 -    uint8_t opc1;
 -    uint8_t opc2;
 -    /* Execution state in which this register is visible: ARM_CP_STATE_* */
 -    int state;
 -    /* Register type: ARM_CP_* bits/values */
 -    int type;
 -    /* Access rights: PL*_[RW] */
 -    int access;
 -    /* Security state: ARM_CP_SECSTATE_* bits/values */
 -    int secure;
 -    /* The opaque pointer passed to define_arm_cp_regs_with_opaque() when
 -     * this register was defined: can be used to hand data through to the
 -     * register read/write functions, since they are passed the ARMCPRegInfo*.
 -     */
 -    void *opaque;
 -    /* Value of this register, if it is ARM_CP_CONST. Otherwise, if
 -     * fieldoffset is non-zero, the reset value of the register.
 -     */
 -    uint64_t resetvalue;
 -    /* Offset of the field in CPUARMState for this register.
 -     *
 -     * This is not needed if either:
 -     *  1. type is ARM_CP_CONST or one of the ARM_CP_SPECIALs
 -     *  2. both readfn and writefn are specified
 -     */
 -    ptrdiff_t fieldoffset; /* offsetof(CPUARMState, field) */
 -
 -    /* Offsets of the secure and non-secure fields in CPUARMState for the
 -     * register if it is banked.  These fields are only used during the static
 -     * registration of a register.  During hashing the bank associated
 -     * with a given security state is copied to fieldoffset which is used from
 -     * there on out.
 -     *
 -     * It is expected that register definitions use either fieldoffset or
 -     * bank_fieldoffsets in the definition but not both.  It is also expected
 -     * that both bank offsets are set when defining a banked register.  This
 -     * use indicates that a register is banked.
 -     */
 -    ptrdiff_t bank_fieldoffsets[2];
 -
 -    /* Function for making any access checks for this register in addition to
 -     * those specified by the 'access' permissions bits. If NULL, no extra
 -     * checks required. The access check is performed at runtime, not at
 -     * translate time.
 -     */
 -    CPAccessFn *accessfn;
 -    /* Function for handling reads of this register. If NULL, then reads
 -     * will be done by loading from the offset into CPUARMState specified
 -     * by fieldoffset.
 -     */
 -    CPReadFn *readfn;
 -    /* Function for handling writes of this register. If NULL, then writes
 -     * will be done by writing to the offset into CPUARMState specified
 -     * by fieldoffset.
 -     */
 -    CPWriteFn *writefn;
 -    /* Function for doing a "raw" read; used when we need to copy
 -     * coprocessor state to the kernel for KVM or out for
 -     * migration. This only needs to be provided if there is also a
 -     * readfn and it has side effects (for instance clear-on-read bits).
 -     */
 -    CPReadFn *raw_readfn;
 -    /* Function for doing a "raw" write; used when we need to copy KVM
 -     * kernel coprocessor state into userspace, or for inbound
 -     * migration. This only needs to be provided if there is also a
 -     * writefn and it masks out "unwritable" bits or has write-one-to-clear
 -     * or similar behaviour.
 -     */
 -    CPWriteFn *raw_writefn;
 -    /* Function for resetting the register. If NULL, then reset will be done
 -     * by writing resetvalue to the field specified in fieldoffset. If
 -     * fieldoffset is 0 then no reset will be done.
 -     */
 -    CPResetFn *resetfn;
 -
 -    /*
 -     * "Original" writefn and readfn.
 -     * For ARMv8.1-VHE register aliases, we overwrite the read/write
 -     * accessor functions of various EL1/EL0 to perform the runtime
 -     * check for which sysreg should actually be modified, and then
 -     * forwards the operation.  Before overwriting the accessors,
 -     * the original function is copied here, so that accesses that
 -     * really do go to the EL1/EL0 version proceed normally.
 -     * (The corresponding EL2 register is linked via opaque.)
 -     */
 -    CPReadFn *orig_readfn;
 -    CPWriteFn *orig_writefn;
 -};
 -
 -/* Macros which are lvalues for the field in CPUARMState for the
 - * ARMCPRegInfo *ri.
 - */
 -#define CPREG_FIELD32(env, ri) \
 -    (*(uint32_t *)((char *)(env) + (ri)->fieldoffset))
 -#define CPREG_FIELD64(env, ri) \
 -    (*(uint64_t *)((char *)(env) + (ri)->fieldoffset))
 -
 -#define REGINFO_SENTINEL { .type = ARM_CP_SENTINEL }
 -
 -void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
 -                                    const ARMCPRegInfo *regs, void *opaque);
 -void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
 -                                       const ARMCPRegInfo *regs, void *opaque);
 -static inline void define_arm_cp_regs(ARMCPU *cpu, const ARMCPRegInfo *regs)
 -{
 -    define_arm_cp_regs_with_opaque(cpu, regs, 0);
 -}
 -static inline void define_one_arm_cp_reg(ARMCPU *cpu, const ARMCPRegInfo *regs)
 -{
 -    define_one_arm_cp_reg_with_opaque(cpu, regs, 0);
 -}
 -const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp);
 -
 -/*
 - * Definition of an ARM co-processor register as viewed from
 - * userspace. This is used for presenting sanitised versions of
 - * registers to userspace when emulating the Linux AArch64 CPU
 - * ID/feature ABI (advertised as HWCAP_CPUID).
 - */
 -typedef struct ARMCPRegUserSpaceInfo {
 -    /* Name of register */
 -    const char *name;
 -
 -    /* Is the name actually a glob pattern */
 -    bool is_glob;
 -
 -    /* Only some bits are exported to user space */
 -    uint64_t exported_bits;
 -
 -    /* Fixed bits are applied after the mask */
 -    uint64_t fixed_bits;
 -} ARMCPRegUserSpaceInfo;
 -
 -#define REGUSERINFO_SENTINEL { .name = NULL }
 -
 -void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods);
 -
 -/* CPWriteFn that can be used to implement writes-ignored behaviour */
 -void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
 -                         uint64_t value);
 -/* CPReadFn that can be used for read-as-zero behaviour */
 -uint64_t arm_cp_read_zero(CPUARMState *env, const ARMCPRegInfo *ri);
 -
 -/* CPResetFn that does nothing, for use if no reset is required even
 - * if fieldoffset is non zero.
 - */
 -void arm_cp_reset_ignore(CPUARMState *env, const ARMCPRegInfo *opaque);
 -
 -/* Return true if this reginfo struct's field in the cpu state struct
 - * is 64 bits wide.
 - */
 -static inline bool cpreg_field_is_64bit(const ARMCPRegInfo *ri)
 -{
 -    return (ri->state == ARM_CP_STATE_AA64) || (ri->type & ARM_CP_64BIT);
 -}
 -
 -static inline bool cp_access_ok(int current_el,
 -                                const ARMCPRegInfo *ri, int isread)
 -{
 -    return (ri->access >> ((current_el * 2) + isread)) & 1;
 -}
 -
 -/* Raw read of a coprocessor register (as needed for migration, etc) */
 -uint64_t read_raw_cp_reg(CPUARMState *env, const ARMCPRegInfo *ri);
 -
  /**
   * write_list_to_cpustate
   * @cpu: ARMCPU
 diff --git a/hw/arm/pxa2xx.c b/hw/arm/pxa2xx.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/pxa2xx.c
 +++ b/hw/arm/pxa2xx.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu/cutils.h"
  #include "qemu/log.h"
  #include "qom/object.h"
 +#include "target/arm/cpregs.h"
  static struct {
      hwaddr io_base;
 diff --git a/hw/arm/pxa2xx_pic.c b/hw/arm/pxa2xx_pic.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/pxa2xx_pic.c
 +++ b/hw/arm/pxa2xx_pic.c
@@ -XXX,XX +XXX,XX @@
  #include "hw/sysbus.h"
  #include "migration/vmstate.h"
  #include "qom/object.h"
 +#include "target/arm/cpregs.h"
  #define ICIP    0x00    /* Interrupt Controller IRQ Pending register */
  #define ICMR    0x04    /* Interrupt Controller Mask register */
 diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/intc/arm_gicv3_cpuif.c
 +++ b/hw/intc/arm_gicv3_cpuif.c
@@ -XXX,XX +XXX,XX @@
  #include "gicv3_internal.h"
  #include "hw/irq.h"
  #include "cpu.h"
 +#include "target/arm/cpregs.h"
  /*
-  * Forward to the isar_feature_* tests given a DisasContext pointer.
+  * Special case return value from hppvi_index(); must be larger than
 diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/intc/arm_gicv3_kvm.c
 +++ b/hw/intc/arm_gicv3_kvm.c
@@ -XXX,XX +XXX,XX @@
  #include "vgic_common.h"
  #include "migration/blocker.h"
  #include "qom/object.h"
 +#include "target/arm/cpregs.h"
 +
  #ifdef DEBUG_GICV3_KVM
  #define DPRINTF(fmt, ...) \
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@
  #include "kvm_arm.h"
  #include "disas/capstone.h"
  #include "fpu/softfloat.h"
 +#include "cpregs.h"
  static void arm_cpu_set_pc(CPUState *cs, vaddr value)
  {
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@
  #include "hvf_arm.h"
  #include "qapi/visitor.h"
  #include "hw/qdev-properties.h"
 +#include "cpregs.h"
  #ifndef CONFIG_USER_ONLY
 diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu_tcg.c
 +++ b/target/arm/cpu_tcg.c
@@ -XXX,XX +XXX,XX @@
  #if !defined(CONFIG_USER_ONLY)
  #include "hw/boards.h"
  #endif
 +#include "cpregs.h"
  /* CPU models. These are not needed for the AArch64 linux-user build. */
  #if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
 diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/gdbstub.c
 +++ b/target/arm/gdbstub.c
@@ -XXX,XX +XXX,XX @@
   */
-diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
+ #include "qemu/osdep.h"
-index XXXXXXX..XXXXXXX 100644
+ #include "cpu.h"
---- a/target/arm/neon_helper.c
+-#include "internals.h"
-+++ b/target/arm/neon_helper.c
+ #include "exec/gdbstub.h"
-@@ -XXX,XX +XXX,XX @@ NEON_POP(pmax_s16, neon_s16, 2)
++#include "internals.h"
- NEON_POP(pmax_u16, neon_u16, 2)
++#include "cpregs.h"
- #undef NEON_FN
+ typedef struct RegisterSysregXmlParam {
--#define NEON_FN(dest, src1, src2) \
+     CPUState *cs;
--    dest = (src1 > src2) ? (src1 - src2) : (src2 - src1)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
--NEON_VOP(abd_s8, neon_s8, 4)
+index XXXXXXX..XXXXXXX 100644
--NEON_VOP(abd_u8, neon_u8, 4)
+--- a/target/arm/helper.c
--NEON_VOP(abd_s16, neon_s16, 2)
++++ b/target/arm/helper.c
--NEON_VOP(abd_u16, neon_u16, 2)
+@@ -XXX,XX +XXX,XX @@
--NEON_VOP(abd_s32, neon_s32, 1)
+ #include "exec/cpu_ldst.h"
--NEON_VOP(abd_u32, neon_u32, 1)
+ #include "semihosting/common-semi.h"
--#undef NEON_FN
+ #endif
--
++#include "cpregs.h"
- #define NEON_FN(dest, src1, src2) do { \
-     int8_t tmp; \
+ #define ARM_CPU_FREQ 1000000000 /* FIXME: 1 GHz, should be configurable */
-     tmp = (int8_t)src2; \
+ #define PMCR_NUM_COUNTERS 4 /* QEMU IMPDEF choice */
 diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/op_helper.c
 +++ b/target/arm/op_helper.c
@@ -XXX,XX +XXX,XX @@
  #include "internals.h"
  #include "exec/exec-all.h"
  #include "exec/cpu_ldst.h"
 +#include "cpregs.h"
  #define SIGNBIT (uint32_t)0x80000000
  #define SIGNBIT64 ((uint64_t)1 << 63)
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@
-             gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
+ #include "translate.h"
-         }
+ #include "internals.h"
-         return;
+ #include "qemu/host-utils.h"
-+    case 0xf: /* SABA, UABA */
+-
-+        if (u) {
+ #include "semihosting/semihost.h"
-+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size);
+ #include "exec/gen-icount.h"
-+        } else {
+-
-+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size);
+ #include "exec/helper-proto.h"
-+        }
+ #include "exec/helper-gen.h"
-+        return;
+ #include "exec/log.h"
-     case 0x10: /* ADD, SUB */
+-
-         if (u) {
++#include "cpregs.h"
-             gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
+ #include "translate-a64.h"
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
+ #include "qemu/atomic128.h"
-                 genenvfn = fns[size][u];
                  break;
              }
 -            case 0xf: /* SABA, UABA */
 -            {
 -                static NeonGenTwoOpFn * const fns[3][2] = {
 -                    { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 },
 -                    { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 },
 -                    { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 },
 -                };
 -                genfn = fns[size][u];
 -                break;
 -            }
              case 0x16: /* SQDMULH, SQRDMULH */
              {
                  static NeonGenTwoOpEnvFn * const fns[2][2] = {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+@@ -XXX,XX +XXX,XX @@
-     tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+ #include "qemu/bitops.h"
- }
+ #include "arm_ldst.h"
+ #include "semihosting/semihost.h"
-+static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
+-
-+{
+ #include "exec/helper-proto.h"
-+    TCGv_i32 t = tcg_temp_new_i32();
+ #include "exec/helper-gen.h"
-+    gen_sabd_i32(t, a, b);
+-
-+    tcg_gen_add_i32(d, d, t);
+ #include "exec/log.h"
-+    tcg_temp_free_i32(t);
++#include "cpregs.h"
-+}
-+
-+static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
+ #define ENABLE_ARCH_4T    arm_dc_feature(s, ARM_FEATURE_V4T)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +    gen_sabd_i64(t, a, b);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    gen_sabd_vec(vece, t, a, b);
 +    tcg_gen_add_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_add_vec,
 +        INDEX_op_smin_vec, INDEX_op_smax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_b,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_h,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_16 },
 +        { .fni4 = gen_saba_i32,
 +          .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_s,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_32 },
 +        { .fni8 = gen_saba_i64,
 +          .fniv = gen_saba_vec,
 +          .fno = gen_helper_gvec_saba_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
 +static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +    gen_uabd_i32(t, a, b);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +    gen_uabd_i64(t, a, b);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    gen_uabd_vec(vece, t, a, b);
 +    tcg_gen_add_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_add_vec,
 +        INDEX_op_umin_vec, INDEX_op_umax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_b,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_h,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_16 },
 +        { .fni4 = gen_uaba_i32,
 +          .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_s,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_32 },
 +        { .fni8 = gen_uaba_i64,
 +          .fniv = gen_uaba_vec,
 +          .fno = gen_helper_gvec_uaba_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
     We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 0;
 +        case NEON_3R_VABA:
 +            if (u) {
 +                gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
 +                              vec_size, vec_size);
 +            } else {
 +                gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
 +                              vec_size, vec_size);
 +            }
 +            return 0;
 +
          case NEON_3R_VADD_VSUB:
          case NEON_3R_LOGIC:
          case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VQRSHL:
              GEN_NEON_INTEGER_OP_ENV(qrshl);
              break;
 -        case NEON_3R_VABA:
 -            GEN_NEON_INTEGER_OP(abd);
 -            tcg_temp_free_i32(tmp2);
 -            tmp2 = neon_load_reg(rd, pass);
 -            gen_neon_add(size, tmp, tmp2);
 -            break;
          case NEON_3R_VPMAX:
              GEN_NEON_INTEGER_OP(pmax);
              break;
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_ABD(gvec_uabd_s, uint32_t)
  DO_ABD(gvec_uabd_d, uint64_t)
  #undef DO_ABD
 +
 +#define DO_ABA(NAME, TYPE)                                      \
 +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc)  \
 +{                                                               \
 +    intptr_t i, opr_sz = simd_oprsz(desc);                      \
 +    TYPE *d = vd, *n = vn, *m = vm;                             \
 +                                                                \
 +    for (i = 0; i < opr_sz / sizeof(TYPE); ++i) {               \
 +        d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i];        \
 +    }                                                           \
 +    clear_tail(d, opr_sz, simd_maxsz(desc));                    \
 +}
 +
 +DO_ABA(gvec_saba_b, int8_t)
 +DO_ABA(gvec_saba_h, int16_t)
 +DO_ABA(gvec_saba_s, int32_t)
 +DO_ABA(gvec_saba_d, int64_t)
 +
 +DO_ABA(gvec_uaba_b, uint8_t)
 +DO_ABA(gvec_uaba_h, uint16_t)
 +DO_ABA(gvec_uaba_s, uint32_t)
 +DO_ABA(gvec_uaba_d, uint64_t)
 +
 +#undef DO_ABA
 --
-.20.1
+.25.1

-[PULL 06/45] target/arm: Tidy handle_vec_simd_shri
+[PULL 03/23] target/arm: Reorg CPAccessResult and access_check_cp_reg
 From: Richard Henderson <richard.henderson@linaro.org>
-Now that we've converted all cases to gvec, there is quite a bit
+Rearrange the values of the enumerators of CPAccessResult
-of dead code at the end of the function.  Remove it.
+so that we may directly extract the target el. For the two
 special cases in access_check_cp_reg, use CPAccessResult.
-Sink the call to gen_gvec_fn2i to the end, loading a function
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 pointer within the switch statement.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200513163245.17915-6-richard.henderson@linaro.org
+Message-id: 20220501055028.646596-3-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.c | 56 ++++++++++----------------------------
+ target/arm/cpregs.h    | 26 ++++++++++++--------
-file changed, 14 insertions(+), 42 deletions(-)
+ target/arm/op_helper.c | 56 +++++++++++++++++++++---------------------
 files changed, 44 insertions(+), 38 deletions(-)
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/cpregs.h
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
+@@ -XXX,XX +XXX,XX @@ static inline bool cptype_valid(int cptype)
-     int size = 32 - clz32(immh) - 1;
+ typedef enum CPAccessResult {
-     int immhb = immh << 3 | immb;
+     /* Access is permitted */
-     int shift = 2 * (8 << size) - immhb;
+     CP_ACCESS_OK = 0,
--    bool accumulate = false;
++
--    int dsize = is_q ? 128 : 64;
++    /*
--    int esize = 8 << size;
++     * Combined with one of the following, the low 2 bits indicate the
--    int elements = dsize/esize;
++     * target exception level.  If 0, the exception is taken to the usual
--    MemOp memop = size | (is_u ? 0 : MO_SIGN);
++     * target EL (EL1 or PL1 if in EL0, otherwise to the current EL).
--    TCGv_i64 tcg_rn = new_tmp_a64(s);
++     */
--    TCGv_i64 tcg_rd = new_tmp_a64(s);
++    CP_ACCESS_EL_MASK = 3,
--    TCGv_i64 tcg_round;
++
--    uint64_t round_const;
+     /*
--    int i;
+      * Access fails due to a configurable trap or enable which would
-+    GVecGen2iFn *gvec_fn;
+      * result in a categorized exception syndrome giving information about
+      * the failing instruction (ie syndrome category 0x3, 0x4, 0x5, 0x6,
-     if (extract32(immh, 3, 1) && !is_q) {
+-     * 0xc or 0x18). The exception is taken to the usual target EL (EL1 or
-         unallocated_encoding(s);
+-     * PL1 if in EL0, otherwise to the current EL).
-@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
++     * 0xc or 0x18).
+      */
-     switch (opcode) {
+-    CP_ACCESS_TRAP = 1,
-     case 0x02: /* SSRA / USRA (accumulate) */
++    CP_ACCESS_TRAP = (1 << 2),
--        gen_gvec_fn2i(s, is_q, rd, rn, shift,
++    CP_ACCESS_TRAP_EL2 = CP_ACCESS_TRAP | 2,
--                      is_u ? gen_gvec_usra : gen_gvec_ssra, size);
++    CP_ACCESS_TRAP_EL3 = CP_ACCESS_TRAP | 3,
 +
      /*
       * Access fails and results in an exception syndrome 0x0 ("uncategorized").
       * Note that this is not a catch-all case -- the set of cases which may
       * result in this failure is specifically defined by the architecture.
       */
 -    CP_ACCESS_TRAP_UNCATEGORIZED = 2,
 -    /* As CP_ACCESS_TRAP, but for traps directly to EL2 or EL3 */
 -    CP_ACCESS_TRAP_EL2 = 3,
 -    CP_ACCESS_TRAP_EL3 = 4,
 -    /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
 -    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
 -    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
 +    CP_ACCESS_TRAP_UNCATEGORIZED = (2 << 2),
 +    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = CP_ACCESS_TRAP_UNCATEGORIZED | 2,
 +    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = CP_ACCESS_TRAP_UNCATEGORIZED | 3,
  } CPAccessResult;
  typedef struct ARMCPRegInfo ARMCPRegInfo;
 diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/op_helper.c
 +++ b/target/arm/op_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(access_check_cp_reg)(CPUARMState *env, void *rip, uint32_t syndrome,
                                   uint32_t isread)
  {
      const ARMCPRegInfo *ri = rip;
 +    CPAccessResult res = CP_ACCESS_OK;
      int target_el;
      if (arm_feature(env, ARM_FEATURE_XSCALE) && ri->cp < 14
          && extract32(env->cp15.c15_cpar, ri->cp, 1) == 0) {
 -        raise_exception(env, EXCP_UDEF, syndrome, exception_target_el(env));
 +        res = CP_ACCESS_TRAP;
 +        goto fail;
      }
      /*
@@ -XXX,XX +XXX,XX @@ void HELPER(access_check_cp_reg)(CPUARMState *env, void *rip, uint32_t syndrome,
          mask &= ~((1 << 4) | (1 << 14));
          if (env->cp15.hstr_el2 & mask) {
 -            target_el = 2;
 -            goto exept;
 +            res = CP_ACCESS_TRAP_EL2;
 +            goto fail;
          }
      }
 -    if (!ri->accessfn) {
 +    if (ri->accessfn) {
 +        res = ri->accessfn(env, ri, isread);
 +    }
 +    if (likely(res == CP_ACCESS_OK)) {
          return;
      }
 -    switch (ri->accessfn(env, ri, isread)) {
 -    case CP_ACCESS_OK:
 -        return;
-+        gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra;
++ fail:
-+        break;
++    switch (res & ~CP_ACCESS_EL_MASK) {
+     case CP_ACCESS_TRAP:
-     case 0x08: /* SRI */
+-        target_el = exception_target_el(env);
--        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
+-        break;
--        return;
+-    case CP_ACCESS_TRAP_EL2:
-+        gvec_fn = gen_gvec_sri;
+-        /* Requesting a trap to EL2 when we're in EL3 is
-+        break;
+-         * a bug in the access function.
+-         */
-     case 0x00: /* SSHR / USHR */
+-        assert(arm_current_el(env) != 3);
-         if (is_u) {
+-        target_el = 2;
-@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
+-        break;
-                 /* Shift count the same size as element size produces zero.  */
+-    case CP_ACCESS_TRAP_EL3:
-                 tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd),
+-        target_el = 3;
-                                      is_q ? 16 : 8, vec_full_reg_size(s), 0);
+         break;
--            } else {
+     case CP_ACCESS_TRAP_UNCATEGORIZED:
--                gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size);
+-        target_el = exception_target_el(env);
-+                return;
+-        syndrome = syn_uncategorized();
-             }
+-        break;
-+            gvec_fn = tcg_gen_gvec_shri;
+-    case CP_ACCESS_TRAP_UNCATEGORIZED_EL2:
-         } else {
+-        target_el = 2;
-             /* Shift count the same size as element size produces all sign.  */
+-        syndrome = syn_uncategorized();
-             if (shift == 8 << size) {
+-        break;
-                 shift -= 1;
+-    case CP_ACCESS_TRAP_UNCATEGORIZED_EL3:
-             }
+-        target_el = 3;
--            gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size);
+         syndrome = syn_uncategorized();
-+            gvec_fn = tcg_gen_gvec_sari;
+         break;
          }
 -        return;
 +        break;
      case 0x04: /* SRSHR / URSHR (rounding) */
 -        gen_gvec_fn2i(s, is_q, rd, rn, shift,
 -                      is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
 -        return;
 +        gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr;
 +        break;
      case 0x06: /* SRSRA / URSRA (accum + rounding) */
 -        gen_gvec_fn2i(s, is_q, rd, rn, shift,
 -                      is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
 -        return;
 +        gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra;
 +        break;
      default:
          g_assert_not_reached();
      }
--    round_const = 1ULL << (shift - 1);
+-exept:
--    tcg_round = tcg_const_i64(round_const);
++    target_el = res & CP_ACCESS_EL_MASK;
--
++    switch (target_el) {
--    for (i = 0; i < elements; i++) {
++    case 0:
--        read_vec_element(s, tcg_rn, rn, i, memop);
++        target_el = exception_target_el(env);
--        if (accumulate) {
++        break;
--            read_vec_element(s, tcg_rd, rd, i, memop);
++    case 2:
--        }
++        assert(arm_current_el(env) != 3);
--
++        assert(arm_is_el2_enabled(env));
--        handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
++        break;
--                                accumulate, is_u, size, shift);
++    case 3:
--
++        assert(arm_feature(env, ARM_FEATURE_EL3));
--        write_vec_element(s, tcg_rd, rd, i, size);
++        break;
--    }
++    default:
--    tcg_temp_free_i64(tcg_round);
++        /* No "direct" traps to EL1 */
--
++        g_assert_not_reached();
--    clear_vec_high(s, is_q, rd);
++    }
-+    gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size);
++
      raise_exception(env, EXCP_UDEF, syndrome, target_el);
  }
- /* SHL/SLI - Vector shift left */
 --
-.20.1
+.25.1

-[PULL 03/45] target/arm: Create gen_gvec_{u,s}{rshr,rsra}
+[PULL 04/23] target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
 From: Richard Henderson <richard.henderson@linaro.org>
-Create vectorized versions of handle_shri_with_rndacc
+Remove a possible source of error by removing REGINFO_SENTINEL
-for shift+round and shift+round+accumulate.  Add out-of-line
+and using ARRAY_SIZE (convinently hidden inside a macro) to
-helpers in preparation for longer vector lengths from SVE.
+find the end of the set of regs being registered or modified.
+The space saved by not having the extra array element reduces
+the executable's .data.rel.ro section by about 9k.
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200513163245.17915-3-richard.henderson@linaro.org
+Message-id: 20220501055028.646596-4-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h        |  20 ++
+ target/arm/cpregs.h       |  53 +++++++++---------
- target/arm/translate.h     |   9 +
+ hw/arm/pxa2xx.c           |   1 -
- target/arm/translate-a64.c |  11 +-
+ hw/arm/pxa2xx_pic.c       |   1 -
- target/arm/translate.c     | 463 +++++++++++++++++++++++++++++++++++--
+ hw/intc/arm_gicv3_cpuif.c |   5 --
- target/arm/vec_helper.c    |  50 ++++
+ hw/intc/arm_gicv3_kvm.c   |   1 -
-files changed, 527 insertions(+), 26 deletions(-)
+ target/arm/cpu64.c        |   1 -
  target/arm/cpu_tcg.c      |   4 --
  target/arm/helper.c       | 111 ++++++++------------------------------
 files changed, 48 insertions(+), 129 deletions(-)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/target/arm/cpregs.h
-+++ b/target/arm/helper.h
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+@@ -XXX,XX +XXX,XX @@
- DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+ #define ARM_CP_NO_GDB            0x4000
- DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+ #define ARM_CP_RAISES_EXC        0x8000
+ #define ARM_CP_NEWEL             0x10000
-+DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+-/* Used only as a terminator for ARMCPRegInfo lists */
-+DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+-#define ARM_CP_SENTINEL          0xfffff
-+DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+ /* Mask of only the flag bits in a type field */
-+DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+ #define ARM_CP_FLAG_MASK         0x1f0ff
@@ -XXX,XX +XXX,XX @@ enum {
      ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
  };
 -/*
 - * Return true if cptype is a valid type field. This is used to try to
 - * catch errors where the sentinel has been accidentally left off the end
 - * of a list of registers.
 - */
 -static inline bool cptype_valid(int cptype)
 -{
 -    return ((cptype & ~ARM_CP_FLAG_MASK) == 0)
 -        || ((cptype & ARM_CP_SPECIAL) &&
 -            ((cptype & ~ARM_CP_FLAG_MASK) <= ARM_LAST_SPECIAL));
 -}
 -
  /*
   * Access rights:
   * We define bits for Read and Write access for what rev C of the v7-AR ARM ARM
@@ -XXX,XX +XXX,XX @@ struct ARMCPRegInfo {
  #define CPREG_FIELD64(env, ri) \
      (*(uint64_t *)((char *)(env) + (ri)->fieldoffset))
 -#define REGINFO_SENTINEL { .type = ARM_CP_SENTINEL }
 +void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu, const ARMCPRegInfo *reg,
 +                                       void *opaque);
 -void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
 -                                    const ARMCPRegInfo *regs, void *opaque);
 -void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
 -                                       const ARMCPRegInfo *regs, void *opaque);
 -static inline void define_arm_cp_regs(ARMCPU *cpu, const ARMCPRegInfo *regs)
 -{
 -    define_arm_cp_regs_with_opaque(cpu, regs, 0);
 -}
  static inline void define_one_arm_cp_reg(ARMCPU *cpu, const ARMCPRegInfo *regs)
  {
 -    define_one_arm_cp_reg_with_opaque(cpu, regs, 0);
 +    define_one_arm_cp_reg_with_opaque(cpu, regs, NULL);
  }
 +
-+DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++void define_arm_cp_regs_with_opaque_len(ARMCPU *cpu, const ARMCPRegInfo *regs,
-+DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++                                        void *opaque, size_t len);
 +DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
-+DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++#define define_arm_cp_regs_with_opaque(CPU, REGS, OPAQUE)               \
-+DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++    do {                                                                \
-+DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++        QEMU_BUILD_BUG_ON(ARRAY_SIZE(REGS) == 0);                       \
-+DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++        define_arm_cp_regs_with_opaque_len(CPU, REGS, OPAQUE,           \
 +                                           ARRAY_SIZE(REGS));           \
 +    } while (0)
 +
-+DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++#define define_arm_cp_regs(CPU, REGS) \
-+DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++    define_arm_cp_regs_with_opaque(CPU, REGS, NULL)
 +DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
+ const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp);
+ /*
+@@ -XXX,XX +XXX,XX @@ typedef struct ARMCPRegUserSpaceInfo {
+     uint64_t fixed_bits;
+ } ARMCPRegUserSpaceInfo;
+-#define REGUSERINFO_SENTINEL { .name = NULL }
++void modify_arm_cp_regs_with_len(ARMCPRegInfo *regs, size_t regs_len,
++                                 const ARMCPRegUserSpaceInfo *mods,
++                                 size_t mods_len);
+-void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods);
++#define modify_arm_cp_regs(REGS, MODS)                                  \
++    do {                                                                \
++        QEMU_BUILD_BUG_ON(ARRAY_SIZE(REGS) == 0);                       \
++        QEMU_BUILD_BUG_ON(ARRAY_SIZE(MODS) == 0);                       \
++        modify_arm_cp_regs_with_len(REGS, ARRAY_SIZE(REGS),             \
++                                    MODS, ARRAY_SIZE(MODS));            \
++    } while (0)
+ /* CPWriteFn that can be used to implement writes-ignored behaviour */
+ void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
+diff --git a/hw/arm/pxa2xx.c b/hw/arm/pxa2xx.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/pxa2xx.c
++++ b/hw/arm/pxa2xx.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pxa_cp_reginfo[] = {
+     { .name = "PWRMODE", .cp = 14, .crn = 7, .crm = 0, .opc1 = 0, .opc2 = 0,
+       .access = PL1_RW, .type = ARM_CP_IO,
+       .readfn = arm_cp_read_zero, .writefn = pxa2xx_pwrmode_write },
+-    REGINFO_SENTINEL
+ };
+ static void pxa2xx_setup_cp14(PXA2xxState *s)
+diff --git a/hw/arm/pxa2xx_pic.c b/hw/arm/pxa2xx_pic.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/pxa2xx_pic.c
++++ b/hw/arm/pxa2xx_pic.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pxa_pic_cp_reginfo[] = {
+     REGINFO_FOR_PIC_CP("ICLR2", 8),
+     REGINFO_FOR_PIC_CP("ICFP2", 9),
+     REGINFO_FOR_PIC_CP("ICPR2", 0xa),
+-    REGINFO_SENTINEL
+ };
+ static const MemoryRegionOps pxa2xx_pic_ops = {
+diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/intc/arm_gicv3_cpuif.c
++++ b/hw/intc/arm_gicv3_cpuif.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_reginfo[] = {
+       .readfn = icc_igrpen1_el3_read,
+       .writefn = icc_igrpen1_el3_write,
+     },
+-    REGINFO_SENTINEL
+ };
+ static uint64_t ich_ap_read(CPUARMState *env, const ARMCPRegInfo *ri)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_hcr_reginfo[] = {
+       .readfn = ich_vmcr_read,
+       .writefn = ich_vmcr_write,
+     },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo gicv3_cpuif_ich_apxr1_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_ich_apxr1_reginfo[] = {
+       .readfn = ich_ap_read,
+       .writefn = ich_ap_write,
+     },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo gicv3_cpuif_ich_apxr23_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_ich_apxr23_reginfo[] = {
+       .readfn = ich_ap_read,
+       .writefn = ich_ap_write,
+     },
+-    REGINFO_SENTINEL
+ };
+ static void gicv3_cpuif_el_change_hook(ARMCPU *cpu, void *opaque)
+@@ -XXX,XX +XXX,XX @@ void gicv3_init_cpuif(GICv3State *s)
+                       .readfn = ich_lr_read,
+                       .writefn = ich_lr_write,
+                     },
+-                    REGINFO_SENTINEL
+                 };
+                 define_arm_cp_regs(cpu, lr_regset);
+             }
+diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/intc/arm_gicv3_kvm.c
++++ b/hw/intc/arm_gicv3_kvm.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_reginfo[] = {
+        */
+       .resetfn = arm_gicv3_icc_reset,
+     },
+-    REGINFO_SENTINEL
+ };
+ /**
+diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu64.c
++++ b/target/arm/cpu64.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cortex_a72_a57_a53_cp_reginfo[] = {
+     { .name = "L2MERRSR",
+       .cp = 15, .opc1 = 3, .crm = 15,
+       .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ static void aarch64_a57_initfn(Object *obj)
+diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu_tcg.c
++++ b/target/arm/cpu_tcg.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cortexa8_cp_reginfo[] = {
+       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+     { .name = "L2AUXCR", .cp = 15, .crn = 9, .crm = 0, .opc1 = 1, .opc2 = 2,
+       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ static void cortex_a8_initfn(Object *obj)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cortexa9_cp_reginfo[] = {
+       .access = PL1_RW, .resetvalue = 0, .type = ARM_CP_CONST },
+     { .name = "TLB_ATTR", .cp = 15, .crn = 15, .crm = 7, .opc1 = 5, .opc2 = 2,
+       .access = PL1_RW, .resetvalue = 0, .type = ARM_CP_CONST },
+-    REGINFO_SENTINEL
+ };
+ static void cortex_a9_initfn(Object *obj)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cortexa15_cp_reginfo[] = {
+ #endif
+     { .name = "L2ECTLR", .cp = 15, .crn = 9, .crm = 0, .opc1 = 1, .opc2 = 3,
+       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ static void cortex_a7_initfn(Object *obj)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cortexr5_cp_reginfo[] = {
+       .access = PL1_RW, .type = ARM_CP_CONST },
+     { .name = "DCACHE_INVAL", .cp = 15, .opc1 = 0, .crn = 15, .crm = 5,
+       .opc2 = 0, .access = PL1_W, .type = ARM_CP_NOP },
+-    REGINFO_SENTINEL
+ };
+ static void cortex_r5_initfn(Object *obj)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cp_reginfo[] = {
+       .secure = ARM_CP_SECSTATE_S,
+       .fieldoffset = offsetof(CPUARMState, cp15.contextidr_s),
+       .resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo not_v8_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v8_cp_reginfo[] = {
+     { .name = "CACHEMAINT", .cp = 15, .crn = 7, .crm = CP_ANY,
+       .opc1 = 0, .opc2 = CP_ANY, .access = PL1_W,
+       .type = ARM_CP_NOP | ARM_CP_OVERRIDE },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo not_v6_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v6_cp_reginfo[] = {
+      */
+     { .name = "WFI_v5", .cp = 15, .crn = 7, .crm = 8, .opc1 = 0, .opc2 = 2,
+       .access = PL1_W, .type = ARM_CP_WFI },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo not_v7_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v7_cp_reginfo[] = {
+       .opc1 = 0, .opc2 = 0, .access = PL1_RW, .type = ARM_CP_NOP },
+     { .name = "NMRR", .cp = 15, .crn = 10, .crm = 2,
+       .opc1 = 0, .opc2 = 1, .access = PL1_RW, .type = ARM_CP_NOP },
+-    REGINFO_SENTINEL
+ };
+ static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
+       .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 2, .accessfn = cpacr_access,
+       .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.cpacr_el1),
+       .resetfn = cpacr_reset, .writefn = cpacr_write, .readfn = cpacr_read },
+-    REGINFO_SENTINEL
+ };
+ typedef struct pm_event {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
+     { .name = "TLBIMVAA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 3,
+       .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+       .writefn = tlbimvaa_write },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo v7mp_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7mp_cp_reginfo[] = {
+     { .name = "TLBIMVAAIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
+       .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
+       .writefn = tlbimvaa_is_write },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo pmovsset_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pmovsset_cp_reginfo[] = {
+       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
+       .writefn = pmovsset_write,
+       .raw_writefn = raw_write },
+-    REGINFO_SENTINEL
+ };
+ static void teecr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo t2ee_cp_reginfo[] = {
+     { .name = "TEEHBR", .cp = 14, .crn = 1, .crm = 0, .opc1 = 6, .opc2 = 0,
+       .access = PL0_RW, .fieldoffset = offsetof(CPUARMState, teehbr),
+       .accessfn = teehbr_access, .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo v6k_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6k_cp_reginfo[] = {
+       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.tpidrprw_s),
+                              offsetoflow32(CPUARMState, cp15.tpidrprw_ns) },
+       .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ #ifndef CONFIG_USER_ONLY
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
+       .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_SEC].cval),
+       .writefn = gt_sec_cval_write, .raw_writefn = raw_write,
+     },
+-    REGINFO_SENTINEL
+ };
+ static CPAccessResult e2h_access(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
+       .access = PL0_R, .type = ARM_CP_NO_RAW | ARM_CP_IO,
+       .readfn = gt_virt_cnt_read,
+     },
+-    REGINFO_SENTINEL
+ };
+ #endif
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vapa_cp_reginfo[] = {
+       .access = PL1_W, .accessfn = ats_access,
+       .writefn = ats_write, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC },
+ #endif
+-    REGINFO_SENTINEL
+ };
+ /* Return basic MPU access permission bits.  */
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pmsav7_cp_reginfo[] = {
+       .fieldoffset = offsetof(CPUARMState, pmsav7.rnr[M_REG_NS]),
+       .writefn = pmsav7_rgnr_write,
+       .resetfn = arm_cp_reset_ignore },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo pmsav5_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pmsav5_cp_reginfo[] = {
+     { .name = "946_PRBS7", .cp = 15, .crn = 6, .crm = 7, .opc1 = 0,
+       .opc2 = CP_ANY, .access = PL1_RW, .resetvalue = 0,
+       .fieldoffset = offsetof(CPUARMState, cp15.c6_region[7]) },
+-    REGINFO_SENTINEL
+ };
+ static void vmsa_ttbcr_raw_write(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
+       .access = PL1_RW, .accessfn = access_tvm_trvm,
+       .fieldoffset = offsetof(CPUARMState, cp15.far_el[1]),
+       .resetvalue = 0, },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
+       /* No offsetoflow32 -- pass the entire TCR to writefn/raw_writefn. */
+       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.tcr_el[3]),
+                              offsetof(CPUARMState, cp15.tcr_el[1])} },
+-    REGINFO_SENTINEL
+ };
+ /* Note that unlike TTBCR, writing to TTBCR2 does not require flushing
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo omap_cp_reginfo[] = {
+     { .name = "C9", .cp = 15, .crn = 9,
+       .crm = CP_ANY, .opc1 = CP_ANY, .opc2 = CP_ANY, .access = PL1_RW,
+       .type = ARM_CP_CONST | ARM_CP_OVERRIDE, .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ static void xscale_cpar_write(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo xscale_cp_reginfo[] = {
+     { .name = "XSCALE_UNLOCK_DCACHE",
+       .cp = 15, .opc1 = 0, .crn = 9, .crm = 2, .opc2 = 1,
+       .access = PL1_W, .type = ARM_CP_NOP },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo dummy_c15_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dummy_c15_cp_reginfo[] = {
+       .access = PL1_RW,
+       .type = ARM_CP_CONST | ARM_CP_NO_RAW | ARM_CP_OVERRIDE,
+       .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo cache_dirty_status_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cache_dirty_status_cp_reginfo[] = {
+     { .name = "CDSR", .cp = 15, .crn = 7, .crm = 10, .opc1 = 0, .opc2 = 6,
+       .access = PL1_R, .type = ARM_CP_CONST | ARM_CP_NO_RAW,
+       .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo cache_block_ops_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cache_block_ops_cp_reginfo[] = {
+       .access = PL0_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
+     { .name = "CIDCR", .cp = 15, .crm = 14, .opc1 = 0,
+       .access = PL1_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo cache_test_clean_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cache_test_clean_cp_reginfo[] = {
+     { .name = "TCI_DCACHE", .cp = 15, .crn = 7, .crm = 14, .opc1 = 0, .opc2 = 3,
+       .access = PL0_R, .type = ARM_CP_CONST | ARM_CP_NO_RAW,
+       .resetvalue = (1 << 30) },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo strongarm_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo strongarm_cp_reginfo[] = {
+       .crm = CP_ANY, .opc1 = CP_ANY, .opc2 = CP_ANY,
+       .access = PL1_RW, .resetvalue = 0,
+       .type = ARM_CP_CONST | ARM_CP_OVERRIDE | ARM_CP_NO_RAW },
+-    REGINFO_SENTINEL
+ };
+ static uint64_t midr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo lpae_cp_reginfo[] = {
+       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
+                              offsetof(CPUARMState, cp15.ttbr1_ns) },
+       .writefn = vmsa_ttbr_write, },
+-    REGINFO_SENTINEL
+ };
+ static uint64_t aa64_fpcr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
+       .access = PL1_RW, .accessfn = access_trap_aa32s_el1,
+       .writefn = sdcr_write,
+       .fieldoffset = offsetoflow32(CPUARMState, cp15.mdcr_el3) },
+-    REGINFO_SENTINEL
+ };
+ /* Used to describe the behaviour of EL2 regs when EL2 does not exist.  */
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_cp_reginfo[] = {
+       .type = ARM_CP_CONST,
+       .cp = 15, .opc1 = 4, .crn = 6, .crm = 0, .opc2 = 2,
+       .access = PL2_RW, .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ /* Ditto, but for registers which exist in ARMv8 but not v7 */
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
+       .cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 4,
+       .access = PL2_RW,
+       .type = ARM_CP_CONST, .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
+       .cp = 15, .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 3,
+       .access = PL2_RW,
+       .fieldoffset = offsetof(CPUARMState, cp15.hstr_el2) },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo el2_v8_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_v8_cp_reginfo[] = {
+       .access = PL2_RW,
+       .fieldoffset = offsetofhigh32(CPUARMState, cp15.hcr_el2),
+       .writefn = hcr_writehigh },
+-    REGINFO_SENTINEL
+ };
+ static CPAccessResult sel2_access(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_sec_cp_reginfo[] = {
+       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 6, .opc2 = 2,
+       .access = PL2_RW, .accessfn = sel2_access,
+       .fieldoffset = offsetof(CPUARMState, cp15.vstcr_el2) },
+-    REGINFO_SENTINEL
+ };
+ static CPAccessResult nsacr_access(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
+       .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 7, .opc2 = 5,
+       .access = PL3_W, .type = ARM_CP_NO_RAW,
+       .writefn = tlbi_aa64_vae3_write },
+-    REGINFO_SENTINEL
+ };
+ #ifndef CONFIG_USER_ONLY
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
+       .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 0,
+       .access = PL1_RW, .accessfn = access_tda,
+       .type = ARM_CP_NOP },
+-    REGINFO_SENTINEL
+ };
+ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
+       .access = PL0_R, .type = ARM_CP_CONST|ARM_CP_64BIT, .resetvalue = 0 },
+     { .name = "DBGDSAR", .cp = 14, .crm = 2, .opc1 = 0,
+       .access = PL0_R, .type = ARM_CP_CONST|ARM_CP_64BIT, .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
+ /* Return the exception level to which exceptions should be taken
+@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
+               .fieldoffset = offsetof(CPUARMState, cp15.dbgbcr[i]),
+               .writefn = dbgbcr_write, .raw_writefn = raw_write
+             },
+-            REGINFO_SENTINEL
+         };
+         define_arm_cp_regs(cpu, dbgregs);
+     }
+@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
+               .fieldoffset = offsetof(CPUARMState, cp15.dbgwcr[i]),
+               .writefn = dbgwcr_write, .raw_writefn = raw_write
+             },
+-            REGINFO_SENTINEL
+         };
+         define_arm_cp_regs(cpu, dbgregs);
+     }
+@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
+               .type = ARM_CP_IO,
+               .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
+               .raw_writefn = pmevtyper_rawwrite },
+-            REGINFO_SENTINEL
+         };
+         define_arm_cp_regs(cpu, pmev_regs);
+         g_free(pmevcntr_name);
+@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
+               .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 5,
+               .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
+               .resetvalue = extract64(cpu->pmceid1, 32, 32) },
+-            REGINFO_SENTINEL
+         };
+         define_arm_cp_regs(cpu, v81_pmu_regs);
+     }
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo lor_reginfo[] = {
+       .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 7,
+       .access = PL1_R, .accessfn = access_lor_ns,
+       .type = ARM_CP_CONST, .resetvalue = 0 },
+-    REGINFO_SENTINEL
+ };
  #ifdef TARGET_AARCH64
- #include "helper-a64.h"
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pauth_reginfo[] = {
- #include "helper-sve.h"
+       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 3,
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+       .access = PL1_RW, .accessfn = access_pauth,
-index XXXXXXX..XXXXXXX 100644
+       .fieldoffset = offsetof(CPUARMState, keys.apib.hi) },
---- a/target/arm/translate.h
+-    REGINFO_SENTINEL
-+++ b/target/arm/translate.h
+ };
-@@ -XXX,XX +XXX,XX @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
- void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+ static const ARMCPRegInfo tlbirange_reginfo[] = {
-                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo tlbirange_reginfo[] = {
+       .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 6, .opc2 = 5,
-+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+       .access = PL3_W, .type = ARM_CP_NO_RAW,
-+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+       .writefn = tlbi_aa64_rvae3_write },
-+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+-    REGINFO_SENTINEL
-+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+ };
-+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
-+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+ static const ARMCPRegInfo tlbios_reginfo[] = {
-+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo tlbios_reginfo[] = {
-+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+       .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 1, .opc2 = 5,
        .access = PL3_W, .type = ARM_CP_NO_RAW,
        .writefn = tlbi_aa64_vae3is_write },
 -    REGINFO_SENTINEL
  };
  static uint64_t rndr_readfn(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo rndr_reginfo[] = {
        .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END | ARM_CP_IO,
        .opc0 = 3, .opc1 = 3, .crn = 2, .crm = 4, .opc2 = 1,
        .access = PL0_R, .readfn = rndr_readfn },
 -    REGINFO_SENTINEL
  };
  #ifndef CONFIG_USER_ONLY
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dcpop_reg[] = {
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
        .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
        .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
 -    REGINFO_SENTINEL
  };
  static const ARMCPRegInfo dcpodp_reg[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dcpodp_reg[] = {
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
        .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
        .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
 -    REGINFO_SENTINEL
  };
  #endif /*CONFIG_USER_ONLY*/
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo mte_reginfo[] = {
      { .name = "DC_CIGDSW", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 6,
        .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
 -    REGINFO_SENTINEL
  };
  static const ARMCPRegInfo mte_tco_ro_reginfo[] = {
      { .name = "TCO", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 3, .crn = 4, .crm = 2, .opc2 = 7,
        .type = ARM_CP_CONST, .access = PL0_RW, },
 -    REGINFO_SENTINEL
  };
  static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
        .accessfn = aa64_zva_access,
  #endif
      },
 -    REGINFO_SENTINEL
  };
  #endif
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo predinv_reginfo[] = {
      { .name = "CPPRCTX", .state = ARM_CP_STATE_AA32,
        .cp = 15, .opc1 = 0, .crn = 7, .crm = 3, .opc2 = 7,
        .type = ARM_CP_NOP, .access = PL0_W, .accessfn = access_predinv },
 -    REGINFO_SENTINEL
  };
  static uint64_t ccsidr2_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo ccsidr2_reginfo[] = {
        .access = PL1_R,
        .accessfn = access_aa64_tid2,
        .readfn = ccsidr2_read, .type = ARM_CP_NO_RAW },
 -    REGINFO_SENTINEL
  };
  static CPAccessResult access_aa64_tid3(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo jazelle_regs[] = {
        .cp = 14, .crn = 2, .crm = 0, .opc1 = 7, .opc2 = 0,
        .accessfn = access_joscr_jmcr,
        .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
 -    REGINFO_SENTINEL
  };
  static const ARMCPRegInfo vhe_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vhe_reginfo[] = {
        .access = PL2_RW, .accessfn = e2h_access,
        .writefn = gt_virt_cval_write, .raw_writefn = raw_write },
  #endif
 -    REGINFO_SENTINEL
  };
  #ifndef CONFIG_USER_ONLY
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo ats1e1_reginfo[] = {
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 1,
        .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
        .writefn = ats_write64 },
 -    REGINFO_SENTINEL
  };
  static const ARMCPRegInfo ats1cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo ats1cp_reginfo[] = {
        .cp = 15, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 1,
        .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
        .writefn = ats_write },
 -    REGINFO_SENTINEL
  };
  #endif
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo actlr2_hactlr2_reginfo[] = {
        .cp = 15, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 3,
        .access = PL2_RW, .type = ARM_CP_CONST,
        .resetvalue = 0 },
 -    REGINFO_SENTINEL
  };
  void register_cp_regs_for_features(ARMCPU *cpu)
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .access = PL1_R, .type = ARM_CP_CONST,
                .accessfn = access_aa32_tid3,
                .resetvalue = cpu->isar.id_isar6 },
 -            REGINFO_SENTINEL
          };
          define_arm_cp_regs(cpu, v6_idregs);
          define_arm_cp_regs(cpu, v6_cp_reginfo);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 7,
                .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
                .resetvalue = cpu->pmceid1 },
 -            REGINFO_SENTINEL
          };
  #ifdef CONFIG_USER_ONLY
          ARMCPRegUserSpaceInfo v8_user_idregs[] = {
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .exported_bits = 0x000000f0ffffffff },
              { .name = "ID_AA64ISAR*_EL1_RESERVED",
                .is_glob = true                     },
 -            REGUSERINFO_SENTINEL
          };
          modify_arm_cp_regs(v8_idregs, v8_user_idregs);
  #endif
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .access = PL2_RW,
                .resetvalue = vmpidr_def,
                .fieldoffset = offsetof(CPUARMState, cp15.vmpidr_el2) },
 -            REGINFO_SENTINEL
          };
          define_arm_cp_regs(cpu, vpidr_regs);
          define_arm_cp_regs(cpu, el2_cp_reginfo);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                    .access = PL2_RW, .accessfn = access_el3_aa32ns,
                    .type = ARM_CP_NO_RAW,
                    .writefn = arm_cp_write_ignore, .readfn = mpidr_read },
 -                REGINFO_SENTINEL
              };
              define_arm_cp_regs(cpu, vpidr_regs);
              define_arm_cp_regs(cpu, el3_no_el2_cp_reginfo);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .raw_writefn = raw_write, .writefn = sctlr_write,
                .fieldoffset = offsetof(CPUARMState, cp15.sctlr_el[3]),
                .resetvalue = cpu->reset_sctlr },
 -            REGINFO_SENTINEL
          };
          define_arm_cp_regs(cpu, el3_regs);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
              { .name = "DUMMY",
                .cp = 15, .crn = 0, .crm = 7, .opc1 = 0, .opc2 = CP_ANY,
                .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
 -            REGINFO_SENTINEL
          };
          ARMCPRegInfo id_v8_midr_cp_reginfo[] = {
              { .name = "MIDR_EL1", .state = ARM_CP_STATE_BOTH,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .access = PL1_R,
                .accessfn = access_aa64_tid1,
                .type = ARM_CP_CONST, .resetvalue = cpu->revidr },
 -            REGINFO_SENTINEL
          };
          ARMCPRegInfo id_cp_reginfo[] = {
              /* These are common to v8 and pre-v8 */
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .access = PL1_R,
                .accessfn = access_aa32_tid1,
                .type = ARM_CP_CONST, .resetvalue = 0 },
 -            REGINFO_SENTINEL
          };
          /* TLBTR is specific to VMSA */
          ARMCPRegInfo id_tlbtr_reginfo = {
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
              { .name = "MIDR_EL1",
                .exported_bits = 0x00000000ffffffff },
              { .name = "REVIDR_EL1"                },
 -            REGUSERINFO_SENTINEL
          };
          modify_arm_cp_regs(id_v8_midr_cp_reginfo, id_v8_user_midr_cp_reginfo);
  #endif
          if (arm_feature(env, ARM_FEATURE_OMAPCP) ||
              arm_feature(env, ARM_FEATURE_STRONGARM)) {
 -            ARMCPRegInfo *r;
 +            size_t i;
              /* Register the blanket "writes ignored" value first to cover the
               * whole space. Then update the specific ID registers to allow write
               * access, so that they ignore writes rather than causing them to
               * UNDEF.
               */
              define_one_arm_cp_reg(cpu, &crn0_wi_reginfo);
 -            for (r = id_pre_v8_midr_cp_reginfo;
 -                 r->type != ARM_CP_SENTINEL; r++) {
 -                r->access = PL1_RW;
 +            for (i = 0; i < ARRAY_SIZE(id_pre_v8_midr_cp_reginfo); ++i) {
 +                id_pre_v8_midr_cp_reginfo[i].access = PL1_RW;
              }
 -            for (r = id_cp_reginfo; r->type != ARM_CP_SENTINEL; r++) {
 -                r->access = PL1_RW;
 +            for (i = 0; i < ARRAY_SIZE(id_cp_reginfo); ++i) {
 +                id_cp_reginfo[i].access = PL1_RW;
              }
              id_mpuir_reginfo.access = PL1_RW;
              id_tlbtr_reginfo.access = PL1_RW;
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
              { .name = "MPIDR_EL1", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 5,
                .access = PL1_R, .readfn = mpidr_read, .type = ARM_CP_NO_RAW },
 -            REGINFO_SENTINEL
          };
  #ifdef CONFIG_USER_ONLY
          ARMCPRegUserSpaceInfo mpidr_user_cp_reginfo[] = {
              { .name = "MPIDR_EL1",
                .fixed_bits = 0x0000000080000000 },
 -            REGUSERINFO_SENTINEL
          };
          modify_arm_cp_regs(mpidr_cp_reginfo, mpidr_user_cp_reginfo);
  #endif
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 0, .opc2 = 1,
                .access = PL3_RW, .type = ARM_CP_CONST,
                .resetvalue = 0 },
 -            REGINFO_SENTINEL
          };
          define_arm_cp_regs(cpu, auxcr_reginfo);
          if (cpu_isar_feature(aa32_ac2, cpu)) {
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                    .type = ARM_CP_CONST,
                    .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 0,
                    .access = PL1_R, .resetvalue = cpu->reset_cbar },
 -                REGINFO_SENTINEL
              };
              /* We don't implement a r/w 64 bit CBAR currently */
              assert(arm_feature(env, ARM_FEATURE_CBAR_RO));
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .bank_fieldoffsets = { offsetof(CPUARMState, cp15.vbar_s),
                                       offsetof(CPUARMState, cp15.vbar_ns) },
                .resetvalue = 0 },
 -            REGINFO_SENTINEL
          };
          define_arm_cp_regs(cpu, vbar_cp_reginfo);
      }
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
                     r->writefn);
          }
      }
 -    /* Bad type field probably means missing sentinel at end of reg list */
 -    assert(cptype_valid(r->type));
 +
- /*
+     for (crm = crmmin; crm <= crmmax; crm++) {
-  * Forward to the isar_feature_* tests given a DisasContext pointer.
+         for (opc1 = opc1min; opc1 <= opc1max; opc1++) {
-  */
+             for (opc2 = opc2min; opc2 <= opc2max; opc2++) {
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
          return;
      case 0x04: /* SRSHR / URSHR (rounding) */
 -        break;
 +        gen_gvec_fn2i(s, is_q, rd, rn, shift,
 +                      is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
 +        return;
 +
      case 0x06: /* SRSRA / URSRA (accum + rounding) */
 -        accumulate = true;
 -        break;
 +        gen_gvec_fn2i(s, is_q, rd, rn, shift,
 +                      is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
 +        return;
 +
      default:
          g_assert_not_reached();
      }
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
      }
  }
-+/*
+-void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
-+ * Shift one less than the requested amount, and the low bit is
+-                                    const ARMCPRegInfo *regs, void *opaque)
-+ * the rounding bit.  For the 8 and 16-bit operations, because we
++/* Define a whole list of registers */
-+ * mask the low bit, we can perform a normal integer shift instead
++void define_arm_cp_regs_with_opaque_len(ARMCPU *cpu, const ARMCPRegInfo *regs,
-+ * of a vector shift.
++                                        void *opaque, size_t len)
-+ */
+ {
-+static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+-    /* Define a whole list of registers */
-+{
+-    const ARMCPRegInfo *r;
-+    TCGv_i64 t = tcg_temp_new_i64();
+-    for (r = regs; r->type != ARM_CP_SENTINEL; r++) {
 -        define_one_arm_cp_reg_with_opaque(cpu, r, opaque);
 +    size_t i;
 +    for (i = 0; i < len; ++i) {
 +        define_one_arm_cp_reg_with_opaque(cpu, regs + i, opaque);
      }
  }
@@ -XXX,XX +XXX,XX @@ void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
   * user-space cannot alter any values and dynamic values pertaining to
   * execution state are hidden from user space view anyway.
   */
 -void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods)
 +void modify_arm_cp_regs_with_len(ARMCPRegInfo *regs, size_t regs_len,
 +                                 const ARMCPRegUserSpaceInfo *mods,
 +                                 size_t mods_len)
  {
 -    const ARMCPRegUserSpaceInfo *m;
 -    ARMCPRegInfo *r;
 -
 -    for (m = mods; m->name; m++) {
 +    for (size_t mi = 0; mi < mods_len; ++mi) {
 +        const ARMCPRegUserSpaceInfo *m = mods + mi;
          GPatternSpec *pat = NULL;
 +
-+    tcg_gen_shri_i64(t, a, sh - 1);
+         if (m->is_glob) {
-+    tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
+             pat = g_pattern_spec_new(m->name);
-+    tcg_gen_vec_sar8i_i64(d, a, sh);
+         }
-+    tcg_gen_vec_add8_i64(d, d, t);
+-        for (r = regs; r->type != ARM_CP_SENTINEL; r++) {
-+    tcg_temp_free_i64(t);
++        for (size_t ri = 0; ri < regs_len; ++ri) {
-+}
++            ARMCPRegInfo *r = regs + ri;
 +
-+static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+             if (pat && g_pattern_match_string(pat, r->name)) {
-+{
+                 r->type = ARM_CP_CONST;
-+    TCGv_i64 t = tcg_temp_new_i64();
+                 r->access = PL0U_R;
 +
 +    tcg_gen_shri_i64(t, a, sh - 1);
 +    tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
 +    tcg_gen_vec_sar16i_i64(d, a, sh);
 +    tcg_gen_vec_add16_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    tcg_gen_extract_i32(t, a, sh - 1, 1);
 +    tcg_gen_sari_i32(d, a, sh);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_extract_i64(t, a, sh - 1, 1);
 +    tcg_gen_sari_i64(d, a, sh);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    TCGv_vec ones = tcg_temp_new_vec_matching(d);
 +
 +    tcg_gen_shri_vec(vece, t, a, sh - 1);
 +    tcg_gen_dupi_vec(vece, ones, 1);
 +    tcg_gen_and_vec(vece, t, t, ones);
 +    tcg_gen_sari_vec(vece, d, a, sh);
 +    tcg_gen_add_vec(vece, d, d, t);
 +
 +    tcg_temp_free_vec(t);
 +    tcg_temp_free_vec(ones);
 +}
 +
 +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_srshr8_i64,
 +          .fniv = gen_srshr_vec,
 +          .fno = gen_helper_gvec_srshr_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni8 = gen_srshr16_i64,
 +          .fniv = gen_srshr_vec,
 +          .fno = gen_helper_gvec_srshr_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_srshr32_i32,
 +          .fniv = gen_srshr_vec,
 +          .fno = gen_helper_gvec_srshr_s,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_srshr64_i64,
 +          .fniv = gen_srshr_vec,
 +          .fno = gen_helper_gvec_srshr_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +
 +    /* tszimm encoding produces immediates in the range [1..esize] */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    if (shift == (8 << vece)) {
 +        /*
 +         * Shifts larger than the element size are architecturally valid.
 +         * Signed results in all sign bits.  With rounding, this produces
 +         *   (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
 +         * I.e. always zero.
 +         */
 +        tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
 +    } else {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    }
 +}
 +
 +static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    gen_srshr8_i64(t, a, sh);
 +    tcg_gen_vec_add8_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    gen_srshr16_i64(t, a, sh);
 +    tcg_gen_vec_add16_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    gen_srshr32_i32(t, a, sh);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    gen_srshr64_i64(t, a, sh);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +
 +    gen_srshr_vec(vece, t, a, sh);
 +    tcg_gen_add_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_srsra8_i64,
 +          .fniv = gen_srsra_vec,
 +          .fno = gen_helper_gvec_srsra_b,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_8 },
 +        { .fni8 = gen_srsra16_i64,
 +          .fniv = gen_srsra_vec,
 +          .fno = gen_helper_gvec_srsra_h,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_16 },
 +        { .fni4 = gen_srsra32_i32,
 +          .fniv = gen_srsra_vec,
 +          .fno = gen_helper_gvec_srsra_s,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_32 },
 +        { .fni8 = gen_srsra64_i64,
 +          .fniv = gen_srsra_vec,
 +          .fno = gen_helper_gvec_srsra_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 +
 +    /* tszimm encoding produces immediates in the range [1..esize] */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    /*
 +     * Shifts larger than the element size are architecturally valid.
 +     * Signed results in all sign bits.  With rounding, this produces
 +     *   (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
 +     * I.e. always zero.  With accumulation, this leaves D unchanged.
 +     */
 +    if (shift == (8 << vece)) {
 +        /* Nop, but we do need to clear the tail. */
 +        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
 +    } else {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    }
 +}
 +
 +static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_shri_i64(t, a, sh - 1);
 +    tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
 +    tcg_gen_vec_shr8i_i64(d, a, sh);
 +    tcg_gen_vec_add8_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_shri_i64(t, a, sh - 1);
 +    tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
 +    tcg_gen_vec_shr16i_i64(d, a, sh);
 +    tcg_gen_vec_add16_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    tcg_gen_extract_i32(t, a, sh - 1, 1);
 +    tcg_gen_shri_i32(d, a, sh);
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_extract_i64(t, a, sh - 1, 1);
 +    tcg_gen_shri_i64(d, a, sh);
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    TCGv_vec ones = tcg_temp_new_vec_matching(d);
 +
 +    tcg_gen_shri_vec(vece, t, a, shift - 1);
 +    tcg_gen_dupi_vec(vece, ones, 1);
 +    tcg_gen_and_vec(vece, t, t, ones);
 +    tcg_gen_shri_vec(vece, d, a, shift);
 +    tcg_gen_add_vec(vece, d, d, t);
 +
 +    tcg_temp_free_vec(t);
 +    tcg_temp_free_vec(ones);
 +}
 +
 +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_urshr8_i64,
 +          .fniv = gen_urshr_vec,
 +          .fno = gen_helper_gvec_urshr_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni8 = gen_urshr16_i64,
 +          .fniv = gen_urshr_vec,
 +          .fno = gen_helper_gvec_urshr_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_urshr32_i32,
 +          .fniv = gen_urshr_vec,
 +          .fno = gen_helper_gvec_urshr_s,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_urshr64_i64,
 +          .fniv = gen_urshr_vec,
 +          .fno = gen_helper_gvec_urshr_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +
 +    /* tszimm encoding produces immediates in the range [1..esize] */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    if (shift == (8 << vece)) {
 +        /*
 +         * Shifts larger than the element size are architecturally valid.
 +         * Unsigned results in zero.  With rounding, this produces a
 +         * copy of the most significant bit.
 +         */
 +        tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
 +    } else {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    }
 +}
 +
 +static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    if (sh == 8) {
 +        tcg_gen_vec_shr8i_i64(t, a, 7);
 +    } else {
 +        gen_urshr8_i64(t, a, sh);
 +    }
 +    tcg_gen_vec_add8_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    if (sh == 16) {
 +        tcg_gen_vec_shr16i_i64(t, a, 15);
 +    } else {
 +        gen_urshr16_i64(t, a, sh);
 +    }
 +    tcg_gen_vec_add16_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    if (sh == 32) {
 +        tcg_gen_shri_i32(t, a, 31);
 +    } else {
 +        gen_urshr32_i32(t, a, sh);
 +    }
 +    tcg_gen_add_i32(d, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    if (sh == 64) {
 +        tcg_gen_shri_i64(t, a, 63);
 +    } else {
 +        gen_urshr64_i64(t, a, sh);
 +    }
 +    tcg_gen_add_i64(d, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +
 +    if (sh == (8 << vece)) {
 +        tcg_gen_shri_vec(vece, t, a, sh - 1);
 +    } else {
 +        gen_urshr_vec(vece, t, a, sh);
 +    }
 +    tcg_gen_add_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_ursra8_i64,
 +          .fniv = gen_ursra_vec,
 +          .fno = gen_helper_gvec_ursra_b,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_8 },
 +        { .fni8 = gen_ursra16_i64,
 +          .fniv = gen_ursra_vec,
 +          .fno = gen_helper_gvec_ursra_h,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_16 },
 +        { .fni4 = gen_ursra32_i32,
 +          .fniv = gen_ursra_vec,
 +          .fno = gen_helper_gvec_ursra_s,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_32 },
 +        { .fni8 = gen_ursra64_i64,
 +          .fniv = gen_ursra_vec,
 +          .fno = gen_helper_gvec_ursra_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 +
 +    /* tszimm encoding produces immediates in the range [1..esize] */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +}
 +
  static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
      uint64_t mask = dup_const(MO_8, 0xff >> shift);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      }
                      return 0;
 +                case 2: /* VRSHR */
 +                    /* Right shift comes here negative.  */
 +                    shift = -shift;
 +                    if (u) {
 +                        gen_gvec_urshr(size, rd_ofs, rm_ofs, shift,
 +                                       vec_size, vec_size);
 +                    } else {
 +                        gen_gvec_srshr(size, rd_ofs, rm_ofs, shift,
 +                                       vec_size, vec_size);
 +                    }
 +                    return 0;
 +
 +                case 3: /* VRSRA */
 +                    /* Right shift comes here negative.  */
 +                    shift = -shift;
 +                    if (u) {
 +                        gen_gvec_ursra(size, rd_ofs, rm_ofs, shift,
 +                                       vec_size, vec_size);
 +                    } else {
 +                        gen_gvec_srsra(size, rd_ofs, rm_ofs, shift,
 +                                       vec_size, vec_size);
 +                    }
 +                    return 0;
 +
                  case 4: /* VSRI */
                      if (!u) {
                          return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          neon_load_reg64(cpu_V0, rm + pass);
                          tcg_gen_movi_i64(cpu_V1, imm);
                          switch (op) {
 -                        case 2: /* VRSHR */
 -                        case 3: /* VRSRA */
 -                            if (u)
 -                                gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1);
 -                            else
 -                                gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1);
 -                            break;
                          case 6: /* VQSHLU */
                              gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
                                                        cpu_V0, cpu_V1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          default:
                              g_assert_not_reached();
                          }
 -                        if (op == 3) {
 -                            /* Accumulate.  */
 -                            neon_load_reg64(cpu_V1, rd + pass);
 -                            tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
 -                        }
                          neon_store_reg64(cpu_V0, rd + pass);
                      } else { /* size < 3 */
                          /* Operands in T0 and T1.  */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          tmp2 = tcg_temp_new_i32();
                          tcg_gen_movi_i32(tmp2, imm);
                          switch (op) {
 -                        case 2: /* VRSHR */
 -                        case 3: /* VRSRA */
 -                            GEN_NEON_INTEGER_OP(rshl);
 -                            break;
                          case 6: /* VQSHLU */
                              switch (size) {
                              case 0:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              g_assert_not_reached();
                          }
                          tcg_temp_free_i32(tmp2);
 -
 -                        if (op == 3) {
 -                            /* Accumulate.  */
 -                            tmp2 = neon_load_reg(rd, pass);
 -                            gen_neon_add(size, tmp, tmp2);
 -                            tcg_temp_free_i32(tmp2);
 -                        }
                          neon_store_reg(rd, pass, tmp);
                      }
                  } /* for pass */
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_SRA(gvec_usra_d, uint64_t)
  #undef DO_SRA
 +#define DO_RSHR(NAME, TYPE)                             \
 +void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
 +{                                                       \
 +    intptr_t i, oprsz = simd_oprsz(desc);               \
 +    int shift = simd_data(desc);                        \
 +    TYPE *d = vd, *n = vn;                              \
 +    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
 +        TYPE tmp = n[i] >> (shift - 1);                 \
 +        d[i] = (tmp >> 1) + (tmp & 1);                  \
 +    }                                                   \
 +    clear_tail(d, oprsz, simd_maxsz(desc));             \
 +}
 +
 +DO_RSHR(gvec_srshr_b, int8_t)
 +DO_RSHR(gvec_srshr_h, int16_t)
 +DO_RSHR(gvec_srshr_s, int32_t)
 +DO_RSHR(gvec_srshr_d, int64_t)
 +
 +DO_RSHR(gvec_urshr_b, uint8_t)
 +DO_RSHR(gvec_urshr_h, uint16_t)
 +DO_RSHR(gvec_urshr_s, uint32_t)
 +DO_RSHR(gvec_urshr_d, uint64_t)
 +
 +#undef DO_RSHR
 +
 +#define DO_RSRA(NAME, TYPE)                             \
 +void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
 +{                                                       \
 +    intptr_t i, oprsz = simd_oprsz(desc);               \
 +    int shift = simd_data(desc);                        \
 +    TYPE *d = vd, *n = vn;                              \
 +    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
 +        TYPE tmp = n[i] >> (shift - 1);                 \
 +        d[i] += (tmp >> 1) + (tmp & 1);                 \
 +    }                                                   \
 +    clear_tail(d, oprsz, simd_maxsz(desc));             \
 +}
 +
 +DO_RSRA(gvec_srsra_b, int8_t)
 +DO_RSRA(gvec_srsra_h, int16_t)
 +DO_RSRA(gvec_srsra_s, int32_t)
 +DO_RSRA(gvec_srsra_d, int64_t)
 +
 +DO_RSRA(gvec_ursra_b, uint8_t)
 +DO_RSRA(gvec_ursra_h, uint16_t)
 +DO_RSRA(gvec_ursra_s, uint32_t)
 +DO_RSRA(gvec_ursra_d, uint64_t)
 +
 +#undef DO_RSRA
 +
  /*
   * Convert float16 to float32, raising no exceptions and
   * preserving exceptional values, including SNaN.
 --
-.20.1
+.25.1

-[PULL 09/45] target/arm: Swap argument order for VSHL during decode
+[PULL 05/23] target/arm: Make some more cpreg data static const
 From: Richard Henderson <richard.henderson@linaro.org>
-Rather than perform the argument swap during code generation,
+These particular data structures are not modified at runtime.
 perform it during decode.  This means it doesn't have to be
 special cased later, and we can share code with aarch64 code
 generation.  Hopefully the decode comment addresses any confusion
 that might arise in between.
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200513163245.17915-9-richard.henderson@linaro.org
+Message-id: 20220501055028.646596-5-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/neon-dp.decode       | 17 +++++++++++++++--
+ target/arm/helper.c | 16 ++++++++--------
- target/arm/translate-neon.inc.c |  3 +--
+file changed, 8 insertions(+), 8 deletions(-)
 files changed, 16 insertions(+), 4 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/helper.c
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
- VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
+               .resetvalue = cpu->pmceid1 },
- VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
+         };
+ #ifdef CONFIG_USER_ONLY
--VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
+-        ARMCPRegUserSpaceInfo v8_user_idregs[] = {
--VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
++        static const ARMCPRegUserSpaceInfo v8_user_idregs[] = {
-+# The _rev suffix indicates that Vn and Vm are reversed. This is
+             { .name = "ID_AA64PFR0_EL1",
-+# the case for shifts. In the Arm ARM these insns are documented
+               .exported_bits = 0x000f000f00ff0000,
-+# with the Vm and Vn fields in their usual places, but in the
+               .fixed_bits    = 0x0000000000000011 },
-+# assembly the operands are listed "backwards", ie in the order
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
-+# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose
+      */
-+# to consider Vm and Vn as being in different fields in the insn,
+     if (arm_feature(env, ARM_FEATURE_EL3)) {
-+# which allows us to avoid special-casing shifts in the trans_
+         if (arm_feature(env, ARM_FEATURE_AARCH64)) {
-+# function code. We would otherwise need to manually swap the operands
+-            ARMCPRegInfo nsacr = {
-+# over to call Neon helper functions that are shared with AArch64,
++            static const ARMCPRegInfo nsacr = {
-+# which does not have this odd reversed-operand situation.
+                 .name = "NSACR", .type = ARM_CP_CONST,
-+@3same_rev       .... ... . . . size:2 .... .... .... . q:1 . . .... \
+                 .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 2,
-+                 &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
+                 .access = PL1_RW, .accessfn = nsacr_access,
-+
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
-+VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
+             };
-+VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
+             define_one_arm_cp_reg(cpu, &nsacr);
+         } else {
- VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
+-            ARMCPRegInfo nsacr = {
- VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
++            static const ARMCPRegInfo nsacr = {
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+                 .name = "NSACR",
-index XXXXXXX..XXXXXXX 100644
+                 .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 2,
---- a/target/arm/translate-neon.inc.c
+                 .access = PL3_RW | PL1_R,
-+++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
-@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
+         }
-                                 uint32_t rn_ofs, uint32_t rm_ofs,       \
+     } else {
-                                 uint32_t oprsz, uint32_t maxsz)         \
+         if (arm_feature(env, ARM_FEATURE_V8)) {
-     {                                                                   \
+-            ARMCPRegInfo nsacr = {
--        /* Note the operation is vshl vd,vm,vn */                       \
++            static const ARMCPRegInfo nsacr = {
--        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
+                 .name = "NSACR", .type = ARM_CP_CONST,
-+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
+                 .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 2,
-                        oprsz, maxsz, &OPARRAY[vece]);                   \
+                 .access = PL1_R,
-     }                                                                   \
+@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
-     DO_3SAME(INSN, gen_##INSN##_3s)
+               .access = PL1_R, .type = ARM_CP_CONST,
                .resetvalue = cpu->pmsav7_dregion << 8
          };
 -        ARMCPRegInfo crn0_wi_reginfo = {
 +        static const ARMCPRegInfo crn0_wi_reginfo = {
              .name = "CRN0_WI", .cp = 15, .crn = 0, .crm = CP_ANY,
              .opc1 = CP_ANY, .opc2 = CP_ANY, .access = PL1_W,
              .type = ARM_CP_NOP | ARM_CP_OVERRIDE
          };
  #ifdef CONFIG_USER_ONLY
 -        ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = {
 +        static const ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = {
              { .name = "MIDR_EL1",
                .exported_bits = 0x00000000ffffffff },
              { .name = "REVIDR_EL1"                },
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                .access = PL1_R, .readfn = mpidr_read, .type = ARM_CP_NO_RAW },
          };
  #ifdef CONFIG_USER_ONLY
 -        ARMCPRegUserSpaceInfo mpidr_user_cp_reginfo[] = {
 +        static const ARMCPRegUserSpaceInfo mpidr_user_cp_reginfo[] = {
              { .name = "MPIDR_EL1",
                .fixed_bits = 0x0000000080000000 },
          };
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
      }
      if (arm_feature(env, ARM_FEATURE_VBAR)) {
 -        ARMCPRegInfo vbar_cp_reginfo[] = {
 +        static const ARMCPRegInfo vbar_cp_reginfo[] = {
              { .name = "VBAR", .state = ARM_CP_STATE_BOTH,
                .opc0 = 3, .crn = 12, .crm = 0, .opc1 = 0, .opc2 = 0,
                .access = PL1_RW, .writefn = vbar_write,
 --
-.20.1
+.25.1

-[PULL 16/45] target/arm: Vectorize SABD/UABD
+[PULL 06/23] target/arm: Reorg ARMCPRegInfo type field bits
 From: Richard Henderson <richard.henderson@linaro.org>
-Include 64-bit element size in preparation for SVE2.
+Instead of defining ARM_CP_FLAG_MASK to remove flags,
+define ARM_CP_SPECIAL_MASK to isolate special cases.
 Sort the specials to the low bits. Use an enum.
 Split the large comment block so as to document each
 value separately.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20220501055028.646596-6-richard.henderson@linaro.org
 Message-id: 20200513163245.17915-16-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h        |  10 +++
+ target/arm/cpregs.h        | 130 +++++++++++++++++++++++--------------
- target/arm/translate.h     |   5 ++
+ target/arm/cpu.c           |   4 +-
- target/arm/translate-a64.c |   8 ++-
+ target/arm/helper.c        |   4 +-
- target/arm/translate.c     | 133 ++++++++++++++++++++++++++++++++++++-
+ target/arm/translate-a64.c |   6 +-
- target/arm/vec_helper.c    |  24 +++++++
+ target/arm/translate.c     |   6 +-
-files changed, 176 insertions(+), 4 deletions(-)
+files changed, 92 insertions(+), 58 deletions(-)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/target/arm/cpregs.h
-+++ b/target/arm/helper.h
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+@@ -XXX,XX +XXX,XX @@
- DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+ #define TARGET_ARM_CPREGS_H
- DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+ /*
-+DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+- * ARMCPRegInfo type field bits. If the SPECIAL bit is set this is a
-+DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+- * special-behaviour cp reg and bits [11..8] indicate what behaviour
-+DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+- * it has. Otherwise it is a simple cp reg, where CONST indicates that
-+DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+- * TCG can assume the value to be constant (ie load at translate time)
 - * and 64BIT indicates a 64 bit wide coprocessor register. SUPPRESS_TB_END
 - * indicates that the TB should not be ended after a write to this register
 - * (the default is that the TB ends after cp writes). OVERRIDE permits
 - * a register definition to override a previous definition for the
 - * same (cp, is64, crn, crm, opc1, opc2) tuple: either the new or the
 - * old must have the OVERRIDE bit set.
 - * ALIAS indicates that this register is an alias view of some underlying
 - * state which is also visible via another register, and that the other
 - * register is handling migration and reset; registers marked ALIAS will not be
 - * migrated but may have their state set by syncing of register state from KVM.
 - * NO_RAW indicates that this register has no underlying state and does not
 - * support raw access for state saving/loading; it will not be used for either
 - * migration or KVM state synchronization. (Typically this is for "registers"
 - * which are actually used as instructions for cache maintenance and so on.)
 - * IO indicates that this register does I/O and therefore its accesses
 - * need to be marked with gen_io_start() and also end the TB. In particular,
 - * registers which implement clocks or timers require this.
 - * RAISES_EXC is for when the read or write hook might raise an exception;
 - * the generated code will synchronize the CPU state before calling the hook
 - * so that it is safe for the hook to call raise_exception().
 - * NEWEL is for writes to registers that might change the exception
 - * level - typically on older ARM chips. For those cases we need to
 - * re-read the new el when recomputing the translation flags.
 + * ARMCPRegInfo type field bits:
   */
 -#define ARM_CP_SPECIAL           0x0001
 -#define ARM_CP_CONST             0x0002
 -#define ARM_CP_64BIT             0x0004
 -#define ARM_CP_SUPPRESS_TB_END   0x0008
 -#define ARM_CP_OVERRIDE          0x0010
 -#define ARM_CP_ALIAS             0x0020
 -#define ARM_CP_IO                0x0040
 -#define ARM_CP_NO_RAW            0x0080
 -#define ARM_CP_NOP               (ARM_CP_SPECIAL | 0x0100)
 -#define ARM_CP_WFI               (ARM_CP_SPECIAL | 0x0200)
 -#define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
 -#define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
 -#define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
 -#define ARM_CP_DC_GVA            (ARM_CP_SPECIAL | 0x0600)
 -#define ARM_CP_DC_GZVA           (ARM_CP_SPECIAL | 0x0700)
 -#define ARM_LAST_SPECIAL         ARM_CP_DC_GZVA
 -#define ARM_CP_FPU               0x1000
 -#define ARM_CP_SVE               0x2000
 -#define ARM_CP_NO_GDB            0x4000
 -#define ARM_CP_RAISES_EXC        0x8000
 -#define ARM_CP_NEWEL             0x10000
 -/* Mask of only the flag bits in a type field */
 -#define ARM_CP_FLAG_MASK         0x1f0ff
 +enum {
 +    /*
 +     * Register must be handled specially during translation.
 +     * The method is one of the values below:
 +     */
 +    ARM_CP_SPECIAL_MASK          = 0x000f,
 +    /* Special: no change to PE state: writes ignored, reads ignored. */
 +    ARM_CP_NOP                   = 0x0001,
 +    /* Special: sysreg is WFI, for v5 and v6. */
 +    ARM_CP_WFI                   = 0x0002,
 +    /* Special: sysreg is NZCV. */
 +    ARM_CP_NZCV                  = 0x0003,
 +    /* Special: sysreg is CURRENTEL. */
 +    ARM_CP_CURRENTEL             = 0x0004,
 +    /* Special: sysreg is DC ZVA or similar. */
 +    ARM_CP_DC_ZVA                = 0x0005,
 +    ARM_CP_DC_GVA                = 0x0006,
 +    ARM_CP_DC_GZVA               = 0x0007,
 +
-+DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++    /* Flag: reads produce resetvalue; writes ignored. */
-+DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++    ARM_CP_CONST                 = 1 << 4,
-+DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++    /* Flag: For ARM_CP_STATE_AA32, sysreg is 64-bit. */
-+DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
++    ARM_CP_64BIT                 = 1 << 5,
-+
++    /*
- #ifdef TARGET_AARCH64
++     * Flag: TB should not be ended after a write to this register
- #include "helper-a64.h"
++     * (the default is that the TB ends after cp writes).
- #include "helper-sve.h"
++     */
-diff --git a/target/arm/translate.h b/target/arm/translate.h
++    ARM_CP_SUPPRESS_TB_END       = 1 << 6,
-index XXXXXXX..XXXXXXX 100644
++    /*
---- a/target/arm/translate.h
++     * Flag: Permit a register definition to override a previous definition
-+++ b/target/arm/translate.h
++     * for the same (cp, is64, crn, crm, opc1, opc2) tuple: either the new
-@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++     * or the old must have the ARM_CP_OVERRIDE bit set.
- void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++     */
-                           uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
++    ARM_CP_OVERRIDE              = 1 << 7,
++    /*
-+void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++     * Flag: Register is an alias view of some underlying state which is also
-+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
++     * visible via another register, and that the other register is handling
-+void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++     * migration and reset; registers marked ARM_CP_ALIAS will not be migrated
-+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
++     * but may have their state set by syncing of register state from KVM.
-+
++     */
 +    ARM_CP_ALIAS                 = 1 << 8,
 +    /*
 +     * Flag: Register does I/O and therefore its accesses need to be marked
 +     * with gen_io_start() and also end the TB. In particular, registers which
 +     * implement clocks or timers require this.
 +     */
 +    ARM_CP_IO                    = 1 << 9,
 +    /*
 +     * Flag: Register has no underlying state and does not support raw access
 +     * for state saving/loading; it will not be used for either migration or
 +     * KVM state synchronization. Typically this is for "registers" which are
 +     * actually used as instructions for cache maintenance and so on.
 +     */
 +    ARM_CP_NO_RAW                = 1 << 10,
 +    /*
 +     * Flag: The read or write hook might raise an exception; the generated
 +     * code will synchronize the CPU state before calling the hook so that it
 +     * is safe for the hook to call raise_exception().
 +     */
 +    ARM_CP_RAISES_EXC            = 1 << 11,
 +    /*
 +     * Flag: Writes to the sysreg might change the exception level - typically
 +     * on older ARM chips. For those cases we need to re-read the new el when
 +     * recomputing the translation flags.
 +     */
 +    ARM_CP_NEWEL                 = 1 << 12,
 +    /*
 +     * Flag: Access check for this sysreg is identical to accessing FPU state
 +     * from an instruction: use translation fp_access_check().
 +     */
 +    ARM_CP_FPU                   = 1 << 13,
 +    /*
 +     * Flag: Access check for this sysreg is identical to accessing SVE state
 +     * from an instruction: use translation sve_access_check().
 +     */
 +    ARM_CP_SVE                   = 1 << 14,
 +    /* Flag: Do not expose in gdb sysreg xml. */
 +    ARM_CP_NO_GDB                = 1 << 15,
 +};
  /*
-  * Forward to the isar_feature_* tests given a DisasContext pointer.
+  * Valid values for ARMCPRegInfo state field, indicating which of
-  */
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
      ARMCPRegInfo *ri = value;
      ARMCPU *cpu = opaque;
 -    if (ri->type & (ARM_CP_SPECIAL | ARM_CP_ALIAS)) {
 +    if (ri->type & (ARM_CP_SPECIAL_MASK | ARM_CP_ALIAS)) {
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void cp_reg_check_reset(gpointer key, gpointer value,  gpointer opaque)
      ARMCPU *cpu = opaque;
      uint64_t oldvalue, newvalue;
 -    if (ri->type & (ARM_CP_SPECIAL | ARM_CP_ALIAS | ARM_CP_NO_RAW)) {
 +    if (ri->type & (ARM_CP_SPECIAL_MASK | ARM_CP_ALIAS | ARM_CP_NO_RAW)) {
          return;
      }
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
       * multiple times. Special registers (ie NOP/WFI) are
       * never migratable and not even raw-accessible.
       */
 -    if ((r->type & ARM_CP_SPECIAL)) {
 +    if (r->type & ARM_CP_SPECIAL_MASK) {
          r2->type |= ARM_CP_NO_RAW;
      }
      if (((r->crm == CP_ANY) && crm != 0) ||
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
      /* Check that the register definition has enough info to handle
       * reads and writes if they are permitted.
       */
 -    if (!(r->type & (ARM_CP_SPECIAL|ARM_CP_CONST))) {
 +    if (!(r->type & (ARM_CP_SPECIAL_MASK | ARM_CP_CONST))) {
          if (r->access & PL3_R) {
              assert((r->fieldoffset ||
                     (r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1])) ||
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
-             gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size);
+     }
      /* Handle special cases first */
 -    switch (ri->type & ~(ARM_CP_FLAG_MASK & ~ARM_CP_SPECIAL)) {
 +    switch (ri->type & ARM_CP_SPECIAL_MASK) {
 +    case 0:
 +        break;
      case ARM_CP_NOP:
          return;
      case ARM_CP_NZCV:
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
          }
          return;
-+    case 0xe: /* SABD, UABD */
+     default:
-+        if (u) {
+-        break;
-+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size);
++        g_assert_not_reached();
-+        } else {
+     }
-+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
+     if ((ri->type & ARM_CP_FPU) && !fp_access_check(s)) {
-+        }
+         return;
 +        return;
      case 0x10: /* ADD, SUB */
          if (u) {
              gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                  genenvfn = fns[size][u];
                  break;
              }
 -            case 0xe: /* SABD, UABD */
              case 0xf: /* SABA, UABA */
              {
                  static NeonGenTwoOpFn * const fns[3][2] = {
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
-                    rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+         }
- }
+         /* Handle special cases first */
-+static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
+-        switch (ri->type & ~(ARM_CP_FLAG_MASK & ~ARM_CP_SPECIAL)) {
-+{
++        switch (ri->type & ARM_CP_SPECIAL_MASK) {
-+    TCGv_i32 t = tcg_temp_new_i32();
++        case 0:
-+
++            break;
-+    tcg_gen_sub_i32(t, a, b);
+         case ARM_CP_NOP:
-+    tcg_gen_sub_i32(d, b, a);
+             return;
-+    tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t);
+         case ARM_CP_WFI:
-+    tcg_temp_free_i32(t);
+@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
-+}
+             s->base.is_jmp = DISAS_WFI;
-+
+             return;
-+static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
+         default:
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_sub_i64(t, a, b);
 +    tcg_gen_sub_i64(d, b, a);
 +    tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +
 +    tcg_gen_smin_vec(vece, t, a, b);
 +    tcg_gen_smax_vec(vece, d, a, b);
 +    tcg_gen_sub_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_sabd_vec,
 +          .fno = gen_helper_gvec_sabd_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_sabd_vec,
 +          .fno = gen_helper_gvec_sabd_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_sabd_i32,
 +          .fniv = gen_sabd_vec,
 +          .fno = gen_helper_gvec_sabd_s,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_sabd_i64,
 +          .fniv = gen_sabd_vec,
 +          .fno = gen_helper_gvec_sabd_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
 +static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 +{
 +    TCGv_i32 t = tcg_temp_new_i32();
 +
 +    tcg_gen_sub_i32(t, a, b);
 +    tcg_gen_sub_i32(d, b, a);
 +    tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t);
 +    tcg_temp_free_i32(t);
 +}
 +
 +static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 +{
 +    TCGv_i64 t = tcg_temp_new_i64();
 +
 +    tcg_gen_sub_i64(t, a, b);
 +    tcg_gen_sub_i64(d, b, a);
 +    tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t);
 +    tcg_temp_free_i64(t);
 +}
 +
 +static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
 +{
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +
 +    tcg_gen_umin_vec(vece, t, a, b);
 +    tcg_gen_umax_vec(vece, d, a, b);
 +    tcg_gen_sub_vec(vece, d, d, t);
 +    tcg_temp_free_vec(t);
 +}
 +
 +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_uabd_i32,
 +          .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_s,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_uabd_i64,
 +          .fniv = gen_uabd_vec,
 +          .fno = gen_helper_gvec_uabd_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 +
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
     We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              }
              return 1;
 +        case NEON_3R_VABD:
 +            if (u) {
 +                gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs,
 +                              vec_size, vec_size);
 +            } else {
 +                gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs,
 +                              vec_size, vec_size);
 +            }
 +            return 0;
 +
          case NEON_3R_VADD_VSUB:
          case NEON_3R_LOGIC:
          case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          case NEON_3R_VQRSHL:
              GEN_NEON_INTEGER_OP_ENV(qrshl);
              break;
 -        case NEON_3R_VABD:
 -            GEN_NEON_INTEGER_OP(abd);
 -            break;
-         case NEON_3R_VABA:
++            g_assert_not_reached();
-             GEN_NEON_INTEGER_OP(abd);
+         }
-             tcg_temp_free_i32(tmp2);
-diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
+         if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_CMP0(gvec_cgt0_h, int16_t, >)
  DO_CMP0(gvec_cge0_h, int16_t, >=)
  #undef DO_CMP0
 +
 +#define DO_ABD(NAME, TYPE)                                      \
 +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc)  \
 +{                                                               \
 +    intptr_t i, opr_sz = simd_oprsz(desc);                      \
 +    TYPE *d = vd, *n = vn, *m = vm;                             \
 +                                                                \
 +    for (i = 0; i < opr_sz / sizeof(TYPE); ++i) {               \
 +        d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i];         \
 +    }                                                           \
 +    clear_tail(d, opr_sz, simd_maxsz(desc));                    \
 +}
 +
 +DO_ABD(gvec_sabd_b, int8_t)
 +DO_ABD(gvec_sabd_h, int16_t)
 +DO_ABD(gvec_sabd_s, int32_t)
 +DO_ABD(gvec_sabd_d, int64_t)
 +
 +DO_ABD(gvec_uabd_b, uint8_t)
 +DO_ABD(gvec_uabd_h, uint16_t)
 +DO_ABD(gvec_uabd_s, uint32_t)
 +DO_ABD(gvec_uabd_d, uint64_t)
 +
 +#undef DO_ABD
 --
-.20.1
+.25.1

-[PULL 07/45] target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
+[PULL 07/23] target/arm: Avoid bare abort() or assert(0)
 From: Richard Henderson <richard.henderson@linaro.org>
-Provide a functional interface for the vector expansion.
+Standardize on g_assert_not_reached() for "should not happen".
-This fits better with the existing set of helpers that
+Retain abort() when preceeded by fprintf or error_report.
 we provide for other operations.
-Macro-ize the 5 nearly identical comparisons.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20220501055028.646596-7-richard.henderson@linaro.org
 Message-id: 20200513163245.17915-7-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h     |  16 ++-
+ target/arm/helper.c         | 7 +++----
- target/arm/translate-a64.c |  22 ++--
+ target/arm/hvf/hvf.c        | 2 +-
- target/arm/translate.c     | 254 ++++++++-----------------------------
+ target/arm/kvm-stub.c       | 4 ++--
-files changed, 74 insertions(+), 218 deletions(-)
+ target/arm/kvm.c            | 4 ++--
  target/arm/machine.c        | 4 ++--
  target/arm/translate-a64.c  | 4 ++--
  target/arm/translate-neon.c | 2 +-
  target/arm/translate.c      | 4 ++--
 files changed, 15 insertions(+), 16 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/helper.c
-+++ b/target/arm/translate.h
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
+@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
- uint64_t vfp_expand_imm(int size, uint8_t imm8);
+             break;
+         default:
- /* Vector operations shared between ARM and AArch64.  */
+             /* broken reginfo with out-of-range opc1 */
--extern const GVecGen2 ceq0_op[4];
+-            assert(false);
--extern const GVecGen2 clt0_op[4];
+-            break;
--extern const GVecGen2 cgt0_op[4];
++            g_assert_not_reached();
--extern const GVecGen2 cle0_op[4];
+         }
--extern const GVecGen2 cge0_op[4];
+         /* assert our permissions are not too lax (stricter is fine) */
-+void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+         assert((r->access & ~mask) == 0);
-+                   uint32_t opr_sz, uint32_t max_sz);
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_v5(CPUARMState *env, uint32_t address,
-+void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+             break;
-+                   uint32_t opr_sz, uint32_t max_sz);
+         default:
-+void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+             /* Never happens, but compiler isn't smart enough to tell.  */
-+                   uint32_t opr_sz, uint32_t max_sz);
+-            abort();
-+void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
++            g_assert_not_reached();
-+                   uint32_t opr_sz, uint32_t max_sz);
+         }
-+void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+     }
-+                   uint32_t opr_sz, uint32_t max_sz);
+     *prot = ap_to_rw_prot(env, mmu_idx, ap, domain_prot);
-+
+@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_v6(CPUARMState *env, uint32_t address,
- extern const GVecGen3 mla_op[4];
+             break;
- extern const GVecGen3 mls_op[4];
+         default:
- extern const GVecGen3 cmtst_op[4];
+             /* Never happens, but compiler isn't smart enough to tell.  */
 -            abort();
 +            g_assert_not_reached();
          }
      }
      if (domain_prot == 3) {
 diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/hvf/hvf.c
 +++ b/target/arm/hvf/hvf.c
@@ -XXX,XX +XXX,XX @@ int hvf_vcpu_exec(CPUState *cpu)
          /* we got kicked, no exit to process */
          return 0;
      default:
 -        assert(0);
 +        g_assert_not_reached();
      }
      hvf_sync_vtimer(cpu);
 diff --git a/target/arm/kvm-stub.c b/target/arm/kvm-stub.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm-stub.c
 +++ b/target/arm/kvm-stub.c
@@ -XXX,XX +XXX,XX @@
  bool write_kvmstate_to_list(ARMCPU *cpu)
  {
 -    abort();
 +    g_assert_not_reached();
  }
  bool write_list_to_kvmstate(ARMCPU *cpu, int level)
  {
 -    abort();
 +    g_assert_not_reached();
  }
 diff --git a/target/arm/kvm.c b/target/arm/kvm.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm.c
 +++ b/target/arm/kvm.c
@@ -XXX,XX +XXX,XX @@ bool write_kvmstate_to_list(ARMCPU *cpu)
              ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &r);
              break;
          default:
 -            abort();
 +            g_assert_not_reached();
          }
          if (ret) {
              ok = false;
@@ -XXX,XX +XXX,XX @@ bool write_list_to_kvmstate(ARMCPU *cpu, int level)
              r.addr = (uintptr_t)(cpu->cpreg_values + i);
              break;
          default:
 -            abort();
 +            g_assert_not_reached();
          }
          ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &r);
          if (ret) {
 diff --git a/target/arm/machine.c b/target/arm/machine.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/machine.c
 +++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static int cpu_pre_save(void *opaque)
      if (kvm_enabled()) {
          if (!write_kvmstate_to_list(cpu)) {
              /* This should never fail */
 -            abort();
 +            g_assert_not_reached();
          }
          /*
@@ -XXX,XX +XXX,XX @@ static int cpu_pre_save(void *opaque)
      } else {
          if (!write_cpustate_to_list(cpu, false)) {
              /* This should never fail. */
 -            abort();
 +            g_assert_not_reached();
          }
      }
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
+@@ -XXX,XX +XXX,XX @@ static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
-             is_q ? 16 : 8, vec_full_reg_size(s));
+         gen_helper_advsimd_rinth(tcg_res, tcg_op, fpst);
          break;
      default:
 -        abort();
 +        g_assert_not_reached();
      }
      write_fp_sreg(s, rd, tcg_res);
@@ -XXX,XX +XXX,XX @@ static void handle_fp_fcvt(DisasContext *s, int opcode,
          break;
      }
      default:
 -        abort();
 +        g_assert_not_reached();
      }
  }
--/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */
+diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c
--static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
+index XXXXXXX..XXXXXXX 100644
--                         int rn, const GVecGen2 *gvec_op)
+--- a/target/arm/translate-neon.c
--{
++++ b/target/arm/translate-neon.c
--    tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
+@@ -XXX,XX +XXX,XX @@ static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a)
 -                   is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
 -}
 -
  /* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
  static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
                           int rn, int rm, const GVecGen3 *gvec_op)
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
          }
          break;
-     case 0x8: /* CMGT, CMGE */
+     default:
--        gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]);
+-        abort();
-+        if (u) {
++        g_assert_not_reached();
-+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size);
+     }
-+        } else {
+     if ((vd + a->stride * (nregs - 1)) > 31) {
-+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size);
+         /*
 +        }
          return;
      case 0x9: /* CMEQ, CMLE */
 -        gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]);
 +        if (u) {
 +            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size);
 +        } else {
 +            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size);
 +        }
          return;
      case 0xa: /* CMLT */
 -        gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]);
 +        gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size);
          return;
      case 0xb:
          if (u) { /* ABS, NEG */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
+@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
-     return 1;
+         offset = 4;
- }
+         break;
+     default:
--static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a)
+-        abort();
--{
++        g_assert_not_reached();
--    tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0);
+     }
--    tcg_gen_neg_i32(d, d);
+     tcg_gen_addi_i32(addr, addr, offset);
--}
+     tmp = load_reg(s, 14);
--
+@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
--static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a)
+             offset = 0;
--{
+             break;
--    tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0);
+         default:
--    tcg_gen_neg_i64(d, d);
+-            abort();
--}
++            g_assert_not_reached();
--
+         }
--static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
+         tcg_gen_addi_i32(addr, addr, offset);
--{
+         gen_helper_set_r13_banked(cpu_env, tcg_constant_i32(mode), addr);
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 +#define GEN_CMP0(NAME, COND)                                            \
 +    static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a)               \
 +    {                                                                   \
 +        tcg_gen_setcondi_i32(COND, d, a, 0);                            \
 +        tcg_gen_neg_i32(d, d);                                          \
 +    }                                                                   \
 +    static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a)               \
 +    {                                                                   \
 +        tcg_gen_setcondi_i64(COND, d, a, 0);                            \
 +        tcg_gen_neg_i64(d, d);                                          \
 +    }                                                                   \
 +    static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \
 +    {                                                                   \
 +        TCGv_vec zero = tcg_const_zeros_vec_matching(d);                \
 +        tcg_gen_cmp_vec(COND, vece, d, a, zero);                        \
 +        tcg_temp_free_vec(zero);                                        \
 +    }                                                                   \
 +    void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m,      \
 +                            uint32_t opr_sz, uint32_t max_sz)           \
 +    {                                                                   \
 +        const GVecGen2 op[4] = {                                        \
 +            { .fno = gen_helper_gvec_##NAME##0_b,                       \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .vece = MO_8 },                                           \
 +            { .fno = gen_helper_gvec_##NAME##0_h,                       \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .vece = MO_16 },                                          \
 +            { .fni4 = gen_##NAME##0_i32,                                \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .vece = MO_32 },                                          \
 +            { .fni8 = gen_##NAME##0_i64,                                \
 +              .fniv = gen_##NAME##0_vec,                                \
 +              .opt_opc = vecop_list_cmp,                                \
 +              .prefer_i64 = TCG_TARGET_REG_BITS == 64,                  \
 +              .vece = MO_64 },                                          \
 +        };                                                              \
 +        tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]);                \
 +    }
  static const TCGOpcode vecop_list_cmp[] = {
      INDEX_op_cmp_vec, 0
  };
 -const GVecGen2 ceq0_op[4] = {
 -    { .fno = gen_helper_gvec_ceq0_b,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_ceq0_h,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_ceq0_i32,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_ceq0_i64,
 -      .fniv = gen_ceq0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 +GEN_CMP0(ceq, TCG_COND_EQ)
 +GEN_CMP0(cle, TCG_COND_LE)
 +GEN_CMP0(cge, TCG_COND_GE)
 +GEN_CMP0(clt, TCG_COND_LT)
 +GEN_CMP0(cgt, TCG_COND_GT)
 -static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 cle0_op[4] = {
 -    { .fno = gen_helper_gvec_cle0_b,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_cle0_h,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cle0_i32,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cle0_i64,
 -      .fniv = gen_cle0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 -
 -static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 cge0_op[4] = {
 -    { .fno = gen_helper_gvec_cge0_b,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_cge0_h,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cge0_i32,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cge0_i64,
 -      .fniv = gen_cge0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 -
 -static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 clt0_op[4] = {
 -    { .fno = gen_helper_gvec_clt0_b,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_clt0_h,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_clt0_i32,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_clt0_i64,
 -      .fniv = gen_clt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 -
 -static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a)
 -{
 -    tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0);
 -    tcg_gen_neg_i32(d, d);
 -}
 -
 -static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a)
 -{
 -    tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0);
 -    tcg_gen_neg_i64(d, d);
 -}
 -
 -static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
 -{
 -    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
 -    tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero);
 -    tcg_temp_free_vec(zero);
 -}
 -
 -const GVecGen2 cgt0_op[4] = {
 -    { .fno = gen_helper_gvec_cgt0_b,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_8 },
 -    { .fno = gen_helper_gvec_cgt0_h,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cgt0_i32,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cgt0_i64,
 -      .fniv = gen_cgt0_vec,
 -      .opt_opc = vecop_list_cmp,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .vece = MO_64 },
 -};
 +#undef GEN_CMP0
  static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      break;
                  case NEON_2RM_VCEQ0:
 -                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
 -                                   vec_size, &ceq0_op[size]);
 +                    gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                      break;
                  case NEON_2RM_VCGT0:
 -                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
 -                                   vec_size, &cgt0_op[size]);
 +                    gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                      break;
                  case NEON_2RM_VCLE0:
 -                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
 -                                   vec_size, &cle0_op[size]);
 +                    gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                      break;
                  case NEON_2RM_VCGE0:
 -                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
 -                                   vec_size, &cge0_op[size]);
 +                    gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                      break;
                  case NEON_2RM_VCLT0:
 -                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
 -                                   vec_size, &clt0_op[size]);
 +                    gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                      break;
                  default:
 --
-.20.1
+.25.1

-[PULL 05/45] target/arm: Remove unnecessary range check for VSHL
+[PULL 08/23] target/arm: Change cpreg access permissions to enum
 From: Richard Henderson <richard.henderson@linaro.org>
-In 1dc8425e551, while converting to gvec, I added an extra range check
+Create a typedef as well, and use it in ARMCPRegInfo.
-against the shift count.  This was unnecessary because the encoding of
+This won't be perfect for debugging, but it'll nicely
-the shift count produces 0 to the element size - 1.
+display the most common cases.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200513163245.17915-5-richard.henderson@linaro.org
+Message-id: 20220501055028.646596-8-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c | 12 ++----------
+ target/arm/cpregs.h | 44 +++++++++++++++++++++++---------------------
-file changed, 2 insertions(+), 10 deletions(-)
+ target/arm/helper.c |  2 +-
 files changed, 24 insertions(+), 22 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/cpregs.h
-+++ b/target/arm/translate.c
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ enum {
-                         gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
+  * described with these bits, then use a laxer set of restrictions, and
-                                      vec_size, vec_size);
+  * do the more restrictive/complex check inside a helper function.
-                     } else { /* VSHL */
+  */
--                        /* Shifts larger than the element size are
+-#define PL3_R 0x80
--                         * architecturally valid and results in zero.
+-#define PL3_W 0x40
--                         */
+-#define PL2_R (0x20 | PL3_R)
--                        if (shift >= 8 << size) {
+-#define PL2_W (0x10 | PL3_W)
--                            tcg_gen_gvec_dup_imm(size, rd_ofs,
+-#define PL1_R (0x08 | PL2_R)
--                                                 vec_size, vec_size, 0);
+-#define PL1_W (0x04 | PL2_W)
--                        } else {
+-#define PL0_R (0x02 | PL1_R)
--                            tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
+-#define PL0_W (0x01 | PL1_W)
--                                              vec_size, vec_size);
++typedef enum {
--                        }
++    PL3_R = 0x80,
-+                        tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
++    PL3_W = 0x40,
-+                                          vec_size, vec_size);
++    PL2_R = 0x20 | PL3_R,
-                     }
++    PL2_W = 0x10 | PL3_W,
-                     return 0;
++    PL1_R = 0x08 | PL2_R,
-                 }
++    PL1_W = 0x04 | PL2_W,
 +    PL0_R = 0x02 | PL1_R,
 +    PL0_W = 0x01 | PL1_W,
 -/*
 - * For user-mode some registers are accessible to EL0 via a kernel
 - * trap-and-emulate ABI. In this case we define the read permissions
 - * as actually being PL0_R. However some bits of any given register
 - * may still be masked.
 - */
 +    /*
 +     * For user-mode some registers are accessible to EL0 via a kernel
 +     * trap-and-emulate ABI. In this case we define the read permissions
 +     * as actually being PL0_R. However some bits of any given register
 +     * may still be masked.
 +     */
  #ifdef CONFIG_USER_ONLY
 -#define PL0U_R PL0_R
 +    PL0U_R = PL0_R,
  #else
 -#define PL0U_R PL1_R
 +    PL0U_R = PL1_R,
  #endif
 -#define PL3_RW (PL3_R | PL3_W)
 -#define PL2_RW (PL2_R | PL2_W)
 -#define PL1_RW (PL1_R | PL1_W)
 -#define PL0_RW (PL0_R | PL0_W)
 +    PL3_RW = PL3_R | PL3_W,
 +    PL2_RW = PL2_R | PL2_W,
 +    PL1_RW = PL1_R | PL1_W,
 +    PL0_RW = PL0_R | PL0_W,
 +} CPAccessRights;
  typedef enum CPAccessResult {
      /* Access is permitted */
@@ -XXX,XX +XXX,XX @@ struct ARMCPRegInfo {
      /* Register type: ARM_CP_* bits/values */
      int type;
      /* Access rights: PL*_[RW] */
 -    int access;
 +    CPAccessRights access;
      /* Security state: ARM_CP_SECSTATE_* bits/values */
      int secure;
      /*
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
       * to encompass the generic architectural permission check.
       */
      if (r->state != ARM_CP_STATE_AA32) {
 -        int mask = 0;
 +        CPAccessRights mask;
          switch (r->opc1) {
          case 0:
              /* min_EL EL1, but some accessible to EL0 via kernel ABI */
 --
-.20.1
+.25.1

-[PULL 15/45] target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
+[PULL 09/23] target/arm: Name CPState type
 From: Richard Henderson <richard.henderson@linaro.org>
-Must clear the tail for AdvSIMD when SVE is enabled.
+Give this enum a name and use in ARMCPRegInfo,
 add_cpreg_to_hashtable and define_one_arm_cp_reg_with_opaque.
-Fixes: ca40a6e6e39
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Cc: qemu-stable@nongnu.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200513163245.17915-15-richard.henderson@linaro.org
+Message-id: 20220501055028.646596-9-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/vec_helper.c | 2 ++
+ target/arm/cpregs.h | 6 +++---
-file changed, 2 insertions(+)
+ target/arm/helper.c | 6 ++++--
 files changed, 7 insertions(+), 5 deletions(-)
-diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/vec_helper.c
+--- a/target/arm/cpregs.h
-+++ b/target/arm/vec_helper.c
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
+@@ -XXX,XX +XXX,XX @@ enum {
-             d[i + j] = TYPE##_mul(n[i + j], mm, stat);                     \
+  * Note that we rely on the values of these enums as we iterate through
-         }                                                                  \
+  * the various states in some places.
-     }                                                                      \
+  */
-+    clear_tail(d, oprsz, simd_maxsz(desc));                                \
+-enum {
 +typedef enum {
      ARM_CP_STATE_AA32 = 0,
      ARM_CP_STATE_AA64 = 1,
      ARM_CP_STATE_BOTH = 2,
 -};
 +} CPState;
  /*
   * ARM CP register secure state flags.  These flags identify security state
@@ -XXX,XX +XXX,XX @@ struct ARMCPRegInfo {
      uint8_t opc1;
      uint8_t opc2;
      /* Execution state in which this register is visible: ARM_CP_STATE_* */
 -    int state;
 +    CPState state;
      /* Register type: ARM_CP_* bits/values */
      int type;
      /* Access rights: PL*_[RW] */
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
  }
- DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
+ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va,                  \
+-                                   void *opaque, int state, int secstate,
-                                      mm, a[i + j], 0, stat);               \
++                                   void *opaque, CPState state, int secstate,
-         }                                                                  \
+                                    int crm, int opc1, int opc2,
-     }                                                                      \
+                                    const char *name)
-+    clear_tail(d, oprsz, simd_maxsz(desc));                                \
+ {
- }
+@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
+      * bits; the ARM_CP_64BIT* flag applies only to the AArch32 view of
- DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2)
+      * the register, if any.
       */
 -    int crm, opc1, opc2, state;
 +    int crm, opc1, opc2;
      int crmmin = (r->crm == CP_ANY) ? 0 : r->crm;
      int crmmax = (r->crm == CP_ANY) ? 15 : r->crm;
      int opc1min = (r->opc1 == CP_ANY) ? 0 : r->opc1;
      int opc1max = (r->opc1 == CP_ANY) ? 7 : r->opc1;
      int opc2min = (r->opc2 == CP_ANY) ? 0 : r->opc2;
      int opc2max = (r->opc2 == CP_ANY) ? 7 : r->opc2;
 +    CPState state;
 +
      /* 64 bit registers have only CRm and Opc1 fields */
      assert(!((r->type & ARM_CP_64BIT) && (r->opc2 || r->crn)));
      /* op0 only exists in the AArch64 encodings */
 --
-.20.1
+.25.1

-[PULL 08/45] target/arm: Create gen_gvec_{mla,mls}
+[PULL 10/23] target/arm: Name CPSecureState type
 From: Richard Henderson <richard.henderson@linaro.org>
-Provide a functional interface for the vector expansion.
+Give this enum a name and use in ARMCPRegInfo and add_cpreg_to_hashtable.
-This fits better with the existing set of helpers that
+Add the enumerator ARM_CP_SECSTATE_BOTH to clarify how 0
-we provide for other operations.
+is handled in define_one_arm_cp_reg_with_opaque.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200513163245.17915-8-richard.henderson@linaro.org
+Message-id: 20220501055028.646596-10-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h          |   7 +-
+ target/arm/cpregs.h | 7 ++++---
- target/arm/translate-a64.c      |   4 +-
+ target/arm/helper.c | 7 +++++--
- target/arm/translate-neon.inc.c |  16 +----
+files changed, 9 insertions(+), 5 deletions(-)
  target/arm/translate.c          | 117 +++++++++++++++++---------------
 files changed, 71 insertions(+), 73 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/cpregs.h
-+++ b/target/arm/translate.h
++++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+@@ -XXX,XX +XXX,XX @@ typedef enum {
- void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+  * registered entry will only have one to identify whether the entry is secure
-                    uint32_t opr_sz, uint32_t max_sz);
+  * or non-secure.
+  */
--extern const GVecGen3 mla_op[4];
+-enum {
--extern const GVecGen3 mls_op[4];
++typedef enum {
-+void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++    ARM_CP_SECSTATE_BOTH = 0,       /* define one cpreg for each secstate */
-+                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+     ARM_CP_SECSTATE_S =   (1 << 0), /* bit[0]: Secure state register */
-+void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+     ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
-+                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+-};
-+
++} CPSecureState;
- extern const GVecGen3 cmtst_op[4];
- extern const GVecGen3 sshl_op[4];
+ /*
- extern const GVecGen3 ushl_op[4];
+  * Access rights:
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ struct ARMCPRegInfo {
      /* Access rights: PL*_[RW] */
      CPAccessRights access;
      /* Security state: ARM_CP_SECSTATE_* bits/values */
 -    int secure;
 +    CPSecureState secure;
      /*
       * The opaque pointer passed to define_arm_cp_regs_with_opaque() when
       * this register was defined: can be used to hand data through to the
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
          return;
      case 0x12: /* MLA, MLS */
          if (u) {
 -            gen_gvec_op3(s, is_q, rd, rn, rm, &mls_op[size]);
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size);
          } else {
 -            gen_gvec_op3(s, is_q, rd, rn, rm, &mla_op[size]);
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size);
          }
          return;
      case 0x11:
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax)
  DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin)
  DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
  DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
 +DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
 +DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
  #define DO_3SAME_CMP(INSN, COND)                                        \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
      return do_3same(s, a, gen_VMUL_p_3s);
  }
--#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY)                           \
+ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
--    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+-                                   void *opaque, CPState state, int secstate,
--                                uint32_t rn_ofs, uint32_t rm_ofs,       \
++                                   void *opaque, CPState state,
--                                uint32_t oprsz, uint32_t maxsz)         \
++                                   CPSecureState secstate,
--    {                                                                   \
+                                    int crm, int opc1, int opc2,
--        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
+                                    const char *name)
--                       oprsz, maxsz, &OPARRAY[vece]);                   \
+ {
--    }                                                                   \
+@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
--    DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s)
+                                                    r->secure, crm, opc1, opc2,
--
+                                                    r->name);
--
+                             break;
--DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op)
+-                        default:
--DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op)
++                        case ARM_CP_SECSTATE_BOTH:
--
+                             name = g_strdup_printf("%s_S", r->name);
- #define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
+                             add_cpreg_to_hashtable(cpu, r, opaque, state,
-     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                                    ARM_CP_SECSTATE_S,
-                                 uint32_t rn_ofs, uint32_t rm_ofs,       \
+@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+                                                    ARM_CP_SECSTATE_NS,
-index XXXXXXX..XXXXXXX 100644
+                                                    crm, opc1, opc2, r->name);
---- a/target/arm/translate.c
+                             break;
-+++ b/target/arm/translate.c
++                        default:
-@@ -XXX,XX +XXX,XX @@ static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
++                            g_assert_not_reached();
- /* Note that while NEON does not support VMLA and VMLS as 64-bit ops,
+                         }
-  * these tables are shared with AArch64 which does support them.
+                     } else {
-  */
+                         /* AArch64 registers get mapped to non-secure instance
 +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_mul_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fni4 = gen_mla8_i32,
 +          .fniv = gen_mla_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni4 = gen_mla16_i32,
 +          .fniv = gen_mla_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_mla32_i32,
 +          .fniv = gen_mla_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_mla64_i64,
 +          .fniv = gen_mla_vec,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
 -static const TCGOpcode vecop_list_mla[] = {
 -    INDEX_op_mul_vec, INDEX_op_add_vec, 0
 -};
 -
 -static const TCGOpcode vecop_list_mls[] = {
 -    INDEX_op_mul_vec, INDEX_op_sub_vec, 0
 -};
 -
 -const GVecGen3 mla_op[4] = {
 -    { .fni4 = gen_mla8_i32,
 -      .fniv = gen_mla_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_8 },
 -    { .fni4 = gen_mla16_i32,
 -      .fniv = gen_mla_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_16 },
 -    { .fni4 = gen_mla32_i32,
 -      .fniv = gen_mla_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_32 },
 -    { .fni8 = gen_mla64_i64,
 -      .fniv = gen_mla_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mla,
 -      .vece = MO_64 },
 -};
 -
 -const GVecGen3 mls_op[4] = {
 -    { .fni4 = gen_mls8_i32,
 -      .fniv = gen_mls_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_8 },
 -    { .fni4 = gen_mls16_i32,
 -      .fniv = gen_mls_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_16 },
 -    { .fni4 = gen_mls32_i32,
 -      .fniv = gen_mls_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_32 },
 -    { .fni8 = gen_mls64_i64,
 -      .fniv = gen_mls_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_mls,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                  uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_mul_vec, INDEX_op_sub_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fni4 = gen_mls8_i32,
 +          .fniv = gen_mls_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni4 = gen_mls16_i32,
 +          .fniv = gen_mls_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_mls32_i32,
 +          .fniv = gen_mls_vec,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_mls64_i64,
 +          .fniv = gen_mls_vec,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  /* CMTST : test is "if (X & Y != 0)". */
  static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 --
-.20.1
+.25.1

-[PULL 28/45] MAINTAINERS: Add ACPI/HEST/GHES entries
+[PULL 11/23] target/arm: Drop always-true test in define_arm_vh_e2h_redirects_aliases
-From: Dongjiu Geng <gengdongjiu@huawei.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-I and Xiang are willing to review the APEI-related patches and
+The new_key field is always non-zero -- drop the if.
 volunteer as the reviewers for the HEST/GHES part.
-Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Message-id: 20220501055028.646596-11-richard.henderson@linaro.org
-Acked-by: Michael S. Tsirkin <mst@redhat.com>
+[PMM: reinstated dropped PL3_RW mask]
 Message-id: 20200512030609.19593-11-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- MAINTAINERS | 9 +++++++++
+ target/arm/helper.c | 23 +++++++++++------------
-file changed, 9 insertions(+)
+file changed, 11 insertions(+), 12 deletions(-)
-diff --git a/MAINTAINERS b/MAINTAINERS
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/MAINTAINERS
+--- a/target/arm/helper.c
-+++ b/MAINTAINERS
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ F: tests/qtest/bios-tables-test.c
+@@ -XXX,XX +XXX,XX @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
- F: tests/qtest/acpi-utils.[hc]
- F: tests/data/acpi/
+     for (i = 0; i < ARRAY_SIZE(aliases); i++) {
+         const struct E2HAlias *a = &aliases[i];
-+ACPI/HEST/GHES
+-        ARMCPRegInfo *src_reg, *dst_reg;
-+R: Dongjiu Geng <gengdongjiu@huawei.com>
++        ARMCPRegInfo *src_reg, *dst_reg, *new_reg;
-+R: Xiang Zheng <zhengxiang9@huawei.com>
++        uint32_t *new_key;
-+L: qemu-arm@nongnu.org
++        bool ok;
-+S: Maintained
-+F: hw/acpi/ghes.c
+         if (a->feature && !a->feature(&cpu->isar)) {
-+F: include/hw/acpi/ghes.h
+             continue;
-+F: docs/specs/acpi_hest_ghes.rst
+@@ -XXX,XX +XXX,XX @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
-+
+         g_assert(src_reg->opaque == NULL);
- ppc4xx
- M: David Gibson <david@gibson.dropbear.id.au>
+         /* Create alias before redirection so we dup the right data. */
- L: qemu-ppc@nongnu.org
+-        if (a->new_key) {
 -            ARMCPRegInfo *new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
 -            uint32_t *new_key = g_memdup(&a->new_key, sizeof(uint32_t));
 -            bool ok;
 +        new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
 +        new_key = g_memdup(&a->new_key, sizeof(uint32_t));
 -            new_reg->name = a->new_name;
 -            new_reg->type |= ARM_CP_ALIAS;
 -            /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
 -            new_reg->access &= PL2_RW | PL3_RW;
 +        new_reg->name = a->new_name;
 +        new_reg->type |= ARM_CP_ALIAS;
 +        /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
 +        new_reg->access &= PL2_RW | PL3_RW;
 -            ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
 -            g_assert(ok);
 -        }
 +        ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
 +        g_assert(ok);
          src_reg->opaque = dst_reg;
          src_reg->orig_readfn = src_reg->readfn ?: raw_read;
 --
-.20.1
+.25.1

-[PULL 11/45] target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
+[PULL 12/23] target/arm: Store cpregs key in the hash table directly
 From: Richard Henderson <richard.henderson@linaro.org>
-Provide a functional interface for the vector expansion.
+Cast the uint32_t key into a gpointer directly, which
-This fits better with the existing set of helpers that
+allows us to avoid allocating storage for each key.
 we provide for other operations.
+Use g_hash_table_lookup when we already have a gpointer
+(e.g. for callbacks like count_cpreg), or when using
+get_arm_cp_reginfo would require casting away const.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20220501055028.646596-12-richard.henderson@linaro.org
 Message-id: 20200513163245.17915-11-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h          |  13 +-
+ target/arm/cpu.c     |  4 ++--
- target/arm/translate-a64.c      |  22 ++-
+ target/arm/gdbstub.c |  2 +-
- target/arm/translate-neon.inc.c |  19 +--
+ target/arm/helper.c  | 41 ++++++++++++++++++-----------------------
- target/arm/translate.c          | 228 +++++++++++++++++---------------
+files changed, 21 insertions(+), 26 deletions(-)
 files changed, 147 insertions(+), 135 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/cpu.c
-+++ b/target/arm/translate.h
++++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+@@ -XXX,XX +XXX,XX @@ static void arm_cpu_initfn(Object *obj)
- void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+     ARMCPU *cpu = ARM_CPU(obj);
-                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+     cpu_set_cpustate_pointers(cpu);
--extern const GVecGen4 uqadd_op[4];
+-    cpu->cp_regs = g_hash_table_new_full(g_int_hash, g_int_equal,
--extern const GVecGen4 sqadd_op[4];
+-                                         g_free, cpreg_hashtable_data_destroy);
--extern const GVecGen4 uqsub_op[4];
++    cpu->cp_regs = g_hash_table_new_full(g_direct_hash, g_direct_equal,
--extern const GVecGen4 sqsub_op[4];
++                                         NULL, cpreg_hashtable_data_destroy);
- void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
- void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
+     QLIST_INIT(&cpu->pre_el_change_hooks);
- void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
+     QLIST_INIT(&cpu->el_change_hooks);
- void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
+diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
  void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
  void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                     int64_t shift, uint32_t opr_sz, uint32_t max_sz);
  void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
+--- a/target/arm/gdbstub.c
-+++ b/target/arm/translate-a64.c
++++ b/target/arm/gdbstub.c
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void arm_gen_one_xml_sysreg_tag(GString *s, DynamicGDBXMLInfo *dyn_xml,
+ static void arm_register_sysreg_for_xml(gpointer key, gpointer value,
-     switch (opcode) {
+                                         gpointer p)
-     case 0x01: /* SQADD, UQADD */
+ {
--        tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
+-    uint32_t ri_key = *(uint32_t *)key;
--                       offsetof(CPUARMState, vfp.qc),
++    uint32_t ri_key = (uintptr_t)key;
--                       vec_full_reg_offset(s, rn),
+     ARMCPRegInfo *ri = value;
--                       vec_full_reg_offset(s, rm),
+     RegisterSysregXmlParam *param = (RegisterSysregXmlParam *)p;
--                       is_q ? 16 : 8, vec_full_reg_size(s),
+     GString *s = param->s;
--                       (u ? uqadd_op : sqadd_op) + size);
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 +        if (u) {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size);
 +        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size);
 +        }
          return;
      case 0x05: /* SQSUB, UQSUB */
 -        tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
 -                       offsetof(CPUARMState, vfp.qc),
 -                       vec_full_reg_offset(s, rn),
 -                       vec_full_reg_offset(s, rm),
 -                       is_q ? 16 : 8, vec_full_reg_size(s),
 -                       (u ? uqsub_op : sqsub_op) + size);
 +        if (u) {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size);
 +        } else {
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size);
 +        }
          return;
      case 0x08: /* SSHL, USHL */
          if (u) {
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME(VORN, tcg_gen_gvec_orc)
+@@ -XXX,XX +XXX,XX @@ bool write_list_to_cpustate(ARMCPU *cpu)
- DO_3SAME(VEOR, tcg_gen_gvec_xor)
+ static void add_cpreg_to_list(gpointer key, gpointer opaque)
- DO_3SAME(VSHL_S, gen_gvec_sshl)
+ {
- DO_3SAME(VSHL_U, gen_gvec_ushl)
+     ARMCPU *cpu = opaque;
-+DO_3SAME(VQADD_S, gen_gvec_sqadd_qc)
+-    uint64_t regidx;
-+DO_3SAME(VQADD_U, gen_gvec_uqadd_qc)
+-    const ARMCPRegInfo *ri;
 +DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc)
 +DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc)
  /* These insns are all gvec_bitsel but with the inputs in various orders. */
  #define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
  DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
  DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
 -#define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
 -    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 -                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 -                                uint32_t oprsz, uint32_t maxsz)         \
 -    {                                                                   \
 -        tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),           \
 -                       rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]);   \
 -    }                                                                   \
 -    DO_3SAME(INSN, gen_##INSN##_3s)
 -
--DO_3SAME_GVEC4(VQADD_S, sqadd_op)
+-    regidx = *(uint32_t *)key;
--DO_3SAME_GVEC4(VQADD_U, uqadd_op)
+-    ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
--DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
++    uint32_t regidx = (uintptr_t)key;
--DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
++    const ARMCPRegInfo *ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
--
- static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+     if (!(ri->type & (ARM_CP_NO_RAW|ARM_CP_ALIAS))) {
-                            uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
+         cpu->cpreg_indexes[cpu->cpreg_array_len] = cpreg_to_kvm_id(regidx);
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_list(gpointer key, gpointer opaque)
  static void count_cpreg(gpointer key, gpointer opaque)
  {
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+     ARMCPU *cpu = opaque;
-index XXXXXXX..XXXXXXX 100644
+-    uint64_t regidx;
---- a/target/arm/translate.c
+     const ARMCPRegInfo *ri;
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
+-    regidx = *(uint32_t *)key;
-     tcg_temp_free_vec(x);
+-    ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
 +    ri = g_hash_table_lookup(cpu->cp_regs, key);
      if (!(ri->type & (ARM_CP_NO_RAW|ARM_CP_ALIAS))) {
          cpu->cpreg_array_len++;
@@ -XXX,XX +XXX,XX @@ static void count_cpreg(gpointer key, gpointer opaque)
  static gint cpreg_key_compare(gconstpointer a, gconstpointer b)
  {
 -    uint64_t aidx = cpreg_to_kvm_id(*(uint32_t *)a);
 -    uint64_t bidx = cpreg_to_kvm_id(*(uint32_t *)b);
 +    uint64_t aidx = cpreg_to_kvm_id((uintptr_t)a);
 +    uint64_t bidx = cpreg_to_kvm_id((uintptr_t)b);
      if (aidx > bidx) {
          return 1;
@@ -XXX,XX +XXX,XX @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
      for (i = 0; i < ARRAY_SIZE(aliases); i++) {
          const struct E2HAlias *a = &aliases[i];
          ARMCPRegInfo *src_reg, *dst_reg, *new_reg;
 -        uint32_t *new_key;
          bool ok;
          if (a->feature && !a->feature(&cpu->isar)) {
              continue;
          }
 -        src_reg = g_hash_table_lookup(cpu->cp_regs, &a->src_key);
 -        dst_reg = g_hash_table_lookup(cpu->cp_regs, &a->dst_key);
 +        src_reg = g_hash_table_lookup(cpu->cp_regs,
 +                                      (gpointer)(uintptr_t)a->src_key);
 +        dst_reg = g_hash_table_lookup(cpu->cp_regs,
 +                                      (gpointer)(uintptr_t)a->dst_key);
          g_assert(src_reg != NULL);
          g_assert(dst_reg != NULL);
@@ -XXX,XX +XXX,XX @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
          /* Create alias before redirection so we dup the right data. */
          new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
 -        new_key = g_memdup(&a->new_key, sizeof(uint32_t));
          new_reg->name = a->new_name;
          new_reg->type |= ARM_CP_ALIAS;
          /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
          new_reg->access &= PL2_RW | PL3_RW;
 -        ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
 +        ok = g_hash_table_insert(cpu->cp_regs,
 +                                 (gpointer)(uintptr_t)a->new_key, new_reg);
          g_assert(ok);
          src_reg->opaque = dst_reg;
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
      /* Private utility function for define_one_arm_cp_reg_with_opaque():
       * add a single reginfo struct to the hash table.
       */
 -    uint32_t *key = g_new(uint32_t, 1);
 +    uint32_t key;
      ARMCPRegInfo *r2 = g_memdup(r, sizeof(ARMCPRegInfo));
      int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
      int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
          if (r->cp == 0 || r->state == ARM_CP_STATE_BOTH) {
              r2->cp = CP_REG_ARM64_SYSREG_CP;
          }
 -        *key = ENCODE_AA64_CP_REG(r2->cp, r2->crn, crm,
 -                                  r2->opc0, opc1, opc2);
 +        key = ENCODE_AA64_CP_REG(r2->cp, r2->crn, crm,
 +                                 r2->opc0, opc1, opc2);
      } else {
 -        *key = ENCODE_CP_REG(r2->cp, is64, ns, r2->crn, crm, opc1, opc2);
 +        key = ENCODE_CP_REG(r2->cp, is64, ns, r2->crn, crm, opc1, opc2);
      }
      if (opaque) {
          r2->opaque = opaque;
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
       * requested.
       */
      if (!(r->type & ARM_CP_OVERRIDE)) {
 -        ARMCPRegInfo *oldreg;
 -        oldreg = g_hash_table_lookup(cpu->cp_regs, key);
 +        const ARMCPRegInfo *oldreg = get_arm_cp_reginfo(cpu->cp_regs, key);
          if (oldreg && !(oldreg->type & ARM_CP_OVERRIDE)) {
              fprintf(stderr, "Register redefined: cp=%d %d bit "
                      "crn=%d crm=%d opc1=%d opc2=%d, "
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
              g_assert_not_reached();
          }
      }
 -    g_hash_table_insert(cpu->cp_regs, key, r2);
 +    g_hash_table_insert(cpu->cp_regs, (gpointer)(uintptr_t)key, r2);
  }
--static const TCGOpcode vecop_list_uqadd[] = {
--    INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
+@@ -XXX,XX +XXX,XX @@ void modify_arm_cp_regs_with_len(ARMCPRegInfo *regs, size_t regs_len,
--};
--
+ const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp)
--const GVecGen4 uqadd_op[4] = {
+ {
--    { .fniv = gen_uqadd_vec,
+-    return g_hash_table_lookup(cpregs, &encoded_cp);
--      .fno = gen_helper_gvec_uqadd_b,
++    return g_hash_table_lookup(cpregs, (gpointer)(uintptr_t)encoded_cp);
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_8 },
 -    { .fniv = gen_uqadd_vec,
 -      .fno = gen_helper_gvec_uqadd_h,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_16 },
 -    { .fniv = gen_uqadd_vec,
 -      .fno = gen_helper_gvec_uqadd_s,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_32 },
 -    { .fniv = gen_uqadd_vec,
 -      .fno = gen_helper_gvec_uqadd_d,
 -      .write_aofs = true,
 -      .opt_opc = vecop_list_uqadd,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_uqadd_vec,
 +          .fno = gen_helper_gvec_uqadd_b,
 +          .write_aofs = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_uqadd_vec,
 +          .fno = gen_helper_gvec_uqadd_h,
 +          .write_aofs = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fniv = gen_uqadd_vec,
 +          .fno = gen_helper_gvec_uqadd_s,
 +          .write_aofs = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fniv = gen_uqadd_vec,
 +          .fno = gen_helper_gvec_uqadd_d,
 +          .write_aofs = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
      tcg_temp_free_vec(x);
  }
--static const TCGOpcode vecop_list_sqadd[] = {
+ void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
 -    INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
 -};
 -
 -const GVecGen4 sqadd_op[4] = {
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_b,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_8 },
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_h,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_16 },
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_s,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_32 },
 -    { .fniv = gen_sqadd_vec,
 -      .fno = gen_helper_gvec_sqadd_d,
 -      .opt_opc = vecop_list_sqadd,
 -      .write_aofs = true,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_b,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_h,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_16 },
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_s,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_32 },
 +        { .fniv = gen_sqadd_vec,
 +          .fno = gen_helper_gvec_sqadd_d,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
      tcg_temp_free_vec(x);
  }
 -static const TCGOpcode vecop_list_uqsub[] = {
 -    INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 -};
 -
 -const GVecGen4 uqsub_op[4] = {
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_b,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_8 },
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_h,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_16 },
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_s,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_32 },
 -    { .fniv = gen_uqsub_vec,
 -      .fno = gen_helper_gvec_uqsub_d,
 -      .opt_opc = vecop_list_uqsub,
 -      .write_aofs = true,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_b,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_h,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_16 },
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_s,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_32 },
 +        { .fniv = gen_uqsub_vec,
 +          .fno = gen_helper_gvec_uqsub_d,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
      tcg_temp_free_vec(x);
  }
 -static const TCGOpcode vecop_list_sqsub[] = {
 -    INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 -};
 -
 -const GVecGen4 sqsub_op[4] = {
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_b,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_8 },
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_h,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_16 },
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_s,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_32 },
 -    { .fniv = gen_sqsub_vec,
 -      .fno = gen_helper_gvec_sqsub_d,
 -      .opt_opc = vecop_list_sqsub,
 -      .write_aofs = true,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
 +    };
 +    static const GVecGen4 ops[4] = {
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_b,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_8 },
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_h,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_16 },
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_s,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_32 },
 +        { .fniv = gen_sqsub_vec,
 +          .fno = gen_helper_gvec_sqsub_d,
 +          .opt_opc = vecop_list,
 +          .write_aofs = true,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
 +                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  /* Translate a NEON data processing instruction.  Return nonzero if the
     instruction is invalid.
 --
-.20.1
+.25.1

-[PULL 10/45] target/arm: Create gen_gvec_{cmtst,ushl,sshl}
+[PULL 13/23] target/arm: Merge allocation of the cpreg and its name
 From: Richard Henderson <richard.henderson@linaro.org>
-Provide a functional interface for the vector expansion.
+Simplify freeing cp_regs hash table entries by using a single
-This fits better with the existing set of helpers that
+allocation for the entire value.
 we provide for other operations.
+This fixes a theoretical bug if we were to ever free the entire
+hash table, because we've been installing string literal constants
+into the cpreg structure in define_arm_vh_e2h_redirects_aliases.
+However, at present we only free entries created for AArch32
+wildcard cpregs which get overwritten by more specific cpregs,
+so this bug is never exposed.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20220501055028.646596-13-richard.henderson@linaro.org
 Message-id: 20200513163245.17915-10-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h          |  10 ++-
+ target/arm/cpu.c    | 16 +---------------
- target/arm/translate-a64.c      |  18 ++--
+ target/arm/helper.c | 10 ++++++++--
- target/arm/translate-neon.inc.c |  23 +----
+files changed, 9 insertions(+), 17 deletions(-)
  target/arm/translate.c          | 146 +++++++++++++++++---------------
 files changed, 95 insertions(+), 102 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/cpu.c
-+++ b/target/arm/translate.h
++++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+@@ -XXX,XX +XXX,XX @@ uint64_t arm_cpu_mp_affinity(int idx, uint8_t clustersz)
- void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+     return (Aff1 << ARM_AFF1_SHIFT) | Aff0;
                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 -extern const GVecGen3 cmtst_op[4];
 -extern const GVecGen3 sshl_op[4];
 -extern const GVecGen3 ushl_op[4];
 +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 +
  extern const GVecGen4 uqadd_op[4];
  extern const GVecGen4 sqadd_op[4];
  extern const GVecGen4 uqsub_op[4];
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
              is_q ? 16 : 8, vec_full_reg_size(s));
  }
--/* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
+-static void cpreg_hashtable_data_destroy(gpointer data)
 -static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
 -                         int rn, int rm, const GVecGen3 *gvec_op)
 -{
--    tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
+-    /*
--                   vec_full_reg_offset(s, rm), is_q ? 16 : 8,
+-     * Destroy function for cpu->cp_regs hashtable data entries.
--                   vec_full_reg_size(s), gvec_op);
+-     * We must free the name string because it was g_strdup()ed in
 -     * add_cpreg_to_hashtable(). It's OK to cast away the 'const'
 -     * from r->name because we know we definitely allocated it.
 -     */
 -    ARMCPRegInfo *r = data;
 -
 -    g_free((void *)r->name);
 -    g_free(r);
 -}
 -
- /* Expand a 3-operand operation using an out-of-line helper.  */
+ static void arm_cpu_initfn(Object *obj)
- static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
+ {
-                              int rn, int rm, int data, gen_helper_gvec_3 *fn)
+     ARMCPU *cpu = ARM_CPU(obj);
-@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
-                        (u ? uqsub_op : sqsub_op) + size);
+     cpu_set_cpustate_pointers(cpu);
-         return;
+     cpu->cp_regs = g_hash_table_new_full(g_direct_hash, g_direct_equal,
-     case 0x08: /* SSHL, USHL */
+-                                         NULL, cpreg_hashtable_data_destroy);
--        gen_gvec_op3(s, is_q, rd, rn, rm,
++                                         NULL, g_free);
--                     u ? &ushl_op[size] : &sshl_op[size]);
-+        if (u) {
+     QLIST_INIT(&cpu->pre_el_change_hooks);
-+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size);
+     QLIST_INIT(&cpu->el_change_hooks);
-+        } else {
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size);
 +        }
          return;
      case 0x0c: /* SMAX, UMAX */
          if (u) {
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
          return;
      case 0x11:
          if (!u) { /* CMTST */
 -            gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]);
 +            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size);
              return;
          }
          /* else CMEQ */
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/helper.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME(VBIC, tcg_gen_gvec_andc)
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
- DO_3SAME(VORR, tcg_gen_gvec_or)
+      * add a single reginfo struct to the hash table.
- DO_3SAME(VORN, tcg_gen_gvec_orc)
+      */
- DO_3SAME(VEOR, tcg_gen_gvec_xor)
+     uint32_t key;
-+DO_3SAME(VSHL_S, gen_gvec_sshl)
+-    ARMCPRegInfo *r2 = g_memdup(r, sizeof(ARMCPRegInfo));
-+DO_3SAME(VSHL_U, gen_gvec_ushl)
++    ARMCPRegInfo *r2;
+     int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
- /* These insns are all gvec_bitsel but with the inputs in various orders. */
+     int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
- #define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
++    size_t name_len;
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
++
- DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
++    /* Combine cpreg and name into one allocation. */
- DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
++    name_len = strlen(name) + 1;
- DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
++    r2 = g_malloc(sizeof(*r2) + name_len);
-+DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
++    *r2 = *r;
++    r2->name = memcpy(r2 + 1, name, name_len);
- #define DO_3SAME_CMP(INSN, COND)                                        \
-     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+-    r2->name = g_strdup(name);
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
+     /* Reset the secure state to the specific incoming state.  This is
- DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
+      * necessary as the register may have been defined with both states.
- DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
+      */
 -static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 -                         uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
 -{
 -    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
 -}
 -DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
 -
  #define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
      static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
                                  uint32_t rn_ofs, uint32_t rm_ofs,       \
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
      }
      return do_3same(s, a, gen_VMUL_p_3s);
  }
 -
 -#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
 -    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
 -                                uint32_t rn_ofs, uint32_t rm_ofs,       \
 -                                uint32_t oprsz, uint32_t maxsz)         \
 -    {                                                                   \
 -        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
 -                       oprsz, maxsz, &OPARRAY[vece]);                   \
 -    }                                                                   \
 -    DO_3SAME(INSN, gen_##INSN##_3s)
 -
 -DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
 -DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
      tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
  }
 -static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 };
 -
 -const GVecGen3 cmtst_op[4] = {
 -    { .fni4 = gen_helper_neon_tst_u8,
 -      .fniv = gen_cmtst_vec,
 -      .opt_opc = vecop_list_cmtst,
 -      .vece = MO_8 },
 -    { .fni4 = gen_helper_neon_tst_u16,
 -      .fniv = gen_cmtst_vec,
 -      .opt_opc = vecop_list_cmtst,
 -      .vece = MO_16 },
 -    { .fni4 = gen_cmtst_i32,
 -      .fniv = gen_cmtst_vec,
 -      .opt_opc = vecop_list_cmtst,
 -      .vece = MO_32 },
 -    { .fni8 = gen_cmtst_i64,
 -      .fniv = gen_cmtst_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .opt_opc = vecop_list_cmtst,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
 +    static const GVecGen3 ops[4] = {
 +        { .fni4 = gen_helper_neon_tst_u8,
 +          .fniv = gen_cmtst_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni4 = gen_helper_neon_tst_u16,
 +          .fniv = gen_cmtst_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_cmtst_i32,
 +          .fniv = gen_cmtst_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_cmtst_i64,
 +          .fniv = gen_cmtst_vec,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
  {
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
      tcg_temp_free_vec(rsh);
  }
 -static const TCGOpcode ushl_list[] = {
 -    INDEX_op_neg_vec, INDEX_op_shlv_vec,
 -    INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
 -};
 -
 -const GVecGen3 ushl_op[4] = {
 -    { .fniv = gen_ushl_vec,
 -      .fno = gen_helper_gvec_ushl_b,
 -      .opt_opc = ushl_list,
 -      .vece = MO_8 },
 -    { .fniv = gen_ushl_vec,
 -      .fno = gen_helper_gvec_ushl_h,
 -      .opt_opc = ushl_list,
 -      .vece = MO_16 },
 -    { .fni4 = gen_ushl_i32,
 -      .fniv = gen_ushl_vec,
 -      .opt_opc = ushl_list,
 -      .vece = MO_32 },
 -    { .fni8 = gen_ushl_i64,
 -      .fniv = gen_ushl_vec,
 -      .opt_opc = ushl_list,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_neg_vec, INDEX_op_shlv_vec,
 +        INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_ushl_vec,
 +          .fno = gen_helper_gvec_ushl_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_ushl_vec,
 +          .fno = gen_helper_gvec_ushl_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_ushl_i32,
 +          .fniv = gen_ushl_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_ushl_i64,
 +          .fniv = gen_ushl_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
  {
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
      tcg_temp_free_vec(tmp);
  }
 -static const TCGOpcode sshl_list[] = {
 -    INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
 -    INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
 -};
 -
 -const GVecGen3 sshl_op[4] = {
 -    { .fniv = gen_sshl_vec,
 -      .fno = gen_helper_gvec_sshl_b,
 -      .opt_opc = sshl_list,
 -      .vece = MO_8 },
 -    { .fniv = gen_sshl_vec,
 -      .fno = gen_helper_gvec_sshl_h,
 -      .opt_opc = sshl_list,
 -      .vece = MO_16 },
 -    { .fni4 = gen_sshl_i32,
 -      .fniv = gen_sshl_vec,
 -      .opt_opc = sshl_list,
 -      .vece = MO_32 },
 -    { .fni8 = gen_sshl_i64,
 -      .fniv = gen_sshl_vec,
 -      .opt_opc = sshl_list,
 -      .vece = MO_64 },
 -};
 +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
 +        INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
 +    };
 +    static const GVecGen3 ops[4] = {
 +        { .fniv = gen_sshl_vec,
 +          .fno = gen_helper_gvec_sshl_b,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fniv = gen_sshl_vec,
 +          .fno = gen_helper_gvec_sshl_h,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_sshl_i32,
 +          .fniv = gen_sshl_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_sshl_i64,
 +          .fniv = gen_sshl_vec,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 +    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 +}
  static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                            TCGv_vec a, TCGv_vec b)
 --
-.20.1
+.25.1

-[PULL 23/45] ACPI: Build Hardware Error Source Table
+[PULL 14/23] target/arm: Hoist computation of key in add_cpreg_to_hashtable
-From: Dongjiu Geng <gengdongjiu@huawei.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-This patch builds Hardware Error Source Table(HEST) via fw_cfg blobs.
+Move the computation of key to the top of the function.
-Now it only supports ARMv8 SEA, a type of Generic Hardware Error
+Hoist the resolution of cp as well, as an input to the
-Source version 2(GHESv2) error source. Afterwards, we can extend
+computation of key.
 the supported types if needed. For the CPER section, currently it
 is memory section because kernel mainly wants userspace to handle
 the memory errors.
-This patch follows the spec ACPI 6.2 to build the Hardware Error
+This will be required by a subsequent patch.
 Source table. For more detailed information, please refer to
 document: docs/specs/acpi_hest_ghes.rst
-build_ghes_hw_error_notification() helper will help to add Hardware
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Error Notification to ACPI tables without using packed C structures
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-and avoid endianness issues as API doesn't need explicit conversion.
+Message-id: 20220501055028.646596-14-richard.henderson@linaro.org
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Message-id: 20200512030609.19593-6-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/acpi/ghes.h   |  39 ++++++++++++
+ target/arm/helper.c | 49 +++++++++++++++++++++++++--------------------
- hw/acpi/ghes.c           | 126 +++++++++++++++++++++++++++++++++++++++
+file changed, 27 insertions(+), 22 deletions(-)
  hw/arm/virt-acpi-build.c |   2 +
 files changed, 167 insertions(+)
-diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/acpi/ghes.h
+--- a/target/arm/helper.c
-+++ b/include/hw/acpi/ghes.h
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
+     ARMCPRegInfo *r2;
- #include "hw/acpi/bios-linker-loader.h"
+     int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
+     int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
-+/*
++    int cp = r->cp;
-+ * Values for Hardware Error Notification Type field
+     size_t name_len;
-+ */
-+enum AcpiGhesNotifyType {
++    switch (state) {
-+    /* Polled */
++    case ARM_CP_STATE_AA32:
-+    ACPI_GHES_NOTIFY_POLLED = 0,
++        /* We assume it is a cp15 register if the .cp field is left unset. */
-+    /* External Interrupt */
++        if (cp == 0 && r->state == ARM_CP_STATE_BOTH) {
-+    ACPI_GHES_NOTIFY_EXTERNAL = 1,
++            cp = 15;
-+    /* Local Interrupt */
++        }
-+    ACPI_GHES_NOTIFY_LOCAL = 2,
++        key = ENCODE_CP_REG(cp, is64, ns, r->crn, crm, opc1, opc2);
-+    /* SCI */
++        break;
-+    ACPI_GHES_NOTIFY_SCI = 3,
++    case ARM_CP_STATE_AA64:
 +    /* NMI */
 +    ACPI_GHES_NOTIFY_NMI = 4,
 +    /* CMCI, ACPI 5.0: 18.3.2.7, Table 18-290 */
 +    ACPI_GHES_NOTIFY_CMCI = 5,
 +    /* MCE, ACPI 5.0: 18.3.2.7, Table 18-290 */
 +    ACPI_GHES_NOTIFY_MCE = 6,
 +    /* GPIO-Signal, ACPI 6.0: 18.3.2.7, Table 18-332 */
 +    ACPI_GHES_NOTIFY_GPIO = 7,
 +    /* ARMv8 SEA, ACPI 6.1: 18.3.2.9, Table 18-345 */
 +    ACPI_GHES_NOTIFY_SEA = 8,
 +    /* ARMv8 SEI, ACPI 6.1: 18.3.2.9, Table 18-345 */
 +    ACPI_GHES_NOTIFY_SEI = 9,
 +    /* External Interrupt - GSIV, ACPI 6.1: 18.3.2.9, Table 18-345 */
 +    ACPI_GHES_NOTIFY_GSIV = 10,
 +    /* Software Delegated Exception, ACPI 6.2: 18.3.2.9, Table 18-383 */
 +    ACPI_GHES_NOTIFY_SDEI = 11,
 +    /* 12 and greater are reserved */
 +    ACPI_GHES_NOTIFY_RESERVED = 12
 +};
 +
 +enum {
 +    ACPI_HEST_SRC_ID_SEA = 0,
 +    /* future ids go here */
 +    ACPI_HEST_SRC_ID_RESERVED,
 +};
 +
  void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
 +void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
  #endif
 diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/acpi/ghes.c
 +++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu/units.h"
  #include "hw/acpi/ghes.h"
  #include "hw/acpi/aml-build.h"
 +#include "qemu/error-report.h"
  #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
  #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
@@ -XXX,XX +XXX,XX @@
  /* Now only support ARMv8 SEA notification type error source */
  #define ACPI_GHES_ERROR_SOURCE_COUNT        1
 +/* Generic Hardware Error Source version 2 */
 +#define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
 +
 +/* Address offset in Generic Address Structure(GAS) */
 +#define GAS_ADDR_OFFSET 4
 +
 +/*
 + * Hardware Error Notification
 + * ACPI 4.0: 17.3.2.7 Hardware Error Notification
 + * Composes dummy Hardware Error Notification descriptor of specified type
 + */
 +static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
 +{
 +    /* Type */
 +    build_append_int_noprefix(table, type, 1);
 +    /*
 +     * Length:
 +     * Total length of the structure in bytes
 +     */
 +    build_append_int_noprefix(table, 28, 1);
 +    /* Configuration Write Enable */
 +    build_append_int_noprefix(table, 0, 2);
 +    /* Poll Interval */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Vector */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Switch To Polling Threshold Value */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Switch To Polling Threshold Window */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Error Threshold Value */
 +    build_append_int_noprefix(table, 0, 4);
 +    /* Error Threshold Window */
 +    build_append_int_noprefix(table, 0, 4);
 +}
 +
  /*
   * Build table for the hardware error fw_cfg blob.
   * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
      bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
 , sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
  }
 +
 +/* Build Generic Hardware Error Source version 2 (GHESv2) */
 +static void build_ghes_v2(GArray *table_data, int source_id, BIOSLinker *linker)
 +{
 +    uint64_t address_offset;
 +    /*
 +     * Type:
 +     * Generic Hardware Error Source version 2(GHESv2 - Type 10)
 +     */
 +    build_append_int_noprefix(table_data, ACPI_GHES_SOURCE_GENERIC_ERROR_V2, 2);
 +    /* Source Id */
 +    build_append_int_noprefix(table_data, source_id, 2);
 +    /* Related Source Id */
 +    build_append_int_noprefix(table_data, 0xffff, 2);
 +    /* Flags */
 +    build_append_int_noprefix(table_data, 0, 1);
 +    /* Enabled */
 +    build_append_int_noprefix(table_data, 1, 1);
 +
 +    /* Number of Records To Pre-allocate */
 +    build_append_int_noprefix(table_data, 1, 4);
 +    /* Max Sections Per Record */
 +    build_append_int_noprefix(table_data, 1, 4);
 +    /* Max Raw Data Length */
 +    build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
 +
 +    address_offset = table_data->len;
 +    /* Error Status Address */
 +    build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
 +                     4 /* QWord access */, 0);
 +    bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
 +        address_offset + GAS_ADDR_OFFSET, sizeof(uint64_t),
 +        ACPI_GHES_ERRORS_FW_CFG_FILE, source_id * sizeof(uint64_t));
 +
 +    switch (source_id) {
 +    case ACPI_HEST_SRC_ID_SEA:
 +        /*
-+         * Notification Structure
++         * To allow abbreviation of ARMCPRegInfo definitions, we treat
-+         * Now only enable ARMv8 SEA notification type
++         * cp == 0 as equivalent to the value for "standard guest-visible
 +         * sysreg".  STATE_BOTH definitions are also always "standard sysreg"
 +         * in their AArch64 view (the .cp value may be non-zero for the
 +         * benefit of the AArch32 view).
 +         */
-+        build_ghes_hw_error_notification(table_data, ACPI_GHES_NOTIFY_SEA);
++        if (cp == 0 || r->state == ARM_CP_STATE_BOTH) {
 +            cp = CP_REG_ARM64_SYSREG_CP;
 +        }
 +        key = ENCODE_AA64_CP_REG(cp, r->crn, crm, r->opc0, opc1, opc2);
 +        break;
 +    default:
-+        error_report("Not support this error source");
++        g_assert_not_reached();
 +        abort();
 +    }
 +
-+    /* Error Status Block Length */
+     /* Combine cpreg and name into one allocation. */
-+    build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
+     name_len = strlen(name) + 1;
-+
+     r2 = g_malloc(sizeof(*r2) + name_len);
-+    /*
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-+     * Read Ack Register
+         }
-+     * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source
-+     * version 2 (GHESv2 - Type 10)
+         if (r->state == ARM_CP_STATE_BOTH) {
-+     */
+-            /* We assume it is a cp15 register if the .cp field is left unset.
-+    address_offset = table_data->len;
+-             */
-+    build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
+-            if (r2->cp == 0) {
-+                     4 /* QWord access */, 0);
+-                r2->cp = 15;
-+    bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
+-            }
-+        address_offset + GAS_ADDR_OFFSET,
+-
-+        sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
+ #if HOST_BIG_ENDIAN
-+        (ACPI_GHES_ERROR_SOURCE_COUNT + source_id) * sizeof(uint64_t));
+             if (r2->fieldoffset) {
-+
+                 r2->fieldoffset += sizeof(uint32_t);
-+    /*
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-+     * Read Ack Preserve field
+ #endif
-+     * We only provide the first bit in Read Ack Register to OSPM to write
+         }
 +     * while the other bits are preserved.
 +     */
 +    build_append_int_noprefix(table_data, ~0x1ULL, 8);
 +    /* Read Ack Write */
 +    build_append_int_noprefix(table_data, 0x1, 8);
 +}
 +
 +/* Build Hardware Error Source Table */
 +void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
 +{
 +    uint64_t hest_start = table_data->len;
 +
 +    /* Hardware Error Source Table header*/
 +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
 +
 +    /* Error Source Count */
 +    build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
 +
 +    build_ghes_v2(table_data, ACPI_HEST_SRC_ID_SEA, linker);
 +
 +    build_header(linker, table_data, (void *)(table_data->data + hest_start),
 +        "HEST", table_data->len - hest_start, 1, NULL, NULL);
 +}
 diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/virt-acpi-build.c
 +++ b/hw/arm/virt-acpi-build.c
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
      if (vms->ras) {
          build_ghes_error_table(tables->hardware_errors, tables->linker);
 +        acpi_add_table(table_offsets, tables_blob);
 +        acpi_build_hest(tables_blob, tables->linker);
      }
+-    if (state == ARM_CP_STATE_AA64) {
-     if (ms->numa_state->num_nodes > 0) {
+-        /* To allow abbreviation of ARMCPRegInfo
 -         * definitions, we treat cp == 0 as equivalent to
 -         * the value for "standard guest-visible sysreg".
 -         * STATE_BOTH definitions are also always "standard
 -         * sysreg" in their AArch64 view (the .cp value may
 -         * be non-zero for the benefit of the AArch32 view).
 -         */
 -        if (r->cp == 0 || r->state == ARM_CP_STATE_BOTH) {
 -            r2->cp = CP_REG_ARM64_SYSREG_CP;
 -        }
 -        key = ENCODE_AA64_CP_REG(r2->cp, r2->crn, crm,
 -                                 r2->opc0, opc1, opc2);
 -    } else {
 -        key = ENCODE_CP_REG(r2->cp, is64, ns, r2->crn, crm, opc1, opc2);
 -    }
      if (opaque) {
          r2->opaque = opaque;
      }
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
      /* Make sure reginfo passed to helpers for wildcarded regs
       * has the correct crm/opc1/opc2 for this reg, not CP_ANY:
       */
 +    r2->cp = cp;
      r2->crm = crm;
      r2->opc1 = opc1;
      r2->opc2 = opc2;
 --
-.20.1
+.25.1

-[PULL 26/45] ACPI: Record Generic Error Status Block(GESB) table
+[PULL 15/23] target/arm: Consolidate cpreg updates in add_cpreg_to_hashtable
-From: Dongjiu Geng <gengdongjiu@huawei.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-kvm_arch_on_sigbus_vcpu() error injection uses source_id as
+Put most of the value writeback to the same place,
-index in etc/hardware_errors to find out Error Status Data
+and improve the comment that goes with them.
 Block entry corresponding to error source. So supported source_id
 values should be assigned here and not be changed afterwards to
 make sure that guest will write error into expected Error Status
 Data Block.
-Before QEMU writes a new error to ACPI table, it will check whether
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-previous error has been acknowledged. If not acknowledged, the new
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-errors will be ignored and not be recorded. For the errors section
+Message-id: 20220501055028.646596-15-richard.henderson@linaro.org
 type, QEMU simulate it to memory section error.
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Message-id: 20200512030609.19593-9-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/acpi/ghes.h |   1 +
+ target/arm/helper.c | 28 ++++++++++++----------------
- hw/acpi/ghes.c         | 219 +++++++++++++++++++++++++++++++++++++++++
+file changed, 12 insertions(+), 16 deletions(-)
 files changed, 220 insertions(+)
-diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/acpi/ghes.h
+--- a/target/arm/helper.c
-+++ b/include/hw/acpi/ghes.h
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
- void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
+     *r2 = *r;
- void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
+     r2->name = memcpy(r2 + 1, name, name_len);
-                           GArray *hardware_errors);
-+int acpi_ghes_record_errors(uint8_t notify, uint64_t error_physical_addr);
+-    /* Reset the secure state to the specific incoming state.  This is
 -     * necessary as the register may have been defined with both states.
 +    /*
 +     * Update fields to match the instantiation, overwiting wildcards
 +     * such as CP_ANY, ARM_CP_STATE_BOTH, or ARM_CP_SECSTATE_BOTH.
       */
 +    r2->cp = cp;
 +    r2->crm = crm;
 +    r2->opc1 = opc1;
 +    r2->opc2 = opc2;
 +    r2->state = state;
      r2->secure = secstate;
 +    if (opaque) {
 +        r2->opaque = opaque;
 +    }
      if (r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1]) {
          /* Register is banked (using both entries in array).
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
  #endif
-diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
+         }
-index XXXXXXX..XXXXXXX 100644
+     }
---- a/hw/acpi/ghes.c
+-    if (opaque) {
-+++ b/hw/acpi/ghes.c
+-        r2->opaque = opaque;
-@@ -XXX,XX +XXX,XX @@
+-    }
- #include "qemu/error-report.h"
+-    /* reginfo passed to helpers is correct for the actual access,
- #include "hw/acpi/generic_event_device.h"
+-     * and is never ARM_CP_STATE_BOTH:
- #include "hw/nvram/fw_cfg.h"
+-     */
-+#include "qemu/uuid.h"
+-    r2->state = state;
+-    /* Make sure reginfo passed to helpers for wildcarded regs
- #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
+-     * has the correct crm/opc1/opc2 for this reg, not CP_ANY:
- #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
+-     */
-@@ -XXX,XX +XXX,XX @@
+-    r2->cp = cp;
- /* Address offset in Generic Address Structure(GAS) */
+-    r2->crm = crm;
- #define GAS_ADDR_OFFSET 4
+-    r2->opc1 = opc1;
+-    r2->opc2 = opc2;
 +/*
 + * The total size of Generic Error Data Entry
 + * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
 + * Table 18-343 Generic Error Data Entry
 + */
 +#define ACPI_GHES_DATA_LENGTH               72
 +
-+/* The memory section CPER size, UEFI 2.6: N.2.5 Memory Error Section */
+     /* By convention, for wildcarded registers only the first
-+#define ACPI_GHES_MEM_CPER_LENGTH           80
+      * entry is used for migration; the others are marked as
-+
+      * ALIAS so we don't try to transfer the register
 +/* Masks for block_status flags */
 +#define ACPI_GEBS_UNCORRECTABLE         1
 +
 +/*
 + * Total size for Generic Error Status Block except Generic Error Data Entries
 + * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
 + * Table 18-380 Generic Error Status Block
 + */
 +#define ACPI_GHES_GESB_SIZE                 20
 +
 +/*
 + * Values for error_severity field
 + */
 +enum AcpiGenericErrorSeverity {
 +    ACPI_CPER_SEV_RECOVERABLE = 0,
 +    ACPI_CPER_SEV_FATAL = 1,
 +    ACPI_CPER_SEV_CORRECTED = 2,
 +    ACPI_CPER_SEV_NONE = 3,
 +};
 +
  /*
   * Hardware Error Notification
   * ACPI 4.0: 17.3.2.7 Hardware Error Notification
@@ -XXX,XX +XXX,XX @@ static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
      build_append_int_noprefix(table, 0, 4);
  }
 +/*
 + * Generic Error Data Entry
 + * ACPI 6.1: 18.3.2.7.1 Generic Error Data
 + */
 +static void acpi_ghes_generic_error_data(GArray *table,
 +                const uint8_t *section_type, uint32_t error_severity,
 +                uint8_t validation_bits, uint8_t flags,
 +                uint32_t error_data_length, QemuUUID fru_id,
 +                uint64_t time_stamp)
 +{
 +    const uint8_t fru_text[20] = {0};
 +
 +    /* Section Type */
 +    g_array_append_vals(table, section_type, 16);
 +
 +    /* Error Severity */
 +    build_append_int_noprefix(table, error_severity, 4);
 +    /* Revision */
 +    build_append_int_noprefix(table, 0x300, 2);
 +    /* Validation Bits */
 +    build_append_int_noprefix(table, validation_bits, 1);
 +    /* Flags */
 +    build_append_int_noprefix(table, flags, 1);
 +    /* Error Data Length */
 +    build_append_int_noprefix(table, error_data_length, 4);
 +
 +    /* FRU Id */
 +    g_array_append_vals(table, fru_id.data, ARRAY_SIZE(fru_id.data));
 +
 +    /* FRU Text */
 +    g_array_append_vals(table, fru_text, sizeof(fru_text));
 +
 +    /* Timestamp */
 +    build_append_int_noprefix(table, time_stamp, 8);
 +}
 +
 +/*
 + * Generic Error Status Block
 + * ACPI 6.1: 18.3.2.7.1 Generic Error Data
 + */
 +static void acpi_ghes_generic_error_status(GArray *table, uint32_t block_status,
 +                uint32_t raw_data_offset, uint32_t raw_data_length,
 +                uint32_t data_length, uint32_t error_severity)
 +{
 +    /* Block Status */
 +    build_append_int_noprefix(table, block_status, 4);
 +    /* Raw Data Offset */
 +    build_append_int_noprefix(table, raw_data_offset, 4);
 +    /* Raw Data Length */
 +    build_append_int_noprefix(table, raw_data_length, 4);
 +    /* Data Length */
 +    build_append_int_noprefix(table, data_length, 4);
 +    /* Error Severity */
 +    build_append_int_noprefix(table, error_severity, 4);
 +}
 +
 +/* UEFI 2.6: N.2.5 Memory Error Section */
 +static void acpi_ghes_build_append_mem_cper(GArray *table,
 +                                            uint64_t error_physical_addr)
 +{
 +    /*
 +     * Memory Error Record
 +     */
 +
 +    /* Validation Bits */
 +    build_append_int_noprefix(table,
 +                              (1ULL << 14) | /* Type Valid */
 +                              (1ULL << 1) /* Physical Address Valid */,
 +                              8);
 +    /* Error Status */
 +    build_append_int_noprefix(table, 0, 8);
 +    /* Physical Address */
 +    build_append_int_noprefix(table, error_physical_addr, 8);
 +    /* Skip all the detailed information normally found in such a record */
 +    build_append_int_noprefix(table, 0, 48);
 +    /* Memory Error Type */
 +    build_append_int_noprefix(table, 0 /* Unknown error */, 1);
 +    /* Skip all the detailed information normally found in such a record */
 +    build_append_int_noprefix(table, 0, 7);
 +}
 +
 +static int acpi_ghes_record_mem_error(uint64_t error_block_address,
 +                                      uint64_t error_physical_addr)
 +{
 +    GArray *block;
 +
 +    /* Memory Error Section Type */
 +    const uint8_t uefi_cper_mem_sec[] =
 +          UUID_LE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
 +                  0xED, 0x7C, 0x83, 0xB1);
 +
 +    /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
 +     * Table 17-13 Generic Error Data Entry
 +     */
 +    QemuUUID fru_id = {};
 +    uint32_t data_length;
 +
 +    block = g_array_new(false, true /* clear */, 1);
 +
 +    /* This is the length if adding a new generic error data entry*/
 +    data_length = ACPI_GHES_DATA_LENGTH + ACPI_GHES_MEM_CPER_LENGTH;
 +
 +    /*
 +     * Check whether it will run out of the preallocated memory if adding a new
 +     * generic error data entry
 +     */
 +    if ((data_length + ACPI_GHES_GESB_SIZE) > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
 +        error_report("Not enough memory to record new CPER!!!");
 +        g_array_free(block, true);
 +        return -1;
 +    }
 +
 +    /* Build the new generic error status block header */
 +    acpi_ghes_generic_error_status(block, ACPI_GEBS_UNCORRECTABLE,
 +        0, 0, data_length, ACPI_CPER_SEV_RECOVERABLE);
 +
 +    /* Build this new generic error data entry header */
 +    acpi_ghes_generic_error_data(block, uefi_cper_mem_sec,
 +        ACPI_CPER_SEV_RECOVERABLE, 0, 0,
 +        ACPI_GHES_MEM_CPER_LENGTH, fru_id, 0);
 +
 +    /* Build the memory section CPER for above new generic error data entry */
 +    acpi_ghes_build_append_mem_cper(block, error_physical_addr);
 +
 +    /* Write the generic error data entry into guest memory */
 +    cpu_physical_memory_write(error_block_address, block->data, block->len);
 +
 +    g_array_free(block, true);
 +
 +    return 0;
 +}
 +
  /*
   * Build table for the hardware error fw_cfg blob.
   * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
@@ -XXX,XX +XXX,XX @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
      fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
          NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
  }
 +
 +int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
 +{
 +    uint64_t error_block_addr, read_ack_register_addr, read_ack_register = 0;
 +    uint64_t start_addr;
 +    bool ret = -1;
 +    AcpiGedState *acpi_ged_state;
 +    AcpiGhesState *ags;
 +
 +    assert(source_id < ACPI_HEST_SRC_ID_RESERVED);
 +
 +    acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
 +                                                       NULL));
 +    g_assert(acpi_ged_state);
 +    ags = &acpi_ged_state->ghes_state;
 +
 +    start_addr = le64_to_cpu(ags->ghes_addr_le);
 +
 +    if (physical_address) {
 +
 +        if (source_id < ACPI_HEST_SRC_ID_RESERVED) {
 +            start_addr += source_id * sizeof(uint64_t);
 +        }
 +
 +        cpu_physical_memory_read(start_addr, &error_block_addr,
 +                                 sizeof(error_block_addr));
 +
 +        error_block_addr = le64_to_cpu(error_block_addr);
 +
 +        read_ack_register_addr = start_addr +
 +            ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t);
 +
 +        cpu_physical_memory_read(read_ack_register_addr,
 +                                 &read_ack_register, sizeof(read_ack_register));
 +
 +        /* zero means OSPM does not acknowledge the error */
 +        if (!read_ack_register) {
 +            error_report("OSPM does not acknowledge previous error,"
 +                " so can not record CPER for current error anymore");
 +        } else if (error_block_addr) {
 +            read_ack_register = cpu_to_le64(0);
 +            /*
 +             * Clear the Read Ack Register, OSPM will write it to 1 when
 +             * it acknowledges this error.
 +             */
 +            cpu_physical_memory_write(read_ack_register_addr,
 +                &read_ack_register, sizeof(uint64_t));
 +
 +            ret = acpi_ghes_record_mem_error(error_block_addr,
 +                                             physical_address);
 +        } else
 +            error_report("can not find Generic Error Status Block");
 +    }
 +
 +    return ret;
 +}
 --
-.20.1
+.25.1

-[PULL 13/45] target/arm: Create gen_gvec_{qrdmla,qrdmls}
+[PULL 16/23] target/arm: Use bool for is64 and ns in add_cpreg_to_hashtable
 From: Richard Henderson <richard.henderson@linaro.org>
-Provide a functional interface for the vector expansion.
+Bool is a more appropriate type for these variables.
 This fits better with the existing set of helpers that
 we provide for other operations.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20220501055028.646596-16-richard.henderson@linaro.org
 Message-id: 20200513163245.17915-13-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.h     |  5 ++++
+ target/arm/helper.c | 4 ++--
- target/arm/translate-a64.c | 34 ++----------------------
+file changed, 2 insertions(+), 2 deletions(-)
  target/arm/translate.c     | 54 +++++++++++++++++++-------------------
 files changed, 34 insertions(+), 59 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/helper.c
-+++ b/target/arm/translate.h
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
- void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+      */
-                   int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+     uint32_t key;
+     ARMCPRegInfo *r2;
-+void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+-    int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
-+                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+-    int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
-+void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++    bool is64 = r->type & ARM_CP_64BIT;
-+                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
++    bool ns = secstate & ARM_CP_SECSTATE_NS;
-+
+     int cp = r->cp;
- /*
+     size_t name_len;
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
                         is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
  }
 -/* Expand a 3-operand + env pointer operation using
 - * an out-of-line helper.
 - */
 -static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd,
 -                             int rn, int rm, gen_helper_gvec_3_ptr *fn)
 -{
 -    tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
 -                       vec_full_reg_offset(s, rn),
 -                       vec_full_reg_offset(s, rm), cpu_env,
 -                       is_q ? 16 : 8, vec_full_reg_size(s), 0, fn);
 -}
 -
  /* Expand a 3-operand + fpstatus pointer + simd data value operation using
   * an out-of-line helper.
   */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
      switch (opcode) {
      case 0x0: /* SQRDMLAH (vector) */
 -        switch (size) {
 -        case 1:
 -            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16);
 -            break;
 -        case 2:
 -            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32);
 -            break;
 -        default:
 -            g_assert_not_reached();
 -        }
 +        gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlah_qc, size);
          return;
      case 0x1: /* SQRDMLSH (vector) */
 -        switch (size) {
 -        case 1:
 -            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16);
 -            break;
 -        case 2:
 -            gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32);
 -            break;
 -        default:
 -            g_assert_not_reached();
 -        }
 +        gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlsh_qc, size);
          return;
      case 0x2: /* SDOT / UDOT */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
      [NEON_2RM_VCVT_UF] = 0x4,
  };
 -
 -/* Expand v8.1 simd helper.  */
 -static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
 -                         int q, int rd, int rn, int rm)
 +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
  {
 -    if (dc_isar_feature(aa32_rdm, s)) {
 -        int opr_sz = (1 + q) * 8;
 -        tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
 -                           vfp_reg_offset(1, rn),
 -                           vfp_reg_offset(1, rm), cpu_env,
 -                           opr_sz, opr_sz, 0, fn);
 -        return 0;
 -    }
 -    return 1;
 +    static gen_helper_gvec_3_ptr * const fns[2] = {
 +        gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
 +    };
 +    tcg_debug_assert(vece >= 1 && vece <= 2);
 +    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
 +                       opr_sz, max_sz, 0, fns[vece - 1]);
 +}
 +
 +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 +                          uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static gen_helper_gvec_3_ptr * const fns[2] = {
 +        gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
 +    };
 +    tcg_debug_assert(vece >= 1 && vece <= 2);
 +    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
 +                       opr_sz, max_sz, 0, fns[vece - 1]);
  }
  #define GEN_CMP0(NAME, COND)                                            \
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  break;  /* VPADD */
              }
              /* VQRDMLAH */
 -            switch (size) {
 -            case 1:
 -                return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16,
 -                                     q, rd, rn, rm);
 -            case 2:
 -                return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32,
 -                                     q, rd, rn, rm);
 +            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
 +                gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs,
 +                                     vec_size, vec_size);
 +                return 0;
              }
              return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  break;
              }
              /* VQRDMLSH */
 -            switch (size) {
 -            case 1:
 -                return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16,
 -                                     q, rd, rn, rm);
 -            case 2:
 -                return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32,
 -                                     q, rd, rn, rm);
 +            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
 +                gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs,
 +                                     vec_size, vec_size);
 +                return 0;
              }
              return 1;
 --
-.20.1
+.25.1

-[PULL 42/45] target/arm: Convert Neon 3-reg-same compare insns to decodetree
+[PULL 17/23] target/arm: Hoist isbanked computation in add_cpreg_to_hashtable
-Convert the Neon integer 3-reg-same compare insns VCGE, VCGT,
+From: Richard Henderson <richard.henderson@linaro.org>
 VCEQ, VACGE and VACGT to decodetree.
+Computing isbanked only once makes the code
+a bit easier to read.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20220501055028.646596-17-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-15-peter.maydell@linaro.org
 ---
- target/arm/neon-dp.decode       |  5 +++++
+ target/arm/helper.c | 6 ++++--
- target/arm/translate-neon.inc.c |  6 +++++
+file changed, 4 insertions(+), 2 deletions(-)
  target/arm/translate.c          | 39 ++-------------------------------
 files changed, 13 insertions(+), 37 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/helper.c
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
- VMLA_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
+     bool is64 = r->type & ARM_CP_64BIT;
- VMLS_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 1 .... @3same_fp
+     bool ns = secstate & ARM_CP_SECSTATE_NS;
- VMUL_fp_3s       1111 001 1 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
+     int cp = r->cp;
-+VCEQ_fp_3s       1111 001 0 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
++    bool isbanked;
-+VCGE_fp_3s       1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
+     size_t name_len;
-+VACGE_fp_3s      1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
-+VCGT_fp_3s       1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
+     switch (state) {
-+VACGT_fp_3s      1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
- VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
+         r2->opaque = opaque;
  VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
          return do_3same_fp(s, a, FUNC, READS_VD);                   \
      }
-+DO_3S_FP(VCEQ, gen_helper_neon_ceq_f32, false)
+-    if (r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1]) {
-+DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
++    isbanked = r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1];
-+DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
++    if (isbanked) {
-+DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
+         /* Register is banked (using both entries in array).
-+DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
+          * Overwriting fieldoffset as the array is only used to define
-+
+          * banked registers but later only fieldoffset is used.
- static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-                             TCGv_ptr fpstatus)
+     }
- {
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+     if (state == ARM_CP_STATE_AA32) {
-index XXXXXXX..XXXXXXX 100644
+-        if (r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1]) {
---- a/target/arm/translate.c
++        if (isbanked) {
-+++ b/target/arm/translate.c
+             /* If the register is banked then we don't need to migrate or
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+              * reset the 32-bit instance in certain cases:
-         case NEON_3R_VQDMULH_VQRDMULH:
+              *
          case NEON_3R_FLOAT_ARITH:
          case NEON_3R_FLOAT_MULTIPLY:
 +        case NEON_3R_FLOAT_CMP:
 +        case NEON_3R_FLOAT_ACMP:
              /* Already handled by decodetree */
              return 1;
          }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  return 1; /* VPMIN/VPMAX handled by decodetree */
              }
              break;
 -        case NEON_3R_FLOAT_CMP:
 -            if (!u && size) {
 -                /* no encoding for U=0 C=1x */
 -                return 1;
 -            }
 -            break;
 -        case NEON_3R_FLOAT_ACMP:
 -            if (!u) {
 -                return 1;
 -            }
 -            break;
          case NEON_3R_FLOAT_MISC:
              /* VMAXNM/VMINNM in ARMv8 */
              if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          tmp = neon_load_reg(rn, pass);
          tmp2 = neon_load_reg(rm, pass);
          switch (op) {
 -        case NEON_3R_FLOAT_CMP:
 -        {
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            if (!u) {
 -                gen_helper_neon_ceq_f32(tmp, tmp, tmp2, fpstatus);
 -            } else {
 -                if (size == 0) {
 -                    gen_helper_neon_cge_f32(tmp, tmp, tmp2, fpstatus);
 -                } else {
 -                    gen_helper_neon_cgt_f32(tmp, tmp, tmp2, fpstatus);
 -                }
 -            }
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
 -        case NEON_3R_FLOAT_ACMP:
 -        {
 -            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -            if (size == 0) {
 -                gen_helper_neon_acge_f32(tmp, tmp, tmp2, fpstatus);
 -            } else {
 -                gen_helper_neon_acgt_f32(tmp, tmp, tmp2, fpstatus);
 -            }
 -            tcg_temp_free_ptr(fpstatus);
 -            break;
 -        }
          case NEON_3R_FLOAT_MINMAX:
          {
              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
-.20.1
+.25.1

-[PULL 25/45] KVM: Move hwpoison page related functions into kvm-all.c
+[PULL 18/23] target/arm: Perform override check early in add_cpreg_to_hashtable
-From: Dongjiu Geng <gengdongjiu@huawei.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-kvm_hwpoison_page_add() and kvm_unpoison_all() will both
+Perform the override check early, so that it is still done
-be used by X86 and ARM platforms, so moving them into
+even when we decide to discard an unreachable cpreg.
 "accel/kvm/kvm-all.c" to avoid duplicate code.
-For architectures that don't use the poison-list functionality
+Use assert not printf+abort.
 the reset handler will harmlessly do nothing, so let's register
 the kvm_unpoison_all() function in the generic kvm_init() function.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
+Message-id: 20220501055028.646596-18-richard.henderson@linaro.org
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
 Message-id: 20200512030609.19593-8-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/sysemu/kvm_int.h | 12 ++++++++++++
+ target/arm/helper.c | 22 ++++++++--------------
- accel/kvm/kvm-all.c      | 36 ++++++++++++++++++++++++++++++++++++
+file changed, 8 insertions(+), 14 deletions(-)
  target/i386/kvm.c        | 36 ------------------------------------
 files changed, 48 insertions(+), 36 deletions(-)
-diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/include/sysemu/kvm_int.h
+--- a/target/arm/helper.c
-+++ b/include/sysemu/kvm_int.h
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-                                   AddressSpace *as, int as_id);
+         g_assert_not_reached();
+     }
- void kvm_set_max_memslot_size(hwaddr max_slot_size);
-+
++    /* Overriding of an existing definition must be explicitly requested. */
-+/**
++    if (!(r->type & ARM_CP_OVERRIDE)) {
-+ * kvm_hwpoison_page_add:
++        const ARMCPRegInfo *oldreg = get_arm_cp_reginfo(cpu->cp_regs, key);
-+ *
++        if (oldreg) {
-+ * Parameters:
++            assert(oldreg->type & ARM_CP_OVERRIDE);
 + *  @ram_addr: the address in the RAM for the poisoned page
 + *
 + * Add a poisoned page to the list
 + *
 + * Return: None.
 + */
 +void kvm_hwpoison_page_add(ram_addr_t ram_addr);
  #endif
 diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
 index XXXXXXX..XXXXXXX 100644
 --- a/accel/kvm/kvm-all.c
 +++ b/accel/kvm/kvm-all.c
@@ -XXX,XX +XXX,XX @@
  #include "qapi/visitor.h"
  #include "qapi/qapi-types-common.h"
  #include "qapi/qapi-visit-common.h"
 +#include "sysemu/reset.h"
  #include "hw/boards.h"
@@ -XXX,XX +XXX,XX @@ int kvm_vm_check_extension(KVMState *s, unsigned int extension)
      return ret;
  }
 +typedef struct HWPoisonPage {
 +    ram_addr_t ram_addr;
 +    QLIST_ENTRY(HWPoisonPage) list;
 +} HWPoisonPage;
 +
 +static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
 +    QLIST_HEAD_INITIALIZER(hwpoison_page_list);
 +
 +static void kvm_unpoison_all(void *param)
 +{
 +    HWPoisonPage *page, *next_page;
 +
 +    QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
 +        QLIST_REMOVE(page, list);
 +        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
 +        g_free(page);
 +    }
 +}
 +
 +void kvm_hwpoison_page_add(ram_addr_t ram_addr)
 +{
 +    HWPoisonPage *page;
 +
 +    QLIST_FOREACH(page, &hwpoison_page_list, list) {
 +        if (page->ram_addr == ram_addr) {
 +            return;
 +        }
 +    }
-+    page = g_new(HWPoisonPage, 1);
-+    page->ram_addr = ram_addr;
-+    QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
-+}
 +
- static uint32_t adjust_ioeventfd_endianness(uint32_t val, uint32_t size)
+     /* Combine cpreg and name into one allocation. */
- {
+     name_len = strlen(name) + 1;
- #if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
+     r2 = g_malloc(sizeof(*r2) + name_len);
-@@ -XXX,XX +XXX,XX @@ static int kvm_init(MachineState *ms)
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-         s->kernel_irqchip_split = mc->default_kernel_irqchip_split ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
+         assert(!raw_accessors_invalid(r2));
      }
-+    qemu_register_reset(kvm_unpoison_all, NULL);
+-    /* Overriding of an existing definition must be explicitly
-+
+-     * requested.
-     if (s->kernel_irqchip_allowed) {
+-     */
-         kvm_irqchip_create(s);
+-    if (!(r->type & ARM_CP_OVERRIDE)) {
-     }
+-        const ARMCPRegInfo *oldreg = get_arm_cp_reginfo(cpu->cp_regs, key);
-diff --git a/target/i386/kvm.c b/target/i386/kvm.c
+-        if (oldreg && !(oldreg->type & ARM_CP_OVERRIDE)) {
-index XXXXXXX..XXXXXXX 100644
+-            fprintf(stderr, "Register redefined: cp=%d %d bit "
---- a/target/i386/kvm.c
+-                    "crn=%d crm=%d opc1=%d opc2=%d, "
-+++ b/target/i386/kvm.c
+-                    "was %s, now %s\n", r2->cp, 32 + 32 * is64,
-@@ -XXX,XX +XXX,XX @@
+-                    r2->crn, r2->crm, r2->opc1, r2->opc2,
- #include "sysemu/sysemu.h"
+-                    oldreg->name, r2->name);
- #include "sysemu/hw_accel.h"
+-            g_assert_not_reached();
  #include "sysemu/kvm_int.h"
 -#include "sysemu/reset.h"
  #include "sysemu/runstate.h"
  #include "kvm_i386.h"
  #include "hyperv.h"
@@ -XXX,XX +XXX,XX @@ uint64_t kvm_arch_get_supported_msr_feature(KVMState *s, uint32_t index)
      }
  }
 -
 -typedef struct HWPoisonPage {
 -    ram_addr_t ram_addr;
 -    QLIST_ENTRY(HWPoisonPage) list;
 -} HWPoisonPage;
 -
 -static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
 -    QLIST_HEAD_INITIALIZER(hwpoison_page_list);
 -
 -static void kvm_unpoison_all(void *param)
 -{
 -    HWPoisonPage *page, *next_page;
 -
 -    QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
 -        QLIST_REMOVE(page, list);
 -        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
 -        g_free(page);
 -    }
 -}
 -
 -static void kvm_hwpoison_page_add(ram_addr_t ram_addr)
 -{
 -    HWPoisonPage *page;
 -
 -    QLIST_FOREACH(page, &hwpoison_page_list, list) {
 -        if (page->ram_addr == ram_addr) {
 -            return;
 -        }
 -    }
--    page = g_new(HWPoisonPage, 1);
+     g_hash_table_insert(cpu->cp_regs, (gpointer)(uintptr_t)key, r2);
--    page->ram_addr = ram_addr;
+ }
--    QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
 -}
 -
  static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap,
                                       int *max_banks)
  {
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init(MachineState *ms, KVMState *s)
          fprintf(stderr, "e820_add_entry() table is full\n");
          return ret;
      }
 -    qemu_register_reset(kvm_unpoison_all, NULL);
      shadow_mem = object_property_get_int(OBJECT(s), "kvm-shadow-mem", &error_abort);
      if (shadow_mem != -1) {
 --
-.20.1
+.25.1

-[PULL 04/45] target/arm: Create gen_gvec_{sri,sli}
+[PULL 19/23] target/arm: Reformat comments in add_cpreg_to_hashtable
 From: Richard Henderson <richard.henderson@linaro.org>
-The functions eliminate duplication of the special cases for
+Put the block comments into the current coding style.
 this operation.  They match up with the GVecGen2iFn typedef.
-Add out-of-line helpers.  We got away with only having inline
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 expanders because the neon vector size is only 16 bytes, and
 we know that the inline expansion will always succeed.
 When we reuse this for SVE, tcg-gvec-op may decide to use an
 out-of-line helper due to longer vector lengths.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20220501055028.646596-19-richard.henderson@linaro.org
 Message-id: 20200513163245.17915-4-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h        |  10 ++
+ target/arm/helper.c | 24 +++++++++++++++---------
- target/arm/translate.h     |   7 +-
+file changed, 15 insertions(+), 9 deletions(-)
  target/arm/translate-a64.c |  20 +---
  target/arm/translate.c     | 186 +++++++++++++++++++++----------------
  target/arm/vec_helper.c    |  38 ++++++++
 files changed, 160 insertions(+), 101 deletions(-)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/target/arm/helper.c
-+++ b/target/arm/helper.h
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+@@ -XXX,XX +XXX,XX @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
- DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+     return cpu_list;
  DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
 +DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
  #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 mls_op[4];
  extern const GVecGen3 cmtst_op[4];
  extern const GVecGen3 sshl_op[4];
  extern const GVecGen3 ushl_op[4];
 -extern const GVecGen2i sri_op[4];
 -extern const GVecGen2i sli_op[4];
  extern const GVecGen4 uqadd_op[4];
  extern const GVecGen4 sqadd_op[4];
  extern const GVecGen4 uqsub_op[4];
@@ -XXX,XX +XXX,XX @@ void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
  void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                      int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
   * Forward to the isar_feature_* tests given a DisasContext pointer.
   */
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
                     is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
  }
--/* Expand a 2-operand + immediate AdvSIMD vector operation using
++/*
-- * an op descriptor.
++ * Private utility function for define_one_arm_cp_reg_with_opaque():
-- */
++ * add a single reginfo struct to the hash table.
--static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd,
++ */
--                          int rn, int64_t imm, const GVecGen2i *gvec_op)
+ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
--{
+                                    void *opaque, CPState state,
--    tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
+                                    CPSecureState secstate,
--                    is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op);
+                                    int crm, int opc1, int opc2,
--}
+                                    const char *name)
--
+ {
- /* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
+-    /* Private utility function for define_one_arm_cp_reg_with_opaque():
- static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
+-     * add a single reginfo struct to the hash table.
-                          int rn, int rm, const GVecGen3 *gvec_op)
+-     */
-@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
+     uint32_t key;
-         gen_gvec_fn2i(s, is_q, rd, rn, shift,
+     ARMCPRegInfo *r2;
-                       is_u ? gen_gvec_usra : gen_gvec_ssra, size);
+     bool is64 = r->type & ARM_CP_64BIT;
-         return;
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-+
-     case 0x08: /* SRI */
+     isbanked = r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1];
--        /* Shift count same as element size is valid but does nothing.  */
+     if (isbanked) {
--        if (shift == 8 << size) {
+-        /* Register is banked (using both entries in array).
--            goto done;
++        /*
--        }
++         * Register is banked (using both entries in array).
--        gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]);
+          * Overwriting fieldoffset as the array is only used to define
-+        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
+          * banked registers but later only fieldoffset is used.
-         return;
+          */
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-     case 0x00: /* SSHR / USHR */
-@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
+     if (state == ARM_CP_STATE_AA32) {
          if (isbanked) {
 -            /* If the register is banked then we don't need to migrate or
 +            /*
 +             * If the register is banked then we don't need to migrate or
               * reset the 32-bit instance in certain cases:
               *
               * 1) If the register has both 32-bit and 64-bit instances then we
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
                  r2->type |= ARM_CP_ALIAS;
              }
          } else if ((secstate != r->secure) && !ns) {
 -            /* The register is not banked so we only want to allow migration of
 -             * the non-secure instance.
 +            /*
 +             * The register is not banked so we only want to allow migration
 +             * of the non-secure instance.
               */
              r2->type |= ARM_CP_ALIAS;
          }
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
          }
      }
-     tcg_temp_free_i64(tcg_round);
+-    /* By convention, for wildcarded registers only the first
-- done:
++    /*
-     clear_vec_high(s, is_q, rd);
++     * By convention, for wildcarded registers only the first
- }
+      * entry is used for migration; the others are marked as
+      * ALIAS so we don't try to transfer the register
-@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
+      * multiple times. Special registers (ie NOP/WFI) are
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
          r2->type |= ARM_CP_ALIAS | ARM_CP_NO_GDB;
      }
-     if (insert) {
+-    /* Check that raw accesses are either forbidden or handled. Note that
--        gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]);
++    /*
-+        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size);
++     * Check that raw accesses are either forbidden or handled. Note that
-     } else {
+      * we can't assert this earlier because the setup of fieldoffset for
-         gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size);
+      * banked registers has to be done first.
-     }
+      */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
  {
 -    if (sh == 0) {
 -        tcg_gen_mov_vec(d, a);
 -    } else {
 -        TCGv_vec t = tcg_temp_new_vec_matching(d);
 -        TCGv_vec m = tcg_temp_new_vec_matching(d);
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    TCGv_vec m = tcg_temp_new_vec_matching(d);
 -        tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
 -        tcg_gen_shri_vec(vece, t, a, sh);
 -        tcg_gen_and_vec(vece, d, d, m);
 -        tcg_gen_or_vec(vece, d, d, t);
 +    tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh));
 +    tcg_gen_shri_vec(vece, t, a, sh);
 +    tcg_gen_and_vec(vece, d, d, m);
 +    tcg_gen_or_vec(vece, d, d, t);
 -        tcg_temp_free_vec(t);
 -        tcg_temp_free_vec(m);
 -    }
 +    tcg_temp_free_vec(t);
 +    tcg_temp_free_vec(m);
  }
 -static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 };
 +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 };
 +    const GVecGen2i ops[4] = {
 +        { .fni8 = gen_shr8_ins_i64,
 +          .fniv = gen_shr_ins_vec,
 +          .fno = gen_helper_gvec_sri_b,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni8 = gen_shr16_ins_i64,
 +          .fniv = gen_shr_ins_vec,
 +          .fno = gen_helper_gvec_sri_h,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_shr32_ins_i32,
 +          .fniv = gen_shr_ins_vec,
 +          .fno = gen_helper_gvec_sri_s,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_shr64_ins_i64,
 +          .fniv = gen_shr_ins_vec,
 +          .fno = gen_helper_gvec_sri_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 -const GVecGen2i sri_op[4] = {
 -    { .fni8 = gen_shr8_ins_i64,
 -      .fniv = gen_shr_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sri,
 -      .vece = MO_8 },
 -    { .fni8 = gen_shr16_ins_i64,
 -      .fniv = gen_shr_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sri,
 -      .vece = MO_16 },
 -    { .fni4 = gen_shr32_ins_i32,
 -      .fniv = gen_shr_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sri,
 -      .vece = MO_32 },
 -    { .fni8 = gen_shr64_ins_i64,
 -      .fniv = gen_shr_ins_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sri,
 -      .vece = MO_64 },
 -};
 +    /* tszimm encoding produces immediates in the range [1..esize]. */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    /* Shift of esize leaves destination unchanged. */
 +    if (shift < (8 << vece)) {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    } else {
 +        /* Nop, but we do need to clear the tail. */
 +        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
 +    }
 +}
  static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
@@ -XXX,XX +XXX,XX @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
  {
 -    if (sh == 0) {
 -        tcg_gen_mov_vec(d, a);
 -    } else {
 -        TCGv_vec t = tcg_temp_new_vec_matching(d);
 -        TCGv_vec m = tcg_temp_new_vec_matching(d);
 +    TCGv_vec t = tcg_temp_new_vec_matching(d);
 +    TCGv_vec m = tcg_temp_new_vec_matching(d);
 -        tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
 -        tcg_gen_shli_vec(vece, t, a, sh);
 -        tcg_gen_and_vec(vece, d, d, m);
 -        tcg_gen_or_vec(vece, d, d, t);
 +    tcg_gen_shli_vec(vece, t, a, sh);
 +    tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh));
 +    tcg_gen_and_vec(vece, d, d, m);
 +    tcg_gen_or_vec(vece, d, d, t);
 -        tcg_temp_free_vec(t);
 -        tcg_temp_free_vec(m);
 -    }
 +    tcg_temp_free_vec(t);
 +    tcg_temp_free_vec(m);
  }
 -static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 };
 +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                  int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 };
 +    const GVecGen2i ops[4] = {
 +        { .fni8 = gen_shl8_ins_i64,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_b,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni8 = gen_shl16_ins_i64,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_h,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_shl32_ins_i32,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_s,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_shl64_ins_i64,
 +          .fniv = gen_shl_ins_vec,
 +          .fno = gen_helper_gvec_sli_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64 },
 +    };
 -const GVecGen2i sli_op[4] = {
 -    { .fni8 = gen_shl8_ins_i64,
 -      .fniv = gen_shl_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_8 },
 -    { .fni8 = gen_shl16_ins_i64,
 -      .fniv = gen_shl_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_16 },
 -    { .fni4 = gen_shl32_ins_i32,
 -      .fniv = gen_shl_ins_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_32 },
 -    { .fni8 = gen_shl64_ins_i64,
 -      .fniv = gen_shl_ins_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_sli,
 -      .vece = MO_64 },
 -};
 +    /* tszimm encoding produces immediates in the range [0..esize-1]. */
 +    tcg_debug_assert(shift >= 0);
 +    tcg_debug_assert(shift < (8 << vece));
 +
 +    if (shift == 0) {
 +        tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz);
 +    } else {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    }
 +}
  static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
  {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      }
                      /* Right shift comes here negative.  */
                      shift = -shift;
 -                    /* Shift out of range leaves destination unchanged.  */
 -                    if (shift < 8 << size) {
 -                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 -                                        shift, &sri_op[size]);
 -                    }
 +                    gen_gvec_sri(size, rd_ofs, rm_ofs, shift,
 +                                 vec_size, vec_size);
                      return 0;
                  case 5: /* VSHL, VSLI */
                      if (u) { /* VSLI */
 -                        /* Shift out of range leaves destination unchanged.  */
 -                        if (shift < 8 << size) {
 -                            tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size,
 -                                            vec_size, shift, &sli_op[size]);
 -                        }
 +                        gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
 +                                     vec_size, vec_size);
                      } else { /* VSHL */
                          /* Shifts larger than the element size are
                           * architecturally valid and results in zero.
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_RSRA(gvec_ursra_d, uint64_t)
  #undef DO_RSRA
 +#define DO_SRI(NAME, TYPE)                              \
 +void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
 +{                                                       \
 +    intptr_t i, oprsz = simd_oprsz(desc);               \
 +    int shift = simd_data(desc);                        \
 +    TYPE *d = vd, *n = vn;                              \
 +    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
 +        d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \
 +    }                                                   \
 +    clear_tail(d, oprsz, simd_maxsz(desc));             \
 +}
 +
 +DO_SRI(gvec_sri_b, uint8_t)
 +DO_SRI(gvec_sri_h, uint16_t)
 +DO_SRI(gvec_sri_s, uint32_t)
 +DO_SRI(gvec_sri_d, uint64_t)
 +
 +#undef DO_SRI
 +
 +#define DO_SLI(NAME, TYPE)                              \
 +void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
 +{                                                       \
 +    intptr_t i, oprsz = simd_oprsz(desc);               \
 +    int shift = simd_data(desc);                        \
 +    TYPE *d = vd, *n = vn;                              \
 +    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
 +        d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \
 +    }                                                   \
 +    clear_tail(d, oprsz, simd_maxsz(desc));             \
 +}
 +
 +DO_SLI(gvec_sli_b, uint8_t)
 +DO_SLI(gvec_sli_h, uint16_t)
 +DO_SLI(gvec_sli_s, uint32_t)
 +DO_SLI(gvec_sli_d, uint64_t)
 +
 +#undef DO_SLI
 +
  /*
   * Convert float16 to float32, raising no exceptions and
   * preserving exceptional values, including SNaN.
 --
-.20.1
+.25.1

-[PULL 12/45] target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
+[PULL 20/23] target/arm: Remove HOST_BIG_ENDIAN ifdef in add_cpreg_to_hashtable
 From: Richard Henderson <richard.henderson@linaro.org>
-These operations do not touch fp_status.
+Since e03b56863d2bc, our host endian indicator is unconditionally
 set, which means that we can use a normal C condition.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20220501055028.646596-20-richard.henderson@linaro.org
-Message-id: 20200513163245.17915-12-richard.henderson@linaro.org
+[PMM: quote correct git hash in commit message]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h        |  4 ++--
+ target/arm/helper.c | 9 +++------
- target/arm/translate-a64.c |  5 ++---
+file changed, 3 insertions(+), 6 deletions(-)
  target/arm/translate.c     | 12 ++----------
  target/arm/vfp_helper.c    |  5 ++---
 files changed, 8 insertions(+), 18 deletions(-)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/target/arm/helper.c
-+++ b/target/arm/helper.h
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
+@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
- DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
+             r2->type |= ARM_CP_ALIAS;
- DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
+         }
- DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
--DEF_HELPER_2(recpe_u32, i32, i32, ptr)
+-        if (r->state == ARM_CP_STATE_BOTH) {
--DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr)
+-#if HOST_BIG_ENDIAN
-+DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32)
+-            if (r2->fieldoffset) {
-+DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32)
+-                r2->fieldoffset += sizeof(uint32_t);
- DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32)
+-            }
+-#endif
- DEF_HELPER_3(shl_cc, i32, env, i32, i32)
++        if (HOST_BIG_ENDIAN &&
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
++            r->state == ARM_CP_STATE_BOTH && r2->fieldoffset) {
-index XXXXXXX..XXXXXXX 100644
++            r2->fieldoffset += sizeof(uint32_t);
---- a/target/arm/translate-a64.c
+         }
-+++ b/target/arm/translate-a64.c
+     }
@@ -XXX,XX +XXX,XX @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode,
              switch (opcode) {
              case 0x3c: /* URECPE */
 -                gen_helper_recpe_u32(tcg_res, tcg_op, fpst);
 +                gen_helper_recpe_u32(tcg_res, tcg_op);
                  break;
              case 0x3d: /* FRECPE */
                  gen_helper_recpe_f32(tcg_res, tcg_op, fpst);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
                  unallocated_encoding(s);
                  return;
              }
 -            need_fpstatus = true;
              break;
          case 0x1e: /* FRINT32Z */
          case 0x1f: /* FRINT64Z */
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
                      gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus);
                      break;
                  case 0x7c: /* URSQRTE */
 -                    gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus);
 +                    gen_helper_rsqrte_u32(tcg_res, tcg_op);
                      break;
                  case 0x1e: /* FRINT32Z */
                  case 0x5e: /* FRINT32X */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              break;
                          }
                          case NEON_2RM_VRECPE:
 -                        {
 -                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                            gen_helper_recpe_u32(tmp, tmp, fpstatus);
 -                            tcg_temp_free_ptr(fpstatus);
 +                            gen_helper_recpe_u32(tmp, tmp);
                              break;
 -                        }
                          case NEON_2RM_VRSQRTE:
 -                        {
 -                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                            gen_helper_rsqrte_u32(tmp, tmp, fpstatus);
 -                            tcg_temp_free_ptr(fpstatus);
 +                            gen_helper_rsqrte_u32(tmp, tmp);
                              break;
 -                        }
                          case NEON_2RM_VRECPE_F:
                          {
                              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vfp_helper.c
 +++ b/target/arm/vfp_helper.c
@@ -XXX,XX +XXX,XX @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp)
      return make_float64(val);
  }
 -uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp)
 +uint32_t HELPER(recpe_u32)(uint32_t a)
  {
 -    /* float_status *s = fpstp; */
      int input, estimate;
      if ((a & 0x80000000) == 0) {
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp)
      return deposit32(0, (32 - 9), 9, estimate);
  }
 -uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp)
 +uint32_t HELPER(rsqrte_u32)(uint32_t a)
  {
      int estimate;
 --
-.20.1
+.25.1

-[PULL 02/45] target/arm: Create gen_gvec_[us]sra
+[PULL 21/23] target/arm: Add isar predicates for FEAT_Debugv8p2
 From: Richard Henderson <richard.henderson@linaro.org>
-The functions eliminate duplication of the special cases for
-this operation.  They match up with the GVecGen2iFn typedef.
-Add out-of-line helpers.  We got away with only having inline
-expanders because the neon vector size is only 16 bytes, and
-we know that the inline expansion will always succeed.
-When we reuse this for SVE, tcg-gvec-op may decide to use an
-out-of-line helper due to longer vector lengths.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200513163245.17915-2-richard.henderson@linaro.org
+Message-id: 20220501055028.646596-24-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/helper.h        |  10 +++
+ target/arm/cpu.h | 15 +++++++++++++++
- target/arm/translate.h     |   7 +-
+file changed, 15 insertions(+)
  target/arm/translate-a64.c |  15 +---
  target/arm/translate.c     | 161 ++++++++++++++++++++++---------------
  target/arm/vec_helper.c    |  25 ++++++
 files changed, 139 insertions(+), 79 deletions(-)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/target/arm/cpu.h
-+++ b/target/arm/helper.h
++++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_ssbs(const ARMISARegisters *id)
+     return FIELD_EX32(id->id_pfr2, ID_PFR2, SSBS) != 0;
- DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+ }
-+DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++static inline bool isar_feature_aa32_debugv8p2(const ARMISARegisters *id)
-+DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++{
-+DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++    return FIELD_EX32(id->id_dfr0, ID_DFR0, COPDBG) >= 8;
-+DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
++}
 +
 +DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 +
  #ifdef TARGET_AARCH64
  #include "helper-a64.h"
  #include "helper-sve.h"
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ extern const GVecGen3 mls_op[4];
  extern const GVecGen3 cmtst_op[4];
  extern const GVecGen3 sshl_op[4];
  extern const GVecGen3 ushl_op[4];
 -extern const GVecGen2i ssra_op[4];
 -extern const GVecGen2i usra_op[4];
  extern const GVecGen2i sri_op[4];
  extern const GVecGen2i sli_op[4];
  extern const GVecGen4 uqadd_op[4];
@@ -XXX,XX +XXX,XX @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
  void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
  void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 +
  /*
-  * Forward to the isar_feature_* tests given a DisasContext pointer.
+  * 64-bit feature tests via id registers.
   */
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_ssbs(const ARMISARegisters *id)
-index XXXXXXX..XXXXXXX 100644
+     return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, SSBS) != 0;
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
      switch (opcode) {
      case 0x02: /* SSRA / USRA (accumulate) */
 -        if (is_u) {
 -            /* Shift count same as element size produces zero to add.  */
 -            if (shift == 8 << size) {
 -                goto done;
 -            }
 -            gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]);
 -        } else {
 -            /* Shift count same as element size produces all sign to add.  */
 -            if (shift == 8 << size) {
 -                shift -= 1;
 -            }
 -            gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]);
 -        }
 +        gen_gvec_fn2i(s, is_q, rd, rn, shift,
 +                      is_u ? gen_gvec_usra : gen_gvec_ssra, size);
          return;
      case 0x08: /* SRI */
          /* Shift count same as element size is valid but does nothing.  */
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
      tcg_gen_add_vec(vece, d, d, a);
  }
--static const TCGOpcode vecop_list_ssra[] = {
++static inline bool isar_feature_aa64_debugv8p2(const ARMISARegisters *id)
 -    INDEX_op_sari_vec, INDEX_op_add_vec, 0
 -};
 +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
-+    static const TCGOpcode vecop_list[] = {
++    return FIELD_EX64(id->id_aa64dfr0, ID_AA64DFR0, DEBUGVER) >= 8;
 +        INDEX_op_sari_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_ssra8_i64,
 +          .fniv = gen_ssra_vec,
 +          .fno = gen_helper_gvec_ssra_b,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8 },
 +        { .fni8 = gen_ssra16_i64,
 +          .fniv = gen_ssra_vec,
 +          .fno = gen_helper_gvec_ssra_h,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16 },
 +        { .fni4 = gen_ssra32_i32,
 +          .fniv = gen_ssra_vec,
 +          .fno = gen_helper_gvec_ssra_s,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32 },
 +        { .fni8 = gen_ssra64_i64,
 +          .fniv = gen_ssra_vec,
 +          .fno = gen_helper_gvec_ssra_b,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .opt_opc = vecop_list,
 +          .load_dest = true,
 +          .vece = MO_64 },
 +    };
 -const GVecGen2i ssra_op[4] = {
 -    { .fni8 = gen_ssra8_i64,
 -      .fniv = gen_ssra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_ssra,
 -      .vece = MO_8 },
 -    { .fni8 = gen_ssra16_i64,
 -      .fniv = gen_ssra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_ssra,
 -      .vece = MO_16 },
 -    { .fni4 = gen_ssra32_i32,
 -      .fniv = gen_ssra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_ssra,
 -      .vece = MO_32 },
 -    { .fni8 = gen_ssra64_i64,
 -      .fniv = gen_ssra_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .opt_opc = vecop_list_ssra,
 -      .load_dest = true,
 -      .vece = MO_64 },
 -};
 +    /* tszimm encoding produces immediates in the range [1..esize]. */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    /*
 +     * Shifts larger than the element size are architecturally valid.
 +     * Signed results in all sign bits.
 +     */
 +    shift = MIN(shift, (8 << vece) - 1);
 +    tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +}
  static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
@@ -XXX,XX +XXX,XX @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
      tcg_gen_add_vec(vece, d, d, a);
  }
 -static const TCGOpcode vecop_list_usra[] = {
 -    INDEX_op_shri_vec, INDEX_op_add_vec, 0
 -};
 +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 +                   int64_t shift, uint32_t opr_sz, uint32_t max_sz)
 +{
 +    static const TCGOpcode vecop_list[] = {
 +        INDEX_op_shri_vec, INDEX_op_add_vec, 0
 +    };
 +    static const GVecGen2i ops[4] = {
 +        { .fni8 = gen_usra8_i64,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_b,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_8, },
 +        { .fni8 = gen_usra16_i64,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_h,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_16, },
 +        { .fni4 = gen_usra32_i32,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_s,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_32, },
 +        { .fni8 = gen_usra64_i64,
 +          .fniv = gen_usra_vec,
 +          .fno = gen_helper_gvec_usra_d,
 +          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 +          .load_dest = true,
 +          .opt_opc = vecop_list,
 +          .vece = MO_64, },
 +    };
 -const GVecGen2i usra_op[4] = {
 -    { .fni8 = gen_usra8_i64,
 -      .fniv = gen_usra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_8, },
 -    { .fni8 = gen_usra16_i64,
 -      .fniv = gen_usra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_16, },
 -    { .fni4 = gen_usra32_i32,
 -      .fniv = gen_usra_vec,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_32, },
 -    { .fni8 = gen_usra64_i64,
 -      .fniv = gen_usra_vec,
 -      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
 -      .load_dest = true,
 -      .opt_opc = vecop_list_usra,
 -      .vece = MO_64, },
 -};
 +    /* tszimm encoding produces immediates in the range [1..esize]. */
 +    tcg_debug_assert(shift > 0);
 +    tcg_debug_assert(shift <= (8 << vece));
 +
 +    /*
 +     * Shifts larger than the element size are architecturally valid.
 +     * Unsigned results in all zeros as input to accumulate: nop.
 +     */
 +    if (shift < (8 << vece)) {
 +        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
 +    } else {
 +        /* Nop, but we do need to clear the tail. */
 +        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
 +    }
 +}
  static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
  {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  case 1:  /* VSRA */
                      /* Right shift comes here negative.  */
                      shift = -shift;
 -                    /* Shifts larger than the element size are architecturally
 -                     * valid.  Unsigned results in all zeros; signed results
 -                     * in all sign bits.
 -                     */
 -                    if (!u) {
 -                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 -                                        MIN(shift, (8 << size) - 1),
 -                                        &ssra_op[size]);
 -                    } else if (shift >= 8 << size) {
 -                        /* rd += 0 */
 +                    if (u) {
 +                        gen_gvec_usra(size, rd_ofs, rm_ofs, shift,
 +                                      vec_size, vec_size);
                      } else {
 -                        tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size,
 -                                        shift, &usra_op[size]);
 +                        gen_gvec_ssra(size, rd_ofs, rm_ofs, shift,
 +                                      vec_size, vec_size);
                      }
                      return 0;
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn,
      clear_tail(d, oprsz, simd_maxsz(desc));
  }
 +
 +#define DO_SRA(NAME, TYPE)                              \
 +void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
 +{                                                       \
 +    intptr_t i, oprsz = simd_oprsz(desc);               \
 +    int shift = simd_data(desc);                        \
 +    TYPE *d = vd, *n = vn;                              \
 +    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
 +        d[i] += n[i] >> shift;                          \
 +    }                                                   \
 +    clear_tail(d, oprsz, simd_maxsz(desc));             \
 +}
 +
-+DO_SRA(gvec_ssra_b, int8_t)
+ static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id)
-+DO_SRA(gvec_ssra_h, int16_t)
+ {
-+DO_SRA(gvec_ssra_s, int32_t)
+     return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SVEVER) != 0;
-+DO_SRA(gvec_ssra_d, int64_t)
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_tts2uxn(const ARMISARegisters *id)
-+
+     return isar_feature_aa64_tts2uxn(id) || isar_feature_aa32_tts2uxn(id);
-+DO_SRA(gvec_usra_b, uint8_t)
+ }
-+DO_SRA(gvec_usra_h, uint16_t)
-+DO_SRA(gvec_usra_s, uint32_t)
++static inline bool isar_feature_any_debugv8p2(const ARMISARegisters *id)
-+DO_SRA(gvec_usra_d, uint64_t)
++{
-+
++    return isar_feature_aa64_debugv8p2(id) || isar_feature_aa32_debugv8p2(id);
-+#undef DO_SRA
++}
 +
  /*
-  * Convert float16 to float32, raising no exceptions and
+  * Forward to the above feature tests given an ARMCPU pointer.
-  * preserving exceptional values, including SNaN.
+  */
 --
-.20.1
+.25.1

-[PULL 14/45] target/arm: Pass pointer to qc to qrdmla/qrdmls
+[PULL 22/23] target/arm: Add isar_feature_{aa64,any}_ras
 From: Richard Henderson <richard.henderson@linaro.org>
-Pass a pointer directly to env->vfp.qc[0], rather than env.
+Add the aa64 predicate for detecting RAS support from id registers.
-This will allow SVE2, which does not modify QC, to pass a
+We already have the aa32 version from the M-profile work.
-pointer to dummy storage.
+Add the 'any' predicate for testing both aa64 and aa32.
 Change the return type of inl_qrdml.h_s16 to match the
 sense of the operation: signed.
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200513163245.17915-14-richard.henderson@linaro.org
+Message-id: 20220501055028.646596-34-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate.c  | 18 ++++++++---
+ target/arm/cpu.h | 10 ++++++++++
- target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------
+file changed, 10 insertions(+)
 files changed, 54 insertions(+), 34 deletions(-)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/target/arm/cpu.h
-+++ b/target/arm/translate.c
++++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_aa32_el1(const ARMISARegisters *id)
-     [NEON_2RM_VCVT_UF] = 0x4,
+     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, EL1) >= 2;
- };
+ }
-+static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
++static inline bool isar_feature_aa64_ras(const ARMISARegisters *id)
 +                            uint32_t opr_sz, uint32_t max_sz,
 +                            gen_helper_gvec_3_ptr *fn)
 +{
-+    TCGv_ptr qc_ptr = tcg_temp_new_ptr();
++    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, RAS) != 0;
 +
 +    tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc));
 +    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
 +                       opr_sz, max_sz, 0, fn);
 +    tcg_temp_free_ptr(qc_ptr);
 +}
 +
- void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+ static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
                            uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
  {
-@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
-         gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
+@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_debugv8p2(const ARMISARegisters *id)
-     };
+     return isar_feature_aa64_debugv8p2(id) || isar_feature_aa32_debugv8p2(id);
      tcg_debug_assert(vece >= 1 && vece <= 2);
 -    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
 -                       opr_sz, max_sz, 0, fns[vece - 1]);
 +    gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
  }
- void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
++static inline bool isar_feature_any_ras(const ARMISARegisters *id)
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
          gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
      };
      tcg_debug_assert(vece >= 1 && vece <= 2);
 -    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
 -                       opr_sz, max_sz, 0, fns[vece - 1]);
 +    gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
  }
  #define GEN_CMP0(NAME, COND)                                            \
 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/vec_helper.c
 +++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@
  #define H4(x)  (x)
  #endif
 -#define SET_QC() env->vfp.qc[0] = 1
 -
  static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
  {
      uint64_t *d = vd + opr_sz;
@@ -XXX,XX +XXX,XX @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
  }
  /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
 -static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
 -                                int16_t src2, int16_t src3)
 +static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2,
 +                               int16_t src3, uint32_t *sat)
  {
      /* Simplify:
       * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
      ret = ((int32_t)src3 << 15) + ret + (1 << 14);
      ret >>= 15;
      if (ret != (int16_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? -0x8000 : 0x7fff);
      }
      return ret;
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
  uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
                                    uint32_t src2, uint32_t src3)
  {
 -    uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3);
 -    uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
 +    uint32_t *sat = &env->vfp.qc[0];
 +    uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat);
 +    uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
      return deposit32(e1, 16, 16, e2);
  }
  void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int16_t *d = vd;
      int16_t *n = vn;
      int16_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 2; ++i) {
 -        d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
  /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
 -static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
 -                                int16_t src2, int16_t src3)
 +static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2,
 +                               int16_t src3, uint32_t *sat)
  {
      /* Similarly, using subtraction:
       * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
      ret = ((int32_t)src3 << 15) - ret + (1 << 14);
      ret >>= 15;
      if (ret != (int16_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? -0x8000 : 0x7fff);
      }
      return ret;
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
  uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
                                    uint32_t src2, uint32_t src3)
  {
 -    uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3);
 -    uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
 +    uint32_t *sat = &env->vfp.qc[0];
 +    uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat);
 +    uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
      return deposit32(e1, 16, 16, e2);
  }
  void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int16_t *d = vd;
      int16_t *n = vn;
      int16_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 2; ++i) {
 -        d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
  /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
 -uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
 -                                  int32_t src2, int32_t src3)
 +static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2,
 +                               int32_t src3, uint32_t *sat)
  {
      /* Simplify similarly to int_qrdmlah_s16 above.  */
      int64_t ret = (int64_t)src1 * src2;
      ret = ((int64_t)src3 << 31) + ret + (1 << 30);
      ret >>= 31;
      if (ret != (int32_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? INT32_MIN : INT32_MAX);
      }
      return ret;
  }
 +uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
 +                                  int32_t src2, int32_t src3)
 +{
-+    uint32_t *sat = &env->vfp.qc[0];
++    return isar_feature_aa64_ras(id) || isar_feature_aa32_ras(id);
 +    return inl_qrdmlah_s32(src1, src2, src3, sat);
 +}
 +
- void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
+ /*
--                              void *ve, uint32_t desc)
+  * Forward to the above feature tests given an ARMCPU pointer.
-+                              void *vq, uint32_t desc)
+  */
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int32_t *d = vd;
      int32_t *n = vn;
      int32_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 4; ++i) {
 -        d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
  /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
 -uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
 -                                  int32_t src2, int32_t src3)
 +static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2,
 +                               int32_t src3, uint32_t *sat)
  {
      /* Simplify similarly to int_qrdmlsh_s16 above.  */
      int64_t ret = (int64_t)src1 * src2;
      ret = ((int64_t)src3 << 31) - ret + (1 << 30);
      ret >>= 31;
      if (ret != (int32_t)ret) {
 -        SET_QC();
 +        *sat = 1;
          ret = (ret < 0 ? INT32_MIN : INT32_MAX);
      }
      return ret;
  }
 +uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
 +                                  int32_t src2, int32_t src3)
 +{
 +    uint32_t *sat = &env->vfp.qc[0];
 +    return inl_qrdmlsh_s32(src1, src2, src3, sat);
 +}
 +
  void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
 -                              void *ve, uint32_t desc)
 +                              void *vq, uint32_t desc)
  {
      uintptr_t opr_sz = simd_oprsz(desc);
      int32_t *d = vd;
      int32_t *n = vn;
      int32_t *m = vm;
 -    CPUARMState *env = ve;
      uintptr_t i;
      for (i = 0; i < opr_sz / 4; ++i) {
 -        d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]);
 +        d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq);
      }
      clear_tail(d, opr_sz, simd_maxsz(desc));
  }
 --
-.20.1
+.25.1

-[PULL 18/45] aspeed: Add support for the sonorapass-bmc board
+Deleted patch
-From: Patrick Williams <patrick@stwcx.xyz>
-Sonora Pass is a 2 socket x86 motherboard designed by Facebook
-and supported by OpenBMC.  Strapping configuration was obtained
-from hardware and i2c configuration is based on dts found at:
-https://github.com/facebook/openbmc-linux/blob/1633c87b8ba7c162095787c988979b748ba65dc8/arch/arm/boot/dts/aspeed-bmc-facebook-sonorapass.dts
-Booted a test image of http://github.com/facebook/openbmc to login
-prompt.
-Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
-Reviewed-by: Amithash Prasad <amithash@fb.com>
-Reviewed-by: Cédric Le Goater <clg@kaod.org>
-[PMM: fixed block comment style nit]
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- hw/arm/aspeed.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++
-file changed, 78 insertions(+)
-diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/aspeed.c
-+++ b/hw/arm/aspeed.c
-@@ -XXX,XX +XXX,XX @@ struct AspeedBoardState {
-         SCU_AST2500_HW_STRAP_ACPI_ENABLE |                              \
-         SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER))
-+/* Sonorapass hardware value: 0xF100D216 */
-+#define SONORAPASS_BMC_HW_STRAP1 (                                      \
-+        SCU_AST2500_HW_STRAP_SPI_AUTOFETCH_ENABLE |                     \
-+        SCU_AST2500_HW_STRAP_GPIO_STRAP_ENABLE |                        \
-+        SCU_AST2500_HW_STRAP_UART_DEBUG |                               \
-+        SCU_AST2500_HW_STRAP_RESERVED28 |                               \
-+        SCU_AST2500_HW_STRAP_DDR4_ENABLE |                              \
-+        SCU_HW_STRAP_VGA_CLASS_CODE |                                   \
-+        SCU_HW_STRAP_LPC_RESET_PIN |                                    \
-+        SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER) |                \
-+        SCU_AST2500_HW_STRAP_SET_AXI_AHB_RATIO(AXI_AHB_RATIO_2_1) |     \
-+        SCU_HW_STRAP_VGA_BIOS_ROM |                                     \
-+        SCU_HW_STRAP_VGA_SIZE_SET(VGA_16M_DRAM) |                       \
-+        SCU_AST2500_HW_STRAP_RESERVED1)
-+
- /* Swift hardware value: 0xF11AD206 */
- #define SWIFT_BMC_HW_STRAP1 (                                           \
-         AST2500_HW_STRAP1_DEFAULTS |                                    \
-@@ -XXX,XX +XXX,XX @@ static void swift_bmc_i2c_init(AspeedBoardState *bmc)
-     i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 12), "tmp105", 0x4a);
- }
-+static void sonorapass_bmc_i2c_init(AspeedBoardState *bmc)
-+{
-+    AspeedSoCState *soc = &bmc->soc;
-+
-+    /* bus 2 : */
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x48);
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x49);
-+    /* bus 2 : pca9546 @ 0x73 */
-+
-+    /* bus 3 : pca9548 @ 0x70 */
-+
-+    /* bus 4 : */
-+    uint8_t *eeprom4_54 = g_malloc0(8 * 1024);
-+    smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), 0x54,
-+                          eeprom4_54);
-+    /* PCA9539 @ 0x76, but PCA9552 is compatible */
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x76);
-+    /* PCA9539 @ 0x77, but PCA9552 is compatible */
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x77);
-+
-+    /* bus 6 : */
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x48);
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x49);
-+    /* bus 6 : pca9546 @ 0x73 */
-+
-+    /* bus 8 : */
-+    uint8_t *eeprom8_56 = g_malloc0(8 * 1024);
-+    smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), 0x56,
-+                          eeprom8_56);
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x60);
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x61);
-+    /* bus 8 : adc128d818 @ 0x1d */
-+    /* bus 8 : adc128d818 @ 0x1f */
-+
-+    /*
-+     * bus 13 : pca9548 @ 0x71
-+     *      - channel 3:
-+     *          - tmm421 @ 0x4c
-+     *          - tmp421 @ 0x4e
-+     *          - tmp421 @ 0x4f
-+     */
-+
-+}
-+
- static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
- {
-     AspeedSoCState *soc = &bmc->soc;
-@@ -XXX,XX +XXX,XX @@ static void aspeed_machine_romulus_class_init(ObjectClass *oc, void *data)
-     mc->default_ram_size       = 512 * MiB;
- };
-+static void aspeed_machine_sonorapass_class_init(ObjectClass *oc, void *data)
-+{
-+    MachineClass *mc = MACHINE_CLASS(oc);
-+    AspeedMachineClass *amc = ASPEED_MACHINE_CLASS(oc);
-+
-+    mc->desc       = "OCP SonoraPass BMC (ARM1176)";
-+    amc->soc_name  = "ast2500-a1";
-+    amc->hw_strap1 = SONORAPASS_BMC_HW_STRAP1;
-+    amc->fmc_model = "mx66l1g45g";
-+    amc->spi_model = "mx66l1g45g";
-+    amc->num_cs    = 2;
-+    amc->i2c_init  = sonorapass_bmc_i2c_init;
-+    mc->default_ram_size       = 512 * MiB;
-+};
-+
- static void aspeed_machine_swift_class_init(ObjectClass *oc, void *data)
- {
-     MachineClass *mc = MACHINE_CLASS(oc);
-@@ -XXX,XX +XXX,XX @@ static const TypeInfo aspeed_machine_types[] = {
-         .name          = MACHINE_TYPE_NAME("swift-bmc"),
-         .parent        = TYPE_ASPEED_MACHINE,
-         .class_init    = aspeed_machine_swift_class_init,
-+    }, {
-+        .name          = MACHINE_TYPE_NAME("sonorapass-bmc"),
-+        .parent        = TYPE_ASPEED_MACHINE,
-+        .class_init    = aspeed_machine_sonorapass_class_init,
-     }, {
-         .name          = MACHINE_TYPE_NAME("witherspoon-bmc"),
-         .parent        = TYPE_ASPEED_MACHINE,
---
-.20.1

-[PULL 19/45] acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
+Deleted patch
-From: Dongjiu Geng <gengdongjiu@huawei.com>
-The little end UUID is used in many places, so make
-NVDIMM_UUID_LE to a common macro to convert the UUID
-to a little end array.
-Reviewed-by: Xiang Zheng <zhengxiang9@huawei.com>
-Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
-Message-id: 20200512030609.19593-2-gengdongjiu@huawei.com
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/qemu/uuid.h | 27 +++++++++++++++++++++++++++
- hw/acpi/nvdimm.c    | 10 +++-------
-files changed, 30 insertions(+), 7 deletions(-)
-diff --git a/include/qemu/uuid.h b/include/qemu/uuid.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/qemu/uuid.h
-+++ b/include/qemu/uuid.h
-@@ -XXX,XX +XXX,XX @@ typedef struct {
-     };
- } QemuUUID;
-+/**
-+ * UUID_LE - converts the fields of UUID to little-endian array,
-+ * each of parameters is the filed of UUID.
-+ *
-+ * @time_low: The low field of the timestamp
-+ * @time_mid: The middle field of the timestamp
-+ * @time_hi_and_version: The high field of the timestamp
-+ *                       multiplexed with the version number
-+ * @clock_seq_hi_and_reserved: The high field of the clock
-+ *                             sequence multiplexed with the variant
-+ * @clock_seq_low: The low field of the clock sequence
-+ * @node0: The spatially unique node0 identifier
-+ * @node1: The spatially unique node1 identifier
-+ * @node2: The spatially unique node2 identifier
-+ * @node3: The spatially unique node3 identifier
-+ * @node4: The spatially unique node4 identifier
-+ * @node5: The spatially unique node5 identifier
-+ */
-+#define UUID_LE(time_low, time_mid, time_hi_and_version,                    \
-+  clock_seq_hi_and_reserved, clock_seq_low, node0, node1, node2,            \
-+  node3, node4, node5)                                                      \
-+  { (time_low) & 0xff, ((time_low) >> 8) & 0xff, ((time_low) >> 16) & 0xff, \
-+    ((time_low) >> 24) & 0xff, (time_mid) & 0xff, ((time_mid) >> 8) & 0xff, \
-+    (time_hi_and_version) & 0xff, ((time_hi_and_version) >> 8) & 0xff,      \
-+    (clock_seq_hi_and_reserved), (clock_seq_low), (node0), (node1), (node2),\
-+    (node3), (node4), (node5) }
-+
- #define UUID_FMT "%02hhx%02hhx%02hhx%02hhx-" \
-                  "%02hhx%02hhx-%02hhx%02hhx-" \
-                  "%02hhx%02hhx-" \
-diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/acpi/nvdimm.c
-+++ b/hw/acpi/nvdimm.c
-@@ -XXX,XX +XXX,XX @@
-  */
- #include "qemu/osdep.h"
-+#include "qemu/uuid.h"
- #include "hw/acpi/acpi.h"
- #include "hw/acpi/aml-build.h"
- #include "hw/acpi/bios-linker-loader.h"
-@@ -XXX,XX +XXX,XX @@
- #include "hw/mem/nvdimm.h"
- #include "qemu/nvdimm-utils.h"
--#define NVDIMM_UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)             \
--   { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
--     (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
--     (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) }
--
- /*
-  * define Byte Addressable Persistent Memory (PM) Region according to
-  * ACPI 6.0: 5.2.25.1 System Physical Address Range Structure.
-  */
- static const uint8_t nvdimm_nfit_spa_uuid[] =
--      NVDIMM_UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
--                     0x18, 0xb7, 0x8c, 0xdb);
-+      UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
-+              0x18, 0xb7, 0x8c, 0xdb);
- /*
-  * NVDIMM Firmware Interface Table
---
-.20.1

-[PULL 20/45] hw/arm/virt: Introduce a RAS machine option
+Deleted patch
-From: Dongjiu Geng <gengdongjiu@huawei.com>
-RAS Virtualization feature is not supported now, so
-add a RAS machine option and disable it by default.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
-Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
-Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
-Reviewed-by: Igor Mammedov <imammedo@redhat.com>
-Message-id: 20200512030609.19593-3-gengdongjiu@huawei.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/arm/virt.h |  1 +
- hw/arm/virt.c         | 23 +++++++++++++++++++++++
-files changed, 24 insertions(+)
-diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/virt.h
-+++ b/include/hw/arm/virt.h
-@@ -XXX,XX +XXX,XX @@ typedef struct {
-     bool highmem_ecam;
-     bool its;
-     bool virt;
-+    bool ras;
-     OnOffAuto acpi;
-     VirtGICType gic_version;
-     VirtIOMMUType iommu;
-diff --git a/hw/arm/virt.c b/hw/arm/virt.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/virt.c
-+++ b/hw/arm/virt.c
-@@ -XXX,XX +XXX,XX @@ static void virt_set_acpi(Object *obj, Visitor *v, const char *name,
-     visit_type_OnOffAuto(v, name, &vms->acpi, errp);
- }
-+static bool virt_get_ras(Object *obj, Error **errp)
-+{
-+    VirtMachineState *vms = VIRT_MACHINE(obj);
-+
-+    return vms->ras;
-+}
-+
-+static void virt_set_ras(Object *obj, bool value, Error **errp)
-+{
-+    VirtMachineState *vms = VIRT_MACHINE(obj);
-+
-+    vms->ras = value;
-+}
-+
- static char *virt_get_gic_version(Object *obj, Error **errp)
- {
-     VirtMachineState *vms = VIRT_MACHINE(obj);
-@@ -XXX,XX +XXX,XX @@ static void virt_instance_init(Object *obj)
-                                     "Valid values are none and smmuv3",
-                                     NULL);
-+    /* Default disallows RAS instantiation */
-+    vms->ras = false;
-+    object_property_add_bool(obj, "ras", virt_get_ras,
-+                             virt_set_ras, NULL);
-+    object_property_set_description(obj, "ras",
-+                                    "Set on/off to enable/disable reporting host memory errors "
-+                                    "to a KVM guest using ACPI and guest external abort exceptions",
-+                                    NULL);
-+
-     vms->irqmap = a15irqmap;
-     virt_flash_create(vms);
---
-.20.1

-[PULL 21/45] docs: APEI GHES generation and CPER record description
+Deleted patch
-From: Dongjiu Geng <gengdongjiu@huawei.com>
-Add APEI/GHES detailed design document
-Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
-Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
-Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
-Reviewed-by: Igor Mammedov <imammedo@redhat.com>
-Message-id: 20200512030609.19593-4-gengdongjiu@huawei.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- docs/specs/acpi_hest_ghes.rst | 110 ++++++++++++++++++++++++++++++++++
- docs/specs/index.rst          |   1 +
-files changed, 111 insertions(+)
- create mode 100644 docs/specs/acpi_hest_ghes.rst
-diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/docs/specs/acpi_hest_ghes.rst
-@@ -XXX,XX +XXX,XX @@
-+APEI tables generating and CPER record
-+======================================
-+
-+..
-+   Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
-+
-+   This work is licensed under the terms of the GNU GPL, version 2 or later.
-+   See the COPYING file in the top-level directory.
-+
-+Design Details
-+--------------
-+
-+::
-+
-+         etc/acpi/tables                           etc/hardware_errors
-+      ====================                   ===============================
-+  + +--------------------------+            +----------------------------+
-+  | | HEST                     | +--------->|    error_block_address1    |------+
-+  | +--------------------------+ |          +----------------------------+      |
-+  | | GHES1                    | | +------->|    error_block_address2    |------+-+
-+  | +--------------------------+ | |        +----------------------------+      | |
-+  | | .................        | | |        |      ..............        |      | |
-+  | | error_status_address-----+-+ |        -----------------------------+      | |
-+  | | .................        |   |   +--->|    error_block_addressN    |------+-+---+
-+  | | read_ack_register--------+-+ |   |    +----------------------------+      | |   |
-+  | | read_ack_preserve        | +-+---+--->|     read_ack_register1     |      | |   |
-+  | | read_ack_write           |   |   |    +----------------------------+      | |   |
-+  + +--------------------------+   | +-+--->|     read_ack_register2     |      | |   |
-+  | | GHES2                    |   | | |    +----------------------------+      | |   |
-+  + +--------------------------+   | | |    |       .............        |      | |   |
-+  | | .................        |   | | |    +----------------------------+      | |   |
-+  | | error_status_address-----+---+ | | +->|     read_ack_registerN     |      | |   |
-+  | | .................        |     | | |  +----------------------------+      | |   |
-+  | | read_ack_register--------+-----+ | |  |Generic Error Status Block 1|<-----+ |   |
-+  | | read_ack_preserve        |       | |  |-+------------------------+-+        |   |
-+  | | read_ack_write           |       | |  | |          CPER          | |        |   |
-+  + +--------------------------|       | |  | |          CPER          | |        |   |
-+  | | ...............          |       | |  | |          ....          | |        |   |
-+  + +--------------------------+       | |  | |          CPER          | |        |   |
-+  | | GHESN                    |       | |  |-+------------------------+-|        |   |
-+  + +--------------------------+       | |  |Generic Error Status Block 2|<-------+   |
-+  | | .................        |       | |  |-+------------------------+-+            |
-+  | | error_status_address-----+-------+ |  | |           CPER         | |            |
-+  | | .................        |         |  | |           CPER         | |            |
-+  | | read_ack_register--------+---------+  | |           ....         | |            |
-+  | | read_ack_preserve        |            | |           CPER         | |            |
-+  | | read_ack_write           |            +-+------------------------+-+            |
-+  + +--------------------------+            |         ..........         |            |
-+                                            |----------------------------+            |
-+                                            |Generic Error Status Block N |<----------+
-+                                            |-+-------------------------+-+
-+                                            | |          CPER           | |
-+                                            | |          CPER           | |
-+                                            | |          ....           | |
-+                                            | |          CPER           | |
-+                                            +-+-------------------------+-+
-+
-+
-+(1) QEMU generates the ACPI HEST table. This table goes in the current
-+    "etc/acpi/tables" fw_cfg blob. Each error source has different
-+    notification types.
-+
-+(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
-+    also needs to populate this blob. The "etc/hardware_errors" fw_cfg blob
-+    contains an address registers table and an Error Status Data Block table.
-+
-+(3) The address registers table contains N Error Block Address entries
-+    and N Read Ack Register entries. The size for each entry is 8-byte.
-+    The Error Status Data Block table contains N Error Status Data Block
-+    entries. The size for each entry is 4096(0x1000) bytes. The total size
-+    for the "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
-+    N is the number of the kinds of hardware error sources.
-+
-+(4) QEMU generates the ACPI linker/loader script for the firmware. The
-+    firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
-+    and copies blob contents there.
-+
-+(5) QEMU generates N ADD_POINTER commands, which patch addresses in the
-+    "error_status_address" fields of the HEST table with a pointer to the
-+    corresponding "address registers" in the "etc/hardware_errors" blob.
-+
-+(6) QEMU generates N ADD_POINTER commands, which patch addresses in the
-+    "read_ack_register" fields of the HEST table with a pointer to the
-+    corresponding "read_ack_register" within the "etc/hardware_errors" blob.
-+
-+(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
-+    addresses in the "error_block_address" fields with a pointer to the
-+    respective "Error Status Data Block" in the "etc/hardware_errors" blob.
-+
-+(8) QEMU defines a third and write-only fw_cfg blob which is called
-+    "etc/hardware_errors_addr". Through that blob, the firmware can send back
-+    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
-+    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
-+    for the firmware. The firmware will write back the start address of
-+    "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
-+
-+(9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
-+    "Error Status Data Block", guest memory, and then injects platform specific
-+    interrupt (in case of arm/virt machine it's Synchronous External Abort) as a
-+    notification which is necessary for notifying the guest.
-+
-+(10) This notification (in virtual hardware) will be handled by the guest
-+     kernel, on receiving notification, guest APEI driver could read the CPER error
-+     and take appropriate action.
-+
-+(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
-+     find out "Error Status Data Block" entry corresponding to error source. So supported
-+     source_id values should be assigned here and not be changed afterwards to make sure
-+     that guest will write error into expected "Error Status Data Block" even if guest was
-+     migrated to a newer QEMU.
-diff --git a/docs/specs/index.rst b/docs/specs/index.rst
-index XXXXXXX..XXXXXXX 100644
---- a/docs/specs/index.rst
-+++ b/docs/specs/index.rst
-@@ -XXX,XX +XXX,XX @@ Contents:
-    ppc-spapr-xive
-    acpi_hw_reduced_hotplug
-    tpm
-+   acpi_hest_ghes
---
-.20.1

-[PULL 24/45] ACPI: Record the Generic Error Status Block address
+Deleted patch
-From: Dongjiu Geng <gengdongjiu@huawei.com>
-Record the GHEB address via fw_cfg file, when recording
-a error to CPER, it will use this address to find out
-Generic Error Data Entries and write the error.
-In order to avoid migration failure, make hardware
-error table address to a part of GED device instead
-of global variable, then this address will be migrated
-to target QEMU.
-Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
-Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
-Reviewed-by: Igor Mammedov <imammedo@redhat.com>
-Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
-Message-id: 20200512030609.19593-7-gengdongjiu@huawei.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/acpi/generic_event_device.h |  2 ++
- include/hw/acpi/ghes.h                 |  6 ++++++
- hw/acpi/generic_event_device.c         | 19 +++++++++++++++++++
- hw/acpi/ghes.c                         | 14 ++++++++++++++
- hw/arm/virt-acpi-build.c               |  8 ++++++++
-files changed, 49 insertions(+)
-diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/acpi/generic_event_device.h
-+++ b/include/hw/acpi/generic_event_device.h
-@@ -XXX,XX +XXX,XX @@
- #include "hw/sysbus.h"
- #include "hw/acpi/memory_hotplug.h"
-+#include "hw/acpi/ghes.h"
- #define ACPI_POWER_BUTTON_DEVICE "PWRB"
-@@ -XXX,XX +XXX,XX @@ typedef struct AcpiGedState {
-     GEDState ged_state;
-     uint32_t ged_event_bitmap;
-     qemu_irq irq;
-+    AcpiGhesState ghes_state;
- } AcpiGedState;
- void build_ged_aml(Aml *table, const char* name, HotplugHandler *hotplug_dev,
-diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/acpi/ghes.h
-+++ b/include/hw/acpi/ghes.h
-@@ -XXX,XX +XXX,XX @@ enum {
-     ACPI_HEST_SRC_ID_RESERVED,
- };
-+typedef struct AcpiGhesState {
-+    uint64_t ghes_addr_le;
-+} AcpiGhesState;
-+
- void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
- void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
-+void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
-+                          GArray *hardware_errors);
- #endif
-diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/acpi/generic_event_device.c
-+++ b/hw/acpi/generic_event_device.c
-@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_ged_state = {
-     }
- };
-+static bool ghes_needed(void *opaque)
-+{
-+    AcpiGedState *s = opaque;
-+    return s->ghes_state.ghes_addr_le;
-+}
-+
-+static const VMStateDescription vmstate_ghes_state = {
-+    .name = "acpi-ged/ghes",
-+    .version_id = 1,
-+    .minimum_version_id = 1,
-+    .needed = ghes_needed,
-+    .fields      = (VMStateField[]) {
-+        VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
-+                       vmstate_ghes_state, AcpiGhesState),
-+        VMSTATE_END_OF_LIST()
-+    }
-+};
-+
- static const VMStateDescription vmstate_acpi_ged = {
-     .name = "acpi-ged",
-     .version_id = 1,
-@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_acpi_ged = {
-     },
-     .subsections = (const VMStateDescription * []) {
-         &vmstate_memhp_state,
-+        &vmstate_ghes_state,
-         NULL
-     }
- };
-diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/acpi/ghes.c
-+++ b/hw/acpi/ghes.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/acpi/ghes.h"
- #include "hw/acpi/aml-build.h"
- #include "qemu/error-report.h"
-+#include "hw/acpi/generic_event_device.h"
-+#include "hw/nvram/fw_cfg.h"
- #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
- #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
-@@ -XXX,XX +XXX,XX @@ void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
-     build_header(linker, table_data, (void *)(table_data->data + hest_start),
-         "HEST", table_data->len - hest_start, 1, NULL, NULL);
- }
-+
-+void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
-+                          GArray *hardware_error)
-+{
-+    /* Create a read-only fw_cfg file for GHES */
-+    fw_cfg_add_file(s, ACPI_GHES_ERRORS_FW_CFG_FILE, hardware_error->data,
-+                    hardware_error->len);
-+
-+    /* Create a read-write fw_cfg file for Address */
-+    fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
-+        NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
-+}
-diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/virt-acpi-build.c
-+++ b/hw/arm/virt-acpi-build.c
-@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
- {
-     AcpiBuildTables tables;
-     AcpiBuildState *build_state;
-+    AcpiGedState *acpi_ged_state;
-     if (!vms->fw_cfg) {
-         trace_virt_acpi_setup();
-@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
-     fw_cfg_add_file(vms->fw_cfg, ACPI_BUILD_TPMLOG_FILE, tables.tcpalog->data,
-                     acpi_data_len(tables.tcpalog));
-+    if (vms->ras) {
-+        assert(vms->acpi_dev);
-+        acpi_ged_state = ACPI_GED(vms->acpi_dev);
-+        acpi_ghes_add_fw_cfg(&acpi_ged_state->ghes_state,
-+                             vms->fw_cfg, tables.hardware_errors);
-+    }
-+
-     build_state->rsdp_mr = acpi_add_rom_blob(virt_acpi_build_update,
-                                              build_state, tables.rsdp,
-                                              ACPI_BUILD_RSDP_FILE, 0);
---
-.20.1

-[PULL 27/45] target-arm: kvm64: handle SIGBUS signal from kernel or KVM
+[PULL 23/23] target/arm: read access to performance counters from EL0
-From: Dongjiu Geng <gengdongjiu@huawei.com>
+From: Alex Zuepke <alex.zuepke@tum.de>
-Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
+The ARMv8 manual defines that PMUSERENR_EL0.ER enables read-access
-translates the host VA delivered by host to guest PA, then fills this PA
+to both PMXEVCNTR_EL0 and PMEVCNTR<n>_EL0 registers, however,
-to guest APEI GHES memory, then notifies guest according to the SIGBUS
+we only use it for PMXEVCNTR_EL0. Extend to PMEVCNTR<n>_EL0 as well.
 type.
-When guest accesses the poisoned memory, it will generate a Synchronous
+Signed-off-by: Alex Zuepke <alex.zuepke@tum.de>
-External Abort(SEA). Then host kernel gets an APEI notification and calls
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-memory_failure() to unmapped the affected page in stage 2, finally
+Message-id: 20220428132717.84190-1-alex.zuepke@tum.de
 returns to guest.
 Guest continues to access the PG_hwpoison page, it will trap to KVM as
 stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
 Qemu, Qemu records this error address into guest APEI GHES memory and
 notifes guest using Synchronous-External-Abort(SEA).
 In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
 in which we can setup the type of exception and the syndrome information.
 When switching to guest, the target vcpu will jump to the synchronous
 external abort vector table entry.
 The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
 ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
 not valid and hold an UNKNOWN value. These values will be set to KVM
 register structures through KVM_SET_ONE_REG IOCTL.
 Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
 Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Igor Mammedov <imammedo@redhat.com>
 Message-id: 20200512030609.19593-10-gengdongjiu@huawei.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/sysemu/kvm.h    |  3 +-
+ target/arm/helper.c | 4 ++--
- target/arm/cpu.h        |  4 +++
+file changed, 2 insertions(+), 2 deletions(-)
  target/arm/internals.h  |  5 +--
  target/i386/cpu.h       |  2 ++
  target/arm/helper.c     |  2 +-
  target/arm/kvm64.c      | 77 +++++++++++++++++++++++++++++++++++++++++
  target/arm/tlb_helper.c |  2 +-
 files changed, 89 insertions(+), 6 deletions(-)
-diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/sysemu/kvm.h
-+++ b/include/sysemu/kvm.h
-@@ -XXX,XX +XXX,XX @@ bool kvm_vcpu_id_is_valid(int vcpu_id);
- /* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
- unsigned long kvm_arch_vcpu_id(CPUState *cpu);
--#ifdef TARGET_I386
--#define KVM_HAVE_MCE_INJECTION 1
-+#ifdef KVM_HAVE_MCE_INJECTION
- void kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
- #endif
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@
- /* ARM processors have a weak memory model */
- #define TCG_GUEST_DEFAULT_MO      (0)
-+#ifdef TARGET_AARCH64
-+#define KVM_HAVE_MCE_INJECTION 1
-+#endif
-+
- #define EXCP_UDEF            1   /* undefined instruction */
- #define EXCP_SWI             2   /* software interrupt */
- #define EXCP_PREFETCH_ABORT  3
-diff --git a/target/arm/internals.h b/target/arm/internals.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/internals.h
-+++ b/target/arm/internals.h
-@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_insn_abort(int same_el, int ea, int s1ptw, int fsc)
-         | ARM_EL_IL | (ea << 9) | (s1ptw << 7) | fsc;
- }
--static inline uint32_t syn_data_abort_no_iss(int same_el,
-+static inline uint32_t syn_data_abort_no_iss(int same_el, int fnv,
-                                              int ea, int cm, int s1ptw,
-                                              int wnr, int fsc)
- {
-     return (EC_DATAABORT << ARM_EL_EC_SHIFT) | (same_el << ARM_EL_EC_SHIFT)
-            | ARM_EL_IL
--           | (ea << 9) | (cm << 8) | (s1ptw << 7) | (wnr << 6) | fsc;
-+           | (fnv << 10) | (ea << 9) | (cm << 8) | (s1ptw << 7)
-+           | (wnr << 6) | fsc;
- }
- static inline uint32_t syn_data_abort_with_iss(int same_el,
-diff --git a/target/i386/cpu.h b/target/i386/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/i386/cpu.h
-+++ b/target/i386/cpu.h
-@@ -XXX,XX +XXX,XX @@
- /* The x86 has a strong memory model with some store-after-load re-ordering */
- #define TCG_GUEST_DEFAULT_MO      (TCG_MO_ALL & ~TCG_MO_ST_LD)
-+#define KVM_HAVE_MCE_INJECTION 1
-+
- /* Maximum instruction code size */
- #define TARGET_MAX_INSN_SIZE 16
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
+@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
-              * Report exception with ESR indicating a fault due to a
+               .crm = 8 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
-              * translation table walk for a cache maintenance instruction.
+               .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
-              */
+               .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
--            syn = syn_data_abort_no_iss(current_el == target_el,
+-              .accessfn = pmreg_access },
-+            syn = syn_data_abort_no_iss(current_el == target_el, 0,
++              .accessfn = pmreg_access_xevcntr },
-                                         fi.ea, 1, fi.s1ptw, 1, fsc);
+             { .name = pmevcntr_el0_name, .state = ARM_CP_STATE_AA64,
-             env->exception.vaddress = value;
+               .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 8 | (3 & (i >> 3)),
-             env->exception.fsr = fsr;
+-              .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
-diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
++              .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access_xevcntr,
-index XXXXXXX..XXXXXXX 100644
+               .type = ARM_CP_IO,
---- a/target/arm/kvm64.c
+               .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
-+++ b/target/arm/kvm64.c
+               .raw_readfn = pmevcntr_rawread,
@@ -XXX,XX +XXX,XX @@
  #include "sysemu/kvm_int.h"
  #include "kvm_arm.h"
  #include "internals.h"
 +#include "hw/acpi/acpi.h"
 +#include "hw/acpi/ghes.h"
 +#include "hw/arm/virt.h"
  static bool have_guest_debug;
@@ -XXX,XX +XXX,XX @@ int kvm_arm_cpreg_level(uint64_t regidx)
      return KVM_PUT_RUNTIME_STATE;
  }
 +/* Callers must hold the iothread mutex lock */
 +static void kvm_inject_arm_sea(CPUState *c)
 +{
 +    ARMCPU *cpu = ARM_CPU(c);
 +    CPUARMState *env = &cpu->env;
 +    CPUClass *cc = CPU_GET_CLASS(c);
 +    uint32_t esr;
 +    bool same_el;
 +
 +    c->exception_index = EXCP_DATA_ABORT;
 +    env->exception.target_el = 1;
 +
 +    /*
 +     * Set the DFSC to synchronous external abort and set FnV to not valid,
 +     * this will tell guest the FAR_ELx is UNKNOWN for this abort.
 +     */
 +    same_el = arm_current_el(env) == env->exception.target_el;
 +    esr = syn_data_abort_no_iss(same_el, 1, 0, 0, 0, 0, 0x10);
 +
 +    env->exception.syndrome = esr;
 +
 +    cc->do_interrupt(c);
 +}
 +
  #define AARCH64_CORE_REG(x)   (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
                   KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
      return ret;
  }
 +void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
 +{
 +    ram_addr_t ram_addr;
 +    hwaddr paddr;
 +    Object *obj = qdev_get_machine();
 +    VirtMachineState *vms = VIRT_MACHINE(obj);
 +    bool acpi_enabled = virt_is_acpi_enabled(vms);
 +
 +    assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
 +
 +    if (acpi_enabled && addr &&
 +            object_property_get_bool(obj, "ras", NULL)) {
 +        ram_addr = qemu_ram_addr_from_host(addr);
 +        if (ram_addr != RAM_ADDR_INVALID &&
 +            kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
 +            kvm_hwpoison_page_add(ram_addr);
 +            /*
 +             * If this is a BUS_MCEERR_AR, we know we have been called
 +             * synchronously from the vCPU thread, so we can easily
 +             * synchronize the state and inject an error.
 +             *
 +             * TODO: we currently don't tell the guest at all about
 +             * BUS_MCEERR_AO. In that case we might either be being
 +             * called synchronously from the vCPU thread, or a bit
 +             * later from the main thread, so doing the injection of
 +             * the error would be more complicated.
 +             */
 +            if (code == BUS_MCEERR_AR) {
 +                kvm_cpu_synchronize_state(c);
 +                if (!acpi_ghes_record_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
 +                    kvm_inject_arm_sea(c);
 +                } else {
 +                    error_report("failed to record the error");
 +                    abort();
 +                }
 +            }
 +            return;
 +        }
 +        if (code == BUS_MCEERR_AO) {
 +            error_report("Hardware memory error at addr %p for memory used by "
 +                "QEMU itself instead of guest system!", addr);
 +        }
 +    }
 +
 +    if (code == BUS_MCEERR_AR) {
 +        error_report("Hardware memory error!");
 +        exit(1);
 +    }
 +}
 +
  /* C6.6.29 BRK instruction */
  static const uint32_t brk_insn = 0xd4200000;
 diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/tlb_helper.c
 +++ b/target/arm/tlb_helper.c
@@ -XXX,XX +XXX,XX @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn,
       * ISV field.
       */
      if (!(template_syn & ARM_EL_ISV) || target_el != 2 || s1ptw) {
 -        syn = syn_data_abort_no_iss(same_el,
 +        syn = syn_data_abort_no_iss(same_el, 0,
                                      ea, 0, s1ptw, is_write, fsc);
      } else {
          /*
 --
-.20.1
+.25.1

-[PULL 29/45] target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
+Deleted patch
-Convert the Neon VQRDMLAH and VQRDMLSH insns in the 3-reg-same group
-to decodetree.  These don't use do_3same() because they want to
-operate on VFP double registers, whose offsets are different from the
-neon_reg_offset() calculations do_3same does.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-2-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |  3 +++
- target/arm/translate-neon.inc.c | 15 +++++++++++++++
- target/arm/translate.c          | 14 ++------------
-files changed, 20 insertions(+), 12 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
- VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
- VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
-+
-+VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
-+VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
-     }
-     return do_3same(s, a, gen_VMUL_p_3s);
- }
-+
-+#define DO_VQRDMLAH(INSN, FUNC)                                         \
-+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-+    {                                                                   \
-+        if (!dc_isar_feature(aa32_rdm, s)) {                            \
-+            return false;                                               \
-+        }                                                               \
-+        if (a->size != 1 && a->size != 2) {                             \
-+            return false;                                               \
-+        }                                                               \
-+        return do_3same(s, a, FUNC);                                    \
-+    }
-+
-+DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
-+DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             if (!u) {
-                 break;  /* VPADD */
-             }
--            /* VQRDMLAH */
--            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
--                gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs,
--                                     vec_size, vec_size);
--                return 0;
--            }
-+            /* VQRDMLAH : handled by decodetree */
-             return 1;
-         case NEON_3R_VFM_VQRDMLSH:
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                 }
-                 break;
-             }
--            /* VQRDMLSH */
--            if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) {
--                gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs,
--                                     vec_size, vec_size);
--                return 0;
--            }
-+            /* VQRDMLSH : handled by decodetree */
-             return 1;
-         case NEON_3R_VABD:
---
-.20.1

-[PULL 30/45] target/arm: Convert Neon 3-reg-same SHA to decodetree
+Deleted patch
-Convert the Neon SHA instructions in the 3-reg-same group
-to decodetree.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-3-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |  10 +++
- target/arm/translate-neon.inc.c | 139 ++++++++++++++++++++++++++++++++
- target/arm/translate.c          |  46 +----------
-files changed, 151 insertions(+), 44 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
- VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
- VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
-+
-+SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
-+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+SHA256H_3s       1111 001 1 0 . 00 .... .... 1100 . 1 . 0 .... \
-+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+SHA256H2_3s      1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
-+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
-+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+
- VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
- DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
- DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
-+
-+static bool trans_SHA1_3s(DisasContext *s, arg_SHA1_3s *a)
-+{
-+    TCGv_ptr ptr1, ptr2, ptr3;
-+    TCGv_i32 tmp;
-+
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
-+        !dc_isar_feature(aa32_sha1, s)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
-+        return false;
-+    }
-+
-+    if ((a->vn | a->vm | a->vd) & 1) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    ptr1 = vfp_reg_ptr(true, a->vd);
-+    ptr2 = vfp_reg_ptr(true, a->vn);
-+    ptr3 = vfp_reg_ptr(true, a->vm);
-+    tmp = tcg_const_i32(a->optype);
-+    gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp);
-+    tcg_temp_free_i32(tmp);
-+    tcg_temp_free_ptr(ptr1);
-+    tcg_temp_free_ptr(ptr2);
-+    tcg_temp_free_ptr(ptr3);
-+
-+    return true;
-+}
-+
-+static bool trans_SHA256H_3s(DisasContext *s, arg_SHA256H_3s *a)
-+{
-+    TCGv_ptr ptr1, ptr2, ptr3;
-+
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
-+        !dc_isar_feature(aa32_sha2, s)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
-+        return false;
-+    }
-+
-+    if ((a->vn | a->vm | a->vd) & 1) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    ptr1 = vfp_reg_ptr(true, a->vd);
-+    ptr2 = vfp_reg_ptr(true, a->vn);
-+    ptr3 = vfp_reg_ptr(true, a->vm);
-+    gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
-+    tcg_temp_free_ptr(ptr1);
-+    tcg_temp_free_ptr(ptr2);
-+    tcg_temp_free_ptr(ptr3);
-+
-+    return true;
-+}
-+
-+static bool trans_SHA256H2_3s(DisasContext *s, arg_SHA256H2_3s *a)
-+{
-+    TCGv_ptr ptr1, ptr2, ptr3;
-+
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
-+        !dc_isar_feature(aa32_sha2, s)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
-+        return false;
-+    }
-+
-+    if ((a->vn | a->vm | a->vd) & 1) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    ptr1 = vfp_reg_ptr(true, a->vd);
-+    ptr2 = vfp_reg_ptr(true, a->vn);
-+    ptr3 = vfp_reg_ptr(true, a->vm);
-+    gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
-+    tcg_temp_free_ptr(ptr1);
-+    tcg_temp_free_ptr(ptr2);
-+    tcg_temp_free_ptr(ptr3);
-+
-+    return true;
-+}
-+
-+static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
-+{
-+    TCGv_ptr ptr1, ptr2, ptr3;
-+
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
-+        !dc_isar_feature(aa32_sha2, s)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
-+        return false;
-+    }
-+
-+    if ((a->vn | a->vm | a->vd) & 1) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    ptr1 = vfp_reg_ptr(true, a->vd);
-+    ptr2 = vfp_reg_ptr(true, a->vn);
-+    ptr3 = vfp_reg_ptr(true, a->vm);
-+    gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
-+    tcg_temp_free_ptr(ptr1);
-+    tcg_temp_free_ptr(ptr2);
-+    tcg_temp_free_ptr(ptr3);
-+
-+    return true;
-+}
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     int vec_size;
-     uint32_t imm;
-     TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
--    TCGv_ptr ptr1, ptr2, ptr3;
-+    TCGv_ptr ptr1, ptr2;
-     TCGv_i64 tmp64;
-     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             return 1;
-         }
-         switch (op) {
--        case NEON_3R_SHA:
--            /* The SHA-1/SHA-256 3-register instructions require special
--             * treatment here, as their size field is overloaded as an
--             * op type selector, and they all consume their input in a
--             * single pass.
--             */
--            if (!q) {
--                return 1;
--            }
--            if (!u) { /* SHA-1 */
--                if (!dc_isar_feature(aa32_sha1, s)) {
--                    return 1;
--                }
--                ptr1 = vfp_reg_ptr(true, rd);
--                ptr2 = vfp_reg_ptr(true, rn);
--                ptr3 = vfp_reg_ptr(true, rm);
--                tmp4 = tcg_const_i32(size);
--                gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp4);
--                tcg_temp_free_i32(tmp4);
--            } else { /* SHA-256 */
--                if (!dc_isar_feature(aa32_sha2, s) || size == 3) {
--                    return 1;
--                }
--                ptr1 = vfp_reg_ptr(true, rd);
--                ptr2 = vfp_reg_ptr(true, rn);
--                ptr3 = vfp_reg_ptr(true, rm);
--                switch (size) {
--                case 0:
--                    gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
--                    break;
--                case 1:
--                    gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
--                    break;
--                case 2:
--                    gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
--                    break;
--                }
--            }
--            tcg_temp_free_ptr(ptr1);
--            tcg_temp_free_ptr(ptr2);
--            tcg_temp_free_ptr(ptr3);
--            return 0;
--
-         case NEON_3R_VPADD_VQRDMLAH:
-             if (!u) {
-                 break;  /* VPADD */
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_VMUL:
-         case NEON_3R_VML:
-         case NEON_3R_VSHL:
-+        case NEON_3R_SHA:
-             /* Already handled by decodetree */
-             return 1;
-         }
---
-.20.1

-[PULL 31/45] target/arm: Convert Neon 64-bit element 3-reg-same insns
+Deleted patch
-Convert the 64-bit element insns in the 3-reg-same group
-to decodetree. This covers VQSHL, VRSHL and VQRSHL where
-size==0b11.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-4-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       | 13 +++++++++++
- target/arm/translate-neon.inc.c | 24 +++++++++++++++++++++
- target/arm/translate.c          | 38 ++-------------------------------
-files changed, 39 insertions(+), 36 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
- VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
- VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
-+# Insns operating on 64-bit elements (size!=0b11 handled elsewhere)
-+# The _rev suffix indicates that Vn and Vm are reversed (as explained
-+# by the comment for the @3same_rev format).
-+@3same_64_rev    .... ... . . . 11 .... .... .... . q:1 . . .... \
-+                 &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
-+
-+VQSHL_S64_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-+VQSHL_U64_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-+VRSHL_S64_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-+VRSHL_U64_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-+VQRSHL_S64_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-+VQRSHL_U64_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-+
- VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
- VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
- VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
-     return true;
- }
-+
-+#define DO_3SAME_64(INSN, FUNC)                                         \
-+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+                                uint32_t oprsz, uint32_t maxsz)         \
-+    {                                                                   \
-+        static const GVecGen3 op = { .fni8 = FUNC };                    \
-+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &op);      \
-+    }                                                                   \
-+    DO_3SAME(INSN, gen_##INSN##_3s)
-+
-+#define DO_3SAME_64_ENV(INSN, FUNC)                                     \
-+    static void gen_##INSN##_elt(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m)    \
-+    {                                                                   \
-+        FUNC(d, cpu_env, n, m);                                         \
-+    }                                                                   \
-+    DO_3SAME_64(INSN, gen_##INSN##_elt)
-+
-+DO_3SAME_64(VRSHL_S64, gen_helper_neon_rshl_s64)
-+DO_3SAME_64(VRSHL_U64, gen_helper_neon_rshl_u64)
-+DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
-+DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
-+DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
-+DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         }
-         if (size == 3) {
--            /* 64-bit element instructions. */
--            for (pass = 0; pass < (q ? 2 : 1); pass++) {
--                neon_load_reg64(cpu_V0, rn + pass);
--                neon_load_reg64(cpu_V1, rm + pass);
--                switch (op) {
--                case NEON_3R_VQSHL:
--                    if (u) {
--                        gen_helper_neon_qshl_u64(cpu_V0, cpu_env,
--                                                 cpu_V1, cpu_V0);
--                    } else {
--                        gen_helper_neon_qshl_s64(cpu_V0, cpu_env,
--                                                 cpu_V1, cpu_V0);
--                    }
--                    break;
--                case NEON_3R_VRSHL:
--                    if (u) {
--                        gen_helper_neon_rshl_u64(cpu_V0, cpu_V1, cpu_V0);
--                    } else {
--                        gen_helper_neon_rshl_s64(cpu_V0, cpu_V1, cpu_V0);
--                    }
--                    break;
--                case NEON_3R_VQRSHL:
--                    if (u) {
--                        gen_helper_neon_qrshl_u64(cpu_V0, cpu_env,
--                                                  cpu_V1, cpu_V0);
--                    } else {
--                        gen_helper_neon_qrshl_s64(cpu_V0, cpu_env,
--                                                  cpu_V1, cpu_V0);
--                    }
--                    break;
--                default:
--                    abort();
--                }
--                neon_store_reg64(cpu_V0, rd + pass);
--            }
--            return 0;
-+            /* 64-bit element instructions: handled by decodetree */
-+            return 1;
-         }
-         pairwise = 0;
-         switch (op) {
---
-.20.1

-[PULL 32/45] target/arm: Convert Neon VHADD 3-reg-same insns
+Deleted patch
-Convert the Neon VHADD insns in the 3-reg-same group to decodetree.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-5-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |  2 ++
- target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
- target/arm/translate.c          |  4 +---
-files changed, 27 insertions(+), 3 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
- @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
-                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
-+VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
- VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
- VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
- DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
- DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
- DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
-+
-+#define DO_3SAME_32(INSN, FUNC)                                         \
-+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+                                uint32_t oprsz, uint32_t maxsz)         \
-+    {                                                                   \
-+        static const GVecGen3 ops[4] = {                                \
-+            { .fni4 = gen_helper_neon_##FUNC##8 },                      \
-+            { .fni4 = gen_helper_neon_##FUNC##16 },                     \
-+            { .fni4 = gen_helper_neon_##FUNC##32 },                     \
-+            { 0 },                                                      \
-+        };                                                              \
-+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
-+    }                                                                   \
-+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-+    {                                                                   \
-+        if (a->size > 2) {                                              \
-+            return false;                                               \
-+        }                                                               \
-+        return do_3same(s, a, gen_##INSN##_3s);                         \
-+    }
-+
-+DO_3SAME_32(VHADD_S, hadd_s)
-+DO_3SAME_32(VHADD_U, hadd_u)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_VML:
-         case NEON_3R_VSHL:
-         case NEON_3R_SHA:
-+        case NEON_3R_VHADD:
-             /* Already handled by decodetree */
-             return 1;
-         }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             tmp2 = neon_load_reg(rm, pass);
-         }
-         switch (op) {
--        case NEON_3R_VHADD:
--            GEN_NEON_INTEGER_OP(hadd);
--            break;
-         case NEON_3R_VRHADD:
-             GEN_NEON_INTEGER_OP(rhadd);
-             break;
---
-.20.1

-[PULL 33/45] target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
+Deleted patch
-Convert the Neon VABA and VABD insns in the 3-reg-same group to
-decodetree.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-6-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |  6 ++++++
- target/arm/translate-neon.inc.c |  4 ++++
- target/arm/translate.c          | 22 ++--------------------
-files changed, 12 insertions(+), 20 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
- VMIN_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 1 .... @3same
- VMIN_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 1 .... @3same
-+VABD_S_3s        1111 001 0 0 . .. .... .... 0111 . . . 0 .... @3same
-+VABD_U_3s        1111 001 1 0 . .. .... .... 0111 . . . 0 .... @3same
-+
-+VABA_S_3s        1111 001 0 0 . .. .... .... 0111 . . . 1 .... @3same
-+VABA_U_3s        1111 001 1 0 . .. .... .... 0111 . . . 1 .... @3same
-+
- VADD_3s          1111 001 0 0 . .. .... .... 1000 . . . 0 .... @3same
- VSUB_3s          1111 001 1 0 . .. .... .... 1000 . . . 0 .... @3same
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
- DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
- DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
- DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
-+DO_3SAME_NO_SZ_3(VABD_S, gen_gvec_sabd)
-+DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba)
-+DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd)
-+DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba)
- #define DO_3SAME_CMP(INSN, COND)                                        \
-     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             /* VQRDMLSH : handled by decodetree */
-             return 1;
--        case NEON_3R_VABD:
--            if (u) {
--                gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs,
--                              vec_size, vec_size);
--            } else {
--                gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs,
--                              vec_size, vec_size);
--            }
--            return 0;
--
--        case NEON_3R_VABA:
--            if (u) {
--                gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
--                              vec_size, vec_size);
--            } else {
--                gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
--                              vec_size, vec_size);
--            }
--            return 0;
--
-         case NEON_3R_VADD_VSUB:
-         case NEON_3R_LOGIC:
-         case NEON_3R_VMAX:
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_VSHL:
-         case NEON_3R_SHA:
-         case NEON_3R_VHADD:
-+        case NEON_3R_VABD:
-+        case NEON_3R_VABA:
-             /* Already handled by decodetree */
-             return 1;
-         }
---
-.20.1

-[PULL 34/45] target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
+Deleted patch
-Convert the Neon VRHADD and VHSUB 3-reg-same insns to decodetree.
-(These are all the other insns in 3-reg-same which were using
-GEN_NEON_INTEGER_OP() and which are not pairwise or
-reversed-operands.)
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-7-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       | 6 ++++++
- target/arm/translate-neon.inc.c | 4 ++++
- target/arm/translate.c          | 8 ++------
-files changed, 12 insertions(+), 6 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
- VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
- VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
-+VRHADD_S_3s      1111 001 0 0 . .. .... .... 0001 . . . 0 .... @3same
-+VRHADD_U_3s      1111 001 1 0 . .. .... .... 0001 . . . 0 .... @3same
-+
- @3same_logic     .... ... . . . .. .... .... .... . q:1 .. .... \
-                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp size=0
-@@ -XXX,XX +XXX,XX @@ VBSL_3s          1111 001 1 0 . 01 .... .... 0001 ... 1 .... @3same_logic
- VBIT_3s          1111 001 1 0 . 10 .... .... 0001 ... 1 .... @3same_logic
- VBIF_3s          1111 001 1 0 . 11 .... .... 0001 ... 1 .... @3same_logic
-+VHSUB_S_3s       1111 001 0 0 . .. .... .... 0010 . . . 0 .... @3same
-+VHSUB_U_3s       1111 001 1 0 . .. .... .... 0010 . . . 0 .... @3same
-+
- VQSUB_S_3s       1111 001 0 0 . .. .... .... 0010 . . . 1 .... @3same
- VQSUB_U_3s       1111 001 1 0 . .. .... .... 0010 . . . 1 .... @3same
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
- DO_3SAME_32(VHADD_S, hadd_s)
- DO_3SAME_32(VHADD_U, hadd_u)
-+DO_3SAME_32(VHSUB_S, hsub_s)
-+DO_3SAME_32(VHSUB_U, hsub_u)
-+DO_3SAME_32(VRHADD_S, rhadd_s)
-+DO_3SAME_32(VRHADD_U, rhadd_u)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_VSHL:
-         case NEON_3R_SHA:
-         case NEON_3R_VHADD:
-+        case NEON_3R_VRHADD:
-+        case NEON_3R_VHSUB:
-         case NEON_3R_VABD:
-         case NEON_3R_VABA:
-             /* Already handled by decodetree */
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             tmp2 = neon_load_reg(rm, pass);
-         }
-         switch (op) {
--        case NEON_3R_VRHADD:
--            GEN_NEON_INTEGER_OP(rhadd);
--            break;
--        case NEON_3R_VHSUB:
--            GEN_NEON_INTEGER_OP(hsub);
--            break;
-         case NEON_3R_VQSHL:
-             GEN_NEON_INTEGER_OP_ENV(qshl);
-             break;
---
-.20.1

-[PULL 35/45] target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
+Deleted patch
-Convert the VQSHL, VRSHL and VQRSHL insns in the 3-reg-same
-group to decodetree. We have already implemented the size==0b11
-case of these insns; this commit handles the remaining sizes.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-8-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       | 30 ++++++++++++++++++-----
- target/arm/translate-neon.inc.c | 43 +++++++++++++++++++++++++++++++++
- target/arm/translate.c          | 22 +++--------------
-files changed, 70 insertions(+), 25 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
- @3same_64_rev    .... ... . . . 11 .... .... .... . q:1 . . .... \
-                  &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
--VQSHL_S64_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
--VQSHL_U64_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
--VRSHL_S64_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
--VRSHL_U64_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
--VQRSHL_S64_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
--VQRSHL_U64_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-+{
-+  VQSHL_S64_3s   1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-+  VQSHL_S_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_rev
-+}
-+{
-+  VQSHL_U64_3s   1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-+  VQSHL_U_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev
-+}
-+{
-+  VRSHL_S64_3s   1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-+  VRSHL_S_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev
-+}
-+{
-+  VRSHL_U64_3s   1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-+  VRSHL_U_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev
-+}
-+{
-+  VQRSHL_S64_3s  1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-+  VQRSHL_S_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_rev
-+}
-+{
-+  VQRSHL_U64_3s  1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-+  VQRSHL_U_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_rev
-+}
- VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
- VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
-         return do_3same(s, a, gen_##INSN##_3s);                         \
-     }
-+/*
-+ * Some helper functions need to be passed the cpu_env. In order
-+ * to use those with the gvec APIs like tcg_gen_gvec_3() we need
-+ * to create wrapper functions whose prototype is a NeonGenTwoOpFn()
-+ * and which call a NeonGenTwoOpEnvFn().
-+ */
-+#define WRAP_ENV_FN(WRAPNAME, FUNC)                                     \
-+    static void WRAPNAME(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m)            \
-+    {                                                                   \
-+        FUNC(d, cpu_env, n, m);                                         \
-+    }
-+
-+#define DO_3SAME_32_ENV(INSN, FUNC)                                     \
-+    WRAP_ENV_FN(gen_##INSN##_tramp8, gen_helper_neon_##FUNC##8);        \
-+    WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##16);      \
-+    WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##32);      \
-+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+                                uint32_t oprsz, uint32_t maxsz)         \
-+    {                                                                   \
-+        static const GVecGen3 ops[4] = {                                \
-+            { .fni4 = gen_##INSN##_tramp8 },                            \
-+            { .fni4 = gen_##INSN##_tramp16 },                           \
-+            { .fni4 = gen_##INSN##_tramp32 },                           \
-+            { 0 },                                                      \
-+        };                                                              \
-+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
-+    }                                                                   \
-+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-+    {                                                                   \
-+        if (a->size > 2) {                                              \
-+            return false;                                               \
-+        }                                                               \
-+        return do_3same(s, a, gen_##INSN##_3s);                         \
-+    }
-+
- DO_3SAME_32(VHADD_S, hadd_s)
- DO_3SAME_32(VHADD_U, hadd_u)
- DO_3SAME_32(VHSUB_S, hsub_s)
- DO_3SAME_32(VHSUB_U, hsub_u)
- DO_3SAME_32(VRHADD_S, rhadd_s)
- DO_3SAME_32(VRHADD_U, rhadd_u)
-+DO_3SAME_32(VRSHL_S, rshl_s)
-+DO_3SAME_32(VRSHL_U, rshl_u)
-+
-+DO_3SAME_32_ENV(VQSHL_S, qshl_s)
-+DO_3SAME_32_ENV(VQSHL_U, qshl_u)
-+DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
-+DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_VHSUB:
-         case NEON_3R_VABD:
-         case NEON_3R_VABA:
-+        case NEON_3R_VQSHL:
-+        case NEON_3R_VRSHL:
-+        case NEON_3R_VQRSHL:
-             /* Already handled by decodetree */
-             return 1;
-         }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         }
-         pairwise = 0;
-         switch (op) {
--        case NEON_3R_VQSHL:
--        case NEON_3R_VRSHL:
--        case NEON_3R_VQRSHL:
--            {
--                int rtmp;
--                /* Shift instruction operands are reversed.  */
--                rtmp = rn;
--                rn = rm;
--                rm = rtmp;
--            }
--            break;
-         case NEON_3R_VPADD_VQRDMLAH:
-         case NEON_3R_VPMAX:
-         case NEON_3R_VPMIN:
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             tmp2 = neon_load_reg(rm, pass);
-         }
-         switch (op) {
--        case NEON_3R_VQSHL:
--            GEN_NEON_INTEGER_OP_ENV(qshl);
--            break;
--        case NEON_3R_VRSHL:
--            GEN_NEON_INTEGER_OP(rshl);
--            break;
--        case NEON_3R_VQRSHL:
--            GEN_NEON_INTEGER_OP_ENV(qrshl);
-             break;
-         case NEON_3R_VPMAX:
-             GEN_NEON_INTEGER_OP(pmax);
---
-.20.1

-[PULL 36/45] target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
+Deleted patch
-Convert the Neon integer VPMAX and VPMIN 3-reg-same insns to
-decodetree. These are 'pairwise' operations.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-9-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |  9 +++++
- target/arm/translate-neon.inc.c | 71 +++++++++++++++++++++++++++++++++
- target/arm/translate.c          | 17 +-------
-files changed, 82 insertions(+), 15 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
- @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
-                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+@3same_q0        .... ... . . . size:2 .... .... .... . 0 . . .... \
-+                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
-+
- VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
- VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
- VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
-@@ -XXX,XX +XXX,XX @@ VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
- VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
- VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
-+VPMAX_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 0 .... @3same_q0
-+VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
-+
-+VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
-+VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
-+
- VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
- SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_32_ENV(VQSHL_S, qshl_s)
- DO_3SAME_32_ENV(VQSHL_U, qshl_u)
- DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
- DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
-+
-+static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
-+{
-+    /* Operations handled pairwise 32 bits at a time */
-+    TCGv_i32 tmp, tmp2, tmp3;
-+
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
-+        return false;
-+    }
-+
-+    if (a->size == 3) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    assert(a->q == 0); /* enforced by decode patterns */
-+
-+    /*
-+     * Note that we have to be careful not to clobber the source operands
-+     * in the "vm == vd" case by storing the result of the first pass too
-+     * early. Since Q is 0 there are always just two passes, so instead
-+     * of a complicated loop over each pass we just unroll.
-+     */
-+    tmp = neon_load_reg(a->vn, 0);
-+    tmp2 = neon_load_reg(a->vn, 1);
-+    fn(tmp, tmp, tmp2);
-+    tcg_temp_free_i32(tmp2);
-+
-+    tmp3 = neon_load_reg(a->vm, 0);
-+    tmp2 = neon_load_reg(a->vm, 1);
-+    fn(tmp3, tmp3, tmp2);
-+    tcg_temp_free_i32(tmp2);
-+
-+    neon_store_reg(a->vd, 0, tmp);
-+    neon_store_reg(a->vd, 1, tmp3);
-+    return true;
-+}
-+
-+#define DO_3SAME_PAIR(INSN, func)                                       \
-+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-+    {                                                                   \
-+        static NeonGenTwoOpFn * const fns[] = {                         \
-+            gen_helper_neon_##func##8,                                  \
-+            gen_helper_neon_##func##16,                                 \
-+            gen_helper_neon_##func##32,                                 \
-+        };                                                              \
-+        if (a->size > 2) {                                              \
-+            return false;                                               \
-+        }                                                               \
-+        return do_3same_pair(s, a, fns[a->size]);                       \
-+    }
-+
-+/* 32-bit pairwise ops end up the same as the elementwise versions.  */
-+#define gen_helper_neon_pmax_s32  tcg_gen_smax_i32
-+#define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
-+#define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
-+#define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
-+
-+DO_3SAME_PAIR(VPMAX_S, pmax_s)
-+DO_3SAME_PAIR(VPMIN_S, pmin_s)
-+DO_3SAME_PAIR(VPMAX_U, pmax_u)
-+DO_3SAME_PAIR(VPMIN_U, pmin_u)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
-     }
- }
--/* 32-bit pairwise ops end up the same as the elementwise versions.  */
--#define gen_helper_neon_pmax_s32  tcg_gen_smax_i32
--#define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
--#define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
--#define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
--
- #define GEN_NEON_INTEGER_OP_ENV(name) do { \
-     switch ((size << 1) | u) { \
-     case 0: \
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_VQSHL:
-         case NEON_3R_VRSHL:
-         case NEON_3R_VQRSHL:
-+        case NEON_3R_VPMAX:
-+        case NEON_3R_VPMIN:
-             /* Already handled by decodetree */
-             return 1;
-         }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         pairwise = 0;
-         switch (op) {
-         case NEON_3R_VPADD_VQRDMLAH:
--        case NEON_3R_VPMAX:
--        case NEON_3R_VPMIN:
-             pairwise = 1;
-             break;
-         case NEON_3R_FLOAT_ARITH:
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             tmp2 = neon_load_reg(rm, pass);
-         }
-         switch (op) {
--            break;
--        case NEON_3R_VPMAX:
--            GEN_NEON_INTEGER_OP(pmax);
--            break;
--        case NEON_3R_VPMIN:
--            GEN_NEON_INTEGER_OP(pmin);
--            break;
-         case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high.  */
-             if (!u) { /* VQDMULH */
-                 switch (size) {
---
-.20.1

-[PULL 37/45] target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
+Deleted patch
-Convert the Neon integer VPADD 3-reg-same insns to decodetree.  These
-are 'pairwise' operations.  (Note that VQRDMLAH, which shares the
-same primary opcode but has U=1, has already been converted.)
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-10-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |  2 ++
- target/arm/translate-neon.inc.c |  2 ++
- target/arm/translate.c          | 19 +------------------
-files changed, 5 insertions(+), 18 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
- VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
- VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
-+VPADD_3s         1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
-+
- VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
- SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
- #define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
- #define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
- #define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
-+#define gen_helper_neon_padd_u32  tcg_gen_add_i32
- DO_3SAME_PAIR(VPMAX_S, pmax_s)
- DO_3SAME_PAIR(VPMIN_S, pmin_s)
- DO_3SAME_PAIR(VPMAX_U, pmax_u)
- DO_3SAME_PAIR(VPMIN_U, pmin_u)
-+DO_3SAME_PAIR(VPADD, padd_u)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             return 1;
-         }
-         switch (op) {
--        case NEON_3R_VPADD_VQRDMLAH:
--            if (!u) {
--                break;  /* VPADD */
--            }
--            /* VQRDMLAH : handled by decodetree */
--            return 1;
--
-         case NEON_3R_VFM_VQRDMLSH:
-             if (!u) {
-                 /* VFM, VFMS */
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_VQRSHL:
-         case NEON_3R_VPMAX:
-         case NEON_3R_VPMIN:
-+        case NEON_3R_VPADD_VQRDMLAH:
-             /* Already handled by decodetree */
-             return 1;
-         }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         }
-         pairwise = 0;
-         switch (op) {
--        case NEON_3R_VPADD_VQRDMLAH:
--            pairwise = 1;
--            break;
-         case NEON_3R_FLOAT_ARITH:
-             pairwise = (u && size < 2); /* if VPADD (float) */
-             break;
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                 }
-             }
-             break;
--        case NEON_3R_VPADD_VQRDMLAH:
--            switch (size) {
--            case 0: gen_helper_neon_padd_u8(tmp, tmp, tmp2); break;
--            case 1: gen_helper_neon_padd_u16(tmp, tmp, tmp2); break;
--            case 2: tcg_gen_add_i32(tmp, tmp, tmp2); break;
--            default: abort();
--            }
--            break;
-         case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
-         {
-             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
---
-.20.1

-[PULL 38/45] target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
+Deleted patch
-Convert the Neon VQDMULH and VQRDMULH 3-reg-same insns to
-decodetree. These are the last integer operations in the
--reg-same group.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-11-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |  3 +++
- target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
- target/arm/translate.c          | 24 +-----------------------
-files changed, 28 insertions(+), 23 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
- VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
- VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
-+VQDMULH_3s       1111 001 0 0 . .. .... .... 1011 . . . 0 .... @3same
-+VQRDMULH_3s      1111 001 1 0 . .. .... .... 1011 . . . 0 .... @3same
-+
- VPADD_3s         1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
- VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPMIN_S, pmin_s)
- DO_3SAME_PAIR(VPMAX_U, pmax_u)
- DO_3SAME_PAIR(VPMIN_U, pmin_u)
- DO_3SAME_PAIR(VPADD, padd_u)
-+
-+#define DO_3SAME_VQDMULH(INSN, FUNC)                                    \
-+    WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16);    \
-+    WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32);    \
-+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+                                uint32_t oprsz, uint32_t maxsz)         \
-+    {                                                                   \
-+        static const GVecGen3 ops[2] = {                                \
-+            { .fni4 = gen_##INSN##_tramp16 },                           \
-+            { .fni4 = gen_##INSN##_tramp32 },                           \
-+        };                                                              \
-+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece - 1]); \
-+    }                                                                   \
-+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
-+    {                                                                   \
-+        if (a->size != 1 && a->size != 2) {                             \
-+            return false;                                               \
-+        }                                                               \
-+        return do_3same(s, a, gen_##INSN##_3s);                         \
-+    }
-+
-+DO_3SAME_VQDMULH(VQDMULH, qdmulh)
-+DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_VPMAX:
-         case NEON_3R_VPMIN:
-         case NEON_3R_VPADD_VQRDMLAH:
-+        case NEON_3R_VQDMULH_VQRDMULH:
-             /* Already handled by decodetree */
-             return 1;
-         }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             tmp2 = neon_load_reg(rm, pass);
-         }
-         switch (op) {
--        case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high.  */
--            if (!u) { /* VQDMULH */
--                switch (size) {
--                case 1:
--                    gen_helper_neon_qdmulh_s16(tmp, cpu_env, tmp, tmp2);
--                    break;
--                case 2:
--                    gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
--                    break;
--                default: abort();
--                }
--            } else { /* VQRDMULH */
--                switch (size) {
--                case 1:
--                    gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
--                    break;
--                case 2:
--                    gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
--                    break;
--                default: abort();
--                }
--            }
--            break;
-         case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
-         {
-             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
---
-.20.1

-[PULL 39/45] target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
+Deleted patch
-Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree.
-We already have gvec helpers for addition and subtraction, but must
-add one for fabd.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-12-peter.maydell@linaro.org
----
- target/arm/helper.h             |  3 ++-
- target/arm/neon-dp.decode       |  8 ++++++++
- target/arm/neon_helper.c        |  7 -------
- target/arm/translate-neon.inc.c | 28 ++++++++++++++++++++++++++++
- target/arm/translate.c          | 10 +++-------
- target/arm/vec_helper.c         |  7 +++++++
-files changed, 48 insertions(+), 15 deletions(-)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
-+++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(neon_qneg_s16, TCG_CALL_NO_RWG, i32, env, i32)
- DEF_HELPER_FLAGS_2(neon_qneg_s32, TCG_CALL_NO_RWG, i32, env, i32)
- DEF_HELPER_FLAGS_2(neon_qneg_s64, TCG_CALL_NO_RWG, i64, env, i64)
--DEF_HELPER_3(neon_abd_f32, i32, i32, i32, ptr)
- DEF_HELPER_3(neon_ceq_f32, i32, i32, i32, ptr)
- DEF_HELPER_3(neon_cge_f32, i32, i32, i32, ptr)
- DEF_HELPER_3(neon_cgt_f32, i32, i32, i32, ptr)
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
- DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
- DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
-+DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
-+
- DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG,
-                    void, ptr, ptr, ptr, ptr, i32)
- DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
- @3same_q0        .... ... . . . size:2 .... .... .... . 0 . . .... \
-                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
-+# For FP insns the high bit of 'size' is used as part of opcode decode
-+@3same_fp        .... ... . . . . size:1 .... .... .... . q:1 . . .... \
-+                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+
- VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
- VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
- VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
-@@ -XXX,XX +XXX,XX @@ SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
-                  vm=%vm_dp vn=%vn_dp vd=%vd_dp
- VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
-+
-+VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
-+VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
-+VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
-diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon_helper.c
-+++ b/target/arm/neon_helper.c
-@@ -XXX,XX +XXX,XX @@ uint64_t HELPER(neon_qneg_s64)(CPUARMState *env, uint64_t x)
- }
- /* NEON Float helpers.  */
--uint32_t HELPER(neon_abd_f32)(uint32_t a, uint32_t b, void *fpstp)
--{
--    float_status *fpst = fpstp;
--    float32 f0 = make_float32(a);
--    float32 f1 = make_float32(b);
--    return float32_val(float32_abs(float32_sub(f0, f1, fpst)));
--}
- /* Floating point comparisons produce an integer result.
-  * Note that EQ doesn't signal InvalidOp for QNaNs but GE and GT do.
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
- DO_3SAME_VQDMULH(VQDMULH, qdmulh)
- DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
-+
-+/*
-+ * For all the functions using this macro, size == 1 means fp16,
-+ * which is an architecture extension we don't implement yet.
-+ */
-+#define DO_3S_FP_GVEC(INSN,FUNC)                                        \
-+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-+                                uint32_t oprsz, uint32_t maxsz)         \
-+    {                                                                   \
-+        TCGv_ptr fpst = get_fpstatus_ptr(1);                            \
-+        tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, fpst,                \
-+                           oprsz, maxsz, 0, FUNC);                      \
-+        tcg_temp_free_ptr(fpst);                                        \
-+    }                                                                   \
-+    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a)     \
-+    {                                                                   \
-+        if (a->size != 0) {                                             \
-+            /* TODO fp16 support */                                     \
-+            return false;                                               \
-+        }                                                               \
-+        return do_3same(s, a, gen_##INSN##_3s);                         \
-+    }
-+
-+
-+DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
-+DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
-+DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         switch (op) {
-         case NEON_3R_FLOAT_ARITH:
-             pairwise = (u && size < 2); /* if VPADD (float) */
-+            if (!pairwise) {
-+                return 1; /* handled by decodetree */
-+            }
-             break;
-         case NEON_3R_FLOAT_MINMAX:
-             pairwise = u; /* if VPMIN/VPMAX (float) */
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         {
-             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-             switch ((u << 2) | size) {
--            case 0: /* VADD */
-             case 4: /* VPADD */
-                 gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
-                 break;
--            case 2: /* VSUB */
--                gen_helper_vfp_subs(tmp, tmp, tmp2, fpstatus);
--                break;
--            case 6: /* VABD */
--                gen_helper_neon_abd_f32(tmp, tmp, tmp2, fpstatus);
--                break;
-             default:
-                 abort();
-             }
-diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/vec_helper.c
-+++ b/target/arm/vec_helper.c
-@@ -XXX,XX +XXX,XX @@ static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *stat)
-     return result;
- }
-+static float32 float32_abd(float32 op1, float32 op2, float_status *stat)
-+{
-+    return float32_abs(float32_sub(op1, op2, stat));
-+}
-+
- #define DO_3OP(NAME, FUNC, TYPE) \
- void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
- {                                                                          \
-@@ -XXX,XX +XXX,XX @@ DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16)
- DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32)
- DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
-+DO_3OP(gvec_fabd_s, float32_abd, float32)
-+
- #ifdef TARGET_AARCH64
- DO_3OP(gvec_recps_h, helper_recpsf_f16, float16)
---
-.20.1

-[PULL 40/45] target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
+Deleted patch
-Convert the Neon float VPMIN, VPMAX and VPADD 3-reg-same insns to
-decodetree. These are the only remaining 'pairwise' operations,
-so we can delete the pairwise-specific bits of the old decoder's
-for-each-element loop now.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-13-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |  5 +++
- target/arm/translate-neon.inc.c | 63 +++++++++++++++++++++++++++++++++
- target/arm/translate.c          | 63 +++++----------------------------
-files changed, 76 insertions(+), 55 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
- # For FP insns the high bit of 'size' is used as part of opcode decode
- @3same_fp        .... ... . . . . size:1 .... .... .... . q:1 . . .... \
-                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+@3same_fp_q0     .... ... . . . . size:1 .... .... .... . 0 . . .... \
-+                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
- VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
- VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
-@@ -XXX,XX +XXX,XX @@ VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
- VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
- VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
-+VPADD_fp_3s      1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
- VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
-+VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
-+VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
- DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
- DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
- DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
-+
-+static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
-+{
-+    /* FP operations handled pairwise 32 bits at a time */
-+    TCGv_i32 tmp, tmp2, tmp3;
-+    TCGv_ptr fpstatus;
-+
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    assert(a->q == 0); /* enforced by decode patterns */
-+
-+    /*
-+     * Note that we have to be careful not to clobber the source operands
-+     * in the "vm == vd" case by storing the result of the first pass too
-+     * early. Since Q is 0 there are always just two passes, so instead
-+     * of a complicated loop over each pass we just unroll.
-+     */
-+    fpstatus = get_fpstatus_ptr(1);
-+    tmp = neon_load_reg(a->vn, 0);
-+    tmp2 = neon_load_reg(a->vn, 1);
-+    fn(tmp, tmp, tmp2, fpstatus);
-+    tcg_temp_free_i32(tmp2);
-+
-+    tmp3 = neon_load_reg(a->vm, 0);
-+    tmp2 = neon_load_reg(a->vm, 1);
-+    fn(tmp3, tmp3, tmp2, fpstatus);
-+    tcg_temp_free_i32(tmp2);
-+    tcg_temp_free_ptr(fpstatus);
-+
-+    neon_store_reg(a->vd, 0, tmp);
-+    neon_store_reg(a->vd, 1, tmp3);
-+    return true;
-+}
-+
-+/*
-+ * For all the functions using this macro, size == 1 means fp16,
-+ * which is an architecture extension we don't implement yet.
-+ */
-+#define DO_3S_FP_PAIR(INSN,FUNC)                                    \
-+    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
-+    {                                                               \
-+        if (a->size != 0) {                                         \
-+            /* TODO fp16 support */                                 \
-+            return false;                                           \
-+        }                                                           \
-+        return do_3same_fp_pair(s, a, FUNC);                        \
-+    }
-+
-+DO_3S_FP_PAIR(VPADD, gen_helper_vfp_adds)
-+DO_3S_FP_PAIR(VPMAX, gen_helper_vfp_maxs)
-+DO_3S_FP_PAIR(VPMIN, gen_helper_vfp_mins)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     int shift;
-     int pass;
-     int count;
--    int pairwise;
-     int u;
-     int vec_size;
-     uint32_t imm;
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_VPMIN:
-         case NEON_3R_VPADD_VQRDMLAH:
-         case NEON_3R_VQDMULH_VQRDMULH:
-+        case NEON_3R_FLOAT_ARITH:
-             /* Already handled by decodetree */
-             return 1;
-         }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             /* 64-bit element instructions: handled by decodetree */
-             return 1;
-         }
--        pairwise = 0;
-         switch (op) {
--        case NEON_3R_FLOAT_ARITH:
--            pairwise = (u && size < 2); /* if VPADD (float) */
--            if (!pairwise) {
--                return 1; /* handled by decodetree */
--            }
--            break;
-         case NEON_3R_FLOAT_MINMAX:
--            pairwise = u; /* if VPMIN/VPMAX (float) */
-+            if (u) {
-+                return 1; /* VPMIN/VPMAX handled by decodetree */
-+            }
-             break;
-         case NEON_3R_FLOAT_CMP:
-             if (!u && size) {
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             break;
-         }
--        if (pairwise && q) {
--            /* All the pairwise insns UNDEF if Q is set */
--            return 1;
--        }
--
-         for (pass = 0; pass < (q ? 4 : 2); pass++) {
--        if (pairwise) {
--            /* Pairwise.  */
--            if (pass < 1) {
--                tmp = neon_load_reg(rn, 0);
--                tmp2 = neon_load_reg(rn, 1);
--            } else {
--                tmp = neon_load_reg(rm, 0);
--                tmp2 = neon_load_reg(rm, 1);
--            }
--        } else {
--            /* Elementwise.  */
--            tmp = neon_load_reg(rn, pass);
--            tmp2 = neon_load_reg(rm, pass);
--        }
-+        /* Elementwise.  */
-+        tmp = neon_load_reg(rn, pass);
-+        tmp2 = neon_load_reg(rm, pass);
-         switch (op) {
--        case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
--        {
--            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
--            switch ((u << 2) | size) {
--            case 4: /* VPADD */
--                gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
--                break;
--            default:
--                abort();
--            }
--            tcg_temp_free_ptr(fpstatus);
--            break;
--        }
-         case NEON_3R_FLOAT_MULTIPLY:
-         {
-             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         }
-         tcg_temp_free_i32(tmp2);
--        /* Save the result.  For elementwise operations we can put it
--           straight into the destination register.  For pairwise operations
--           we have to be careful to avoid clobbering the source operands.  */
--        if (pairwise && rd == rm) {
--            neon_store_scratch(pass, tmp);
--        } else {
--            neon_store_reg(rd, pass, tmp);
--        }
-+        neon_store_reg(rd, pass, tmp);
-         } /* for pass */
--        if (pairwise && rd == rm) {
--            for (pass = 0; pass < (q ? 4 : 2); pass++) {
--                tmp = neon_load_scratch(pass);
--                neon_store_reg(rd, pass, tmp);
--            }
--        }
-         /* End of 3 register same size operations.  */
-     } else if (insn & (1 << 4)) {
-         if ((insn & 0x00380080) != 0) {
---
-.20.1

-[PULL 41/45] target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
+Deleted patch
-Convert the Neon integer VMUL, VMLA, and VMLS 3-reg-same inssn to
-decodetree.
-We don't have a gvec helper for multiply-accumulate, so VMLA and VMLS
-need a loop function do_3same_fp().  This takes a reads_vd parameter
-to do_3same_fp() which tells it to load the old value into vd before
-calling the callback function, in the same way that the do_vfp_3op_sp()
-and do_vfp_3op_dp() functions in translate-vfp.inc.c work. (The
-only uses in this patch pass reads_vd == true, but later commits
-will use reads_vd == false.)
-This conversion fixes in passing an underdecoding for VMUL
-(originally reported by Fredrik Strupe <fredrik@strupe.net>): bit 1
-of the 'size' field must be 0.  The old decoder didn't enforce this,
-but the decodetree pattern does.
-The gen_VMLA_fp_reg() function performs the addition operation
-with the operands in the opposite order to the old decoder:
-since Neon sets 'default NaN mode' float32_add operations are
-commutative so there is no behaviour difference, but putting
-them this way around matches the Arm ARM pseudocode and the
-required operation order for the subtraction in gen_VMLS_fp_reg().
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-14-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |  3 ++
- target/arm/translate-neon.inc.c | 81 +++++++++++++++++++++++++++++++++
- target/arm/translate.c          | 17 +------
-files changed, 85 insertions(+), 16 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
- VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
- VPADD_fp_3s      1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
- VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
-+VMLA_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
-+VMLS_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 1 .... @3same_fp
-+VMUL_fp_3s       1111 001 1 0 . 0 . .... .... 1101 ... 1 .... @3same_fp
- VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
- VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPADD, padd_u)
- DO_3SAME_VQDMULH(VQDMULH, qdmulh)
- DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
-+static bool do_3same_fp(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn,
-+                        bool reads_vd)
-+{
-+    /*
-+     * FP operations handled elementwise 32 bits at a time.
-+     * If reads_vd is true then the old value of Vd will be
-+     * loaded before calling the callback function. This is
-+     * used for multiply-accumulate type operations.
-+     */
-+    TCGv_i32 tmp, tmp2;
-+    int pass;
-+
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
-+        return false;
-+    }
-+
-+    if ((a->vn | a->vm | a->vd) & a->q) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-+    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
-+        tmp = neon_load_reg(a->vn, pass);
-+        tmp2 = neon_load_reg(a->vm, pass);
-+        if (reads_vd) {
-+            TCGv_i32 tmp_rd = neon_load_reg(a->vd, pass);
-+            fn(tmp_rd, tmp, tmp2, fpstatus);
-+            neon_store_reg(a->vd, pass, tmp_rd);
-+            tcg_temp_free_i32(tmp);
-+        } else {
-+            fn(tmp, tmp, tmp2, fpstatus);
-+            neon_store_reg(a->vd, pass, tmp);
-+        }
-+        tcg_temp_free_i32(tmp2);
-+    }
-+    tcg_temp_free_ptr(fpstatus);
-+    return true;
-+}
-+
- /*
-  * For all the functions using this macro, size == 1 means fp16,
-  * which is an architecture extension we don't implement yet.
-@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
- DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
- DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
- DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
-+DO_3S_FP_GVEC(VMUL, gen_helper_gvec_fmul_s)
-+
-+/*
-+ * For all the functions using this macro, size == 1 means fp16,
-+ * which is an architecture extension we don't implement yet.
-+ */
-+#define DO_3S_FP(INSN,FUNC,READS_VD)                                \
-+    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
-+    {                                                               \
-+        if (a->size != 0) {                                         \
-+            /* TODO fp16 support */                                 \
-+            return false;                                           \
-+        }                                                           \
-+        return do_3same_fp(s, a, FUNC, READS_VD);                   \
-+    }
-+
-+static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
-+                            TCGv_ptr fpstatus)
-+{
-+    gen_helper_vfp_muls(vn, vn, vm, fpstatus);
-+    gen_helper_vfp_adds(vd, vd, vn, fpstatus);
-+}
-+
-+static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
-+                            TCGv_ptr fpstatus)
-+{
-+    gen_helper_vfp_muls(vn, vn, vm, fpstatus);
-+    gen_helper_vfp_subs(vd, vd, vn, fpstatus);
-+}
-+
-+DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
-+DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
- static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
- {
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_VPADD_VQRDMLAH:
-         case NEON_3R_VQDMULH_VQRDMULH:
-         case NEON_3R_FLOAT_ARITH:
-+        case NEON_3R_FLOAT_MULTIPLY:
-             /* Already handled by decodetree */
-             return 1;
-         }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         tmp = neon_load_reg(rn, pass);
-         tmp2 = neon_load_reg(rm, pass);
-         switch (op) {
--        case NEON_3R_FLOAT_MULTIPLY:
--        {
--            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
--            gen_helper_vfp_muls(tmp, tmp, tmp2, fpstatus);
--            if (!u) {
--                tcg_temp_free_i32(tmp2);
--                tmp2 = neon_load_reg(rd, pass);
--                if (size == 0) {
--                    gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
--                } else {
--                    gen_helper_vfp_subs(tmp, tmp2, tmp, fpstatus);
--                }
--            }
--            tcg_temp_free_ptr(fpstatus);
--            break;
--        }
-         case NEON_3R_FLOAT_CMP:
-         {
-             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
---
-.20.1

-[PULL 43/45] target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
+Deleted patch
-The usual location for the env argument in the argument list of a TCG helper
-is immediately after the return-value argument. recps_f32 and rsqrts_f32
-differ in that they put it at the end.
-Move the env argument to its usual place; this will allow us to
-more easily use these helper functions with the gvec APIs.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-16-peter.maydell@linaro.org
----
- target/arm/helper.h     | 4 ++--
- target/arm/translate.c  | 4 ++--
- target/arm/vfp_helper.c | 4 ++--
-files changed, 6 insertions(+), 6 deletions(-)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
-+++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, ptr, i32)
- DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
- DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
--DEF_HELPER_3(recps_f32, f32, f32, f32, env)
--DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env)
-+DEF_HELPER_3(recps_f32, f32, env, f32, f32)
-+DEF_HELPER_3(rsqrts_f32, f32, env, f32, f32)
- DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
- DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
- DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                 tcg_temp_free_ptr(fpstatus);
-             } else {
-                 if (size == 0) {
--                    gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env);
-+                    gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
-                 } else {
--                    gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env);
-+                    gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
-               }
-             }
-             break;
-diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/vfp_helper.c
-+++ b/target/arm/vfp_helper.c
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
- #define float32_three make_float32(0x40400000)
- #define float32_one_point_five make_float32(0x3fc00000)
--float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
-+float32 HELPER(recps_f32)(CPUARMState *env, float32 a, float32 b)
- {
-     float_status *s = &env->vfp.standard_fp_status;
-     if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
-@@ -XXX,XX +XXX,XX @@ float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
-     return float32_sub(float32_two, float32_mul(a, b, s), s);
- }
--float32 HELPER(rsqrts_f32)(float32 a, float32 b, CPUARMState *env)
-+float32 HELPER(rsqrts_f32)(CPUARMState *env, float32 a, float32 b)
- {
-     float_status *s = &env->vfp.standard_fp_status;
-     float32 product;
---
-.20.1

-[PULL 44/45] target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
+Deleted patch
-Convert the Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS 3-reg-same
-insns to decodetree. (These are all the remaining non-accumulation
-instructions in this group.)
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-17-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |  6 +++
- target/arm/translate-neon.inc.c | 70 +++++++++++++++++++++++++++++++++
- target/arm/translate.c          | 42 +-------------------
-files changed, 78 insertions(+), 40 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ VCGE_fp_3s       1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
- VACGE_fp_3s      1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
- VCGT_fp_3s       1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
- VACGT_fp_3s      1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
-+VMAX_fp_3s       1111 001 0 0 . 0 . .... .... 1111 ... 0 .... @3same_fp
-+VMIN_fp_3s       1111 001 0 0 . 1 . .... .... 1111 ... 0 .... @3same_fp
- VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
- VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
-+VRECPS_fp_3s     1111 001 0 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
-+VRSQRTS_fp_3s    1111 001 0 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
-+VMAXNM_fp_3s     1111 001 1 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
-+VMINNM_fp_3s     1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
- DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
- DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
- DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
-+DO_3S_FP(VMAX, gen_helper_vfp_maxs, false)
-+DO_3S_FP(VMIN, gen_helper_vfp_mins, false)
- static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
-                             TCGv_ptr fpstatus)
-@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
- DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
- DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
-+static bool trans_VMAXNM_fp_3s(DisasContext *s, arg_3same *a)
-+{
-+    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
-+        return false;
-+    }
-+
-+    if (a->size != 0) {
-+        /* TODO fp16 support */
-+        return false;
-+    }
-+
-+    return do_3same_fp(s, a, gen_helper_vfp_maxnums, false);
-+}
-+
-+static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
-+{
-+    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
-+        return false;
-+    }
-+
-+    if (a->size != 0) {
-+        /* TODO fp16 support */
-+        return false;
-+    }
-+
-+    return do_3same_fp(s, a, gen_helper_vfp_minnums, false);
-+}
-+
-+WRAP_ENV_FN(gen_VRECPS_tramp, gen_helper_recps_f32)
-+
-+static void gen_VRECPS_fp_3s(unsigned vece, uint32_t rd_ofs,
-+                             uint32_t rn_ofs, uint32_t rm_ofs,
-+                             uint32_t oprsz, uint32_t maxsz)
-+{
-+    static const GVecGen3 ops = { .fni4 = gen_VRECPS_tramp };
-+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
-+}
-+
-+static bool trans_VRECPS_fp_3s(DisasContext *s, arg_3same *a)
-+{
-+    if (a->size != 0) {
-+        /* TODO fp16 support */
-+        return false;
-+    }
-+
-+    return do_3same(s, a, gen_VRECPS_fp_3s);
-+}
-+
-+WRAP_ENV_FN(gen_VRSQRTS_tramp, gen_helper_rsqrts_f32)
-+
-+static void gen_VRSQRTS_fp_3s(unsigned vece, uint32_t rd_ofs,
-+                              uint32_t rn_ofs, uint32_t rm_ofs,
-+                              uint32_t oprsz, uint32_t maxsz)
-+{
-+    static const GVecGen3 ops = { .fni4 = gen_VRSQRTS_tramp };
-+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
-+}
-+
-+static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
-+{
-+    if (a->size != 0) {
-+        /* TODO fp16 support */
-+        return false;
-+    }
-+
-+    return do_3same(s, a, gen_VRSQRTS_fp_3s);
-+}
-+
- static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
- {
-     /* FP operations handled pairwise 32 bits at a time */
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         case NEON_3R_FLOAT_MULTIPLY:
-         case NEON_3R_FLOAT_CMP:
-         case NEON_3R_FLOAT_ACMP:
-+        case NEON_3R_FLOAT_MINMAX:
-+        case NEON_3R_FLOAT_MISC:
-             /* Already handled by decodetree */
-             return 1;
-         }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             return 1;
-         }
-         switch (op) {
--        case NEON_3R_FLOAT_MINMAX:
--            if (u) {
--                return 1; /* VPMIN/VPMAX handled by decodetree */
--            }
--            break;
--        case NEON_3R_FLOAT_MISC:
--            /* VMAXNM/VMINNM in ARMv8 */
--            if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
--                return 1;
--            }
--            break;
-         case NEON_3R_VFM_VQRDMLSH:
-             if (!dc_isar_feature(aa32_simdfmac, s)) {
-                 return 1;
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         tmp = neon_load_reg(rn, pass);
-         tmp2 = neon_load_reg(rm, pass);
-         switch (op) {
--        case NEON_3R_FLOAT_MINMAX:
--        {
--            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
--            if (size == 0) {
--                gen_helper_vfp_maxs(tmp, tmp, tmp2, fpstatus);
--            } else {
--                gen_helper_vfp_mins(tmp, tmp, tmp2, fpstatus);
--            }
--            tcg_temp_free_ptr(fpstatus);
--            break;
--        }
--        case NEON_3R_FLOAT_MISC:
--            if (u) {
--                /* VMAXNM/VMINNM */
--                TCGv_ptr fpstatus = get_fpstatus_ptr(1);
--                if (size == 0) {
--                    gen_helper_vfp_maxnums(tmp, tmp, tmp2, fpstatus);
--                } else {
--                    gen_helper_vfp_minnums(tmp, tmp, tmp2, fpstatus);
--                }
--                tcg_temp_free_ptr(fpstatus);
--            } else {
--                if (size == 0) {
--                    gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
--                } else {
--                    gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
--              }
--            }
--            break;
-         case NEON_3R_VFM_VQRDMLSH:
-         {
-             /* VFMA, VFMS: fused multiply-add */
---
-.20.1

-[PULL 45/45] target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree
+Deleted patch
-Convert the Neon floating point VFMA and VFMS insn to decodetree.
-These are the last insns in the 3-reg-same group so we can
-remove all the support/loop code from the old decoder.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20200512163904.10918-18-peter.maydell@linaro.org
----
- target/arm/neon-dp.decode       |   3 +
- target/arm/translate-neon.inc.c |  41 ++++++++
- target/arm/translate.c          | 176 +-------------------------------
-files changed, 46 insertions(+), 174 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ SHA256H2_3s      1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
- SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
-                  vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+VFMA_fp_3s       1111 001 0 0 . 0 . .... .... 1100 ... 1 .... @3same_fp
-+VFMS_fp_3s       1111 001 0 0 . 1 . .... .... 1100 ... 1 .... @3same_fp
-+
- VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
- VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
-     return do_3same(s, a, gen_VRSQRTS_fp_3s);
- }
-+static void gen_VFMA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
-+                            TCGv_ptr fpstatus)
-+{
-+    gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
-+}
-+
-+static bool trans_VFMA_fp_3s(DisasContext *s, arg_3same *a)
-+{
-+    if (!dc_isar_feature(aa32_simdfmac, s)) {
-+        return false;
-+    }
-+
-+    if (a->size != 0) {
-+        /* TODO fp16 support */
-+        return false;
-+    }
-+
-+    return do_3same_fp(s, a, gen_VFMA_fp_3s, true);
-+}
-+
-+static void gen_VFMS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
-+                            TCGv_ptr fpstatus)
-+{
-+    gen_helper_vfp_negs(vn, vn);
-+    gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
-+}
-+
-+static bool trans_VFMS_fp_3s(DisasContext *s, arg_3same *a)
-+{
-+    if (!dc_isar_feature(aa32_simdfmac, s)) {
-+        return false;
-+    }
-+
-+    if (a->size != 0) {
-+        /* TODO fp16 support */
-+        return false;
-+    }
-+
-+    return do_3same_fp(s, a, gen_VFMS_fp_3s, true);
-+}
-+
- static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
- {
-     /* FP operations handled pairwise 32 bits at a time */
-diff --git a/target/arm/translate.c b/target/arm/translate.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
-+++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
-     }
- }
--/* Symbolic constants for op fields for Neon 3-register same-length.
-- * The values correspond to bits [11:8,4]; see the ARM ARM DDI0406B
-- * table A7-9.
-- */
--#define NEON_3R_VHADD 0
--#define NEON_3R_VQADD 1
--#define NEON_3R_VRHADD 2
--#define NEON_3R_LOGIC 3 /* VAND,VBIC,VORR,VMOV,VORN,VEOR,VBIF,VBIT,VBSL */
--#define NEON_3R_VHSUB 4
--#define NEON_3R_VQSUB 5
--#define NEON_3R_VCGT 6
--#define NEON_3R_VCGE 7
--#define NEON_3R_VSHL 8
--#define NEON_3R_VQSHL 9
--#define NEON_3R_VRSHL 10
--#define NEON_3R_VQRSHL 11
--#define NEON_3R_VMAX 12
--#define NEON_3R_VMIN 13
--#define NEON_3R_VABD 14
--#define NEON_3R_VABA 15
--#define NEON_3R_VADD_VSUB 16
--#define NEON_3R_VTST_VCEQ 17
--#define NEON_3R_VML 18 /* VMLA, VMLS */
--#define NEON_3R_VMUL 19
--#define NEON_3R_VPMAX 20
--#define NEON_3R_VPMIN 21
--#define NEON_3R_VQDMULH_VQRDMULH 22
--#define NEON_3R_VPADD_VQRDMLAH 23
--#define NEON_3R_SHA 24 /* SHA1C,SHA1P,SHA1M,SHA1SU0,SHA256H{2},SHA256SU1 */
--#define NEON_3R_VFM_VQRDMLSH 25 /* VFMA, VFMS, VQRDMLSH */
--#define NEON_3R_FLOAT_ARITH 26 /* float VADD, VSUB, VPADD, VABD */
--#define NEON_3R_FLOAT_MULTIPLY 27 /* float VMLA, VMLS, VMUL */
--#define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */
--#define NEON_3R_FLOAT_ACMP 29 /* float VACGE, VACGT, VACLE, VACLT */
--#define NEON_3R_FLOAT_MINMAX 30 /* float VMIN, VMAX */
--#define NEON_3R_FLOAT_MISC 31 /* float VRECPS, VRSQRTS, VMAXNM/MINNM */
--
--static const uint8_t neon_3r_sizes[] = {
--    [NEON_3R_VHADD] = 0x7,
--    [NEON_3R_VQADD] = 0xf,
--    [NEON_3R_VRHADD] = 0x7,
--    [NEON_3R_LOGIC] = 0xf, /* size field encodes op type */
--    [NEON_3R_VHSUB] = 0x7,
--    [NEON_3R_VQSUB] = 0xf,
--    [NEON_3R_VCGT] = 0x7,
--    [NEON_3R_VCGE] = 0x7,
--    [NEON_3R_VSHL] = 0xf,
--    [NEON_3R_VQSHL] = 0xf,
--    [NEON_3R_VRSHL] = 0xf,
--    [NEON_3R_VQRSHL] = 0xf,
--    [NEON_3R_VMAX] = 0x7,
--    [NEON_3R_VMIN] = 0x7,
--    [NEON_3R_VABD] = 0x7,
--    [NEON_3R_VABA] = 0x7,
--    [NEON_3R_VADD_VSUB] = 0xf,
--    [NEON_3R_VTST_VCEQ] = 0x7,
--    [NEON_3R_VML] = 0x7,
--    [NEON_3R_VMUL] = 0x7,
--    [NEON_3R_VPMAX] = 0x7,
--    [NEON_3R_VPMIN] = 0x7,
--    [NEON_3R_VQDMULH_VQRDMULH] = 0x6,
--    [NEON_3R_VPADD_VQRDMLAH] = 0x7,
--    [NEON_3R_SHA] = 0xf, /* size field encodes op type */
--    [NEON_3R_VFM_VQRDMLSH] = 0x7, /* For VFM, size bit 1 encodes op */
--    [NEON_3R_FLOAT_ARITH] = 0x5, /* size bit 1 encodes op */
--    [NEON_3R_FLOAT_MULTIPLY] = 0x5, /* size bit 1 encodes op */
--    [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */
--    [NEON_3R_FLOAT_ACMP] = 0x5, /* size bit 1 encodes op */
--    [NEON_3R_FLOAT_MINMAX] = 0x5, /* size bit 1 encodes op */
--    [NEON_3R_FLOAT_MISC] = 0x5, /* size bit 1 encodes op */
--};
--
- /* Symbolic constants for op fields for Neon 2-register miscellaneous.
-  * The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
-  * table A7-13.
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     rm_ofs = neon_reg_offset(rm, 0);
-     if ((insn & (1 << 23)) == 0) {
--        /* Three register same length.  */
--        op = ((insn >> 7) & 0x1e) | ((insn >> 4) & 1);
--        /* Catch invalid op and bad size combinations: UNDEF */
--        if ((neon_3r_sizes[op] & (1 << size)) == 0) {
--            return 1;
--        }
--        /* All insns of this form UNDEF for either this condition or the
--         * superset of cases "Q==1"; we catch the latter later.
--         */
--        if (q && ((rd | rn | rm) & 1)) {
--            return 1;
--        }
--        switch (op) {
--        case NEON_3R_VFM_VQRDMLSH:
--            if (!u) {
--                /* VFM, VFMS */
--                if (size == 1) {
--                    return 1;
--                }
--                break;
--            }
--            /* VQRDMLSH : handled by decodetree */
--            return 1;
--
--        case NEON_3R_VADD_VSUB:
--        case NEON_3R_LOGIC:
--        case NEON_3R_VMAX:
--        case NEON_3R_VMIN:
--        case NEON_3R_VTST_VCEQ:
--        case NEON_3R_VCGT:
--        case NEON_3R_VCGE:
--        case NEON_3R_VQADD:
--        case NEON_3R_VQSUB:
--        case NEON_3R_VMUL:
--        case NEON_3R_VML:
--        case NEON_3R_VSHL:
--        case NEON_3R_SHA:
--        case NEON_3R_VHADD:
--        case NEON_3R_VRHADD:
--        case NEON_3R_VHSUB:
--        case NEON_3R_VABD:
--        case NEON_3R_VABA:
--        case NEON_3R_VQSHL:
--        case NEON_3R_VRSHL:
--        case NEON_3R_VQRSHL:
--        case NEON_3R_VPMAX:
--        case NEON_3R_VPMIN:
--        case NEON_3R_VPADD_VQRDMLAH:
--        case NEON_3R_VQDMULH_VQRDMULH:
--        case NEON_3R_FLOAT_ARITH:
--        case NEON_3R_FLOAT_MULTIPLY:
--        case NEON_3R_FLOAT_CMP:
--        case NEON_3R_FLOAT_ACMP:
--        case NEON_3R_FLOAT_MINMAX:
--        case NEON_3R_FLOAT_MISC:
--            /* Already handled by decodetree */
--            return 1;
--        }
--
--        if (size == 3) {
--            /* 64-bit element instructions: handled by decodetree */
--            return 1;
--        }
--        switch (op) {
--        case NEON_3R_VFM_VQRDMLSH:
--            if (!dc_isar_feature(aa32_simdfmac, s)) {
--                return 1;
--            }
--            break;
--        default:
--            break;
--        }
--
--        for (pass = 0; pass < (q ? 4 : 2); pass++) {
--
--        /* Elementwise.  */
--        tmp = neon_load_reg(rn, pass);
--        tmp2 = neon_load_reg(rm, pass);
--        switch (op) {
--        case NEON_3R_VFM_VQRDMLSH:
--        {
--            /* VFMA, VFMS: fused multiply-add */
--            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
--            TCGv_i32 tmp3 = neon_load_reg(rd, pass);
--            if (size) {
--                /* VFMS */
--                gen_helper_vfp_negs(tmp, tmp);
--            }
--            gen_helper_vfp_muladds(tmp, tmp, tmp2, tmp3, fpstatus);
--            tcg_temp_free_i32(tmp3);
--            tcg_temp_free_ptr(fpstatus);
--            break;
--        }
--        default:
--            abort();
--        }
--        tcg_temp_free_i32(tmp2);
--
--        neon_store_reg(rd, pass, tmp);
--
--        } /* for pass */
--        /* End of 3 register same size operations.  */
-+        /* Three register same length: handled by decodetree */
-+        return 1;
-     } else if (insn & (1 << 4)) {
-         if ((insn & 0x00380080) != 0) {
-             /* Two registers and shift.  */
---
-.20.1

Mostly this is patches from me and RTH cleaning up and doing
more decodetree conversion for AArch32 Neon. The major new feature
is Dongjiu Geng's patchset to report host memory errors to KVM guests;
also a new aspeed board from Patrick Williams.

thanks
-- PMM

The following changes since commit 035b448b84f3557206abc44d786c5d3db2638f7d:

Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-05-14' into staging (2020-05-14 10:58:30 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200514

for you to fetch changes up to e95485f85657be21135c17a9226e297c21e73360:

target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree (2020-05-14 15:03:09 +0100)

----------------------------------------------------------------
target-arm queue:
 * target/arm: Use correct GDB XML for M-profile cores
 * target/arm: Code cleanup to use gvec APIs better
 * aspeed: Add support for the sonorapass-bmc board
 * target/arm: Support reporting KVM host memory errors
   to the guest via ACPI notifications
 * target/arm: Finish conversion of Neon 3-reg-same insns to decodetree

----------------------------------------------------------------
Dongjiu Geng (10):
      acpi: nvdimm: change NVDIMM_UUID_LE to a common macro
      hw/arm/virt: Introduce a RAS machine option
      docs: APEI GHES generation and CPER record description
      ACPI: Build related register address fields via hardware error fw_cfg blob
      ACPI: Build Hardware Error Source Table
      ACPI: Record the Generic Error Status Block address
      KVM: Move hwpoison page related functions into kvm-all.c
      ACPI: Record Generic Error Status Block(GESB) table
      target-arm: kvm64: handle SIGBUS signal from kernel or KVM
      MAINTAINERS: Add ACPI/HEST/GHES entries

Patrick Williams (1):
      aspeed: Add support for the sonorapass-bmc board

Peter Maydell (18):
      target/arm: Use correct GDB XML for M-profile cores
      target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree
      target/arm: Convert Neon 3-reg-same SHA to decodetree
      target/arm: Convert Neon 64-bit element 3-reg-same insns
      target/arm: Convert Neon VHADD 3-reg-same insns
      target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree
      target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree
      target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree
      target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree
      target/arm: Convert Neon VPADD 3-reg-same insns to decodetree
      target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree
      target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
      target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree
      target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
      target/arm: Convert Neon 3-reg-same compare insns to decodetree
      target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place
      target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree
      target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree

Richard Henderson (16):
      target/arm: Create gen_gvec_[us]sra
      target/arm: Create gen_gvec_{u,s}{rshr,rsra}
      target/arm: Create gen_gvec_{sri,sli}
      target/arm: Remove unnecessary range check for VSHL
      target/arm: Tidy handle_vec_simd_shri
      target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
      target/arm: Create gen_gvec_{mla,mls}
      target/arm: Swap argument order for VSHL during decode
      target/arm: Create gen_gvec_{cmtst,ushl,sshl}
      target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
      target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32
      target/arm: Create gen_gvec_{qrdmla,qrdmls}
      target/arm: Pass pointer to qc to qrdmla/qrdmls
      target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
      target/arm: Vectorize SABD/UABD
      target/arm: Vectorize SABA/UABA

GDB's remote protocol requires M-profile cores to use the feature
name 'org.gnu.gdb.arm.m-profile' instead of the 'org.gnu.gdb.arm.core'
feature used for A- and R-profile cores. We weren't doing this, which
meant GDB treated our M-profile cores like A-profile ones. This mostly
doesn't matter, but for instance means that it doesn't correctly
handle backtraces where an M-profile exception frame is involved.

Ship a copy of GDB's arm-m-profile.xml and use it on the M-profile
cores.  The integer registers have the same offsets as the
arm-core.xml, but register 25 is the M-profile XPSR rather than the
A-profile CPSR, so we need to update arm_cpu_gdb_read_register() and
arm_cpu_gdb_write_register() to handle XSPR reads and writes.

Fixes: https://bugs.launchpad.net/qemu/+bug/1877136
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200507134755.13997-1-peter.maydell@linaro.org
---
 configure                 |  4 ++--
 target/arm/cpu_tcg.c      |  1 +
 target/arm/gdbstub.c      | 22 ++++++++++++++++++----
 gdb-xml/arm-m-profile.xml | 27 +++++++++++++++++++++++++++
 4 files changed, 48 insertions(+), 6 deletions(-)
 create mode 100644 gdb-xml/arm-m-profile.xml

diff --git a/configure b/configure
index XXXXXXX..XXXXXXX 100755
--- a/configure
+++ b/configure
@@ -XXX,XX +XXX,XX @@ case "$target_name" in
     TARGET_SYSTBL_ABI=common,oabi
     bflt="yes"
     mttcg="yes"
-    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
+    gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
   ;;
   aarch64|aarch64_be)
     TARGET_ARCH=aarch64
     TARGET_BASE_ARCH=arm
     bflt="yes"
     mttcg="yes"
-    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
+    gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml arm-m-profile.xml"
   ;;
   cris)
   ;;
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -XXX,XX +XXX,XX @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
 #endif
 
     cc->cpu_exec_interrupt = arm_v7m_cpu_exec_interrupt;
+    cc->gdb_core_xml_file = "arm-m-profile.xml";
 }
 
 static const ARMCPUInfo arm_tcg_cpus[] = {
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
         }
         return gdb_get_reg32(mem_buf, 0);
     case 25:
-        /* CPSR */
-        return gdb_get_reg32(mem_buf, cpsr_read(env));
+        /* CPSR, or XPSR for M-profile */
+        if (arm_feature(env, ARM_FEATURE_M)) {
+            return gdb_get_reg32(mem_buf, xpsr_read(env));
+        } else {
+            return gdb_get_reg32(mem_buf, cpsr_read(env));
+        }
     }
     /* Unknown register.  */
     return 0;
@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
         }
         return 4;
     case 25:
-        /* CPSR */
-        cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
+        /* CPSR, or XPSR for M-profile */
+        if (arm_feature(env, ARM_FEATURE_M)) {
+            /*
+             * Don't allow writing to XPSR.Exception as it can cause
+             * a transition into or out of handler mode (it's not
+             * writeable via the MSR insn so this is a reasonable
+             * restriction). Other fields are safe to update.
+             */
+            xpsr_write(env, tmp, ~XPSR_EXCP);
+        } else {
+            cpsr_write(env, tmp, 0xffffffff, CPSRWriteByGDBStub);
+        }
         return 4;
     }
     /* Unknown register.  */
diff --git a/gdb-xml/arm-m-profile.xml b/gdb-xml/arm-m-profile.xml
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/gdb-xml/arm-m-profile.xml
@@ -XXX,XX +XXX,XX @@
+<?xml version="1.0"?>
+
+
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
+<feature name="org.gnu.gdb.arm.m-profile">
+  <reg name="r0" bitsize="32"/>
+  <reg name="r1" bitsize="32"/>
+  <reg name="r2" bitsize="32"/>
+  <reg name="r3" bitsize="32"/>
+  <reg name="r4" bitsize="32"/>
+  <reg name="r5" bitsize="32"/>
+  <reg name="r6" bitsize="32"/>
+  <reg name="r7" bitsize="32"/>
+  <reg name="r8" bitsize="32"/>
+  <reg name="r9" bitsize="32"/>
+  <reg name="r10" bitsize="32"/>
+  <reg name="r11" bitsize="32"/>
+  <reg name="r12" bitsize="32"/>
+  <reg name="sp" bitsize="32" type="data_ptr"/>
+  <reg name="lr" bitsize="32"/>
+  <reg name="pc" bitsize="32" type="code_ptr"/>
+  <reg name="xpsr" bitsize="32" regnum="25"/>
+</feature>
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The functions eliminate duplication of the special cases for
this operation.  They match up with the GVecGen2iFn typedef.

Add out-of-line helpers.  We got away with only having inline
expanders because the neon vector size is only 16 bytes, and
we know that the inline expansion will always succeed.
When we reuse this for SVE, tcg-gvec-op may decide to use an
out-of-line helper due to longer vector lengths.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  10 +++
 target/arm/translate.h     |   7 +-
 target/arm/translate-a64.c |  15 +---
 target/arm/translate.c     | 161 ++++++++++++++++++++++---------------
 target/arm/vec_helper.c    |  25 ++++++
 5 files changed, 139 insertions(+), 79 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Create vectorized versions of handle_shri_with_rndacc
for shift+round and shift+round+accumulate.  Add out-of-line
helpers in preparation for longer vector lengths from SVE.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  20 ++
 target/arm/translate.h     |   9 +
 target/arm/translate-a64.c |  11 +-
 target/arm/translate.c     | 463 +++++++++++++++++++++++++++++++++++--
 target/arm/vec_helper.c    |  50 ++++
 5 files changed, 527 insertions(+), 26 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #include "helper-sve.h"
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
 void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 
+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
+
 /*
  * Forward to the isar_feature_* tests given a DisasContext pointer.
  */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
         return;
 
     case 0x04: /* SRSHR / URSHR (rounding) */
-        break;
+        gen_gvec_fn2i(s, is_q, rd, rn, shift,
+                      is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
+        return;
+
     case 0x06: /* SRSRA / URSRA (accum + rounding) */
-        accumulate = true;
-        break;
+        gen_gvec_fn2i(s, is_q, rd, rn, shift,
+                      is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
+        return;
+
     default:
         g_assert_not_reached();
     }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
     }
 }
 
+/*
+ * Shift one less than the requested amount, and the low bit is
+ * the rounding bit.  For the 8 and 16-bit operations, because we
+ * mask the low bit, we can perform a normal integer shift instead
+ * of a vector shift.
+ */
+static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shri_i64(t, a, sh - 1);
+    tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
+    tcg_gen_vec_sar8i_i64(d, a, sh);
+    tcg_gen_vec_add8_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shri_i64(t, a, sh - 1);
+    tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
+    tcg_gen_vec_sar16i_i64(d, a, sh);
+    tcg_gen_vec_add16_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+
+    tcg_gen_extract_i32(t, a, sh - 1, 1);
+    tcg_gen_sari_i32(d, a, sh);
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_extract_i64(t, a, sh - 1, 1);
+    tcg_gen_sari_i64(d, a, sh);
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+    TCGv_vec ones = tcg_temp_new_vec_matching(d);
+
+    tcg_gen_shri_vec(vece, t, a, sh - 1);
+    tcg_gen_dupi_vec(vece, ones, 1);
+    tcg_gen_and_vec(vece, t, t, ones);
+    tcg_gen_sari_vec(vece, d, a, sh);
+    tcg_gen_add_vec(vece, d, d, t);
+
+    tcg_temp_free_vec(t);
+    tcg_temp_free_vec(ones);
+}
+
+void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen2i ops[4] = {
+        { .fni8 = gen_srshr8_i64,
+          .fniv = gen_srshr_vec,
+          .fno = gen_helper_gvec_srshr_b,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fni8 = gen_srshr16_i64,
+          .fniv = gen_srshr_vec,
+          .fno = gen_helper_gvec_srshr_h,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fni4 = gen_srshr32_i32,
+          .fniv = gen_srshr_vec,
+          .fno = gen_helper_gvec_srshr_s,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fni8 = gen_srshr64_i64,
+          .fniv = gen_srshr_vec,
+          .fno = gen_helper_gvec_srshr_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+
+    /* tszimm encoding produces immediates in the range [1..esize] */
+    tcg_debug_assert(shift > 0);
+    tcg_debug_assert(shift <= (8 << vece));
+
+    if (shift == (8 << vece)) {
+        /*
+         * Shifts larger than the element size are architecturally valid.
+         * Signed results in all sign bits.  With rounding, this produces
+         *   (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
+         * I.e. always zero.
+         */
+        tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0);
+    } else {
+        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
+    }
+}
+
+static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    gen_srshr8_i64(t, a, sh);
+    tcg_gen_vec_add8_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    gen_srshr16_i64(t, a, sh);
+    tcg_gen_vec_add16_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+
+    gen_srshr32_i32(t, a, sh);
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    gen_srshr64_i64(t, a, sh);
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+
+    gen_srshr_vec(vece, t, a, sh);
+    tcg_gen_add_vec(vece, d, d, t);
+    tcg_temp_free_vec(t);
+}
+
+void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen2i ops[4] = {
+        { .fni8 = gen_srsra8_i64,
+          .fniv = gen_srsra_vec,
+          .fno = gen_helper_gvec_srsra_b,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_8 },
+        { .fni8 = gen_srsra16_i64,
+          .fniv = gen_srsra_vec,
+          .fno = gen_helper_gvec_srsra_h,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_16 },
+        { .fni4 = gen_srsra32_i32,
+          .fniv = gen_srsra_vec,
+          .fno = gen_helper_gvec_srsra_s,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_32 },
+        { .fni8 = gen_srsra64_i64,
+          .fniv = gen_srsra_vec,
+          .fno = gen_helper_gvec_srsra_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_64 },
+    };
+
+    /* tszimm encoding produces immediates in the range [1..esize] */
+    tcg_debug_assert(shift > 0);
+    tcg_debug_assert(shift <= (8 << vece));
+
+    /*
+     * Shifts larger than the element size are architecturally valid.
+     * Signed results in all sign bits.  With rounding, this produces
+     *   (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0.
+     * I.e. always zero.  With accumulation, this leaves D unchanged.
+     */
+    if (shift == (8 << vece)) {
+        /* Nop, but we do need to clear the tail. */
+        tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz);
+    } else {
+        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
+    }
+}
+
+static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shri_i64(t, a, sh - 1);
+    tcg_gen_andi_i64(t, t, dup_const(MO_8, 1));
+    tcg_gen_vec_shr8i_i64(d, a, sh);
+    tcg_gen_vec_add8_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_shri_i64(t, a, sh - 1);
+    tcg_gen_andi_i64(t, t, dup_const(MO_16, 1));
+    tcg_gen_vec_shr16i_i64(d, a, sh);
+    tcg_gen_vec_add16_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+
+    tcg_gen_extract_i32(t, a, sh - 1, 1);
+    tcg_gen_shri_i32(d, a, sh);
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_extract_i64(t, a, sh - 1, 1);
+    tcg_gen_shri_i64(d, a, sh);
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+    TCGv_vec ones = tcg_temp_new_vec_matching(d);
+
+    tcg_gen_shri_vec(vece, t, a, shift - 1);
+    tcg_gen_dupi_vec(vece, ones, 1);
+    tcg_gen_and_vec(vece, t, t, ones);
+    tcg_gen_shri_vec(vece, d, a, shift);
+    tcg_gen_add_vec(vece, d, d, t);
+
+    tcg_temp_free_vec(t);
+    tcg_temp_free_vec(ones);
+}
+
+void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_shri_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen2i ops[4] = {
+        { .fni8 = gen_urshr8_i64,
+          .fniv = gen_urshr_vec,
+          .fno = gen_helper_gvec_urshr_b,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fni8 = gen_urshr16_i64,
+          .fniv = gen_urshr_vec,
+          .fno = gen_helper_gvec_urshr_h,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fni4 = gen_urshr32_i32,
+          .fniv = gen_urshr_vec,
+          .fno = gen_helper_gvec_urshr_s,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fni8 = gen_urshr64_i64,
+          .fniv = gen_urshr_vec,
+          .fno = gen_helper_gvec_urshr_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+
+    /* tszimm encoding produces immediates in the range [1..esize] */
+    tcg_debug_assert(shift > 0);
+    tcg_debug_assert(shift <= (8 << vece));
+
+    if (shift == (8 << vece)) {
+        /*
+         * Shifts larger than the element size are architecturally valid.
+         * Unsigned results in zero.  With rounding, this produces a
+         * copy of the most significant bit.
+         */
+        tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz);
+    } else {
+        tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
+    }
+}
+
+static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    if (sh == 8) {
+        tcg_gen_vec_shr8i_i64(t, a, 7);
+    } else {
+        gen_urshr8_i64(t, a, sh);
+    }
+    tcg_gen_vec_add8_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    if (sh == 16) {
+        tcg_gen_vec_shr16i_i64(t, a, 15);
+    } else {
+        gen_urshr16_i64(t, a, sh);
+    }
+    tcg_gen_vec_add16_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+
+    if (sh == 32) {
+        tcg_gen_shri_i32(t, a, 31);
+    } else {
+        gen_urshr32_i32(t, a, sh);
+    }
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    if (sh == 64) {
+        tcg_gen_shri_i64(t, a, 63);
+    } else {
+        gen_urshr64_i64(t, a, sh);
+    }
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+
+    if (sh == (8 << vece)) {
+        tcg_gen_shri_vec(vece, t, a, sh - 1);
+    } else {
+        gen_urshr_vec(vece, t, a, sh);
+    }
+    tcg_gen_add_vec(vece, d, d, t);
+    tcg_temp_free_vec(t);
+}
+
+void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                    int64_t shift, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_shri_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen2i ops[4] = {
+        { .fni8 = gen_ursra8_i64,
+          .fniv = gen_ursra_vec,
+          .fno = gen_helper_gvec_ursra_b,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_8 },
+        { .fni8 = gen_ursra16_i64,
+          .fniv = gen_ursra_vec,
+          .fno = gen_helper_gvec_ursra_h,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_16 },
+        { .fni4 = gen_ursra32_i32,
+          .fniv = gen_ursra_vec,
+          .fno = gen_helper_gvec_ursra_s,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_32 },
+        { .fni8 = gen_ursra64_i64,
+          .fniv = gen_ursra_vec,
+          .fno = gen_helper_gvec_ursra_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_64 },
+    };
+
+    /* tszimm encoding produces immediates in the range [1..esize] */
+    tcg_debug_assert(shift > 0);
+    tcg_debug_assert(shift <= (8 << vece));
+
+    tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]);
+}
+
 static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 {
     uint64_t mask = dup_const(MO_8, 0xff >> shift);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     }
                     return 0;
 
+                case 2: /* VRSHR */
+                    /* Right shift comes here negative.  */
+                    shift = -shift;
+                    if (u) {
+                        gen_gvec_urshr(size, rd_ofs, rm_ofs, shift,
+                                       vec_size, vec_size);
+                    } else {
+                        gen_gvec_srshr(size, rd_ofs, rm_ofs, shift,
+                                       vec_size, vec_size);
+                    }
+                    return 0;
+
+                case 3: /* VRSRA */
+                    /* Right shift comes here negative.  */
+                    shift = -shift;
+                    if (u) {
+                        gen_gvec_ursra(size, rd_ofs, rm_ofs, shift,
+                                       vec_size, vec_size);
+                    } else {
+                        gen_gvec_srsra(size, rd_ofs, rm_ofs, shift,
+                                       vec_size, vec_size);
+                    }
+                    return 0;
+
                 case 4: /* VSRI */
                     if (!u) {
                         return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         neon_load_reg64(cpu_V0, rm + pass);
                         tcg_gen_movi_i64(cpu_V1, imm);
                         switch (op) {
-                        case 2: /* VRSHR */
-                        case 3: /* VRSRA */
-                            if (u)
-                                gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1);
-                            else
-                                gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1);
-                            break;
                         case 6: /* VQSHLU */
                             gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
                                                       cpu_V0, cpu_V1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         default:
                             g_assert_not_reached();
                         }
-                        if (op == 3) {
-                            /* Accumulate.  */
-                            neon_load_reg64(cpu_V1, rd + pass);
-                            tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1);
-                        }
                         neon_store_reg64(cpu_V0, rd + pass);
                     } else { /* size < 3 */
                         /* Operands in T0 and T1.  */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         tmp2 = tcg_temp_new_i32();
                         tcg_gen_movi_i32(tmp2, imm);
                         switch (op) {
-                        case 2: /* VRSHR */
-                        case 3: /* VRSRA */
-                            GEN_NEON_INTEGER_OP(rshl);
-                            break;
                         case 6: /* VQSHLU */
                             switch (size) {
                             case 0:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             g_assert_not_reached();
                         }
                         tcg_temp_free_i32(tmp2);
-
-                        if (op == 3) {
-                            /* Accumulate.  */
-                            tmp2 = neon_load_reg(rd, pass);
-                            gen_neon_add(size, tmp, tmp2);
-                            tcg_temp_free_i32(tmp2);
-                        }
                         neon_store_reg(rd, pass, tmp);
                     }
                 } /* for pass */
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_SRA(gvec_usra_d, uint64_t)
 
 #undef DO_SRA
 
+#define DO_RSHR(NAME, TYPE)                             \
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
+{                                                       \
+    intptr_t i, oprsz = simd_oprsz(desc);               \
+    int shift = simd_data(desc);                        \
+    TYPE *d = vd, *n = vn;                              \
+    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
+        TYPE tmp = n[i] >> (shift - 1);                 \
+        d[i] = (tmp >> 1) + (tmp & 1);                  \
+    }                                                   \
+    clear_tail(d, oprsz, simd_maxsz(desc));             \
+}
+
+DO_RSHR(gvec_srshr_b, int8_t)
+DO_RSHR(gvec_srshr_h, int16_t)
+DO_RSHR(gvec_srshr_s, int32_t)
+DO_RSHR(gvec_srshr_d, int64_t)
+
+DO_RSHR(gvec_urshr_b, uint8_t)
+DO_RSHR(gvec_urshr_h, uint16_t)
+DO_RSHR(gvec_urshr_s, uint32_t)
+DO_RSHR(gvec_urshr_d, uint64_t)
+
+#undef DO_RSHR
+
+#define DO_RSRA(NAME, TYPE)                             \
+void HELPER(NAME)(void *vd, void *vn, uint32_t desc)    \
+{                                                       \
+    intptr_t i, oprsz = simd_oprsz(desc);               \
+    int shift = simd_data(desc);                        \
+    TYPE *d = vd, *n = vn;                              \
+    for (i = 0; i < oprsz / sizeof(TYPE); i++) {        \
+        TYPE tmp = n[i] >> (shift - 1);                 \
+        d[i] += (tmp >> 1) + (tmp & 1);                 \
+    }                                                   \
+    clear_tail(d, oprsz, simd_maxsz(desc));             \
+}
+
+DO_RSRA(gvec_srsra_b, int8_t)
+DO_RSRA(gvec_srsra_h, int16_t)
+DO_RSRA(gvec_srsra_s, int32_t)
+DO_RSRA(gvec_srsra_d, int64_t)
+
+DO_RSRA(gvec_ursra_b, uint8_t)
+DO_RSRA(gvec_ursra_h, uint16_t)
+DO_RSRA(gvec_ursra_s, uint32_t)
+DO_RSRA(gvec_ursra_d, uint64_t)
+
+#undef DO_RSRA
+
 /*
  * Convert float16 to float32, raising no exceptions and
  * preserving exceptional values, including SNaN.
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

The functions eliminate duplication of the special cases for
this operation.  They match up with the GVecGen2iFn typedef.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  10 ++
 target/arm/translate.h     |   7 +-
 target/arm/translate-a64.c |  20 +---
 target/arm/translate.c     | 186 +++++++++++++++++++++----------------
 target/arm/vec_helper.c    |  38 ++++++++
 5 files changed, 160 insertions(+), 101 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

In 1dc8425e551, while converting to gvec, I added an extra range check
against the shift count.  This was unnecessary because the encoding of
the shift count produces 0 to the element size - 1.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
                                      vec_size, vec_size);
                     } else { /* VSHL */
-                        /* Shifts larger than the element size are
-                         * architecturally valid and results in zero.
-                         */
-                        if (shift >= 8 << size) {
-                            tcg_gen_gvec_dup_imm(size, rd_ofs,
-                                                 vec_size, vec_size, 0);
-                        } else {
-                            tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
-                                              vec_size, vec_size);
-                        }
+                        tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
+                                          vec_size, vec_size);
                     }
                     return 0;
                 }
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Now that we've converted all cases to gvec, there is quite a bit
of dead code at the end of the function.  Remove it.

Sink the call to gen_gvec_fn2i to the end, loading a function
pointer within the switch statement.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.c | 56 ++++++++++----------------------------
 1 file changed, 14 insertions(+), 42 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
     int size = 32 - clz32(immh) - 1;
     int immhb = immh << 3 | immb;
     int shift = 2 * (8 << size) - immhb;
-    bool accumulate = false;
-    int dsize = is_q ? 128 : 64;
-    int esize = 8 << size;
-    int elements = dsize/esize;
-    MemOp memop = size | (is_u ? 0 : MO_SIGN);
-    TCGv_i64 tcg_rn = new_tmp_a64(s);
-    TCGv_i64 tcg_rd = new_tmp_a64(s);
-    TCGv_i64 tcg_round;
-    uint64_t round_const;
-    int i;
+    GVecGen2iFn *gvec_fn;
 
     if (extract32(immh, 3, 1) && !is_q) {
         unallocated_encoding(s);
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
 
     switch (opcode) {
     case 0x02: /* SSRA / USRA (accumulate) */
-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
-                      is_u ? gen_gvec_usra : gen_gvec_ssra, size);
-        return;
+        gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra;
+        break;
 
     case 0x08: /* SRI */
-        gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size);
-        return;
+        gvec_fn = gen_gvec_sri;
+        break;
 
     case 0x00: /* SSHR / USHR */
         if (is_u) {
@@ -XXX,XX +XXX,XX @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
                 /* Shift count the same size as element size produces zero.  */
                 tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd),
                                      is_q ? 16 : 8, vec_full_reg_size(s), 0);
-            } else {
-                gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size);
+                return;
             }
+            gvec_fn = tcg_gen_gvec_shri;
         } else {
             /* Shift count the same size as element size produces all sign.  */
             if (shift == 8 << size) {
                 shift -= 1;
             }
-            gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size);
+            gvec_fn = tcg_gen_gvec_sari;
         }
-        return;
+        break;
 
     case 0x04: /* SRSHR / URSHR (rounding) */
-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
-                      is_u ? gen_gvec_urshr : gen_gvec_srshr, size);
-        return;
+        gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr;
+        break;
 
     case 0x06: /* SRSRA / URSRA (accum + rounding) */
-        gen_gvec_fn2i(s, is_q, rd, rn, shift,
-                      is_u ? gen_gvec_ursra : gen_gvec_srsra, size);
-        return;
+        gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra;
+        break;
 
     default:
         g_assert_not_reached();
     }
 
-    round_const = 1ULL << (shift - 1);
-    tcg_round = tcg_const_i64(round_const);
-
-    for (i = 0; i < elements; i++) {
-        read_vec_element(s, tcg_rn, rn, i, memop);
-        if (accumulate) {
-            read_vec_element(s, tcg_rd, rd, i, memop);
-        }
-
-        handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
-                                accumulate, is_u, size, shift);
-
-        write_vec_element(s, tcg_rd, rd, i, size);
-    }
-    tcg_temp_free_i64(tcg_round);
-
-    clear_vec_high(s, is_q, rd);
+    gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size);
 }
 
 /* SHL/SLI - Vector shift left */
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Macro-ize the 5 nearly identical comparisons.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     |  16 ++-
 target/arm/translate-a64.c |  22 ++--
 target/arm/translate.c     | 254 ++++++++-----------------------------
 3 files changed, 74 insertions(+), 218 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex)
 uint64_t vfp_expand_imm(int size, uint8_t imm8);
 
 /* Vector operations shared between ARM and AArch64.  */
-extern const GVecGen2 ceq0_op[4];
-extern const GVecGen2 clt0_op[4];
-extern const GVecGen2 cgt0_op[4];
-extern const GVecGen2 cle0_op[4];
-extern const GVecGen2 cge0_op[4];
+void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                   uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                   uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                   uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                   uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+                   uint32_t opr_sz, uint32_t max_sz);
+
 extern const GVecGen3 mla_op[4];
 extern const GVecGen3 mls_op[4];
 extern const GVecGen3 cmtst_op[4];
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
             is_q ? 16 : 8, vec_full_reg_size(s));
 }
 
-/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */
-static void gen_gvec_op2(DisasContext *s, bool is_q, int rd,
-                         int rn, const GVecGen2 *gvec_op)
-{
-    tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
-                   is_q ? 16 : 8, vec_full_reg_size(s), gvec_op);
-}
-
 /* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
 static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
                          int rn, int rm, const GVecGen3 *gvec_op)
@@ -XXX,XX +XXX,XX @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
         }
         break;
     case 0x8: /* CMGT, CMGE */
-        gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]);
+        if (u) {
+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size);
+        } else {
+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size);
+        }
         return;
     case 0x9: /* CMEQ, CMLE */
-        gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]);
+        if (u) {
+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size);
+        } else {
+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size);
+        }
         return;
     case 0xa: /* CMLT */
-        gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]);
+        gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size);
         return;
     case 0xb:
         if (u) { /* ABS, NEG */
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn,
     return 1;
 }
 
-static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a)
-{
-    tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0);
-    tcg_gen_neg_i32(d, d);
-}
-
-static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a)
-{
-    tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0);
-    tcg_gen_neg_i64(d, d);
-}
-
-static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-{
-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
-    tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero);
-    tcg_temp_free_vec(zero);
-}
+#define GEN_CMP0(NAME, COND)                                            \
+    static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a)               \
+    {                                                                   \
+        tcg_gen_setcondi_i32(COND, d, a, 0);                            \
+        tcg_gen_neg_i32(d, d);                                          \
+    }                                                                   \
+    static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a)               \
+    {                                                                   \
+        tcg_gen_setcondi_i64(COND, d, a, 0);                            \
+        tcg_gen_neg_i64(d, d);                                          \
+    }                                                                   \
+    static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \
+    {                                                                   \
+        TCGv_vec zero = tcg_const_zeros_vec_matching(d);                \
+        tcg_gen_cmp_vec(COND, vece, d, a, zero);                        \
+        tcg_temp_free_vec(zero);                                        \
+    }                                                                   \
+    void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m,      \
+                            uint32_t opr_sz, uint32_t max_sz)           \
+    {                                                                   \
+        const GVecGen2 op[4] = {                                        \
+            { .fno = gen_helper_gvec_##NAME##0_b,                       \
+              .fniv = gen_##NAME##0_vec,                                \
+              .opt_opc = vecop_list_cmp,                                \
+              .vece = MO_8 },                                           \
+            { .fno = gen_helper_gvec_##NAME##0_h,                       \
+              .fniv = gen_##NAME##0_vec,                                \
+              .opt_opc = vecop_list_cmp,                                \
+              .vece = MO_16 },                                          \
+            { .fni4 = gen_##NAME##0_i32,                                \
+              .fniv = gen_##NAME##0_vec,                                \
+              .opt_opc = vecop_list_cmp,                                \
+              .vece = MO_32 },                                          \
+            { .fni8 = gen_##NAME##0_i64,                                \
+              .fniv = gen_##NAME##0_vec,                                \
+              .opt_opc = vecop_list_cmp,                                \
+              .prefer_i64 = TCG_TARGET_REG_BITS == 64,                  \
+              .vece = MO_64 },                                          \
+        };                                                              \
+        tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]);                \
+    }
 
 static const TCGOpcode vecop_list_cmp[] = {
     INDEX_op_cmp_vec, 0
 };
 
-const GVecGen2 ceq0_op[4] = {
-    { .fno = gen_helper_gvec_ceq0_b,
-      .fniv = gen_ceq0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_8 },
-    { .fno = gen_helper_gvec_ceq0_h,
-      .fniv = gen_ceq0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_16 },
-    { .fni4 = gen_ceq0_i32,
-      .fniv = gen_ceq0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_32 },
-    { .fni8 = gen_ceq0_i64,
-      .fniv = gen_ceq0_vec,
-      .opt_opc = vecop_list_cmp,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .vece = MO_64 },
-};
+GEN_CMP0(ceq, TCG_COND_EQ)
+GEN_CMP0(cle, TCG_COND_LE)
+GEN_CMP0(cge, TCG_COND_GE)
+GEN_CMP0(clt, TCG_COND_LT)
+GEN_CMP0(cgt, TCG_COND_GT)
 
-static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a)
-{
-    tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0);
-    tcg_gen_neg_i32(d, d);
-}
-
-static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a)
-{
-    tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0);
-    tcg_gen_neg_i64(d, d);
-}
-
-static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-{
-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
-    tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero);
-    tcg_temp_free_vec(zero);
-}
-
-const GVecGen2 cle0_op[4] = {
-    { .fno = gen_helper_gvec_cle0_b,
-      .fniv = gen_cle0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_8 },
-    { .fno = gen_helper_gvec_cle0_h,
-      .fniv = gen_cle0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_16 },
-    { .fni4 = gen_cle0_i32,
-      .fniv = gen_cle0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_32 },
-    { .fni8 = gen_cle0_i64,
-      .fniv = gen_cle0_vec,
-      .opt_opc = vecop_list_cmp,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .vece = MO_64 },
-};
-
-static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a)
-{
-    tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0);
-    tcg_gen_neg_i32(d, d);
-}
-
-static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a)
-{
-    tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0);
-    tcg_gen_neg_i64(d, d);
-}
-
-static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-{
-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
-    tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero);
-    tcg_temp_free_vec(zero);
-}
-
-const GVecGen2 cge0_op[4] = {
-    { .fno = gen_helper_gvec_cge0_b,
-      .fniv = gen_cge0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_8 },
-    { .fno = gen_helper_gvec_cge0_h,
-      .fniv = gen_cge0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_16 },
-    { .fni4 = gen_cge0_i32,
-      .fniv = gen_cge0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_32 },
-    { .fni8 = gen_cge0_i64,
-      .fniv = gen_cge0_vec,
-      .opt_opc = vecop_list_cmp,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .vece = MO_64 },
-};
-
-static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a)
-{
-    tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0);
-    tcg_gen_neg_i32(d, d);
-}
-
-static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a)
-{
-    tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0);
-    tcg_gen_neg_i64(d, d);
-}
-
-static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-{
-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
-    tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero);
-    tcg_temp_free_vec(zero);
-}
-
-const GVecGen2 clt0_op[4] = {
-    { .fno = gen_helper_gvec_clt0_b,
-      .fniv = gen_clt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_8 },
-    { .fno = gen_helper_gvec_clt0_h,
-      .fniv = gen_clt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_16 },
-    { .fni4 = gen_clt0_i32,
-      .fniv = gen_clt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_32 },
-    { .fni8 = gen_clt0_i64,
-      .fniv = gen_clt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .vece = MO_64 },
-};
-
-static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a)
-{
-    tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0);
-    tcg_gen_neg_i32(d, d);
-}
-
-static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a)
-{
-    tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0);
-    tcg_gen_neg_i64(d, d);
-}
-
-static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a)
-{
-    TCGv_vec zero = tcg_const_zeros_vec_matching(d);
-    tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero);
-    tcg_temp_free_vec(zero);
-}
-
-const GVecGen2 cgt0_op[4] = {
-    { .fno = gen_helper_gvec_cgt0_b,
-      .fniv = gen_cgt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_8 },
-    { .fno = gen_helper_gvec_cgt0_h,
-      .fniv = gen_cgt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_16 },
-    { .fni4 = gen_cgt0_i32,
-      .fniv = gen_cgt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .vece = MO_32 },
-    { .fni8 = gen_cgt0_i64,
-      .fniv = gen_cgt0_vec,
-      .opt_opc = vecop_list_cmp,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .vece = MO_64 },
-};
+#undef GEN_CMP0
 
 static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift)
 {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     break;
 
                 case NEON_2RM_VCEQ0:
-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-                                   vec_size, &ceq0_op[size]);
+                    gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                     break;
                 case NEON_2RM_VCGT0:
-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-                                   vec_size, &cgt0_op[size]);
+                    gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                     break;
                 case NEON_2RM_VCLE0:
-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-                                   vec_size, &cle0_op[size]);
+                    gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                     break;
                 case NEON_2RM_VCGE0:
-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-                                   vec_size, &cge0_op[size]);
+                    gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                     break;
                 case NEON_2RM_VCLT0:
-                    tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size,
-                                   vec_size, &clt0_op[size]);
+                    gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
                     break;
 
                 default:
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h          |   7 +-
 target/arm/translate-a64.c      |   4 +-
 target/arm/translate-neon.inc.c |  16 +----
 target/arm/translate.c          | 117 +++++++++++++++++---------------
 4 files changed, 71 insertions(+), 73 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Rather than perform the argument swap during code generation,
perform it during decode.  This means it doesn't have to be
special cased later, and we can share code with aarch64 code
generation.  Hopefully the decode comment addresses any confusion
that might arise in between.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       | 17 +++++++++++++++--
 target/arm/translate-neon.inc.c |  3 +--
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VCGT_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same
 VCGE_S_3s        1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same
 VCGE_U_3s        1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same
 
-VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same
-VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same
+# The _rev suffix indicates that Vn and Vm are reversed. This is
+# the case for shifts. In the Arm ARM these insns are documented
+# with the Vm and Vn fields in their usual places, but in the
+# assembly the operands are listed "backwards", ie in the order
+# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose
+# to consider Vm and Vn as being in different fields in the insn,
+# which allows us to avoid special-casing shifts in the trans_
+# function code. We would otherwise need to manually swap the operands
+# over to call Neon helper functions that are shared with AArch64,
+# which does not have this odd reversed-operand situation.
+@3same_rev       .... ... . . . size:2 .... .... .... . q:1 . . .... \
+                 &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp
+
+VSHL_S_3s        1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev
+VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
 
 VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
 VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
                                 uint32_t rn_ofs, uint32_t rm_ofs,       \
                                 uint32_t oprsz, uint32_t maxsz)         \
     {                                                                   \
-        /* Note the operation is vshl vd,vm,vn */                       \
-        tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs,                          \
+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
                        oprsz, maxsz, &OPARRAY[vece]);                   \
     }                                                                   \
     DO_3SAME(INSN, gen_##INSN##_3s)
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h          |  10 ++-
 target/arm/translate-a64.c      |  18 ++--
 target/arm/translate-neon.inc.c |  23 +----
 target/arm/translate.c          | 146 +++++++++++++++++---------------
 4 files changed, 95 insertions(+), 102 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 
-extern const GVecGen3 cmtst_op[4];
-extern const GVecGen3 sshl_op[4];
-extern const GVecGen3 ushl_op[4];
+void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+
 extern const GVecGen4 uqadd_op[4];
 extern const GVecGen4 sqadd_op[4];
 extern const GVecGen4 uqsub_op[4];
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm,
             is_q ? 16 : 8, vec_full_reg_size(s));
 }
 
-/* Expand a 3-operand AdvSIMD vector operation using an op descriptor.  */
-static void gen_gvec_op3(DisasContext *s, bool is_q, int rd,
-                         int rn, int rm, const GVecGen3 *gvec_op)
-{
-    tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn),
-                   vec_full_reg_offset(s, rm), is_q ? 16 : 8,
-                   vec_full_reg_size(s), gvec_op);
-}
-
 /* Expand a 3-operand operation using an out-of-line helper.  */
 static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
                              int rn, int rm, int data, gen_helper_gvec_3 *fn)
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                        (u ? uqsub_op : sqsub_op) + size);
         return;
     case 0x08: /* SSHL, USHL */
-        gen_gvec_op3(s, is_q, rd, rn, rm,
-                     u ? &ushl_op[size] : &sshl_op[size]);
+        if (u) {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size);
+        } else {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size);
+        }
         return;
     case 0x0c: /* SMAX, UMAX */
         if (u) {
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
         return;
     case 0x11:
         if (!u) { /* CMTST */
-            gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]);
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size);
             return;
         }
         /* else CMEQ */
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME(VBIC, tcg_gen_gvec_andc)
 DO_3SAME(VORR, tcg_gen_gvec_or)
 DO_3SAME(VORN, tcg_gen_gvec_orc)
 DO_3SAME(VEOR, tcg_gen_gvec_xor)
+DO_3SAME(VSHL_S, gen_gvec_sshl)
+DO_3SAME(VSHL_U, gen_gvec_ushl)
 
 /* These insns are all gvec_bitsel but with the inputs in various orders. */
 #define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin)
 DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul)
 DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla)
 DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls)
+DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst)
 
 #define DO_3SAME_CMP(INSN, COND)                                        \
     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
 DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
 DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
 
-static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
-                         uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
-{
-    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]);
-}
-DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s)
-
 #define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
                                 uint32_t rn_ofs, uint32_t rm_ofs,       \
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
     }
     return do_3same(s, a, gen_VMUL_p_3s);
 }
-
-#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY)                             \
-    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-                                uint32_t oprsz, uint32_t maxsz)         \
-    {                                                                   \
-        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,                          \
-                       oprsz, maxsz, &OPARRAY[vece]);                   \
-    }                                                                   \
-    DO_3SAME(INSN, gen_##INSN##_3s)
-
-DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op)
-DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
     tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a);
 }
 
-static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 };
-
-const GVecGen3 cmtst_op[4] = {
-    { .fni4 = gen_helper_neon_tst_u8,
-      .fniv = gen_cmtst_vec,
-      .opt_opc = vecop_list_cmtst,
-      .vece = MO_8 },
-    { .fni4 = gen_helper_neon_tst_u16,
-      .fniv = gen_cmtst_vec,
-      .opt_opc = vecop_list_cmtst,
-      .vece = MO_16 },
-    { .fni4 = gen_cmtst_i32,
-      .fniv = gen_cmtst_vec,
-      .opt_opc = vecop_list_cmtst,
-      .vece = MO_32 },
-    { .fni8 = gen_cmtst_i64,
-      .fniv = gen_cmtst_vec,
-      .prefer_i64 = TCG_TARGET_REG_BITS == 64,
-      .opt_opc = vecop_list_cmtst,
-      .vece = MO_64 },
-};
+void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 };
+    static const GVecGen3 ops[4] = {
+        { .fni4 = gen_helper_neon_tst_u8,
+          .fniv = gen_cmtst_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fni4 = gen_helper_neon_tst_u16,
+          .fniv = gen_cmtst_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fni4 = gen_cmtst_i32,
+          .fniv = gen_cmtst_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fni8 = gen_cmtst_i64,
+          .fniv = gen_cmtst_vec,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
 {
@@ -XXX,XX +XXX,XX @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
     tcg_temp_free_vec(rsh);
 }
 
-static const TCGOpcode ushl_list[] = {
-    INDEX_op_neg_vec, INDEX_op_shlv_vec,
-    INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
-};
-
-const GVecGen3 ushl_op[4] = {
-    { .fniv = gen_ushl_vec,
-      .fno = gen_helper_gvec_ushl_b,
-      .opt_opc = ushl_list,
-      .vece = MO_8 },
-    { .fniv = gen_ushl_vec,
-      .fno = gen_helper_gvec_ushl_h,
-      .opt_opc = ushl_list,
-      .vece = MO_16 },
-    { .fni4 = gen_ushl_i32,
-      .fniv = gen_ushl_vec,
-      .opt_opc = ushl_list,
-      .vece = MO_32 },
-    { .fni8 = gen_ushl_i64,
-      .fniv = gen_ushl_vec,
-      .opt_opc = ushl_list,
-      .vece = MO_64 },
-};
+void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_neg_vec, INDEX_op_shlv_vec,
+        INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0
+    };
+    static const GVecGen3 ops[4] = {
+        { .fniv = gen_ushl_vec,
+          .fno = gen_helper_gvec_ushl_b,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fniv = gen_ushl_vec,
+          .fno = gen_helper_gvec_ushl_h,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fni4 = gen_ushl_i32,
+          .fniv = gen_ushl_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fni8 = gen_ushl_i64,
+          .fniv = gen_ushl_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
 {
@@ -XXX,XX +XXX,XX @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
     tcg_temp_free_vec(tmp);
 }
 
-static const TCGOpcode sshl_list[] = {
-    INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
-    INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
-};
-
-const GVecGen3 sshl_op[4] = {
-    { .fniv = gen_sshl_vec,
-      .fno = gen_helper_gvec_sshl_b,
-      .opt_opc = sshl_list,
-      .vece = MO_8 },
-    { .fniv = gen_sshl_vec,
-      .fno = gen_helper_gvec_sshl_h,
-      .opt_opc = sshl_list,
-      .vece = MO_16 },
-    { .fni4 = gen_sshl_i32,
-      .fniv = gen_sshl_vec,
-      .opt_opc = sshl_list,
-      .vece = MO_32 },
-    { .fni8 = gen_sshl_i64,
-      .fniv = gen_sshl_vec,
-      .opt_opc = sshl_list,
-      .vece = MO_64 },
-};
+void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec,
+        INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0
+    };
+    static const GVecGen3 ops[4] = {
+        { .fniv = gen_sshl_vec,
+          .fno = gen_helper_gvec_sshl_b,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fniv = gen_sshl_vec,
+          .fno = gen_helper_gvec_sshl_h,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fni4 = gen_sshl_i32,
+          .fniv = gen_sshl_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fni8 = gen_sshl_i64,
+          .fniv = gen_sshl_vec,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                           TCGv_vec a, TCGv_vec b)
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h          |  13 +-
 target/arm/translate-a64.c      |  22 ++-
 target/arm/translate-neon.inc.c |  19 +--
 target/arm/translate.c          | 228 +++++++++++++++++---------------
 4 files changed, 147 insertions(+), 135 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 
-extern const GVecGen4 uqadd_op[4];
-extern const GVecGen4 sqadd_op[4];
-extern const GVecGen4 uqsub_op[4];
-extern const GVecGen4 sqsub_op[4];
 void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
 void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
 void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
 
+void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+
 void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
                    int64_t shift, uint32_t opr_sz, uint32_t max_sz);
 void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
 
     switch (opcode) {
     case 0x01: /* SQADD, UQADD */
-        tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
-                       offsetof(CPUARMState, vfp.qc),
-                       vec_full_reg_offset(s, rn),
-                       vec_full_reg_offset(s, rm),
-                       is_q ? 16 : 8, vec_full_reg_size(s),
-                       (u ? uqadd_op : sqadd_op) + size);
+        if (u) {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size);
+        } else {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size);
+        }
         return;
     case 0x05: /* SQSUB, UQSUB */
-        tcg_gen_gvec_4(vec_full_reg_offset(s, rd),
-                       offsetof(CPUARMState, vfp.qc),
-                       vec_full_reg_offset(s, rn),
-                       vec_full_reg_offset(s, rm),
-                       is_q ? 16 : 8, vec_full_reg_size(s),
-                       (u ? uqsub_op : sqsub_op) + size);
+        if (u) {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size);
+        } else {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size);
+        }
         return;
     case 0x08: /* SSHL, USHL */
         if (u) {
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME(VORN, tcg_gen_gvec_orc)
 DO_3SAME(VEOR, tcg_gen_gvec_xor)
 DO_3SAME(VSHL_S, gen_gvec_sshl)
 DO_3SAME(VSHL_U, gen_gvec_ushl)
+DO_3SAME(VQADD_S, gen_gvec_sqadd_qc)
+DO_3SAME(VQADD_U, gen_gvec_uqadd_qc)
+DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc)
+DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc)
 
 /* These insns are all gvec_bitsel but with the inputs in various orders. */
 #define DO_3SAME_BITSEL(INSN, O1, O2, O3)                               \
@@ -XXX,XX +XXX,XX @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE)
 DO_3SAME_CMP(VCGE_U, TCG_COND_GEU)
 DO_3SAME_CMP(VCEQ, TCG_COND_EQ)
 
-#define DO_3SAME_GVEC4(INSN, OPARRAY)                                   \
-    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
-                                uint32_t rn_ofs, uint32_t rm_ofs,       \
-                                uint32_t oprsz, uint32_t maxsz)         \
-    {                                                                   \
-        tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),           \
-                       rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]);   \
-    }                                                                   \
-    DO_3SAME(INSN, gen_##INSN##_3s)
-
-DO_3SAME_GVEC4(VQADD_S, sqadd_op)
-DO_3SAME_GVEC4(VQADD_U, uqadd_op)
-DO_3SAME_GVEC4(VQSUB_S, sqsub_op)
-DO_3SAME_GVEC4(VQSUB_U, uqsub_op)
-
 static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                            uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz)
 {
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
     tcg_temp_free_vec(x);
 }
 
-static const TCGOpcode vecop_list_uqadd[] = {
-    INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
-};
-
-const GVecGen4 uqadd_op[4] = {
-    { .fniv = gen_uqadd_vec,
-      .fno = gen_helper_gvec_uqadd_b,
-      .write_aofs = true,
-      .opt_opc = vecop_list_uqadd,
-      .vece = MO_8 },
-    { .fniv = gen_uqadd_vec,
-      .fno = gen_helper_gvec_uqadd_h,
-      .write_aofs = true,
-      .opt_opc = vecop_list_uqadd,
-      .vece = MO_16 },
-    { .fniv = gen_uqadd_vec,
-      .fno = gen_helper_gvec_uqadd_s,
-      .write_aofs = true,
-      .opt_opc = vecop_list_uqadd,
-      .vece = MO_32 },
-    { .fniv = gen_uqadd_vec,
-      .fno = gen_helper_gvec_uqadd_d,
-      .write_aofs = true,
-      .opt_opc = vecop_list_uqadd,
-      .vece = MO_64 },
-};
+void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen4 ops[4] = {
+        { .fniv = gen_uqadd_vec,
+          .fno = gen_helper_gvec_uqadd_b,
+          .write_aofs = true,
+          .opt_opc = vecop_list,
+          .vece = MO_8 },
+        { .fniv = gen_uqadd_vec,
+          .fno = gen_helper_gvec_uqadd_h,
+          .write_aofs = true,
+          .opt_opc = vecop_list,
+          .vece = MO_16 },
+        { .fniv = gen_uqadd_vec,
+          .fno = gen_helper_gvec_uqadd_s,
+          .write_aofs = true,
+          .opt_opc = vecop_list,
+          .vece = MO_32 },
+        { .fniv = gen_uqadd_vec,
+          .fno = gen_helper_gvec_uqadd_d,
+          .write_aofs = true,
+          .opt_opc = vecop_list,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                           TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
     tcg_temp_free_vec(x);
 }
 
-static const TCGOpcode vecop_list_sqadd[] = {
-    INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
-};
-
-const GVecGen4 sqadd_op[4] = {
-    { .fniv = gen_sqadd_vec,
-      .fno = gen_helper_gvec_sqadd_b,
-      .opt_opc = vecop_list_sqadd,
-      .write_aofs = true,
-      .vece = MO_8 },
-    { .fniv = gen_sqadd_vec,
-      .fno = gen_helper_gvec_sqadd_h,
-      .opt_opc = vecop_list_sqadd,
-      .write_aofs = true,
-      .vece = MO_16 },
-    { .fniv = gen_sqadd_vec,
-      .fno = gen_helper_gvec_sqadd_s,
-      .opt_opc = vecop_list_sqadd,
-      .write_aofs = true,
-      .vece = MO_32 },
-    { .fniv = gen_sqadd_vec,
-      .fno = gen_helper_gvec_sqadd_d,
-      .opt_opc = vecop_list_sqadd,
-      .write_aofs = true,
-      .vece = MO_64 },
-};
+void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0
+    };
+    static const GVecGen4 ops[4] = {
+        { .fniv = gen_sqadd_vec,
+          .fno = gen_helper_gvec_sqadd_b,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_8 },
+        { .fniv = gen_sqadd_vec,
+          .fno = gen_helper_gvec_sqadd_h,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_16 },
+        { .fniv = gen_sqadd_vec,
+          .fno = gen_helper_gvec_sqadd_s,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_32 },
+        { .fniv = gen_sqadd_vec,
+          .fno = gen_helper_gvec_sqadd_d,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                           TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
     tcg_temp_free_vec(x);
 }
 
-static const TCGOpcode vecop_list_uqsub[] = {
-    INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
-};
-
-const GVecGen4 uqsub_op[4] = {
-    { .fniv = gen_uqsub_vec,
-      .fno = gen_helper_gvec_uqsub_b,
-      .opt_opc = vecop_list_uqsub,
-      .write_aofs = true,
-      .vece = MO_8 },
-    { .fniv = gen_uqsub_vec,
-      .fno = gen_helper_gvec_uqsub_h,
-      .opt_opc = vecop_list_uqsub,
-      .write_aofs = true,
-      .vece = MO_16 },
-    { .fniv = gen_uqsub_vec,
-      .fno = gen_helper_gvec_uqsub_s,
-      .opt_opc = vecop_list_uqsub,
-      .write_aofs = true,
-      .vece = MO_32 },
-    { .fniv = gen_uqsub_vec,
-      .fno = gen_helper_gvec_uqsub_d,
-      .opt_opc = vecop_list_uqsub,
-      .write_aofs = true,
-      .vece = MO_64 },
-};
+void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
+    };
+    static const GVecGen4 ops[4] = {
+        { .fniv = gen_uqsub_vec,
+          .fno = gen_helper_gvec_uqsub_b,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_8 },
+        { .fniv = gen_uqsub_vec,
+          .fno = gen_helper_gvec_uqsub_h,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_16 },
+        { .fniv = gen_uqsub_vec,
+          .fno = gen_helper_gvec_uqsub_s,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_32 },
+        { .fniv = gen_uqsub_vec,
+          .fno = gen_helper_gvec_uqsub_d,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
                           TCGv_vec a, TCGv_vec b)
@@ -XXX,XX +XXX,XX @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat,
     tcg_temp_free_vec(x);
 }
 
-static const TCGOpcode vecop_list_sqsub[] = {
-    INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
-};
-
-const GVecGen4 sqsub_op[4] = {
-    { .fniv = gen_sqsub_vec,
-      .fno = gen_helper_gvec_sqsub_b,
-      .opt_opc = vecop_list_sqsub,
-      .write_aofs = true,
-      .vece = MO_8 },
-    { .fniv = gen_sqsub_vec,
-      .fno = gen_helper_gvec_sqsub_h,
-      .opt_opc = vecop_list_sqsub,
-      .write_aofs = true,
-      .vece = MO_16 },
-    { .fniv = gen_sqsub_vec,
-      .fno = gen_helper_gvec_sqsub_s,
-      .opt_opc = vecop_list_sqsub,
-      .write_aofs = true,
-      .vece = MO_32 },
-    { .fniv = gen_sqsub_vec,
-      .fno = gen_helper_gvec_sqsub_d,
-      .opt_opc = vecop_list_sqsub,
-      .write_aofs = true,
-      .vece = MO_64 },
-};
+void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                       uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0
+    };
+    static const GVecGen4 ops[4] = {
+        { .fniv = gen_sqsub_vec,
+          .fno = gen_helper_gvec_sqsub_b,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_8 },
+        { .fniv = gen_sqsub_vec,
+          .fno = gen_helper_gvec_sqsub_h,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_16 },
+        { .fniv = gen_sqsub_vec,
+          .fno = gen_helper_gvec_sqsub_s,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_32 },
+        { .fniv = gen_sqsub_vec,
+          .fno = gen_helper_gvec_sqsub_d,
+          .opt_opc = vecop_list,
+          .write_aofs = true,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc),
+                   rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
 
 /* Translate a NEON data processing instruction.  Return nonzero if the
    instruction is invalid.
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

These operations do not touch fp_status.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  4 ++--
 target/arm/translate-a64.c |  5 ++---
 target/arm/translate.c     | 12 ++----------
 target/arm/vfp_helper.c    |  5 ++---
 4 files changed, 8 insertions(+), 18 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     |  5 ++++
 target/arm/translate-a64.c | 34 ++----------------------
 target/arm/translate.c     | 54 +++++++++++++++++++-------------------
 3 files changed, 34 insertions(+), 59 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Pass a pointer directly to env->vfp.qc[0], rather than env.
This will allow SVE2, which does not modify QC, to pass a
pointer to dummy storage.

Change the return type of inl_qrdml.h_s16 to match the
sense of the operation: signed.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.c  | 18 ++++++++---
 target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------
 2 files changed, 54 insertions(+), 34 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t neon_2rm_sizes[] = {
     [NEON_2RM_VCVT_UF] = 0x4,
 };
 
+static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
+                            uint32_t opr_sz, uint32_t max_sz,
+                            gen_helper_gvec_3_ptr *fn)
+{
+    TCGv_ptr qc_ptr = tcg_temp_new_ptr();
+
+    tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc));
+    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr,
+                       opr_sz, max_sz, 0, fn);
+    tcg_temp_free_ptr(qc_ptr);
+}
+
 void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                           uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
 {
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
         gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32
     };
     tcg_debug_assert(vece >= 1 && vece <= 2);
-    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
-                       opr_sz, max_sz, 0, fns[vece - 1]);
+    gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
 }
 
 void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
         gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32
     };
     tcg_debug_assert(vece >= 1 && vece <= 2);
-    tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env,
-                       opr_sz, max_sz, 0, fns[vece - 1]);
+    gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]);
 }
 
 #define GEN_CMP0(NAME, COND)                                            \
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@
 #define H4(x)  (x)
 #endif
 
-#define SET_QC() env->vfp.qc[0] = 1
-
 static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
 {
     uint64_t *d = vd + opr_sz;
@@ -XXX,XX +XXX,XX @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz)
 }
 
 /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */
-static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
-                                int16_t src2, int16_t src3)
+static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2,
+                               int16_t src3, uint32_t *sat)
 {
     /* Simplify:
      * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
     ret = ((int32_t)src3 << 15) + ret + (1 << 14);
     ret >>= 15;
     if (ret != (int16_t)ret) {
-        SET_QC();
+        *sat = 1;
         ret = (ret < 0 ? -0x8000 : 0x7fff);
     }
     return ret;
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1,
 uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1,
                                   uint32_t src2, uint32_t src3)
 {
-    uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3);
-    uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
+    uint32_t *sat = &env->vfp.qc[0];
+    uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat);
+    uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
     return deposit32(e1, 16, 16, e2);
 }
 
 void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm,
-                              void *ve, uint32_t desc)
+                              void *vq, uint32_t desc)
 {
     uintptr_t opr_sz = simd_oprsz(desc);
     int16_t *d = vd;
     int16_t *n = vn;
     int16_t *m = vm;
-    CPUARMState *env = ve;
     uintptr_t i;
 
     for (i = 0; i < opr_sz / 2; ++i) {
-        d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]);
+        d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq);
     }
     clear_tail(d, opr_sz, simd_maxsz(desc));
 }
 
 /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */
-static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
-                                int16_t src2, int16_t src3)
+static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2,
+                               int16_t src3, uint32_t *sat)
 {
     /* Similarly, using subtraction:
      * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
     ret = ((int32_t)src3 << 15) - ret + (1 << 14);
     ret >>= 15;
     if (ret != (int16_t)ret) {
-        SET_QC();
+        *sat = 1;
         ret = (ret < 0 ? -0x8000 : 0x7fff);
     }
     return ret;
@@ -XXX,XX +XXX,XX @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1,
 uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1,
                                   uint32_t src2, uint32_t src3)
 {
-    uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3);
-    uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16);
+    uint32_t *sat = &env->vfp.qc[0];
+    uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat);
+    uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat);
     return deposit32(e1, 16, 16, e2);
 }
 
 void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm,
-                              void *ve, uint32_t desc)
+                              void *vq, uint32_t desc)
 {
     uintptr_t opr_sz = simd_oprsz(desc);
     int16_t *d = vd;
     int16_t *n = vn;
     int16_t *m = vm;
-    CPUARMState *env = ve;
     uintptr_t i;
 
     for (i = 0; i < opr_sz / 2; ++i) {
-        d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]);
+        d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq);
     }
     clear_tail(d, opr_sz, simd_maxsz(desc));
 }
 
 /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */
-uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
-                                  int32_t src2, int32_t src3)
+static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2,
+                               int32_t src3, uint32_t *sat)
 {
     /* Simplify similarly to int_qrdmlah_s16 above.  */
     int64_t ret = (int64_t)src1 * src2;
     ret = ((int64_t)src3 << 31) + ret + (1 << 30);
     ret >>= 31;
     if (ret != (int32_t)ret) {
-        SET_QC();
+        *sat = 1;
         ret = (ret < 0 ? INT32_MIN : INT32_MAX);
     }
     return ret;
 }
 
+uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1,
+                                  int32_t src2, int32_t src3)
+{
+    uint32_t *sat = &env->vfp.qc[0];
+    return inl_qrdmlah_s32(src1, src2, src3, sat);
+}
+
 void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm,
-                              void *ve, uint32_t desc)
+                              void *vq, uint32_t desc)
 {
     uintptr_t opr_sz = simd_oprsz(desc);
     int32_t *d = vd;
     int32_t *n = vn;
     int32_t *m = vm;
-    CPUARMState *env = ve;
     uintptr_t i;
 
     for (i = 0; i < opr_sz / 4; ++i) {
-        d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]);
+        d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq);
     }
     clear_tail(d, opr_sz, simd_maxsz(desc));
 }
 
 /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */
-uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
-                                  int32_t src2, int32_t src3)
+static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2,
+                               int32_t src3, uint32_t *sat)
 {
     /* Simplify similarly to int_qrdmlsh_s16 above.  */
     int64_t ret = (int64_t)src1 * src2;
     ret = ((int64_t)src3 << 31) - ret + (1 << 30);
     ret >>= 31;
     if (ret != (int32_t)ret) {
-        SET_QC();
+        *sat = 1;
         ret = (ret < 0 ? INT32_MIN : INT32_MAX);
     }
     return ret;
 }
 
+uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1,
+                                  int32_t src2, int32_t src3)
+{
+    uint32_t *sat = &env->vfp.qc[0];
+    return inl_qrdmlsh_s32(src1, src2, src3, sat);
+}
+
 void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
-                              void *ve, uint32_t desc)
+                              void *vq, uint32_t desc)
 {
     uintptr_t opr_sz = simd_oprsz(desc);
     int32_t *d = vd;
     int32_t *n = vn;
     int32_t *m = vm;
-    CPUARMState *env = ve;
     uintptr_t i;
 
     for (i = 0; i < opr_sz / 4; ++i) {
-        d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]);
+        d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq);
     }
     clear_tail(d, opr_sz, simd_maxsz(desc));
 }
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Must clear the tail for AdvSIMD when SVE is enabled.

Fixes: ca40a6e6e39
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/vec_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
             d[i + j] = TYPE##_mul(n[i + j], mm, stat);                     \
         }                                                                  \
     }                                                                      \
+    clear_tail(d, oprsz, simd_maxsz(desc));                                \
 }
 
 DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
@@ -XXX,XX +XXX,XX @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va,                  \
                                      mm, a[i + j], 0, stat);               \
         }                                                                  \
     }                                                                      \
+    clear_tail(d, oprsz, simd_maxsz(desc));                                \
 }
 
 DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2)
-- 
2.20.1

From: Richard Henderson <richard.henderson@linaro.org>

Include 64-bit element size in preparation for SVE2.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  10 +++
 target/arm/translate.h     |   5 ++
 target/arm/translate-a64.c |   8 ++-
 target/arm/translate.c     | 133 ++++++++++++++++++++++++++++++++++++-
 target/arm/vec_helper.c    |  24 +++++++
 5 files changed, 176 insertions(+), 4 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Include 64-bit element size in preparation for SVE2.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.h        |  17 +++--
 target/arm/translate.h     |   5 ++
 target/arm/neon_helper.c   |  10 ---
 target/arm/translate-a64.c |  17 ++---
 target/arm/translate.c     | 134 +++++++++++++++++++++++++++++++++++--
 target/arm/vec_helper.c    |  24 +++++++
 6 files changed, 174 insertions(+), 33 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32)
 DEF_HELPER_2(neon_pmax_u16, i32, i32, i32)
 DEF_HELPER_2(neon_pmax_s16, i32, i32, i32)
 
-DEF_HELPER_2(neon_abd_u8, i32, i32, i32)
-DEF_HELPER_2(neon_abd_s8, i32, i32, i32)
-DEF_HELPER_2(neon_abd_u16, i32, i32, i32)
-DEF_HELPER_2(neon_abd_s16, i32, i32, i32)
-DEF_HELPER_2(neon_abd_u32, i32, i32, i32)
-DEF_HELPER_2(neon_abd_s32, i32, i32, i32)
-
 DEF_HELPER_2(neon_shl_u16, i32, i32, i32)
 DEF_HELPER_2(neon_shl_s16, i32, i32, i32)
 DEF_HELPER_2(neon_rshl_u8, i32, i32, i32)
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #include "helper-sve.h"
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                    uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 
+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
+
 /*
  * Forward to the isar_feature_* tests given a DisasContext pointer.
  */
diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon_helper.c
+++ b/target/arm/neon_helper.c
@@ -XXX,XX +XXX,XX @@ NEON_POP(pmax_s16, neon_s16, 2)
 NEON_POP(pmax_u16, neon_u16, 2)
 #undef NEON_FN
 
-#define NEON_FN(dest, src1, src2) \
-    dest = (src1 > src2) ? (src1 - src2) : (src2 - src1)
-NEON_VOP(abd_s8, neon_s8, 4)
-NEON_VOP(abd_u8, neon_u8, 4)
-NEON_VOP(abd_s16, neon_s16, 2)
-NEON_VOP(abd_u16, neon_u16, 2)
-NEON_VOP(abd_s32, neon_s32, 1)
-NEON_VOP(abd_u32, neon_u32, 1)
-#undef NEON_FN
-
 #define NEON_FN(dest, src1, src2) do { \
     int8_t tmp; \
     tmp = (int8_t)src2; \
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
             gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size);
         }
         return;
+    case 0xf: /* SABA, UABA */
+        if (u) {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size);
+        } else {
+            gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size);
+        }
+        return;
     case 0x10: /* ADD, SUB */
         if (u) {
             gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size);
@@ -XXX,XX +XXX,XX @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
                 genenvfn = fns[size][u];
                 break;
             }
-            case 0xf: /* SABA, UABA */
-            {
-                static NeonGenTwoOpFn * const fns[3][2] = {
-                    { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 },
-                    { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 },
-                    { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 },
-                };
-                genfn = fns[size][u];
-                break;
-            }
             case 0x16: /* SQDMULH, SQRDMULH */
             {
                 static NeonGenTwoOpEnvFn * const fns[2][2] = {
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
     tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 }
 
+static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+    gen_sabd_i32(t, a, b);
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+    gen_sabd_i64(t, a, b);
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+    gen_sabd_vec(vece, t, a, b);
+    tcg_gen_add_vec(vece, d, d, t);
+    tcg_temp_free_vec(t);
+}
+
+void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_sub_vec, INDEX_op_add_vec,
+        INDEX_op_smin_vec, INDEX_op_smax_vec, 0
+    };
+    static const GVecGen3 ops[4] = {
+        { .fniv = gen_saba_vec,
+          .fno = gen_helper_gvec_saba_b,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_8 },
+        { .fniv = gen_saba_vec,
+          .fno = gen_helper_gvec_saba_h,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_16 },
+        { .fni4 = gen_saba_i32,
+          .fniv = gen_saba_vec,
+          .fno = gen_helper_gvec_saba_s,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_32 },
+        { .fni8 = gen_saba_i64,
+          .fniv = gen_saba_vec,
+          .fno = gen_helper_gvec_saba_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
+
+static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
+{
+    TCGv_i32 t = tcg_temp_new_i32();
+    gen_uabd_i32(t, a, b);
+    tcg_gen_add_i32(d, d, t);
+    tcg_temp_free_i32(t);
+}
+
+static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
+{
+    TCGv_i64 t = tcg_temp_new_i64();
+    gen_uabd_i64(t, a, b);
+    tcg_gen_add_i64(d, d, t);
+    tcg_temp_free_i64(t);
+}
+
+static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
+{
+    TCGv_vec t = tcg_temp_new_vec_matching(d);
+    gen_uabd_vec(vece, t, a, b);
+    tcg_gen_add_vec(vece, d, d, t);
+    tcg_temp_free_vec(t);
+}
+
+void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                   uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_sub_vec, INDEX_op_add_vec,
+        INDEX_op_umin_vec, INDEX_op_umax_vec, 0
+    };
+    static const GVecGen3 ops[4] = {
+        { .fniv = gen_uaba_vec,
+          .fno = gen_helper_gvec_uaba_b,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_8 },
+        { .fniv = gen_uaba_vec,
+          .fno = gen_helper_gvec_uaba_h,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_16 },
+        { .fni4 = gen_uaba_i32,
+          .fniv = gen_uaba_vec,
+          .fno = gen_helper_gvec_uaba_s,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_32 },
+        { .fni8 = gen_uaba_i64,
+          .fniv = gen_uaba_vec,
+          .fno = gen_helper_gvec_uaba_d,
+          .prefer_i64 = TCG_TARGET_REG_BITS == 64,
+          .opt_opc = vecop_list,
+          .load_dest = true,
+          .vece = MO_64 },
+    };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
+}
+
 /* Translate a NEON data processing instruction.  Return nonzero if the
    instruction is invalid.
    We process data in a mixture of 32-bit and 64-bit chunks.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             }
             return 0;
 
+        case NEON_3R_VABA:
+            if (u) {
+                gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs,
+                              vec_size, vec_size);
+            } else {
+                gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs,
+                              vec_size, vec_size);
+            }
+            return 0;
+
         case NEON_3R_VADD_VSUB:
         case NEON_3R_LOGIC:
         case NEON_3R_VMAX:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VQRSHL:
             GEN_NEON_INTEGER_OP_ENV(qrshl);
             break;
-        case NEON_3R_VABA:
-            GEN_NEON_INTEGER_OP(abd);
-            tcg_temp_free_i32(tmp2);
-            tmp2 = neon_load_reg(rd, pass);
-            gen_neon_add(size, tmp, tmp2);
-            break;
         case NEON_3R_VPMAX:
             GEN_NEON_INTEGER_OP(pmax);
             break;
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -XXX,XX +XXX,XX @@ DO_ABD(gvec_uabd_s, uint32_t)
 DO_ABD(gvec_uabd_d, uint64_t)
 
 #undef DO_ABD
+
+#define DO_ABA(NAME, TYPE)                                      \
+void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc)  \
+{                                                               \
+    intptr_t i, opr_sz = simd_oprsz(desc);                      \
+    TYPE *d = vd, *n = vn, *m = vm;                             \
+                                                                \
+    for (i = 0; i < opr_sz / sizeof(TYPE); ++i) {               \
+        d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i];        \
+    }                                                           \
+    clear_tail(d, opr_sz, simd_maxsz(desc));                    \
+}
+
+DO_ABA(gvec_saba_b, int8_t)
+DO_ABA(gvec_saba_h, int16_t)
+DO_ABA(gvec_saba_s, int32_t)
+DO_ABA(gvec_saba_d, int64_t)
+
+DO_ABA(gvec_uaba_b, uint8_t)
+DO_ABA(gvec_uaba_h, uint16_t)
+DO_ABA(gvec_uaba_s, uint32_t)
+DO_ABA(gvec_uaba_d, uint64_t)
+
+#undef DO_ABA
-- 
2.20.1

From: Patrick Williams <patrick@stwcx.xyz>

Sonora Pass is a 2 socket x86 motherboard designed by Facebook
and supported by OpenBMC.  Strapping configuration was obtained
from hardware and i2c configuration is based on dts found at:

https://github.com/facebook/openbmc-linux/blob/1633c87b8ba7c162095787c988979b748ba65dc8/arch/arm/boot/dts/aspeed-bmc-facebook-sonorapass.dts

Booted a test image of http://github.com/facebook/openbmc to login
prompt.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Reviewed-by: Amithash Prasad <amithash@fb.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[PMM: fixed block comment style nit]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/aspeed.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -XXX,XX +XXX,XX @@ struct AspeedBoardState {
         SCU_AST2500_HW_STRAP_ACPI_ENABLE |                              \
         SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER))
 
+/* Sonorapass hardware value: 0xF100D216 */
+#define SONORAPASS_BMC_HW_STRAP1 (                                      \
+        SCU_AST2500_HW_STRAP_SPI_AUTOFETCH_ENABLE |                     \
+        SCU_AST2500_HW_STRAP_GPIO_STRAP_ENABLE |                        \
+        SCU_AST2500_HW_STRAP_UART_DEBUG |                               \
+        SCU_AST2500_HW_STRAP_RESERVED28 |                               \
+        SCU_AST2500_HW_STRAP_DDR4_ENABLE |                              \
+        SCU_HW_STRAP_VGA_CLASS_CODE |                                   \
+        SCU_HW_STRAP_LPC_RESET_PIN |                                    \
+        SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER) |                \
+        SCU_AST2500_HW_STRAP_SET_AXI_AHB_RATIO(AXI_AHB_RATIO_2_1) |     \
+        SCU_HW_STRAP_VGA_BIOS_ROM |                                     \
+        SCU_HW_STRAP_VGA_SIZE_SET(VGA_16M_DRAM) |                       \
+        SCU_AST2500_HW_STRAP_RESERVED1)
+
 /* Swift hardware value: 0xF11AD206 */
 #define SWIFT_BMC_HW_STRAP1 (                                           \
         AST2500_HW_STRAP1_DEFAULTS |                                    \
@@ -XXX,XX +XXX,XX @@ static void swift_bmc_i2c_init(AspeedBoardState *bmc)
     i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 12), "tmp105", 0x4a);
 }
 
+static void sonorapass_bmc_i2c_init(AspeedBoardState *bmc)
+{
+    AspeedSoCState *soc = &bmc->soc;
+
+    /* bus 2 : */
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x48);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 2), "tmp105", 0x49);
+    /* bus 2 : pca9546 @ 0x73 */
+
+    /* bus 3 : pca9548 @ 0x70 */
+
+    /* bus 4 : */
+    uint8_t *eeprom4_54 = g_malloc0(8 * 1024);
+    smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), 0x54,
+                          eeprom4_54);
+    /* PCA9539 @ 0x76, but PCA9552 is compatible */
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x76);
+    /* PCA9539 @ 0x77, but PCA9552 is compatible */
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "pca9552", 0x77);
+
+    /* bus 6 : */
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x48);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 6), "tmp105", 0x49);
+    /* bus 6 : pca9546 @ 0x73 */
+
+    /* bus 8 : */
+    uint8_t *eeprom8_56 = g_malloc0(8 * 1024);
+    smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), 0x56,
+                          eeprom8_56);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x60);
+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 8), "pca9552", 0x61);
+    /* bus 8 : adc128d818 @ 0x1d */
+    /* bus 8 : adc128d818 @ 0x1f */
+
+    /*
+     * bus 13 : pca9548 @ 0x71
+     *      - channel 3:
+     *          - tmm421 @ 0x4c
+     *          - tmp421 @ 0x4e
+     *          - tmp421 @ 0x4f
+     */
+
+}
+
 static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
 {
     AspeedSoCState *soc = &bmc->soc;
@@ -XXX,XX +XXX,XX @@ static void aspeed_machine_romulus_class_init(ObjectClass *oc, void *data)
     mc->default_ram_size       = 512 * MiB;
 };
 
+static void aspeed_machine_sonorapass_class_init(ObjectClass *oc, void *data)
+{
+    MachineClass *mc = MACHINE_CLASS(oc);
+    AspeedMachineClass *amc = ASPEED_MACHINE_CLASS(oc);
+
+    mc->desc       = "OCP SonoraPass BMC (ARM1176)";
+    amc->soc_name  = "ast2500-a1";
+    amc->hw_strap1 = SONORAPASS_BMC_HW_STRAP1;
+    amc->fmc_model = "mx66l1g45g";
+    amc->spi_model = "mx66l1g45g";
+    amc->num_cs    = 2;
+    amc->i2c_init  = sonorapass_bmc_i2c_init;
+    mc->default_ram_size       = 512 * MiB;
+};
+
 static void aspeed_machine_swift_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -XXX,XX +XXX,XX @@ static const TypeInfo aspeed_machine_types[] = {
         .name          = MACHINE_TYPE_NAME("swift-bmc"),
         .parent        = TYPE_ASPEED_MACHINE,
         .class_init    = aspeed_machine_swift_class_init,
+    }, {
+        .name          = MACHINE_TYPE_NAME("sonorapass-bmc"),
+        .parent        = TYPE_ASPEED_MACHINE,
+        .class_init    = aspeed_machine_sonorapass_class_init,
     }, {
         .name          = MACHINE_TYPE_NAME("witherspoon-bmc"),
         .parent        = TYPE_ASPEED_MACHINE,
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

The little end UUID is used in many places, so make
NVDIMM_UUID_LE to a common macro to convert the UUID
to a little end array.

Reviewed-by: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Message-id: 20200512030609.19593-2-gengdongjiu@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/qemu/uuid.h | 27 +++++++++++++++++++++++++++
 hw/acpi/nvdimm.c    | 10 +++-------
 2 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/include/qemu/uuid.h b/include/qemu/uuid.h
index XXXXXXX..XXXXXXX 100644
--- a/include/qemu/uuid.h
+++ b/include/qemu/uuid.h
@@ -XXX,XX +XXX,XX @@ typedef struct {
     };
 } QemuUUID;
 
+/**
+ * UUID_LE - converts the fields of UUID to little-endian array,
+ * each of parameters is the filed of UUID.
+ *
+ * @time_low: The low field of the timestamp
+ * @time_mid: The middle field of the timestamp
+ * @time_hi_and_version: The high field of the timestamp
+ *                       multiplexed with the version number
+ * @clock_seq_hi_and_reserved: The high field of the clock
+ *                             sequence multiplexed with the variant
+ * @clock_seq_low: The low field of the clock sequence
+ * @node0: The spatially unique node0 identifier
+ * @node1: The spatially unique node1 identifier
+ * @node2: The spatially unique node2 identifier
+ * @node3: The spatially unique node3 identifier
+ * @node4: The spatially unique node4 identifier
+ * @node5: The spatially unique node5 identifier
+ */
+#define UUID_LE(time_low, time_mid, time_hi_and_version,                    \
+  clock_seq_hi_and_reserved, clock_seq_low, node0, node1, node2,            \
+  node3, node4, node5)                                                      \
+  { (time_low) & 0xff, ((time_low) >> 8) & 0xff, ((time_low) >> 16) & 0xff, \
+    ((time_low) >> 24) & 0xff, (time_mid) & 0xff, ((time_mid) >> 8) & 0xff, \
+    (time_hi_and_version) & 0xff, ((time_hi_and_version) >> 8) & 0xff,      \
+    (clock_seq_hi_and_reserved), (clock_seq_low), (node0), (node1), (node2),\
+    (node3), (node4), (node5) }
+
 #define UUID_FMT "%02hhx%02hhx%02hhx%02hhx-" \
                  "%02hhx%02hhx-%02hhx%02hhx-" \
                  "%02hhx%02hhx-" \
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -XXX,XX +XXX,XX @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/uuid.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/bios-linker-loader.h"
@@ -XXX,XX +XXX,XX @@
 #include "hw/mem/nvdimm.h"
 #include "qemu/nvdimm-utils.h"
 
-#define NVDIMM_UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)             \
-   { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
-     (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,          \
-     (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) }
-
 /*
  * define Byte Addressable Persistent Memory (PM) Region according to
  * ACPI 6.0: 5.2.25.1 System Physical Address Range Structure.
  */
 static const uint8_t nvdimm_nfit_spa_uuid[] =
-      NVDIMM_UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
-                     0x18, 0xb7, 0x8c, 0xdb);
+      UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, 0x33,
+              0x18, 0xb7, 0x8c, 0xdb);
 
 /*
  * NVDIMM Firmware Interface Table
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

RAS Virtualization feature is not supported now, so
add a RAS machine option and disable it by default.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20200512030609.19593-3-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/virt.h |  1 +
 hw/arm/virt.c         | 23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -XXX,XX +XXX,XX @@ typedef struct {
     bool highmem_ecam;
     bool its;
     bool virt;
+    bool ras;
     OnOffAuto acpi;
     VirtGICType gic_version;
     VirtIOMMUType iommu;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -XXX,XX +XXX,XX @@ static void virt_set_acpi(Object *obj, Visitor *v, const char *name,
     visit_type_OnOffAuto(v, name, &vms->acpi, errp);
 }
 
+static bool virt_get_ras(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->ras;
+}
+
+static void virt_set_ras(Object *obj, bool value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->ras = value;
+}
+
 static char *virt_get_gic_version(Object *obj, Error **errp)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -XXX,XX +XXX,XX @@ static void virt_instance_init(Object *obj)
                                     "Valid values are none and smmuv3",
                                     NULL);
 
+    /* Default disallows RAS instantiation */
+    vms->ras = false;
+    object_property_add_bool(obj, "ras", virt_get_ras,
+                             virt_set_ras, NULL);
+    object_property_set_description(obj, "ras",
+                                    "Set on/off to enable/disable reporting host memory errors "
+                                    "to a KVM guest using ACPI and guest external abort exceptions",
+                                    NULL);
+
     vms->irqmap = a15irqmap;
 
     virt_flash_create(vms);
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

Add APEI/GHES detailed design document

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20200512030609.19593-4-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/specs/acpi_hest_ghes.rst | 110 ++++++++++++++++++++++++++++++++++
 docs/specs/index.rst          |   1 +
 2 files changed, 111 insertions(+)
 create mode 100644 docs/specs/acpi_hest_ghes.rst

diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/docs/specs/acpi_hest_ghes.rst
@@ -XXX,XX +XXX,XX @@
+APEI tables generating and CPER record
+======================================
+
+..
+   Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
+
+   This work is licensed under the terms of the GNU GPL, version 2 or later.
+   See the COPYING file in the top-level directory.
+
+Design Details
+--------------
+
+::
+
+         etc/acpi/tables                           etc/hardware_errors
+      ====================                   ===============================
+  + +--------------------------+            +----------------------------+
+  | | HEST                     | +--------->|    error_block_address1    |------+
+  | +--------------------------+ |          +----------------------------+      |
+  | | GHES1                    | | +------->|    error_block_address2    |------+-+
+  | +--------------------------+ | |        +----------------------------+      | |
+  | | .................        | | |        |      ..............        |      | |
+  | | error_status_address-----+-+ |        -----------------------------+      | |
+  | | .................        |   |   +--->|    error_block_addressN    |------+-+---+
+  | | read_ack_register--------+-+ |   |    +----------------------------+      | |   |
+  | | read_ack_preserve        | +-+---+--->|     read_ack_register1     |      | |   |
+  | | read_ack_write           |   |   |    +----------------------------+      | |   |
+  + +--------------------------+   | +-+--->|     read_ack_register2     |      | |   |
+  | | GHES2                    |   | | |    +----------------------------+      | |   |
+  + +--------------------------+   | | |    |       .............        |      | |   |
+  | | .................        |   | | |    +----------------------------+      | |   |
+  | | error_status_address-----+---+ | | +->|     read_ack_registerN     |      | |   |
+  | | .................        |     | | |  +----------------------------+      | |   |
+  | | read_ack_register--------+-----+ | |  |Generic Error Status Block 1|<-----+ |   |
+  | | read_ack_preserve        |       | |  |-+------------------------+-+        |   |
+  | | read_ack_write           |       | |  | |          CPER          | |        |   |
+  + +--------------------------|       | |  | |          CPER          | |        |   |
+  | | ...............          |       | |  | |          ....          | |        |   |
+  + +--------------------------+       | |  | |          CPER          | |        |   |
+  | | GHESN                    |       | |  |-+------------------------+-|        |   |
+  + +--------------------------+       | |  |Generic Error Status Block 2|<-------+   |
+  | | .................        |       | |  |-+------------------------+-+            |
+  | | error_status_address-----+-------+ |  | |           CPER         | |            |
+  | | .................        |         |  | |           CPER         | |            |
+  | | read_ack_register--------+---------+  | |           ....         | |            |
+  | | read_ack_preserve        |            | |           CPER         | |            |
+  | | read_ack_write           |            +-+------------------------+-+            |
+  + +--------------------------+            |         ..........         |            |
+                                            |----------------------------+            |
+                                            |Generic Error Status Block N |<----------+
+                                            |-+-------------------------+-+
+                                            | |          CPER           | |
+                                            | |          CPER           | |
+                                            | |          ....           | |
+                                            | |          CPER           | |
+                                            +-+-------------------------+-+
+
+
+(1) QEMU generates the ACPI HEST table. This table goes in the current
+    "etc/acpi/tables" fw_cfg blob. Each error source has different
+    notification types.
+
+(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
+    also needs to populate this blob. The "etc/hardware_errors" fw_cfg blob
+    contains an address registers table and an Error Status Data Block table.
+
+(3) The address registers table contains N Error Block Address entries
+    and N Read Ack Register entries. The size for each entry is 8-byte.
+    The Error Status Data Block table contains N Error Status Data Block
+    entries. The size for each entry is 4096(0x1000) bytes. The total size
+    for the "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
+    N is the number of the kinds of hardware error sources.
+
+(4) QEMU generates the ACPI linker/loader script for the firmware. The
+    firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
+    and copies blob contents there.
+
+(5) QEMU generates N ADD_POINTER commands, which patch addresses in the
+    "error_status_address" fields of the HEST table with a pointer to the
+    corresponding "address registers" in the "etc/hardware_errors" blob.
+
+(6) QEMU generates N ADD_POINTER commands, which patch addresses in the
+    "read_ack_register" fields of the HEST table with a pointer to the
+    corresponding "read_ack_register" within the "etc/hardware_errors" blob.
+
+(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
+    addresses in the "error_block_address" fields with a pointer to the
+    respective "Error Status Data Block" in the "etc/hardware_errors" blob.
+
+(8) QEMU defines a third and write-only fw_cfg blob which is called
+    "etc/hardware_errors_addr". Through that blob, the firmware can send back
+    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
+    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
+    for the firmware. The firmware will write back the start address of
+    "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
+
+(9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
+    "Error Status Data Block", guest memory, and then injects platform specific
+    interrupt (in case of arm/virt machine it's Synchronous External Abort) as a
+    notification which is necessary for notifying the guest.
+
+(10) This notification (in virtual hardware) will be handled by the guest
+     kernel, on receiving notification, guest APEI driver could read the CPER error
+     and take appropriate action.
+
+(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
+     find out "Error Status Data Block" entry corresponding to error source. So supported
+     source_id values should be assigned here and not be changed afterwards to make sure
+     that guest will write error into expected "Error Status Data Block" even if guest was
+     migrated to a newer QEMU.
diff --git a/docs/specs/index.rst b/docs/specs/index.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/specs/index.rst
+++ b/docs/specs/index.rst
@@ -XXX,XX +XXX,XX @@ Contents:
    ppc-spapr-xive
    acpi_hw_reduced_hotplug
    tpm
+   acpi_hest_ghes
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

This patch builds error_block_address and read_ack_register fields
in hardware errors table , the error_block_address points to Generic
Error Status Block(GESB) via bios_linker. The max size for one GESB
is 1kb, For more detailed information, please refer to
document: docs/specs/acpi_hest_ghes.rst

Now we only support one Error source, if necessary, we can extend to
support more.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200512030609.19593-5-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 default-configs/arm-softmmu.mak |  1 +
 include/hw/acpi/aml-build.h     |  1 +
 include/hw/acpi/ghes.h          | 28 +++++++++++
 hw/acpi/aml-build.c             |  2 +
 hw/acpi/ghes.c                  | 89 +++++++++++++++++++++++++++++++++
 hw/arm/virt-acpi-build.c        |  5 ++
 hw/acpi/Kconfig                 |  4 ++
 hw/acpi/Makefile.objs           |  1 +
 8 files changed, 131 insertions(+)
 create mode 100644 include/hw/acpi/ghes.h
 create mode 100644 hw/acpi/ghes.c

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index XXXXXXX..XXXXXXX 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -XXX,XX +XXX,XX @@ CONFIG_FSL_IMX7=y
 CONFIG_FSL_IMX6UL=y
 CONFIG_SEMIHOSTING=y
 CONFIG_ALLWINNER_H3=y
+CONFIG_ACPI_APEI=y
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -XXX,XX +XXX,XX @@ struct AcpiBuildTables {
     GArray *rsdp;
     GArray *tcpalog;
     GArray *vmgenid;
+    GArray *hardware_errors;
     BIOSLinker *linker;
 } AcpiBuildTables;
 
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/include/hw/acpi/ghes.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * Support for generating APEI tables and recording CPER for Guests
+ *
+ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
+ *
+ * Author: Dongjiu Geng <gengdongjiu@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef ACPI_GHES_H
+#define ACPI_GHES_H
+
+#include "hw/acpi/bios-linker-loader.h"
+
+void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
+#endif
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_init(AcpiBuildTables *tables)
     tables->table_data = g_array_new(false, true /* clear */, 1);
     tables->tcpalog = g_array_new(false, true /* clear */, 1);
     tables->vmgenid = g_array_new(false, true /* clear */, 1);
+    tables->hardware_errors = g_array_new(false, true /* clear */, 1);
     tables->linker = bios_linker_loader_init();
 }
 
@@ -XXX,XX +XXX,XX @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre)
     g_array_free(tables->table_data, true);
     g_array_free(tables->tcpalog, mfre);
     g_array_free(tables->vmgenid, mfre);
+    g_array_free(tables->hardware_errors, mfre);
 }
 
 /*
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * Support for generating APEI tables and recording CPER for Guests
+ *
+ * Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
+ *
+ * Author: Dongjiu Geng <gengdongjiu@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "hw/acpi/ghes.h"
+#include "hw/acpi/aml-build.h"
+
+#define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
+#define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
+
+/* The max size in bytes for one error block */
+#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
+
+/* Now only support ARMv8 SEA notification type error source */
+#define ACPI_GHES_ERROR_SOURCE_COUNT        1
+
+/*
+ * Build table for the hardware error fw_cfg blob.
+ * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
+ * See docs/specs/acpi_hest_ghes.rst for blobs format.
+ */
+void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
+{
+    int i, error_status_block_offset;
+
+    /* Build error_block_address */
+    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+        build_append_int_noprefix(hardware_errors, 0, sizeof(uint64_t));
+    }
+
+    /* Build read_ack_register */
+    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+        /*
+         * Initialize the value of read_ack_register to 1, so GHES can be
+         * writeable after (re)boot.
+         * ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
+         * (GHESv2 - Type 10)
+         */
+        build_append_int_noprefix(hardware_errors, 1, sizeof(uint64_t));
+    }
+
+    /* Generic Error Status Block offset in the hardware error fw_cfg blob */
+    error_status_block_offset = hardware_errors->len;
+
+    /* Reserve space for Error Status Data Block */
+    acpi_data_push(hardware_errors,
+        ACPI_GHES_MAX_RAW_DATA_LENGTH * ACPI_GHES_ERROR_SOURCE_COUNT);
+
+    /* Tell guest firmware to place hardware_errors blob into RAM */
+    bios_linker_loader_alloc(linker, ACPI_GHES_ERRORS_FW_CFG_FILE,
+                             hardware_errors, sizeof(uint64_t), false);
+
+    for (i = 0; i < ACPI_GHES_ERROR_SOURCE_COUNT; i++) {
+        /*
+         * Tell firmware to patch error_block_address entries to point to
+         * corresponding "Generic Error Status Block"
+         */
+        bios_linker_loader_add_pointer(linker,
+            ACPI_GHES_ERRORS_FW_CFG_FILE, sizeof(uint64_t) * i,
+            sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
+            error_status_block_offset + i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
+    }
+
+    /*
+     * tell firmware to write hardware_errors GPA into
+     * hardware_errors_addr fw_cfg, once the former has been initialized.
+     */
+    bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
+        0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
+}
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -XXX,XX +XXX,XX @@
 #include "sysemu/reset.h"
 #include "kvm_arm.h"
 #include "migration/vmstate.h"
+#include "hw/acpi/ghes.h"
 
 #define ARM_SPI_BASE 32
 
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
     acpi_add_table(table_offsets, tables_blob);
     build_spcr(tables_blob, tables->linker, vms);
 
+    if (vms->ras) {
+        build_ghes_error_table(tables->hardware_errors, tables->linker);
+    }
+
     if (ms->numa_state->num_nodes > 0) {
         acpi_add_table(table_offsets, tables_blob);
         build_srat(tables_blob, tables->linker, vms);
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -XXX,XX +XXX,XX @@ config ACPI_HMAT
     bool
     depends on ACPI
 
+config ACPI_APEI
+    bool
+    depends on ACPI
+
 config ACPI_PCI
     bool
     depends on ACPI && PCI
diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/Makefile.objs
+++ b/hw/acpi/Makefile.objs
@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
 common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
 common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
 common-obj-$(CONFIG_ACPI_HMAT) += hmat.o
+common-obj-$(CONFIG_ACPI_APEI) += ghes.o
 common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
 common-obj-$(call lnot,$(CONFIG_PC)) += acpi-x86-stub.o
 
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

This patch builds Hardware Error Source Table(HEST) via fw_cfg blobs.
Now it only supports ARMv8 SEA, a type of Generic Hardware Error
Source version 2(GHESv2) error source. Afterwards, we can extend
the supported types if needed. For the CPER section, currently it
is memory section because kernel mainly wants userspace to handle
the memory errors.

This patch follows the spec ACPI 6.2 to build the Hardware Error
Source table. For more detailed information, please refer to
document: docs/specs/acpi_hest_ghes.rst

build_ghes_hw_error_notification() helper will help to add Hardware
Error Notification to ACPI tables without using packed C structures
and avoid endianness issues as API doesn't need explicit conversion.

Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200512030609.19593-6-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/acpi/ghes.h   |  39 ++++++++++++
 hw/acpi/ghes.c           | 126 +++++++++++++++++++++++++++++++++++++++
 hw/arm/virt-acpi-build.c |   2 +
 3 files changed, 167 insertions(+)

diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -XXX,XX +XXX,XX @@
 
 #include "hw/acpi/bios-linker-loader.h"
 
+/*
+ * Values for Hardware Error Notification Type field
+ */
+enum AcpiGhesNotifyType {
+    /* Polled */
+    ACPI_GHES_NOTIFY_POLLED = 0,
+    /* External Interrupt */
+    ACPI_GHES_NOTIFY_EXTERNAL = 1,
+    /* Local Interrupt */
+    ACPI_GHES_NOTIFY_LOCAL = 2,
+    /* SCI */
+    ACPI_GHES_NOTIFY_SCI = 3,
+    /* NMI */
+    ACPI_GHES_NOTIFY_NMI = 4,
+    /* CMCI, ACPI 5.0: 18.3.2.7, Table 18-290 */
+    ACPI_GHES_NOTIFY_CMCI = 5,
+    /* MCE, ACPI 5.0: 18.3.2.7, Table 18-290 */
+    ACPI_GHES_NOTIFY_MCE = 6,
+    /* GPIO-Signal, ACPI 6.0: 18.3.2.7, Table 18-332 */
+    ACPI_GHES_NOTIFY_GPIO = 7,
+    /* ARMv8 SEA, ACPI 6.1: 18.3.2.9, Table 18-345 */
+    ACPI_GHES_NOTIFY_SEA = 8,
+    /* ARMv8 SEI, ACPI 6.1: 18.3.2.9, Table 18-345 */
+    ACPI_GHES_NOTIFY_SEI = 9,
+    /* External Interrupt - GSIV, ACPI 6.1: 18.3.2.9, Table 18-345 */
+    ACPI_GHES_NOTIFY_GSIV = 10,
+    /* Software Delegated Exception, ACPI 6.2: 18.3.2.9, Table 18-383 */
+    ACPI_GHES_NOTIFY_SDEI = 11,
+    /* 12 and greater are reserved */
+    ACPI_GHES_NOTIFY_RESERVED = 12
+};
+
+enum {
+    ACPI_HEST_SRC_ID_SEA = 0,
+    /* future ids go here */
+    ACPI_HEST_SRC_ID_RESERVED,
+};
+
 void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
+void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
 #endif
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/units.h"
 #include "hw/acpi/ghes.h"
 #include "hw/acpi/aml-build.h"
+#include "qemu/error-report.h"
 
 #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
 #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
@@ -XXX,XX +XXX,XX @@
 /* Now only support ARMv8 SEA notification type error source */
 #define ACPI_GHES_ERROR_SOURCE_COUNT        1
 
+/* Generic Hardware Error Source version 2 */
+#define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
+
+/* Address offset in Generic Address Structure(GAS) */
+#define GAS_ADDR_OFFSET 4
+
+/*
+ * Hardware Error Notification
+ * ACPI 4.0: 17.3.2.7 Hardware Error Notification
+ * Composes dummy Hardware Error Notification descriptor of specified type
+ */
+static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
+{
+    /* Type */
+    build_append_int_noprefix(table, type, 1);
+    /*
+     * Length:
+     * Total length of the structure in bytes
+     */
+    build_append_int_noprefix(table, 28, 1);
+    /* Configuration Write Enable */
+    build_append_int_noprefix(table, 0, 2);
+    /* Poll Interval */
+    build_append_int_noprefix(table, 0, 4);
+    /* Vector */
+    build_append_int_noprefix(table, 0, 4);
+    /* Switch To Polling Threshold Value */
+    build_append_int_noprefix(table, 0, 4);
+    /* Switch To Polling Threshold Window */
+    build_append_int_noprefix(table, 0, 4);
+    /* Error Threshold Value */
+    build_append_int_noprefix(table, 0, 4);
+    /* Error Threshold Window */
+    build_append_int_noprefix(table, 0, 4);
+}
+
 /*
  * Build table for the hardware error fw_cfg blob.
  * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
     bios_linker_loader_write_pointer(linker, ACPI_GHES_DATA_ADDR_FW_CFG_FILE,
         0, sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE, 0);
 }
+
+/* Build Generic Hardware Error Source version 2 (GHESv2) */
+static void build_ghes_v2(GArray *table_data, int source_id, BIOSLinker *linker)
+{
+    uint64_t address_offset;
+    /*
+     * Type:
+     * Generic Hardware Error Source version 2(GHESv2 - Type 10)
+     */
+    build_append_int_noprefix(table_data, ACPI_GHES_SOURCE_GENERIC_ERROR_V2, 2);
+    /* Source Id */
+    build_append_int_noprefix(table_data, source_id, 2);
+    /* Related Source Id */
+    build_append_int_noprefix(table_data, 0xffff, 2);
+    /* Flags */
+    build_append_int_noprefix(table_data, 0, 1);
+    /* Enabled */
+    build_append_int_noprefix(table_data, 1, 1);
+
+    /* Number of Records To Pre-allocate */
+    build_append_int_noprefix(table_data, 1, 4);
+    /* Max Sections Per Record */
+    build_append_int_noprefix(table_data, 1, 4);
+    /* Max Raw Data Length */
+    build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
+
+    address_offset = table_data->len;
+    /* Error Status Address */
+    build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
+                     4 /* QWord access */, 0);
+    bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
+        address_offset + GAS_ADDR_OFFSET, sizeof(uint64_t),
+        ACPI_GHES_ERRORS_FW_CFG_FILE, source_id * sizeof(uint64_t));
+
+    switch (source_id) {
+    case ACPI_HEST_SRC_ID_SEA:
+        /*
+         * Notification Structure
+         * Now only enable ARMv8 SEA notification type
+         */
+        build_ghes_hw_error_notification(table_data, ACPI_GHES_NOTIFY_SEA);
+        break;
+    default:
+        error_report("Not support this error source");
+        abort();
+    }
+
+    /* Error Status Block Length */
+    build_append_int_noprefix(table_data, ACPI_GHES_MAX_RAW_DATA_LENGTH, 4);
+
+    /*
+     * Read Ack Register
+     * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source
+     * version 2 (GHESv2 - Type 10)
+     */
+    address_offset = table_data->len;
+    build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 0x40, 0,
+                     4 /* QWord access */, 0);
+    bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
+        address_offset + GAS_ADDR_OFFSET,
+        sizeof(uint64_t), ACPI_GHES_ERRORS_FW_CFG_FILE,
+        (ACPI_GHES_ERROR_SOURCE_COUNT + source_id) * sizeof(uint64_t));
+
+    /*
+     * Read Ack Preserve field
+     * We only provide the first bit in Read Ack Register to OSPM to write
+     * while the other bits are preserved.
+     */
+    build_append_int_noprefix(table_data, ~0x1ULL, 8);
+    /* Read Ack Write */
+    build_append_int_noprefix(table_data, 0x1, 8);
+}
+
+/* Build Hardware Error Source Table */
+void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
+{
+    uint64_t hest_start = table_data->len;
+
+    /* Hardware Error Source Table header*/
+    acpi_data_push(table_data, sizeof(AcpiTableHeader));
+
+    /* Error Source Count */
+    build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
+
+    build_ghes_v2(table_data, ACPI_HEST_SRC_ID_SEA, linker);
+
+    build_header(linker, table_data, (void *)(table_data->data + hest_start),
+        "HEST", table_data->len - hest_start, 1, NULL, NULL);
+}
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 
     if (vms->ras) {
         build_ghes_error_table(tables->hardware_errors, tables->linker);
+        acpi_add_table(table_offsets, tables_blob);
+        acpi_build_hest(tables_blob, tables->linker);
     }
 
     if (ms->numa_state->num_nodes > 0) {
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

Record the GHEB address via fw_cfg file, when recording
a error to CPER, it will use this address to find out
Generic Error Data Entries and write the error.

In order to avoid migration failure, make hardware
error table address to a part of GED device instead
of global variable, then this address will be migrated
to target QEMU.

Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200512030609.19593-7-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/acpi/generic_event_device.h |  2 ++
 include/hw/acpi/ghes.h                 |  6 ++++++
 hw/acpi/generic_event_device.c         | 19 +++++++++++++++++++
 hw/acpi/ghes.c                         | 14 ++++++++++++++
 hw/arm/virt-acpi-build.c               |  8 ++++++++
 5 files changed, 49 insertions(+)

diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -XXX,XX +XXX,XX @@
 
 #include "hw/sysbus.h"
 #include "hw/acpi/memory_hotplug.h"
+#include "hw/acpi/ghes.h"
 
 #define ACPI_POWER_BUTTON_DEVICE "PWRB"
 
@@ -XXX,XX +XXX,XX @@ typedef struct AcpiGedState {
     GEDState ged_state;
     uint32_t ged_event_bitmap;
     qemu_irq irq;
+    AcpiGhesState ghes_state;
 } AcpiGedState;
 
 void build_ged_aml(Aml *table, const char* name, HotplugHandler *hotplug_dev,
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -XXX,XX +XXX,XX @@ enum {
     ACPI_HEST_SRC_ID_RESERVED,
 };
 
+typedef struct AcpiGhesState {
+    uint64_t ghes_addr_le;
+} AcpiGhesState;
+
 void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
 void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
+void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
+                          GArray *hardware_errors);
 #endif
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_ged_state = {
     }
 };
 
+static bool ghes_needed(void *opaque)
+{
+    AcpiGedState *s = opaque;
+    return s->ghes_state.ghes_addr_le;
+}
+
+static const VMStateDescription vmstate_ghes_state = {
+    .name = "acpi-ged/ghes",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = ghes_needed,
+    .fields      = (VMStateField[]) {
+        VMSTATE_STRUCT(ghes_state, AcpiGedState, 1,
+                       vmstate_ghes_state, AcpiGhesState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_acpi_ged = {
     .name = "acpi-ged",
     .version_id = 1,
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_acpi_ged = {
     },
     .subsections = (const VMStateDescription * []) {
         &vmstate_memhp_state,
+        &vmstate_ghes_state,
         NULL
     }
 };
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/acpi/ghes.h"
 #include "hw/acpi/aml-build.h"
 #include "qemu/error-report.h"
+#include "hw/acpi/generic_event_device.h"
+#include "hw/nvram/fw_cfg.h"
 
 #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
 #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
@@ -XXX,XX +XXX,XX @@ void acpi_build_hest(GArray *table_data, BIOSLinker *linker)
     build_header(linker, table_data, (void *)(table_data->data + hest_start),
         "HEST", table_data->len - hest_start, 1, NULL, NULL);
 }
+
+void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
+                          GArray *hardware_error)
+{
+    /* Create a read-only fw_cfg file for GHES */
+    fw_cfg_add_file(s, ACPI_GHES_ERRORS_FW_CFG_FILE, hardware_error->data,
+                    hardware_error->len);
+
+    /* Create a read-write fw_cfg file for Address */
+    fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
+        NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
+}
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
 {
     AcpiBuildTables tables;
     AcpiBuildState *build_state;
+    AcpiGedState *acpi_ged_state;
 
     if (!vms->fw_cfg) {
         trace_virt_acpi_setup();
@@ -XXX,XX +XXX,XX @@ void virt_acpi_setup(VirtMachineState *vms)
     fw_cfg_add_file(vms->fw_cfg, ACPI_BUILD_TPMLOG_FILE, tables.tcpalog->data,
                     acpi_data_len(tables.tcpalog));
 
+    if (vms->ras) {
+        assert(vms->acpi_dev);
+        acpi_ged_state = ACPI_GED(vms->acpi_dev);
+        acpi_ghes_add_fw_cfg(&acpi_ged_state->ghes_state,
+                             vms->fw_cfg, tables.hardware_errors);
+    }
+
     build_state->rsdp_mr = acpi_add_rom_blob(virt_acpi_build_update,
                                              build_state, tables.rsdp,
                                              ACPI_BUILD_RSDP_FILE, 0);
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

kvm_hwpoison_page_add() and kvm_unpoison_all() will both
be used by X86 and ARM platforms, so moving them into
"accel/kvm/kvm-all.c" to avoid duplicate code.

For architectures that don't use the poison-list functionality
the reset handler will harmlessly do nothing, so let's register
the kvm_unpoison_all() function in the generic kvm_init() function.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
Message-id: 20200512030609.19593-8-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/sysemu/kvm_int.h | 12 ++++++++++++
 accel/kvm/kvm-all.c      | 36 ++++++++++++++++++++++++++++++++++++
 target/i386/kvm.c        | 36 ------------------------------------
 3 files changed, 48 insertions(+), 36 deletions(-)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index XXXXXXX..XXXXXXX 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -XXX,XX +XXX,XX @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
                                   AddressSpace *as, int as_id);
 
 void kvm_set_max_memslot_size(hwaddr max_slot_size);
+
+/**
+ * kvm_hwpoison_page_add:
+ *
+ * Parameters:
+ *  @ram_addr: the address in the RAM for the poisoned page
+ *
+ * Add a poisoned page to the list
+ *
+ * Return: None.
+ */
+void kvm_hwpoison_page_add(ram_addr_t ram_addr);
 #endif
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index XXXXXXX..XXXXXXX 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -XXX,XX +XXX,XX @@
 #include "qapi/visitor.h"
 #include "qapi/qapi-types-common.h"
 #include "qapi/qapi-visit-common.h"
+#include "sysemu/reset.h"
 
 #include "hw/boards.h"
 
@@ -XXX,XX +XXX,XX @@ int kvm_vm_check_extension(KVMState *s, unsigned int extension)
     return ret;
 }
 
+typedef struct HWPoisonPage {
+    ram_addr_t ram_addr;
+    QLIST_ENTRY(HWPoisonPage) list;
+} HWPoisonPage;
+
+static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
+    QLIST_HEAD_INITIALIZER(hwpoison_page_list);
+
+static void kvm_unpoison_all(void *param)
+{
+    HWPoisonPage *page, *next_page;
+
+    QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
+        QLIST_REMOVE(page, list);
+        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
+        g_free(page);
+    }
+}
+
+void kvm_hwpoison_page_add(ram_addr_t ram_addr)
+{
+    HWPoisonPage *page;
+
+    QLIST_FOREACH(page, &hwpoison_page_list, list) {
+        if (page->ram_addr == ram_addr) {
+            return;
+        }
+    }
+    page = g_new(HWPoisonPage, 1);
+    page->ram_addr = ram_addr;
+    QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
+}
+
 static uint32_t adjust_ioeventfd_endianness(uint32_t val, uint32_t size)
 {
 #if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
@@ -XXX,XX +XXX,XX @@ static int kvm_init(MachineState *ms)
         s->kernel_irqchip_split = mc->default_kernel_irqchip_split ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
     }
 
+    qemu_register_reset(kvm_unpoison_all, NULL);
+
     if (s->kernel_irqchip_allowed) {
         kvm_irqchip_create(s);
     }
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index XXXXXXX..XXXXXXX 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -XXX,XX +XXX,XX @@
 #include "sysemu/sysemu.h"
 #include "sysemu/hw_accel.h"
 #include "sysemu/kvm_int.h"
-#include "sysemu/reset.h"
 #include "sysemu/runstate.h"
 #include "kvm_i386.h"
 #include "hyperv.h"
@@ -XXX,XX +XXX,XX @@ uint64_t kvm_arch_get_supported_msr_feature(KVMState *s, uint32_t index)
     }
 }
 
-
-typedef struct HWPoisonPage {
-    ram_addr_t ram_addr;
-    QLIST_ENTRY(HWPoisonPage) list;
-} HWPoisonPage;
-
-static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list =
-    QLIST_HEAD_INITIALIZER(hwpoison_page_list);
-
-static void kvm_unpoison_all(void *param)
-{
-    HWPoisonPage *page, *next_page;
-
-    QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) {
-        QLIST_REMOVE(page, list);
-        qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE);
-        g_free(page);
-    }
-}
-
-static void kvm_hwpoison_page_add(ram_addr_t ram_addr)
-{
-    HWPoisonPage *page;
-
-    QLIST_FOREACH(page, &hwpoison_page_list, list) {
-        if (page->ram_addr == ram_addr) {
-            return;
-        }
-    }
-    page = g_new(HWPoisonPage, 1);
-    page->ram_addr = ram_addr;
-    QLIST_INSERT_HEAD(&hwpoison_page_list, page, list);
-}
-
 static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap,
                                      int *max_banks)
 {
@@ -XXX,XX +XXX,XX @@ int kvm_arch_init(MachineState *ms, KVMState *s)
         fprintf(stderr, "e820_add_entry() table is full\n");
         return ret;
     }
-    qemu_register_reset(kvm_unpoison_all, NULL);
 
     shadow_mem = object_property_get_int(OBJECT(s), "kvm-shadow-mem", &error_abort);
     if (shadow_mem != -1) {
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

kvm_arch_on_sigbus_vcpu() error injection uses source_id as
index in etc/hardware_errors to find out Error Status Data
Block entry corresponding to error source. So supported source_id
values should be assigned here and not be changed afterwards to
make sure that guest will write error into expected Error Status
Data Block.

Before QEMU writes a new error to ACPI table, it will check whether
previous error has been acknowledged. If not acknowledged, the new
errors will be ignored and not be recorded. For the errors section
type, QEMU simulate it to memory section error.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200512030609.19593-9-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/acpi/ghes.h |   1 +
 hw/acpi/ghes.c         | 219 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 220 insertions(+)

diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -XXX,XX +XXX,XX @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker);
 void acpi_build_hest(GArray *table_data, BIOSLinker *linker);
 void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
                           GArray *hardware_errors);
+int acpi_ghes_record_errors(uint8_t notify, uint64_t error_physical_addr);
 #endif
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/error-report.h"
 #include "hw/acpi/generic_event_device.h"
 #include "hw/nvram/fw_cfg.h"
+#include "qemu/uuid.h"
 
 #define ACPI_GHES_ERRORS_FW_CFG_FILE        "etc/hardware_errors"
 #define ACPI_GHES_DATA_ADDR_FW_CFG_FILE     "etc/hardware_errors_addr"
@@ -XXX,XX +XXX,XX @@
 /* Address offset in Generic Address Structure(GAS) */
 #define GAS_ADDR_OFFSET 4
 
+/*
+ * The total size of Generic Error Data Entry
+ * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
+ * Table 18-343 Generic Error Data Entry
+ */
+#define ACPI_GHES_DATA_LENGTH               72
+
+/* The memory section CPER size, UEFI 2.6: N.2.5 Memory Error Section */
+#define ACPI_GHES_MEM_CPER_LENGTH           80
+
+/* Masks for block_status flags */
+#define ACPI_GEBS_UNCORRECTABLE         1
+
+/*
+ * Total size for Generic Error Status Block except Generic Error Data Entries
+ * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
+ * Table 18-380 Generic Error Status Block
+ */
+#define ACPI_GHES_GESB_SIZE                 20
+
+/*
+ * Values for error_severity field
+ */
+enum AcpiGenericErrorSeverity {
+    ACPI_CPER_SEV_RECOVERABLE = 0,
+    ACPI_CPER_SEV_FATAL = 1,
+    ACPI_CPER_SEV_CORRECTED = 2,
+    ACPI_CPER_SEV_NONE = 3,
+};
+
 /*
  * Hardware Error Notification
  * ACPI 4.0: 17.3.2.7 Hardware Error Notification
@@ -XXX,XX +XXX,XX @@ static void build_ghes_hw_error_notification(GArray *table, const uint8_t type)
     build_append_int_noprefix(table, 0, 4);
 }
 
+/*
+ * Generic Error Data Entry
+ * ACPI 6.1: 18.3.2.7.1 Generic Error Data
+ */
+static void acpi_ghes_generic_error_data(GArray *table,
+                const uint8_t *section_type, uint32_t error_severity,
+                uint8_t validation_bits, uint8_t flags,
+                uint32_t error_data_length, QemuUUID fru_id,
+                uint64_t time_stamp)
+{
+    const uint8_t fru_text[20] = {0};
+
+    /* Section Type */
+    g_array_append_vals(table, section_type, 16);
+
+    /* Error Severity */
+    build_append_int_noprefix(table, error_severity, 4);
+    /* Revision */
+    build_append_int_noprefix(table, 0x300, 2);
+    /* Validation Bits */
+    build_append_int_noprefix(table, validation_bits, 1);
+    /* Flags */
+    build_append_int_noprefix(table, flags, 1);
+    /* Error Data Length */
+    build_append_int_noprefix(table, error_data_length, 4);
+
+    /* FRU Id */
+    g_array_append_vals(table, fru_id.data, ARRAY_SIZE(fru_id.data));
+
+    /* FRU Text */
+    g_array_append_vals(table, fru_text, sizeof(fru_text));
+
+    /* Timestamp */
+    build_append_int_noprefix(table, time_stamp, 8);
+}
+
+/*
+ * Generic Error Status Block
+ * ACPI 6.1: 18.3.2.7.1 Generic Error Data
+ */
+static void acpi_ghes_generic_error_status(GArray *table, uint32_t block_status,
+                uint32_t raw_data_offset, uint32_t raw_data_length,
+                uint32_t data_length, uint32_t error_severity)
+{
+    /* Block Status */
+    build_append_int_noprefix(table, block_status, 4);
+    /* Raw Data Offset */
+    build_append_int_noprefix(table, raw_data_offset, 4);
+    /* Raw Data Length */
+    build_append_int_noprefix(table, raw_data_length, 4);
+    /* Data Length */
+    build_append_int_noprefix(table, data_length, 4);
+    /* Error Severity */
+    build_append_int_noprefix(table, error_severity, 4);
+}
+
+/* UEFI 2.6: N.2.5 Memory Error Section */
+static void acpi_ghes_build_append_mem_cper(GArray *table,
+                                            uint64_t error_physical_addr)
+{
+    /*
+     * Memory Error Record
+     */
+
+    /* Validation Bits */
+    build_append_int_noprefix(table,
+                              (1ULL << 14) | /* Type Valid */
+                              (1ULL << 1) /* Physical Address Valid */,
+                              8);
+    /* Error Status */
+    build_append_int_noprefix(table, 0, 8);
+    /* Physical Address */
+    build_append_int_noprefix(table, error_physical_addr, 8);
+    /* Skip all the detailed information normally found in such a record */
+    build_append_int_noprefix(table, 0, 48);
+    /* Memory Error Type */
+    build_append_int_noprefix(table, 0 /* Unknown error */, 1);
+    /* Skip all the detailed information normally found in such a record */
+    build_append_int_noprefix(table, 0, 7);
+}
+
+static int acpi_ghes_record_mem_error(uint64_t error_block_address,
+                                      uint64_t error_physical_addr)
+{
+    GArray *block;
+
+    /* Memory Error Section Type */
+    const uint8_t uefi_cper_mem_sec[] =
+          UUID_LE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
+                  0xED, 0x7C, 0x83, 0xB1);
+
+    /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
+     * Table 17-13 Generic Error Data Entry
+     */
+    QemuUUID fru_id = {};
+    uint32_t data_length;
+
+    block = g_array_new(false, true /* clear */, 1);
+
+    /* This is the length if adding a new generic error data entry*/
+    data_length = ACPI_GHES_DATA_LENGTH + ACPI_GHES_MEM_CPER_LENGTH;
+
+    /*
+     * Check whether it will run out of the preallocated memory if adding a new
+     * generic error data entry
+     */
+    if ((data_length + ACPI_GHES_GESB_SIZE) > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
+        error_report("Not enough memory to record new CPER!!!");
+        g_array_free(block, true);
+        return -1;
+    }
+
+    /* Build the new generic error status block header */
+    acpi_ghes_generic_error_status(block, ACPI_GEBS_UNCORRECTABLE,
+        0, 0, data_length, ACPI_CPER_SEV_RECOVERABLE);
+
+    /* Build this new generic error data entry header */
+    acpi_ghes_generic_error_data(block, uefi_cper_mem_sec,
+        ACPI_CPER_SEV_RECOVERABLE, 0, 0,
+        ACPI_GHES_MEM_CPER_LENGTH, fru_id, 0);
+
+    /* Build the memory section CPER for above new generic error data entry */
+    acpi_ghes_build_append_mem_cper(block, error_physical_addr);
+
+    /* Write the generic error data entry into guest memory */
+    cpu_physical_memory_write(error_block_address, block->data, block->len);
+
+    g_array_free(block, true);
+
+    return 0;
+}
+
 /*
  * Build table for the hardware error fw_cfg blob.
  * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
@@ -XXX,XX +XXX,XX @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
     fw_cfg_add_file_callback(s, ACPI_GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
         NULL, &(ags->ghes_addr_le), sizeof(ags->ghes_addr_le), false);
 }
+
+int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
+{
+    uint64_t error_block_addr, read_ack_register_addr, read_ack_register = 0;
+    uint64_t start_addr;
+    bool ret = -1;
+    AcpiGedState *acpi_ged_state;
+    AcpiGhesState *ags;
+
+    assert(source_id < ACPI_HEST_SRC_ID_RESERVED);
+
+    acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
+                                                       NULL));
+    g_assert(acpi_ged_state);
+    ags = &acpi_ged_state->ghes_state;
+
+    start_addr = le64_to_cpu(ags->ghes_addr_le);
+
+    if (physical_address) {
+
+        if (source_id < ACPI_HEST_SRC_ID_RESERVED) {
+            start_addr += source_id * sizeof(uint64_t);
+        }
+
+        cpu_physical_memory_read(start_addr, &error_block_addr,
+                                 sizeof(error_block_addr));
+
+        error_block_addr = le64_to_cpu(error_block_addr);
+
+        read_ack_register_addr = start_addr +
+            ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t);
+
+        cpu_physical_memory_read(read_ack_register_addr,
+                                 &read_ack_register, sizeof(read_ack_register));
+
+        /* zero means OSPM does not acknowledge the error */
+        if (!read_ack_register) {
+            error_report("OSPM does not acknowledge previous error,"
+                " so can not record CPER for current error anymore");
+        } else if (error_block_addr) {
+            read_ack_register = cpu_to_le64(0);
+            /*
+             * Clear the Read Ack Register, OSPM will write it to 1 when
+             * it acknowledges this error.
+             */
+            cpu_physical_memory_write(read_ack_register_addr,
+                &read_ack_register, sizeof(uint64_t));
+
+            ret = acpi_ghes_record_mem_error(error_block_addr,
+                                             physical_address);
+        } else
+            error_report("can not find Generic Error Status Block");
+    }
+
+    return ret;
+}
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
translates the host VA delivered by host to guest PA, then fills this PA
to guest APEI GHES memory, then notifies guest according to the SIGBUS
type.

When guest accesses the poisoned memory, it will generate a Synchronous
External Abort(SEA). Then host kernel gets an APEI notification and calls
memory_failure() to unmapped the affected page in stage 2, finally
returns to guest.

Guest continues to access the PG_hwpoison page, it will trap to KVM as
stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
Qemu, Qemu records this error address into guest APEI GHES memory and
notifes guest using Synchronous-External-Abort(SEA).

In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
in which we can setup the type of exception and the syndrome information.
When switching to guest, the target vcpu will jump to the synchronous
external abort vector table entry.

The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
not valid and hold an UNKNOWN value. These values will be set to KVM
register structures through KVM_SET_ONE_REG IOCTL.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20200512030609.19593-10-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/sysemu/kvm.h    |  3 +-
 target/arm/cpu.h        |  4 +++
 target/arm/internals.h  |  5 +--
 target/i386/cpu.h       |  2 ++
 target/arm/helper.c     |  2 +-
 target/arm/kvm64.c      | 77 +++++++++++++++++++++++++++++++++++++++++
 target/arm/tlb_helper.c |  2 +-
 7 files changed, 89 insertions(+), 6 deletions(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index XXXXXXX..XXXXXXX 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -XXX,XX +XXX,XX @@ bool kvm_vcpu_id_is_valid(int vcpu_id);
 /* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
 unsigned long kvm_arch_vcpu_id(CPUState *cpu);
 
-#ifdef TARGET_I386
-#define KVM_HAVE_MCE_INJECTION 1
+#ifdef KVM_HAVE_MCE_INJECTION
 void kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
 #endif
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@
 /* ARM processors have a weak memory model */
 #define TCG_GUEST_DEFAULT_MO      (0)
 
+#ifdef TARGET_AARCH64
+#define KVM_HAVE_MCE_INJECTION 1
+#endif
+
 #define EXCP_UDEF            1   /* undefined instruction */
 #define EXCP_SWI             2   /* software interrupt */
 #define EXCP_PREFETCH_ABORT  3
diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_insn_abort(int same_el, int ea, int s1ptw, int fsc)
         | ARM_EL_IL | (ea << 9) | (s1ptw << 7) | fsc;
 }
 
-static inline uint32_t syn_data_abort_no_iss(int same_el,
+static inline uint32_t syn_data_abort_no_iss(int same_el, int fnv,
                                              int ea, int cm, int s1ptw,
                                              int wnr, int fsc)
 {
     return (EC_DATAABORT << ARM_EL_EC_SHIFT) | (same_el << ARM_EL_EC_SHIFT)
            | ARM_EL_IL
-           | (ea << 9) | (cm << 8) | (s1ptw << 7) | (wnr << 6) | fsc;
+           | (fnv << 10) | (ea << 9) | (cm << 8) | (s1ptw << 7)
+           | (wnr << 6) | fsc;
 }
 
 static inline uint32_t syn_data_abort_with_iss(int same_el,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -XXX,XX +XXX,XX @@
 /* The x86 has a strong memory model with some store-after-load re-ordering */
 #define TCG_GUEST_DEFAULT_MO      (TCG_MO_ALL & ~TCG_MO_ST_LD)
 
+#define KVM_HAVE_MCE_INJECTION 1
+
 /* Maximum instruction code size */
 #define TARGET_MAX_INSN_SIZE 16
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
              * Report exception with ESR indicating a fault due to a
              * translation table walk for a cache maintenance instruction.
              */
-            syn = syn_data_abort_no_iss(current_el == target_el,
+            syn = syn_data_abort_no_iss(current_el == target_el, 0,
                                         fi.ea, 1, fi.s1ptw, 1, fsc);
             env->exception.vaddress = value;
             env->exception.fsr = fsr;
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@
 #include "sysemu/kvm_int.h"
 #include "kvm_arm.h"
 #include "internals.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/ghes.h"
+#include "hw/arm/virt.h"
 
 static bool have_guest_debug;
 
@@ -XXX,XX +XXX,XX @@ int kvm_arm_cpreg_level(uint64_t regidx)
     return KVM_PUT_RUNTIME_STATE;
 }
 
+/* Callers must hold the iothread mutex lock */
+static void kvm_inject_arm_sea(CPUState *c)
+{
+    ARMCPU *cpu = ARM_CPU(c);
+    CPUARMState *env = &cpu->env;
+    CPUClass *cc = CPU_GET_CLASS(c);
+    uint32_t esr;
+    bool same_el;
+
+    c->exception_index = EXCP_DATA_ABORT;
+    env->exception.target_el = 1;
+
+    /*
+     * Set the DFSC to synchronous external abort and set FnV to not valid,
+     * this will tell guest the FAR_ELx is UNKNOWN for this abort.
+     */
+    same_el = arm_current_el(env) == env->exception.target_el;
+    esr = syn_data_abort_no_iss(same_el, 1, 0, 0, 0, 0, 0x10);
+
+    env->exception.syndrome = esr;
+
+    cc->do_interrupt(c);
+}
+
 #define AARCH64_CORE_REG(x)   (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
                  KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
 
@@ -XXX,XX +XXX,XX @@ int kvm_arch_get_registers(CPUState *cs)
     return ret;
 }
 
+void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
+{
+    ram_addr_t ram_addr;
+    hwaddr paddr;
+    Object *obj = qdev_get_machine();
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+    bool acpi_enabled = virt_is_acpi_enabled(vms);
+
+    assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
+
+    if (acpi_enabled && addr &&
+            object_property_get_bool(obj, "ras", NULL)) {
+        ram_addr = qemu_ram_addr_from_host(addr);
+        if (ram_addr != RAM_ADDR_INVALID &&
+            kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
+            kvm_hwpoison_page_add(ram_addr);
+            /*
+             * If this is a BUS_MCEERR_AR, we know we have been called
+             * synchronously from the vCPU thread, so we can easily
+             * synchronize the state and inject an error.
+             *
+             * TODO: we currently don't tell the guest at all about
+             * BUS_MCEERR_AO. In that case we might either be being
+             * called synchronously from the vCPU thread, or a bit
+             * later from the main thread, so doing the injection of
+             * the error would be more complicated.
+             */
+            if (code == BUS_MCEERR_AR) {
+                kvm_cpu_synchronize_state(c);
+                if (!acpi_ghes_record_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {
+                    kvm_inject_arm_sea(c);
+                } else {
+                    error_report("failed to record the error");
+                    abort();
+                }
+            }
+            return;
+        }
+        if (code == BUS_MCEERR_AO) {
+            error_report("Hardware memory error at addr %p for memory used by "
+                "QEMU itself instead of guest system!", addr);
+        }
+    }
+
+    if (code == BUS_MCEERR_AR) {
+        error_report("Hardware memory error!");
+        exit(1);
+    }
+}
+
 /* C6.6.29 BRK instruction */
 static const uint32_t brk_insn = 0xd4200000;
 
diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tlb_helper.c
+++ b/target/arm/tlb_helper.c
@@ -XXX,XX +XXX,XX @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn,
      * ISV field.
      */
     if (!(template_syn & ARM_EL_ISV) || target_el != 2 || s1ptw) {
-        syn = syn_data_abort_no_iss(same_el,
+        syn = syn_data_abort_no_iss(same_el, 0,
                                     ea, 0, s1ptw, is_write, fsc);
     } else {
         /*
-- 
2.20.1

From: Dongjiu Geng <gengdongjiu@huawei.com>

I and Xiang are willing to review the APEI-related patches and
volunteer as the reviewers for the HEST/GHES part.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20200512030609.19593-11-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 MAINTAINERS | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index XXXXXXX..XXXXXXX 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ F: tests/qtest/bios-tables-test.c
 F: tests/qtest/acpi-utils.[hc]
 F: tests/data/acpi/
 
+ACPI/HEST/GHES
+R: Dongjiu Geng <gengdongjiu@huawei.com>
+R: Xiang Zheng <zhengxiang9@huawei.com>
+L: qemu-arm@nongnu.org
+S: Maintained
+F: hw/acpi/ghes.c
+F: include/hw/acpi/ghes.h
+F: docs/specs/acpi_hest_ghes.rst
+
 ppc4xx
 M: David Gibson <david@gibson.dropbear.id.au>
 L: qemu-ppc@nongnu.org
-- 
2.20.1

Convert the Neon VQRDMLAH and VQRDMLSH insns in the 3-reg-same group
to decodetree.  These don't use do_3same() because they want to
operate on VFP double registers, whose offsets are different from the
neon_reg_offset() calculations do_3same does.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-2-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  3 +++
 target/arm/translate-neon.inc.c | 15 +++++++++++++++
 target/arm/translate.c          | 14 ++------------
 3 files changed, 20 insertions(+), 12 deletions(-)

Convert the Neon SHA instructions in the 3-reg-same group
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-3-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  10 +++
 target/arm/translate-neon.inc.c | 139 ++++++++++++++++++++++++++++++++
 target/arm/translate.c          |  46 +----------
 3 files changed, 151 insertions(+), 44 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
 VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
 
 VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
+
+SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
+SHA256H_3s       1111 001 1 0 . 00 .... .... 1100 . 1 . 0 .... \
+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
+SHA256H2_3s      1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
+SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
+
 VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a)
 
 DO_VQRDMLAH(VQRDMLAH, gen_gvec_sqrdmlah_qc)
 DO_VQRDMLAH(VQRDMLSH, gen_gvec_sqrdmlsh_qc)
+
+static bool trans_SHA1_3s(DisasContext *s, arg_SHA1_3s *a)
+{
+    TCGv_ptr ptr1, ptr2, ptr3;
+    TCGv_i32 tmp;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !dc_isar_feature(aa32_sha1, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    ptr1 = vfp_reg_ptr(true, a->vd);
+    ptr2 = vfp_reg_ptr(true, a->vn);
+    ptr3 = vfp_reg_ptr(true, a->vm);
+    tmp = tcg_const_i32(a->optype);
+    gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp);
+    tcg_temp_free_i32(tmp);
+    tcg_temp_free_ptr(ptr1);
+    tcg_temp_free_ptr(ptr2);
+    tcg_temp_free_ptr(ptr3);
+
+    return true;
+}
+
+static bool trans_SHA256H_3s(DisasContext *s, arg_SHA256H_3s *a)
+{
+    TCGv_ptr ptr1, ptr2, ptr3;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !dc_isar_feature(aa32_sha2, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    ptr1 = vfp_reg_ptr(true, a->vd);
+    ptr2 = vfp_reg_ptr(true, a->vn);
+    ptr3 = vfp_reg_ptr(true, a->vm);
+    gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
+    tcg_temp_free_ptr(ptr1);
+    tcg_temp_free_ptr(ptr2);
+    tcg_temp_free_ptr(ptr3);
+
+    return true;
+}
+
+static bool trans_SHA256H2_3s(DisasContext *s, arg_SHA256H2_3s *a)
+{
+    TCGv_ptr ptr1, ptr2, ptr3;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !dc_isar_feature(aa32_sha2, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    ptr1 = vfp_reg_ptr(true, a->vd);
+    ptr2 = vfp_reg_ptr(true, a->vn);
+    ptr3 = vfp_reg_ptr(true, a->vm);
+    gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
+    tcg_temp_free_ptr(ptr1);
+    tcg_temp_free_ptr(ptr2);
+    tcg_temp_free_ptr(ptr3);
+
+    return true;
+}
+
+static bool trans_SHA256SU1_3s(DisasContext *s, arg_SHA256SU1_3s *a)
+{
+    TCGv_ptr ptr1, ptr2, ptr3;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !dc_isar_feature(aa32_sha2, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    ptr1 = vfp_reg_ptr(true, a->vd);
+    ptr2 = vfp_reg_ptr(true, a->vn);
+    ptr3 = vfp_reg_ptr(true, a->vm);
+    gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
+    tcg_temp_free_ptr(ptr1);
+    tcg_temp_free_ptr(ptr2);
+    tcg_temp_free_ptr(ptr3);
+
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     int vec_size;
     uint32_t imm;
     TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
-    TCGv_ptr ptr1, ptr2, ptr3;
+    TCGv_ptr ptr1, ptr2;
     TCGv_i64 tmp64;
 
     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             return 1;
         }
         switch (op) {
-        case NEON_3R_SHA:
-            /* The SHA-1/SHA-256 3-register instructions require special
-             * treatment here, as their size field is overloaded as an
-             * op type selector, and they all consume their input in a
-             * single pass.
-             */
-            if (!q) {
-                return 1;
-            }
-            if (!u) { /* SHA-1 */
-                if (!dc_isar_feature(aa32_sha1, s)) {
-                    return 1;
-                }
-                ptr1 = vfp_reg_ptr(true, rd);
-                ptr2 = vfp_reg_ptr(true, rn);
-                ptr3 = vfp_reg_ptr(true, rm);
-                tmp4 = tcg_const_i32(size);
-                gen_helper_crypto_sha1_3reg(ptr1, ptr2, ptr3, tmp4);
-                tcg_temp_free_i32(tmp4);
-            } else { /* SHA-256 */
-                if (!dc_isar_feature(aa32_sha2, s) || size == 3) {
-                    return 1;
-                }
-                ptr1 = vfp_reg_ptr(true, rd);
-                ptr2 = vfp_reg_ptr(true, rn);
-                ptr3 = vfp_reg_ptr(true, rm);
-                switch (size) {
-                case 0:
-                    gen_helper_crypto_sha256h(ptr1, ptr2, ptr3);
-                    break;
-                case 1:
-                    gen_helper_crypto_sha256h2(ptr1, ptr2, ptr3);
-                    break;
-                case 2:
-                    gen_helper_crypto_sha256su1(ptr1, ptr2, ptr3);
-                    break;
-                }
-            }
-            tcg_temp_free_ptr(ptr1);
-            tcg_temp_free_ptr(ptr2);
-            tcg_temp_free_ptr(ptr3);
-            return 0;
-
         case NEON_3R_VPADD_VQRDMLAH:
             if (!u) {
                 break;  /* VPADD */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VMUL:
         case NEON_3R_VML:
         case NEON_3R_VSHL:
+        case NEON_3R_SHA:
             /* Already handled by decodetree */
             return 1;
         }
-- 
2.20.1

Convert the 64-bit element insns in the 3-reg-same group
to decodetree. This covers VQSHL, VRSHL and VQRSHL where
size==0b11.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-4-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       | 13 +++++++++++
 target/arm/translate-neon.inc.c | 24 +++++++++++++++++++++
 target/arm/translate.c          | 38 ++-------------------------------
 3 files changed, 39 insertions(+), 36 deletions(-)

Convert the Neon VHADD insns in the 3-reg-same group to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-5-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  2 ++
 target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
 target/arm/translate.c          |  4 +---
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 
+VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
+VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
 VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
 VQADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 1 .... @3same
 
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64)
 DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64)
 DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64)
 DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
+
+#define DO_3SAME_32(INSN, FUNC)                                         \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        static const GVecGen3 ops[4] = {                                \
+            { .fni4 = gen_helper_neon_##FUNC##8 },                      \
+            { .fni4 = gen_helper_neon_##FUNC##16 },                     \
+            { .fni4 = gen_helper_neon_##FUNC##32 },                     \
+            { 0 },                                                      \
+        };                                                              \
+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
+    }                                                                   \
+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+    {                                                                   \
+        if (a->size > 2) {                                              \
+            return false;                                               \
+        }                                                               \
+        return do_3same(s, a, gen_##INSN##_3s);                         \
+    }
+
+DO_3SAME_32(VHADD_S, hadd_s)
+DO_3SAME_32(VHADD_U, hadd_u)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VML:
         case NEON_3R_VSHL:
         case NEON_3R_SHA:
+        case NEON_3R_VHADD:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tmp2 = neon_load_reg(rm, pass);
         }
         switch (op) {
-        case NEON_3R_VHADD:
-            GEN_NEON_INTEGER_OP(hadd);
-            break;
         case NEON_3R_VRHADD:
             GEN_NEON_INTEGER_OP(rhadd);
             break;
-- 
2.20.1

Convert the Neon VABA and VABD insns in the 3-reg-same group to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-6-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  6 ++++++
 target/arm/translate-neon.inc.c |  4 ++++
 target/arm/translate.c          | 22 ++--------------------
 3 files changed, 12 insertions(+), 20 deletions(-)

Convert the Neon VRHADD and VHSUB 3-reg-same insns to decodetree.
(These are all the other insns in 3-reg-same which were using
GEN_NEON_INTEGER_OP() and which are not pairwise or
reversed-operands.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-7-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       | 6 ++++++
 target/arm/translate-neon.inc.c | 4 ++++
 target/arm/translate.c          | 8 ++------
 3 files changed, 12 insertions(+), 6 deletions(-)

Convert the VQSHL, VRSHL and VQRSHL insns in the 3-reg-same
group to decodetree. We have already implemented the size==0b11
case of these insns; this commit handles the remaining sizes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-8-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       | 30 ++++++++++++++++++-----
 target/arm/translate-neon.inc.c | 43 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 22 +++--------------
 3 files changed, 70 insertions(+), 25 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VSHL_U_3s        1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev
 @3same_64_rev    .... ... . . . 11 .... .... .... . q:1 . . .... \
                  &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3
 
-VQSHL_S64_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-VQSHL_U64_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
-VRSHL_S64_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-VRSHL_U64_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
-VQRSHL_S64_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
-VQRSHL_U64_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
+{
+  VQSHL_S64_3s   1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
+  VQSHL_S_3s     1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_rev
+}
+{
+  VQSHL_U64_3s   1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev
+  VQSHL_U_3s     1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev
+}
+{
+  VRSHL_S64_3s   1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
+  VRSHL_S_3s     1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev
+}
+{
+  VRSHL_U64_3s   1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev
+  VRSHL_U_3s     1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev
+}
+{
+  VQRSHL_S64_3s  1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
+  VQRSHL_S_3s    1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_rev
+}
+{
+  VQRSHL_U64_3s  1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev
+  VQRSHL_U_3s    1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_rev
+}
 
 VMAX_S_3s        1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same
 VMAX_U_3s        1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64)
         return do_3same(s, a, gen_##INSN##_3s);                         \
     }
 
+/*
+ * Some helper functions need to be passed the cpu_env. In order
+ * to use those with the gvec APIs like tcg_gen_gvec_3() we need
+ * to create wrapper functions whose prototype is a NeonGenTwoOpFn()
+ * and which call a NeonGenTwoOpEnvFn().
+ */
+#define WRAP_ENV_FN(WRAPNAME, FUNC)                                     \
+    static void WRAPNAME(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m)            \
+    {                                                                   \
+        FUNC(d, cpu_env, n, m);                                         \
+    }
+
+#define DO_3SAME_32_ENV(INSN, FUNC)                                     \
+    WRAP_ENV_FN(gen_##INSN##_tramp8, gen_helper_neon_##FUNC##8);        \
+    WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##16);      \
+    WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##32);      \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        static const GVecGen3 ops[4] = {                                \
+            { .fni4 = gen_##INSN##_tramp8 },                            \
+            { .fni4 = gen_##INSN##_tramp16 },                           \
+            { .fni4 = gen_##INSN##_tramp32 },                           \
+            { 0 },                                                      \
+        };                                                              \
+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \
+    }                                                                   \
+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+    {                                                                   \
+        if (a->size > 2) {                                              \
+            return false;                                               \
+        }                                                               \
+        return do_3same(s, a, gen_##INSN##_3s);                         \
+    }
+
 DO_3SAME_32(VHADD_S, hadd_s)
 DO_3SAME_32(VHADD_U, hadd_u)
 DO_3SAME_32(VHSUB_S, hsub_s)
 DO_3SAME_32(VHSUB_U, hsub_u)
 DO_3SAME_32(VRHADD_S, rhadd_s)
 DO_3SAME_32(VRHADD_U, rhadd_u)
+DO_3SAME_32(VRSHL_S, rshl_s)
+DO_3SAME_32(VRSHL_U, rshl_u)
+
+DO_3SAME_32_ENV(VQSHL_S, qshl_s)
+DO_3SAME_32_ENV(VQSHL_U, qshl_u)
+DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
+DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VHSUB:
         case NEON_3R_VABD:
         case NEON_3R_VABA:
+        case NEON_3R_VQSHL:
+        case NEON_3R_VRSHL:
+        case NEON_3R_VQRSHL:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         }
         pairwise = 0;
         switch (op) {
-        case NEON_3R_VQSHL:
-        case NEON_3R_VRSHL:
-        case NEON_3R_VQRSHL:
-            {
-                int rtmp;
-                /* Shift instruction operands are reversed.  */
-                rtmp = rn;
-                rn = rm;
-                rm = rtmp;
-            }
-            break;
         case NEON_3R_VPADD_VQRDMLAH:
         case NEON_3R_VPMAX:
         case NEON_3R_VPMIN:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tmp2 = neon_load_reg(rm, pass);
         }
         switch (op) {
-        case NEON_3R_VQSHL:
-            GEN_NEON_INTEGER_OP_ENV(qshl);
-            break;
-        case NEON_3R_VRSHL:
-            GEN_NEON_INTEGER_OP(rshl);
-            break;
-        case NEON_3R_VQRSHL:
-            GEN_NEON_INTEGER_OP_ENV(qrshl);
             break;
         case NEON_3R_VPMAX:
             GEN_NEON_INTEGER_OP(pmax);
-- 
2.20.1

Convert the Neon integer VPMAX and VPMIN 3-reg-same insns to
decodetree. These are 'pairwise' operations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-9-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  9 +++++
 target/arm/translate-neon.inc.c | 71 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 17 +-------
 3 files changed, 82 insertions(+), 15 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 @3same           .... ... . . . size:2 .... .... .... . q:1 . . .... \
                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
 
+@3same_q0        .... ... . . . size:2 .... .... .... . 0 . . .... \
+                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
+
 VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
 VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
 VQADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 1 .... @3same
@@ -XXX,XX +XXX,XX @@ VMLS_3s          1111 001 1 0 . .. .... .... 1001 . . . 0 .... @3same
 VMUL_3s          1111 001 0 0 . .. .... .... 1001 . . . 1 .... @3same
 VMUL_p_3s        1111 001 1 0 . .. .... .... 1001 . . . 1 .... @3same
 
+VPMAX_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 0 .... @3same_q0
+VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
+
+VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
+VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
+
 VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
 
 SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_32_ENV(VQSHL_S, qshl_s)
 DO_3SAME_32_ENV(VQSHL_U, qshl_u)
 DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
 DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
+
+static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
+{
+    /* Operations handled pairwise 32 bits at a time */
+    TCGv_i32 tmp, tmp2, tmp3;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (a->size == 3) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    assert(a->q == 0); /* enforced by decode patterns */
+
+    /*
+     * Note that we have to be careful not to clobber the source operands
+     * in the "vm == vd" case by storing the result of the first pass too
+     * early. Since Q is 0 there are always just two passes, so instead
+     * of a complicated loop over each pass we just unroll.
+     */
+    tmp = neon_load_reg(a->vn, 0);
+    tmp2 = neon_load_reg(a->vn, 1);
+    fn(tmp, tmp, tmp2);
+    tcg_temp_free_i32(tmp2);
+
+    tmp3 = neon_load_reg(a->vm, 0);
+    tmp2 = neon_load_reg(a->vm, 1);
+    fn(tmp3, tmp3, tmp2);
+    tcg_temp_free_i32(tmp2);
+
+    neon_store_reg(a->vd, 0, tmp);
+    neon_store_reg(a->vd, 1, tmp3);
+    return true;
+}
+
+#define DO_3SAME_PAIR(INSN, func)                                       \
+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+    {                                                                   \
+        static NeonGenTwoOpFn * const fns[] = {                         \
+            gen_helper_neon_##func##8,                                  \
+            gen_helper_neon_##func##16,                                 \
+            gen_helper_neon_##func##32,                                 \
+        };                                                              \
+        if (a->size > 2) {                                              \
+            return false;                                               \
+        }                                                               \
+        return do_3same_pair(s, a, fns[a->size]);                       \
+    }
+
+/* 32-bit pairwise ops end up the same as the elementwise versions.  */
+#define gen_helper_neon_pmax_s32  tcg_gen_smax_i32
+#define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
+#define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
+#define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
+
+DO_3SAME_PAIR(VPMAX_S, pmax_s)
+DO_3SAME_PAIR(VPMIN_S, pmin_s)
+DO_3SAME_PAIR(VPMAX_U, pmax_u)
+DO_3SAME_PAIR(VPMIN_U, pmin_u)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
     }
 }
 
-/* 32-bit pairwise ops end up the same as the elementwise versions.  */
-#define gen_helper_neon_pmax_s32  tcg_gen_smax_i32
-#define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
-#define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
-#define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
-
 #define GEN_NEON_INTEGER_OP_ENV(name) do { \
     switch ((size << 1) | u) { \
     case 0: \
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VQSHL:
         case NEON_3R_VRSHL:
         case NEON_3R_VQRSHL:
+        case NEON_3R_VPMAX:
+        case NEON_3R_VPMIN:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         pairwise = 0;
         switch (op) {
         case NEON_3R_VPADD_VQRDMLAH:
-        case NEON_3R_VPMAX:
-        case NEON_3R_VPMIN:
             pairwise = 1;
             break;
         case NEON_3R_FLOAT_ARITH:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tmp2 = neon_load_reg(rm, pass);
         }
         switch (op) {
-            break;
-        case NEON_3R_VPMAX:
-            GEN_NEON_INTEGER_OP(pmax);
-            break;
-        case NEON_3R_VPMIN:
-            GEN_NEON_INTEGER_OP(pmin);
-            break;
         case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high.  */
             if (!u) { /* VQDMULH */
                 switch (size) {
-- 
2.20.1

Convert the Neon integer VPADD 3-reg-same insns to decodetree.  These
are 'pairwise' operations.  (Note that VQRDMLAH, which shares the
same primary opcode but has U=1, has already been converted.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-10-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  2 ++
 target/arm/translate-neon.inc.c |  2 ++
 target/arm/translate.c          | 19 +------------------
 3 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
 VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 
+VPADD_3s         1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
+
 VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
 
 SHA1_3s          1111 001 0 0 . optype:2 .... .... 1100 . 1 . 0 .... \
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
 #define gen_helper_neon_pmax_u32  tcg_gen_umax_i32
 #define gen_helper_neon_pmin_s32  tcg_gen_smin_i32
 #define gen_helper_neon_pmin_u32  tcg_gen_umin_i32
+#define gen_helper_neon_padd_u32  tcg_gen_add_i32
 
 DO_3SAME_PAIR(VPMAX_S, pmax_s)
 DO_3SAME_PAIR(VPMIN_S, pmin_s)
 DO_3SAME_PAIR(VPMAX_U, pmax_u)
 DO_3SAME_PAIR(VPMIN_U, pmin_u)
+DO_3SAME_PAIR(VPADD, padd_u)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             return 1;
         }
         switch (op) {
-        case NEON_3R_VPADD_VQRDMLAH:
-            if (!u) {
-                break;  /* VPADD */
-            }
-            /* VQRDMLAH : handled by decodetree */
-            return 1;
-
         case NEON_3R_VFM_VQRDMLSH:
             if (!u) {
                 /* VFM, VFMS */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VQRSHL:
         case NEON_3R_VPMAX:
         case NEON_3R_VPMIN:
+        case NEON_3R_VPADD_VQRDMLAH:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         }
         pairwise = 0;
         switch (op) {
-        case NEON_3R_VPADD_VQRDMLAH:
-            pairwise = 1;
-            break;
         case NEON_3R_FLOAT_ARITH:
             pairwise = (u && size < 2); /* if VPADD (float) */
             break;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 }
             }
             break;
-        case NEON_3R_VPADD_VQRDMLAH:
-            switch (size) {
-            case 0: gen_helper_neon_padd_u8(tmp, tmp, tmp2); break;
-            case 1: gen_helper_neon_padd_u16(tmp, tmp, tmp2); break;
-            case 2: tcg_gen_add_i32(tmp, tmp, tmp2); break;
-            default: abort();
-            }
-            break;
         case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
         {
             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-- 
2.20.1

Convert the Neon VQDMULH and VQRDMULH 3-reg-same insns to
decodetree. These are the last integer operations in the
3-reg-same group.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-11-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  3 +++
 target/arm/translate-neon.inc.c | 24 ++++++++++++++++++++++++
 target/arm/translate.c          | 24 +-----------------------
 3 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VPMAX_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 0 .... @3same_q0
 VPMIN_S_3s       1111 001 0 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 VPMIN_U_3s       1111 001 1 0 . .. .... .... 1010 . . . 1 .... @3same_q0
 
+VQDMULH_3s       1111 001 0 0 . .. .... .... 1011 . . . 0 .... @3same
+VQRDMULH_3s      1111 001 1 0 . .. .... .... 1011 . . . 0 .... @3same
+
 VPADD_3s         1111 001 0 0 . .. .... .... 1011 . . . 1 .... @3same_q0
 
 VQRDMLAH_3s      1111 001 1 0 . .. .... .... 1011 ... 1 .... @3same
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_PAIR(VPMIN_S, pmin_s)
 DO_3SAME_PAIR(VPMAX_U, pmax_u)
 DO_3SAME_PAIR(VPMIN_U, pmin_u)
 DO_3SAME_PAIR(VPADD, padd_u)
+
+#define DO_3SAME_VQDMULH(INSN, FUNC)                                    \
+    WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16);    \
+    WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32);    \
+    static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,         \
+                                uint32_t rn_ofs, uint32_t rm_ofs,       \
+                                uint32_t oprsz, uint32_t maxsz)         \
+    {                                                                   \
+        static const GVecGen3 ops[2] = {                                \
+            { .fni4 = gen_##INSN##_tramp16 },                           \
+            { .fni4 = gen_##INSN##_tramp32 },                           \
+        };                                                              \
+        tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece - 1]); \
+    }                                                                   \
+    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
+    {                                                                   \
+        if (a->size != 1 && a->size != 2) {                             \
+            return false;                                               \
+        }                                                               \
+        return do_3same(s, a, gen_##INSN##_3s);                         \
+    }
+
+DO_3SAME_VQDMULH(VQDMULH, qdmulh)
+DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VPMAX:
         case NEON_3R_VPMIN:
         case NEON_3R_VPADD_VQRDMLAH:
+        case NEON_3R_VQDMULH_VQRDMULH:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             tmp2 = neon_load_reg(rm, pass);
         }
         switch (op) {
-        case NEON_3R_VQDMULH_VQRDMULH: /* Multiply high.  */
-            if (!u) { /* VQDMULH */
-                switch (size) {
-                case 1:
-                    gen_helper_neon_qdmulh_s16(tmp, cpu_env, tmp, tmp2);
-                    break;
-                case 2:
-                    gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
-                    break;
-                default: abort();
-                }
-            } else { /* VQRDMULH */
-                switch (size) {
-                case 1:
-                    gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
-                    break;
-                case 2:
-                    gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
-                    break;
-                default: abort();
-                }
-            }
-            break;
         case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
         {
             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-- 
2.20.1

Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree.
We already have gvec helpers for addition and subtraction, but must
add one for fabd.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-12-peter.maydell@linaro.org
---
 target/arm/helper.h             |  3 ++-
 target/arm/neon-dp.decode       |  8 ++++++++
 target/arm/neon_helper.c        |  7 -------
 target/arm/translate-neon.inc.c | 28 ++++++++++++++++++++++++++++
 target/arm/translate.c          | 10 +++-------
 target/arm/vec_helper.c         |  7 +++++++
 6 files changed, 48 insertions(+), 15 deletions(-)

Convert the Neon float VPMIN, VPMAX and VPADD 3-reg-same insns to
decodetree. These are the only remaining 'pairwise' operations,
so we can delete the pairwise-specific bits of the old decoder's
for-each-element loop now.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-13-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  5 +++
 target/arm/translate-neon.inc.c | 63 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 63 +++++----------------------------
 3 files changed, 76 insertions(+), 55 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@
 # For FP insns the high bit of 'size' is used as part of opcode decode
 @3same_fp        .... ... . . . . size:1 .... .... .... . q:1 . . .... \
                  &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp
+@3same_fp_q0     .... ... . . . . size:1 .... .... .... . 0 . . .... \
+                 &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
 
 VHADD_S_3s       1111 001 0 0 . .. .... .... 0000 . . . 0 .... @3same
 VHADD_U_3s       1111 001 1 0 . .. .... .... 0000 . . . 0 .... @3same
@@ -XXX,XX +XXX,XX @@ VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
 
 VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
 VSUB_fp_3s       1111 001 0 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
+VPADD_fp_3s      1111 001 1 0 . 0 . .... .... 1101 ... 0 .... @3same_fp_q0
 VABD_fp_3s       1111 001 1 0 . 1 . .... .... 1101 ... 0 .... @3same_fp
+VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
+VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3SAME_VQDMULH(VQRDMULH, qrdmulh)
 DO_3S_FP_GVEC(VADD, gen_helper_gvec_fadd_s)
 DO_3S_FP_GVEC(VSUB, gen_helper_gvec_fsub_s)
 DO_3S_FP_GVEC(VABD, gen_helper_gvec_fabd_s)
+
+static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
+{
+    /* FP operations handled pairwise 32 bits at a time */
+    TCGv_i32 tmp, tmp2, tmp3;
+    TCGv_ptr fpstatus;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    assert(a->q == 0); /* enforced by decode patterns */
+
+    /*
+     * Note that we have to be careful not to clobber the source operands
+     * in the "vm == vd" case by storing the result of the first pass too
+     * early. Since Q is 0 there are always just two passes, so instead
+     * of a complicated loop over each pass we just unroll.
+     */
+    fpstatus = get_fpstatus_ptr(1);
+    tmp = neon_load_reg(a->vn, 0);
+    tmp2 = neon_load_reg(a->vn, 1);
+    fn(tmp, tmp, tmp2, fpstatus);
+    tcg_temp_free_i32(tmp2);
+
+    tmp3 = neon_load_reg(a->vm, 0);
+    tmp2 = neon_load_reg(a->vm, 1);
+    fn(tmp3, tmp3, tmp2, fpstatus);
+    tcg_temp_free_i32(tmp2);
+    tcg_temp_free_ptr(fpstatus);
+
+    neon_store_reg(a->vd, 0, tmp);
+    neon_store_reg(a->vd, 1, tmp3);
+    return true;
+}
+
+/*
+ * For all the functions using this macro, size == 1 means fp16,
+ * which is an architecture extension we don't implement yet.
+ */
+#define DO_3S_FP_PAIR(INSN,FUNC)                                    \
+    static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \
+    {                                                               \
+        if (a->size != 0) {                                         \
+            /* TODO fp16 support */                                 \
+            return false;                                           \
+        }                                                           \
+        return do_3same_fp_pair(s, a, FUNC);                        \
+    }
+
+DO_3S_FP_PAIR(VPADD, gen_helper_vfp_adds)
+DO_3S_FP_PAIR(VPMAX, gen_helper_vfp_maxs)
+DO_3S_FP_PAIR(VPMIN, gen_helper_vfp_mins)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     int shift;
     int pass;
     int count;
-    int pairwise;
     int u;
     int vec_size;
     uint32_t imm;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_VPMIN:
         case NEON_3R_VPADD_VQRDMLAH:
         case NEON_3R_VQDMULH_VQRDMULH:
+        case NEON_3R_FLOAT_ARITH:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             /* 64-bit element instructions: handled by decodetree */
             return 1;
         }
-        pairwise = 0;
         switch (op) {
-        case NEON_3R_FLOAT_ARITH:
-            pairwise = (u && size < 2); /* if VPADD (float) */
-            if (!pairwise) {
-                return 1; /* handled by decodetree */
-            }
-            break;
         case NEON_3R_FLOAT_MINMAX:
-            pairwise = u; /* if VPMIN/VPMAX (float) */
+            if (u) {
+                return 1; /* VPMIN/VPMAX handled by decodetree */
+            }
             break;
         case NEON_3R_FLOAT_CMP:
             if (!u && size) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             break;
         }
 
-        if (pairwise && q) {
-            /* All the pairwise insns UNDEF if Q is set */
-            return 1;
-        }
-
         for (pass = 0; pass < (q ? 4 : 2); pass++) {
 
-        if (pairwise) {
-            /* Pairwise.  */
-            if (pass < 1) {
-                tmp = neon_load_reg(rn, 0);
-                tmp2 = neon_load_reg(rn, 1);
-            } else {
-                tmp = neon_load_reg(rm, 0);
-                tmp2 = neon_load_reg(rm, 1);
-            }
-        } else {
-            /* Elementwise.  */
-            tmp = neon_load_reg(rn, pass);
-            tmp2 = neon_load_reg(rm, pass);
-        }
+        /* Elementwise.  */
+        tmp = neon_load_reg(rn, pass);
+        tmp2 = neon_load_reg(rm, pass);
         switch (op) {
-        case NEON_3R_FLOAT_ARITH: /* Floating point arithmetic. */
-        {
-            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-            switch ((u << 2) | size) {
-            case 4: /* VPADD */
-                gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
-                break;
-            default:
-                abort();
-            }
-            tcg_temp_free_ptr(fpstatus);
-            break;
-        }
         case NEON_3R_FLOAT_MULTIPLY:
         {
             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         }
         tcg_temp_free_i32(tmp2);
 
-        /* Save the result.  For elementwise operations we can put it
-           straight into the destination register.  For pairwise operations
-           we have to be careful to avoid clobbering the source operands.  */
-        if (pairwise && rd == rm) {
-            neon_store_scratch(pass, tmp);
-        } else {
-            neon_store_reg(rd, pass, tmp);
-        }
+        neon_store_reg(rd, pass, tmp);
 
         } /* for pass */
-        if (pairwise && rd == rm) {
-            for (pass = 0; pass < (q ? 4 : 2); pass++) {
-                tmp = neon_load_scratch(pass);
-                neon_store_reg(rd, pass, tmp);
-            }
-        }
         /* End of 3 register same size operations.  */
     } else if (insn & (1 << 4)) {
         if ((insn & 0x00380080) != 0) {
-- 
2.20.1

Convert the Neon integer VMUL, VMLA, and VMLS 3-reg-same inssn to
decodetree.

We don't have a gvec helper for multiply-accumulate, so VMLA and VMLS
need a loop function do_3same_fp().  This takes a reads_vd parameter
to do_3same_fp() which tells it to load the old value into vd before
calling the callback function, in the same way that the do_vfp_3op_sp()
and do_vfp_3op_dp() functions in translate-vfp.inc.c work. (The
only uses in this patch pass reads_vd == true, but later commits
will use reads_vd == false.)

This conversion fixes in passing an underdecoding for VMUL
(originally reported by Fredrik Strupe <fredrik@strupe.net>): bit 1
of the 'size' field must be 0.  The old decoder didn't enforce this,
but the decodetree pattern does.

The gen_VMLA_fp_reg() function performs the addition operation
with the operands in the opposite order to the old decoder:
since Neon sets 'default NaN mode' float32_add operations are
commutative so there is no behaviour difference, but putting
them this way around matches the Arm ARM pseudocode and the
required operation order for the subtraction in gen_VMLS_fp_reg().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-14-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  3 ++
 target/arm/translate-neon.inc.c | 81 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 17 +------
 3 files changed, 85 insertions(+), 16 deletions(-)

Convert the Neon integer 3-reg-same compare insns VCGE, VCGT,
VCEQ, VACGE and VACGT to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-15-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  5 +++++
 target/arm/translate-neon.inc.c |  6 +++++
 target/arm/translate.c          | 39 ++-------------------------------
 3 files changed, 13 insertions(+), 37 deletions(-)

The usual location for the env argument in the argument list of a TCG helper
is immediately after the return-value argument. recps_f32 and rsqrts_f32
differ in that they put it at the end.

Move the env argument to its usual place; this will allow us to
more easily use these helper functions with the gvec APIs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-16-peter.maydell@linaro.org
---
 target/arm/helper.h     | 4 ++--
 target/arm/translate.c  | 4 ++--
 target/arm/vfp_helper.c | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, ptr, i32)
 DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
 DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
 
-DEF_HELPER_3(recps_f32, f32, f32, f32, env)
-DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env)
+DEF_HELPER_3(recps_f32, f32, env, f32, f32)
+DEF_HELPER_3(rsqrts_f32, f32, env, f32, f32)
 DEF_HELPER_FLAGS_2(recpe_f16, TCG_CALL_NO_RWG, f16, f16, ptr)
 DEF_HELPER_FLAGS_2(recpe_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
 DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 tcg_temp_free_ptr(fpstatus);
             } else {
                 if (size == 0) {
-                    gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env);
+                    gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
                 } else {
-                    gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env);
+                    gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
               }
             }
             break;
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vfp_helper.c
+++ b/target/arm/vfp_helper.c
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
 #define float32_three make_float32(0x40400000)
 #define float32_one_point_five make_float32(0x3fc00000)
 
-float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
+float32 HELPER(recps_f32)(CPUARMState *env, float32 a, float32 b)
 {
     float_status *s = &env->vfp.standard_fp_status;
     if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
@@ -XXX,XX +XXX,XX @@ float32 HELPER(recps_f32)(float32 a, float32 b, CPUARMState *env)
     return float32_sub(float32_two, float32_mul(a, b, s), s);
 }
 
-float32 HELPER(rsqrts_f32)(float32 a, float32 b, CPUARMState *env)
+float32 HELPER(rsqrts_f32)(CPUARMState *env, float32 a, float32 b)
 {
     float_status *s = &env->vfp.standard_fp_status;
     float32 product;
-- 
2.20.1

Convert the Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS 3-reg-same
insns to decodetree. (These are all the remaining non-accumulation
instructions in this group.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-17-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |  6 +++
 target/arm/translate-neon.inc.c | 70 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 42 +-------------------
 3 files changed, 78 insertions(+), 40 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VCGE_fp_3s       1111 001 1 0 . 0 . .... .... 1110 ... 0 .... @3same_fp
 VACGE_fp_3s      1111 001 1 0 . 0 . .... .... 1110 ... 1 .... @3same_fp
 VCGT_fp_3s       1111 001 1 0 . 1 . .... .... 1110 ... 0 .... @3same_fp
 VACGT_fp_3s      1111 001 1 0 . 1 . .... .... 1110 ... 1 .... @3same_fp
+VMAX_fp_3s       1111 001 0 0 . 0 . .... .... 1111 ... 0 .... @3same_fp
+VMIN_fp_3s       1111 001 0 0 . 1 . .... .... 1111 ... 0 .... @3same_fp
 VPMAX_fp_3s      1111 001 1 0 . 0 . .... .... 1111 ... 0 .... @3same_fp_q0
 VPMIN_fp_3s      1111 001 1 0 . 1 . .... .... 1111 ... 0 .... @3same_fp_q0
+VRECPS_fp_3s     1111 001 0 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
+VRSQRTS_fp_3s    1111 001 0 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
+VMAXNM_fp_3s     1111 001 1 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
+VMINNM_fp_3s     1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_3S_FP(VCGE, gen_helper_neon_cge_f32, false)
 DO_3S_FP(VCGT, gen_helper_neon_cgt_f32, false)
 DO_3S_FP(VACGE, gen_helper_neon_acge_f32, false)
 DO_3S_FP(VACGT, gen_helper_neon_acgt_f32, false)
+DO_3S_FP(VMAX, gen_helper_vfp_maxs, false)
+DO_3S_FP(VMIN, gen_helper_vfp_mins, false)
 
 static void gen_VMLA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
                             TCGv_ptr fpstatus)
@@ -XXX,XX +XXX,XX @@ static void gen_VMLS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
 DO_3S_FP(VMLA, gen_VMLA_fp_3s, true)
 DO_3S_FP(VMLS, gen_VMLS_fp_3s, true)
 
+static bool trans_VMAXNM_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
+        return false;
+    }
+
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same_fp(s, a, gen_helper_vfp_maxnums, false);
+}
+
+static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
+        return false;
+    }
+
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same_fp(s, a, gen_helper_vfp_minnums, false);
+}
+
+WRAP_ENV_FN(gen_VRECPS_tramp, gen_helper_recps_f32)
+
+static void gen_VRECPS_fp_3s(unsigned vece, uint32_t rd_ofs,
+                             uint32_t rn_ofs, uint32_t rm_ofs,
+                             uint32_t oprsz, uint32_t maxsz)
+{
+    static const GVecGen3 ops = { .fni4 = gen_VRECPS_tramp };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
+}
+
+static bool trans_VRECPS_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same(s, a, gen_VRECPS_fp_3s);
+}
+
+WRAP_ENV_FN(gen_VRSQRTS_tramp, gen_helper_rsqrts_f32)
+
+static void gen_VRSQRTS_fp_3s(unsigned vece, uint32_t rd_ofs,
+                              uint32_t rn_ofs, uint32_t rm_ofs,
+                              uint32_t oprsz, uint32_t maxsz)
+{
+    static const GVecGen3 ops = { .fni4 = gen_VRSQRTS_tramp };
+    tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops);
+}
+
+static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same(s, a, gen_VRSQRTS_fp_3s);
+}
+
 static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
 {
     /* FP operations handled pairwise 32 bits at a time */
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         case NEON_3R_FLOAT_MULTIPLY:
         case NEON_3R_FLOAT_CMP:
         case NEON_3R_FLOAT_ACMP:
+        case NEON_3R_FLOAT_MINMAX:
+        case NEON_3R_FLOAT_MISC:
             /* Already handled by decodetree */
             return 1;
         }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             return 1;
         }
         switch (op) {
-        case NEON_3R_FLOAT_MINMAX:
-            if (u) {
-                return 1; /* VPMIN/VPMAX handled by decodetree */
-            }
-            break;
-        case NEON_3R_FLOAT_MISC:
-            /* VMAXNM/VMINNM in ARMv8 */
-            if (u && !arm_dc_feature(s, ARM_FEATURE_V8)) {
-                return 1;
-            }
-            break;
         case NEON_3R_VFM_VQRDMLSH:
             if (!dc_isar_feature(aa32_simdfmac, s)) {
                 return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         tmp = neon_load_reg(rn, pass);
         tmp2 = neon_load_reg(rm, pass);
         switch (op) {
-        case NEON_3R_FLOAT_MINMAX:
-        {
-            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-            if (size == 0) {
-                gen_helper_vfp_maxs(tmp, tmp, tmp2, fpstatus);
-            } else {
-                gen_helper_vfp_mins(tmp, tmp, tmp2, fpstatus);
-            }
-            tcg_temp_free_ptr(fpstatus);
-            break;
-        }
-        case NEON_3R_FLOAT_MISC:
-            if (u) {
-                /* VMAXNM/VMINNM */
-                TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                if (size == 0) {
-                    gen_helper_vfp_maxnums(tmp, tmp, tmp2, fpstatus);
-                } else {
-                    gen_helper_vfp_minnums(tmp, tmp, tmp2, fpstatus);
-                }
-                tcg_temp_free_ptr(fpstatus);
-            } else {
-                if (size == 0) {
-                    gen_helper_recps_f32(tmp, cpu_env, tmp, tmp2);
-                } else {
-                    gen_helper_rsqrts_f32(tmp, cpu_env, tmp, tmp2);
-              }
-            }
-            break;
         case NEON_3R_VFM_VQRDMLSH:
         {
             /* VFMA, VFMS: fused multiply-add */
-- 
2.20.1

Convert the Neon floating point VFMA and VFMS insn to decodetree.
These are the last insns in the 3-reg-same group so we can
remove all the support/loop code from the old decoder.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200512163904.10918-18-peter.maydell@linaro.org
---
 target/arm/neon-dp.decode       |   3 +
 target/arm/translate-neon.inc.c |  41 ++++++++
 target/arm/translate.c          | 176 +-------------------------------
 3 files changed, 46 insertions(+), 174 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ SHA256H2_3s      1111 001 1 0 . 01 .... .... 1100 . 1 . 0 .... \
 SHA256SU1_3s     1111 001 1 0 . 10 .... .... 1100 . 1 . 0 .... \
                  vm=%vm_dp vn=%vn_dp vd=%vd_dp
 
+VFMA_fp_3s       1111 001 0 0 . 0 . .... .... 1100 ... 1 .... @3same_fp
+VFMS_fp_3s       1111 001 0 0 . 1 . .... .... 1100 ... 1 .... @3same_fp
+
 VQRDMLSH_3s      1111 001 1 0 . .. .... .... 1100 ... 1 .... @3same
 
 VADD_fp_3s       1111 001 0 0 . 0 . .... .... 1101 ... 0 .... @3same_fp
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VRSQRTS_fp_3s(DisasContext *s, arg_3same *a)
     return do_3same(s, a, gen_VRSQRTS_fp_3s);
 }
 
+static void gen_VFMA_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
+                            TCGv_ptr fpstatus)
+{
+    gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
+}
+
+static bool trans_VFMA_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (!dc_isar_feature(aa32_simdfmac, s)) {
+        return false;
+    }
+
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same_fp(s, a, gen_VFMA_fp_3s, true);
+}
+
+static void gen_VFMS_fp_3s(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm,
+                            TCGv_ptr fpstatus)
+{
+    gen_helper_vfp_negs(vn, vn);
+    gen_helper_vfp_muladds(vd, vn, vm, vd, fpstatus);
+}
+
+static bool trans_VFMS_fp_3s(DisasContext *s, arg_3same *a)
+{
+    if (!dc_isar_feature(aa32_simdfmac, s)) {
+        return false;
+    }
+
+    if (a->size != 0) {
+        /* TODO fp16 support */
+        return false;
+    }
+
+    return do_3same_fp(s, a, gen_VFMS_fp_3s, true);
+}
+
 static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
 {
     /* FP operations handled pairwise 32 bits at a time */
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_neon_narrow_op(int op, int u, int size,
     }
 }
 
-/* Symbolic constants for op fields for Neon 3-register same-length.
- * The values correspond to bits [11:8,4]; see the ARM ARM DDI0406B
- * table A7-9.
- */
-#define NEON_3R_VHADD 0
-#define NEON_3R_VQADD 1
-#define NEON_3R_VRHADD 2
-#define NEON_3R_LOGIC 3 /* VAND,VBIC,VORR,VMOV,VORN,VEOR,VBIF,VBIT,VBSL */
-#define NEON_3R_VHSUB 4
-#define NEON_3R_VQSUB 5
-#define NEON_3R_VCGT 6
-#define NEON_3R_VCGE 7
-#define NEON_3R_VSHL 8
-#define NEON_3R_VQSHL 9
-#define NEON_3R_VRSHL 10
-#define NEON_3R_VQRSHL 11
-#define NEON_3R_VMAX 12
-#define NEON_3R_VMIN 13
-#define NEON_3R_VABD 14
-#define NEON_3R_VABA 15
-#define NEON_3R_VADD_VSUB 16
-#define NEON_3R_VTST_VCEQ 17
-#define NEON_3R_VML 18 /* VMLA, VMLS */
-#define NEON_3R_VMUL 19
-#define NEON_3R_VPMAX 20
-#define NEON_3R_VPMIN 21
-#define NEON_3R_VQDMULH_VQRDMULH 22
-#define NEON_3R_VPADD_VQRDMLAH 23
-#define NEON_3R_SHA 24 /* SHA1C,SHA1P,SHA1M,SHA1SU0,SHA256H{2},SHA256SU1 */
-#define NEON_3R_VFM_VQRDMLSH 25 /* VFMA, VFMS, VQRDMLSH */
-#define NEON_3R_FLOAT_ARITH 26 /* float VADD, VSUB, VPADD, VABD */
-#define NEON_3R_FLOAT_MULTIPLY 27 /* float VMLA, VMLS, VMUL */
-#define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */
-#define NEON_3R_FLOAT_ACMP 29 /* float VACGE, VACGT, VACLE, VACLT */
-#define NEON_3R_FLOAT_MINMAX 30 /* float VMIN, VMAX */
-#define NEON_3R_FLOAT_MISC 31 /* float VRECPS, VRSQRTS, VMAXNM/MINNM */
-
-static const uint8_t neon_3r_sizes[] = {
-    [NEON_3R_VHADD] = 0x7,
-    [NEON_3R_VQADD] = 0xf,
-    [NEON_3R_VRHADD] = 0x7,
-    [NEON_3R_LOGIC] = 0xf, /* size field encodes op type */
-    [NEON_3R_VHSUB] = 0x7,
-    [NEON_3R_VQSUB] = 0xf,
-    [NEON_3R_VCGT] = 0x7,
-    [NEON_3R_VCGE] = 0x7,
-    [NEON_3R_VSHL] = 0xf,
-    [NEON_3R_VQSHL] = 0xf,
-    [NEON_3R_VRSHL] = 0xf,
-    [NEON_3R_VQRSHL] = 0xf,
-    [NEON_3R_VMAX] = 0x7,
-    [NEON_3R_VMIN] = 0x7,
-    [NEON_3R_VABD] = 0x7,
-    [NEON_3R_VABA] = 0x7,
-    [NEON_3R_VADD_VSUB] = 0xf,
-    [NEON_3R_VTST_VCEQ] = 0x7,
-    [NEON_3R_VML] = 0x7,
-    [NEON_3R_VMUL] = 0x7,
-    [NEON_3R_VPMAX] = 0x7,
-    [NEON_3R_VPMIN] = 0x7,
-    [NEON_3R_VQDMULH_VQRDMULH] = 0x6,
-    [NEON_3R_VPADD_VQRDMLAH] = 0x7,
-    [NEON_3R_SHA] = 0xf, /* size field encodes op type */
-    [NEON_3R_VFM_VQRDMLSH] = 0x7, /* For VFM, size bit 1 encodes op */
-    [NEON_3R_FLOAT_ARITH] = 0x5, /* size bit 1 encodes op */
-    [NEON_3R_FLOAT_MULTIPLY] = 0x5, /* size bit 1 encodes op */
-    [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */
-    [NEON_3R_FLOAT_ACMP] = 0x5, /* size bit 1 encodes op */
-    [NEON_3R_FLOAT_MINMAX] = 0x5, /* size bit 1 encodes op */
-    [NEON_3R_FLOAT_MISC] = 0x5, /* size bit 1 encodes op */
-};
-
 /* Symbolic constants for op fields for Neon 2-register miscellaneous.
  * The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
  * table A7-13.
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     rm_ofs = neon_reg_offset(rm, 0);
 
     if ((insn & (1 << 23)) == 0) {
-        /* Three register same length.  */
-        op = ((insn >> 7) & 0x1e) | ((insn >> 4) & 1);
-        /* Catch invalid op and bad size combinations: UNDEF */
-        if ((neon_3r_sizes[op] & (1 << size)) == 0) {
-            return 1;
-        }
-        /* All insns of this form UNDEF for either this condition or the
-         * superset of cases "Q==1"; we catch the latter later.
-         */
-        if (q && ((rd | rn | rm) & 1)) {
-            return 1;
-        }
-        switch (op) {
-        case NEON_3R_VFM_VQRDMLSH:
-            if (!u) {
-                /* VFM, VFMS */
-                if (size == 1) {
-                    return 1;
-                }
-                break;
-            }
-            /* VQRDMLSH : handled by decodetree */
-            return 1;
-
-        case NEON_3R_VADD_VSUB:
-        case NEON_3R_LOGIC:
-        case NEON_3R_VMAX:
-        case NEON_3R_VMIN:
-        case NEON_3R_VTST_VCEQ:
-        case NEON_3R_VCGT:
-        case NEON_3R_VCGE:
-        case NEON_3R_VQADD:
-        case NEON_3R_VQSUB:
-        case NEON_3R_VMUL:
-        case NEON_3R_VML:
-        case NEON_3R_VSHL:
-        case NEON_3R_SHA:
-        case NEON_3R_VHADD:
-        case NEON_3R_VRHADD:
-        case NEON_3R_VHSUB:
-        case NEON_3R_VABD:
-        case NEON_3R_VABA:
-        case NEON_3R_VQSHL:
-        case NEON_3R_VRSHL:
-        case NEON_3R_VQRSHL:
-        case NEON_3R_VPMAX:
-        case NEON_3R_VPMIN:
-        case NEON_3R_VPADD_VQRDMLAH:
-        case NEON_3R_VQDMULH_VQRDMULH:
-        case NEON_3R_FLOAT_ARITH:
-        case NEON_3R_FLOAT_MULTIPLY:
-        case NEON_3R_FLOAT_CMP:
-        case NEON_3R_FLOAT_ACMP:
-        case NEON_3R_FLOAT_MINMAX:
-        case NEON_3R_FLOAT_MISC:
-            /* Already handled by decodetree */
-            return 1;
-        }
-
-        if (size == 3) {
-            /* 64-bit element instructions: handled by decodetree */
-            return 1;
-        }
-        switch (op) {
-        case NEON_3R_VFM_VQRDMLSH:
-            if (!dc_isar_feature(aa32_simdfmac, s)) {
-                return 1;
-            }
-            break;
-        default:
-            break;
-        }
-
-        for (pass = 0; pass < (q ? 4 : 2); pass++) {
-
-        /* Elementwise.  */
-        tmp = neon_load_reg(rn, pass);
-        tmp2 = neon_load_reg(rm, pass);
-        switch (op) {
-        case NEON_3R_VFM_VQRDMLSH:
-        {
-            /* VFMA, VFMS: fused multiply-add */
-            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-            TCGv_i32 tmp3 = neon_load_reg(rd, pass);
-            if (size) {
-                /* VFMS */
-                gen_helper_vfp_negs(tmp, tmp);
-            }
-            gen_helper_vfp_muladds(tmp, tmp, tmp2, tmp3, fpstatus);
-            tcg_temp_free_i32(tmp3);
-            tcg_temp_free_ptr(fpstatus);
-            break;
-        }
-        default:
-            abort();
-        }
-        tcg_temp_free_i32(tmp2);
-
-        neon_store_reg(rd, pass, tmp);
-
-        } /* for pass */
-        /* End of 3 register same size operations.  */
+        /* Three register same length: handled by decodetree */
+        return 1;
     } else if (insn & (1 << 4)) {
         if ((insn & 0x00380080) != 0) {
             /* Two registers and shift.  */
-- 
2.20.1

Two small bugfixes, plus most of RTH's refactoring of cpregs
handling.

-- PMM

The following changes since commit 1fba9dc71a170b3a05b9d3272dd8ecfe7f26e215:

Merge tag 'pull-request-2022-05-04' of https://gitlab.com/thuth/qemu into staging (2022-05-04 08:07:02 -0700)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20220505

for you to fetch changes up to 99a50d1a67c602126fc2b3a4812d3000eba9bf34:

target/arm: read access to performance counters from EL0 (2022-05-05 09:36:22 +0100)

----------------------------------------------------------------
target-arm queue:
 * Enable read access to performance counters from EL0
 * Enable SCTLR_EL1.BT0 for aarch64-linux-user
 * Refactoring of cpreg handling

----------------------------------------------------------------
Alex Zuepke (1):
      target/arm: read access to performance counters from EL0

Richard Henderson (22):
      target/arm: Enable SCTLR_EL1.BT0 for aarch64-linux-user
      target/arm: Split out cpregs.h
      target/arm: Reorg CPAccessResult and access_check_cp_reg
      target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
      target/arm: Make some more cpreg data static const
      target/arm: Reorg ARMCPRegInfo type field bits
      target/arm: Avoid bare abort() or assert(0)
      target/arm: Change cpreg access permissions to enum
      target/arm: Name CPState type
      target/arm: Name CPSecureState type
      target/arm: Drop always-true test in define_arm_vh_e2h_redirects_aliases
      target/arm: Store cpregs key in the hash table directly
      target/arm: Merge allocation of the cpreg and its name
      target/arm: Hoist computation of key in add_cpreg_to_hashtable
      target/arm: Consolidate cpreg updates in add_cpreg_to_hashtable
      target/arm: Use bool for is64 and ns in add_cpreg_to_hashtable
      target/arm: Hoist isbanked computation in add_cpreg_to_hashtable
      target/arm: Perform override check early in add_cpreg_to_hashtable
      target/arm: Reformat comments in add_cpreg_to_hashtable
      target/arm: Remove HOST_BIG_ENDIAN ifdef in add_cpreg_to_hashtable
      target/arm: Add isar predicates for FEAT_Debugv8p2
      target/arm: Add isar_feature_{aa64,any}_ras

From: Richard Henderson <richard.henderson@linaro.org>

This controls whether the PACI{A,B}SP instructions trap with BTYPE=3
(indirect branch from register other than x16/x17).  The linux kernel
sets this in bti_enable().

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/998
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220427042312.294300-1-richard.henderson@linaro.org
[PMM: remove stray change to makefile comment]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c                  |  2 ++
 tests/tcg/aarch64/bti-3.c         | 42 +++++++++++++++++++++++++++++++
 tests/tcg/aarch64/Makefile.target |  6 ++---
 3 files changed, 47 insertions(+), 3 deletions(-)
 create mode 100644 tests/tcg/aarch64/bti-3.c

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(DeviceState *dev)
         /* Enable all PAC keys.  */
         env->cp15.sctlr_el[1] |= (SCTLR_EnIA | SCTLR_EnIB |
                                   SCTLR_EnDA | SCTLR_EnDB);
+        /* Trap on btype=3 for PACIxSP. */
+        env->cp15.sctlr_el[1] |= SCTLR_BT0;
         /* and to the FP/Neon instructions */
         env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3);
         /* and to the SVE instructions */
diff --git a/tests/tcg/aarch64/bti-3.c b/tests/tcg/aarch64/bti-3.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/tests/tcg/aarch64/bti-3.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * BTI vs PACIASP
+ */
+
+#include "bti-crt.inc.c"
+
+static void skip2_sigill(int sig, siginfo_t *info, ucontext_t *uc)
+{
+    uc->uc_mcontext.pc += 8;
+    uc->uc_mcontext.pstate = 1;
+}
+
+#define BTYPE_1() \
+    asm("mov %0,#1; adr x16, 1f; br x16; 1: hint #25; mov %0,#0" \
+        : "=r"(skipped) : : "x16", "x30")
+
+#define BTYPE_2() \
+    asm("mov %0,#1; adr x16, 1f; blr x16; 1: hint #25; mov %0,#0" \
+        : "=r"(skipped) : : "x16", "x30")
+
+#define BTYPE_3() \
+    asm("mov %0,#1; adr x15, 1f; br x15; 1: hint #25; mov %0,#0" \
+        : "=r"(skipped) : : "x15", "x30")
+
+#define TEST(WHICH, EXPECT) \
+    do { WHICH(); fail += skipped ^ EXPECT; } while (0)
+
+int main()
+{
+    int fail = 0;
+    int skipped;
+
+    /* Signal-like with SA_SIGINFO.  */
+    signal_info(SIGILL, skip2_sigill);
+
+    /* With SCTLR_EL1.BT0 set, PACIASP is not compatible with type=3. */
+    TEST(BTYPE_1, 0);
+    TEST(BTYPE_2, 0);
+    TEST(BTYPE_3, 1);
+
+    return fail;
+}
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index XXXXXXX..XXXXXXX 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -XXX,XX +XXX,XX @@ endif
 # BTI Tests
 # bti-1 tests the elf notes, so we require special compiler support.
 ifneq ($(CROSS_CC_HAS_ARMV8_BTI),)
-AARCH64_TESTS += bti-1
-bti-1: CFLAGS += -mbranch-protection=standard
-bti-1: LDFLAGS += -nostdlib
+AARCH64_TESTS += bti-1 bti-3
+bti-1 bti-3: CFLAGS += -mbranch-protection=standard
+bti-1 bti-3: LDFLAGS += -nostdlib
 endif
 # bti-2 tests PROT_BTI, so no special compiler support required.
 AARCH64_TESTS += bti-2
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Move ARMCPRegInfo and all related declarations to a new
internal header, out of the public cpu.h.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220501055028.646596-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpregs.h        | 413 +++++++++++++++++++++++++++++++++++++
 target/arm/cpu.h           | 368 ---------------------------------
 hw/arm/pxa2xx.c            |   1 +
 hw/arm/pxa2xx_pic.c        |   1 +
 hw/intc/arm_gicv3_cpuif.c  |   1 +
 hw/intc/arm_gicv3_kvm.c    |   2 +
 target/arm/cpu.c           |   1 +
 target/arm/cpu64.c         |   1 +
 target/arm/cpu_tcg.c       |   1 +
 target/arm/gdbstub.c       |   3 +-
 target/arm/helper.c        |   1 +
 target/arm/op_helper.c     |   1 +
 target/arm/translate-a64.c |   4 +-
 target/arm/translate.c     |   3 +-
 14 files changed, 427 insertions(+), 374 deletions(-)
 create mode 100644 target/arm/cpregs.h

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/cpregs.h
@@ -XXX,XX +XXX,XX @@
+/*
+ * QEMU ARM CP Register access and descriptions
+ *
+ * Copyright (c) 2022 Linaro Ltd
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see
+ * <http://www.gnu.org/licenses/gpl-2.0.html>
+ */
+
+#ifndef TARGET_ARM_CPREGS_H
+#define TARGET_ARM_CPREGS_H
+
+/*
+ * ARMCPRegInfo type field bits. If the SPECIAL bit is set this is a
+ * special-behaviour cp reg and bits [11..8] indicate what behaviour
+ * it has. Otherwise it is a simple cp reg, where CONST indicates that
+ * TCG can assume the value to be constant (ie load at translate time)
+ * and 64BIT indicates a 64 bit wide coprocessor register. SUPPRESS_TB_END
+ * indicates that the TB should not be ended after a write to this register
+ * (the default is that the TB ends after cp writes). OVERRIDE permits
+ * a register definition to override a previous definition for the
+ * same (cp, is64, crn, crm, opc1, opc2) tuple: either the new or the
+ * old must have the OVERRIDE bit set.
+ * ALIAS indicates that this register is an alias view of some underlying
+ * state which is also visible via another register, and that the other
+ * register is handling migration and reset; registers marked ALIAS will not be
+ * migrated but may have their state set by syncing of register state from KVM.
+ * NO_RAW indicates that this register has no underlying state and does not
+ * support raw access for state saving/loading; it will not be used for either
+ * migration or KVM state synchronization. (Typically this is for "registers"
+ * which are actually used as instructions for cache maintenance and so on.)
+ * IO indicates that this register does I/O and therefore its accesses
+ * need to be marked with gen_io_start() and also end the TB. In particular,
+ * registers which implement clocks or timers require this.
+ * RAISES_EXC is for when the read or write hook might raise an exception;
+ * the generated code will synchronize the CPU state before calling the hook
+ * so that it is safe for the hook to call raise_exception().
+ * NEWEL is for writes to registers that might change the exception
+ * level - typically on older ARM chips. For those cases we need to
+ * re-read the new el when recomputing the translation flags.
+ */
+#define ARM_CP_SPECIAL           0x0001
+#define ARM_CP_CONST             0x0002
+#define ARM_CP_64BIT             0x0004
+#define ARM_CP_SUPPRESS_TB_END   0x0008
+#define ARM_CP_OVERRIDE          0x0010
+#define ARM_CP_ALIAS             0x0020
+#define ARM_CP_IO                0x0040
+#define ARM_CP_NO_RAW            0x0080
+#define ARM_CP_NOP               (ARM_CP_SPECIAL | 0x0100)
+#define ARM_CP_WFI               (ARM_CP_SPECIAL | 0x0200)
+#define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
+#define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
+#define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
+#define ARM_CP_DC_GVA            (ARM_CP_SPECIAL | 0x0600)
+#define ARM_CP_DC_GZVA           (ARM_CP_SPECIAL | 0x0700)
+#define ARM_LAST_SPECIAL         ARM_CP_DC_GZVA
+#define ARM_CP_FPU               0x1000
+#define ARM_CP_SVE               0x2000
+#define ARM_CP_NO_GDB            0x4000
+#define ARM_CP_RAISES_EXC        0x8000
+#define ARM_CP_NEWEL             0x10000
+/* Used only as a terminator for ARMCPRegInfo lists */
+#define ARM_CP_SENTINEL          0xfffff
+/* Mask of only the flag bits in a type field */
+#define ARM_CP_FLAG_MASK         0x1f0ff
+
+/*
+ * Valid values for ARMCPRegInfo state field, indicating which of
+ * the AArch32 and AArch64 execution states this register is visible in.
+ * If the reginfo doesn't explicitly specify then it is AArch32 only.
+ * If the reginfo is declared to be visible in both states then a second
+ * reginfo is synthesised for the AArch32 view of the AArch64 register,
+ * such that the AArch32 view is the lower 32 bits of the AArch64 one.
+ * Note that we rely on the values of these enums as we iterate through
+ * the various states in some places.
+ */
+enum {
+    ARM_CP_STATE_AA32 = 0,
+    ARM_CP_STATE_AA64 = 1,
+    ARM_CP_STATE_BOTH = 2,
+};
+
+/*
+ * ARM CP register secure state flags.  These flags identify security state
+ * attributes for a given CP register entry.
+ * The existence of both or neither secure and non-secure flags indicates that
+ * the register has both a secure and non-secure hash entry.  A single one of
+ * these flags causes the register to only be hashed for the specified
+ * security state.
+ * Although definitions may have any combination of the S/NS bits, each
+ * registered entry will only have one to identify whether the entry is secure
+ * or non-secure.
+ */
+enum {
+    ARM_CP_SECSTATE_S =   (1 << 0), /* bit[0]: Secure state register */
+    ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
+};
+
+/*
+ * Return true if cptype is a valid type field. This is used to try to
+ * catch errors where the sentinel has been accidentally left off the end
+ * of a list of registers.
+ */
+static inline bool cptype_valid(int cptype)
+{
+    return ((cptype & ~ARM_CP_FLAG_MASK) == 0)
+        || ((cptype & ARM_CP_SPECIAL) &&
+            ((cptype & ~ARM_CP_FLAG_MASK) <= ARM_LAST_SPECIAL));
+}
+
+/*
+ * Access rights:
+ * We define bits for Read and Write access for what rev C of the v7-AR ARM ARM
+ * defines as PL0 (user), PL1 (fiq/irq/svc/abt/und/sys, ie privileged), and
+ * PL2 (hyp). The other level which has Read and Write bits is Secure PL1
+ * (ie any of the privileged modes in Secure state, or Monitor mode).
+ * If a register is accessible in one privilege level it's always accessible
+ * in higher privilege levels too. Since "Secure PL1" also follows this rule
+ * (ie anything visible in PL2 is visible in S-PL1, some things are only
+ * visible in S-PL1) but "Secure PL1" is a bit of a mouthful, we bend the
+ * terminology a little and call this PL3.
+ * In AArch64 things are somewhat simpler as the PLx bits line up exactly
+ * with the ELx exception levels.
+ *
+ * If access permissions for a register are more complex than can be
+ * described with these bits, then use a laxer set of restrictions, and
+ * do the more restrictive/complex check inside a helper function.
+ */
+#define PL3_R 0x80
+#define PL3_W 0x40
+#define PL2_R (0x20 | PL3_R)
+#define PL2_W (0x10 | PL3_W)
+#define PL1_R (0x08 | PL2_R)
+#define PL1_W (0x04 | PL2_W)
+#define PL0_R (0x02 | PL1_R)
+#define PL0_W (0x01 | PL1_W)
+
+/*
+ * For user-mode some registers are accessible to EL0 via a kernel
+ * trap-and-emulate ABI. In this case we define the read permissions
+ * as actually being PL0_R. However some bits of any given register
+ * may still be masked.
+ */
+#ifdef CONFIG_USER_ONLY
+#define PL0U_R PL0_R
+#else
+#define PL0U_R PL1_R
+#endif
+
+#define PL3_RW (PL3_R | PL3_W)
+#define PL2_RW (PL2_R | PL2_W)
+#define PL1_RW (PL1_R | PL1_W)
+#define PL0_RW (PL0_R | PL0_W)
+
+typedef enum CPAccessResult {
+    /* Access is permitted */
+    CP_ACCESS_OK = 0,
+    /*
+     * Access fails due to a configurable trap or enable which would
+     * result in a categorized exception syndrome giving information about
+     * the failing instruction (ie syndrome category 0x3, 0x4, 0x5, 0x6,
+     * 0xc or 0x18). The exception is taken to the usual target EL (EL1 or
+     * PL1 if in EL0, otherwise to the current EL).
+     */
+    CP_ACCESS_TRAP = 1,
+    /*
+     * Access fails and results in an exception syndrome 0x0 ("uncategorized").
+     * Note that this is not a catch-all case -- the set of cases which may
+     * result in this failure is specifically defined by the architecture.
+     */
+    CP_ACCESS_TRAP_UNCATEGORIZED = 2,
+    /* As CP_ACCESS_TRAP, but for traps directly to EL2 or EL3 */
+    CP_ACCESS_TRAP_EL2 = 3,
+    CP_ACCESS_TRAP_EL3 = 4,
+    /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
+    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
+    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
+} CPAccessResult;
+
+typedef struct ARMCPRegInfo ARMCPRegInfo;
+
+/*
+ * Access functions for coprocessor registers. These cannot fail and
+ * may not raise exceptions.
+ */
+typedef uint64_t CPReadFn(CPUARMState *env, const ARMCPRegInfo *opaque);
+typedef void CPWriteFn(CPUARMState *env, const ARMCPRegInfo *opaque,
+                       uint64_t value);
+/* Access permission check functions for coprocessor registers. */
+typedef CPAccessResult CPAccessFn(CPUARMState *env,
+                                  const ARMCPRegInfo *opaque,
+                                  bool isread);
+/* Hook function for register reset */
+typedef void CPResetFn(CPUARMState *env, const ARMCPRegInfo *opaque);
+
+#define CP_ANY 0xff
+
+/* Definition of an ARM coprocessor register */
+struct ARMCPRegInfo {
+    /* Name of register (useful mainly for debugging, need not be unique) */
+    const char *name;
+    /*
+     * Location of register: coprocessor number and (crn,crm,opc1,opc2)
+     * tuple. Any of crm, opc1 and opc2 may be CP_ANY to indicate a
+     * 'wildcard' field -- any value of that field in the MRC/MCR insn
+     * will be decoded to this register. The register read and write
+     * callbacks will be passed an ARMCPRegInfo with the crn/crm/opc1/opc2
+     * used by the program, so it is possible to register a wildcard and
+     * then behave differently on read/write if necessary.
+     * For 64 bit registers, only crm and opc1 are relevant; crn and opc2
+     * must both be zero.
+     * For AArch64-visible registers, opc0 is also used.
+     * Since there are no "coprocessors" in AArch64, cp is purely used as a
+     * way to distinguish (for KVM's benefit) guest-visible system registers
+     * from demuxed ones provided to preserve the "no side effects on
+     * KVM register read/write from QEMU" semantics. cp==0x13 is guest
+     * visible (to match KVM's encoding); cp==0 will be converted to
+     * cp==0x13 when the ARMCPRegInfo is registered, for convenience.
+     */
+    uint8_t cp;
+    uint8_t crn;
+    uint8_t crm;
+    uint8_t opc0;
+    uint8_t opc1;
+    uint8_t opc2;
+    /* Execution state in which this register is visible: ARM_CP_STATE_* */
+    int state;
+    /* Register type: ARM_CP_* bits/values */
+    int type;
+    /* Access rights: PL*_[RW] */
+    int access;
+    /* Security state: ARM_CP_SECSTATE_* bits/values */
+    int secure;
+    /*
+     * The opaque pointer passed to define_arm_cp_regs_with_opaque() when
+     * this register was defined: can be used to hand data through to the
+     * register read/write functions, since they are passed the ARMCPRegInfo*.
+     */
+    void *opaque;
+    /*
+     * Value of this register, if it is ARM_CP_CONST. Otherwise, if
+     * fieldoffset is non-zero, the reset value of the register.
+     */
+    uint64_t resetvalue;
+    /*
+     * Offset of the field in CPUARMState for this register.
+     * This is not needed if either:
+     *  1. type is ARM_CP_CONST or one of the ARM_CP_SPECIALs
+     *  2. both readfn and writefn are specified
+     */
+    ptrdiff_t fieldoffset; /* offsetof(CPUARMState, field) */
+
+    /*
+     * Offsets of the secure and non-secure fields in CPUARMState for the
+     * register if it is banked.  These fields are only used during the static
+     * registration of a register.  During hashing the bank associated
+     * with a given security state is copied to fieldoffset which is used from
+     * there on out.
+     *
+     * It is expected that register definitions use either fieldoffset or
+     * bank_fieldoffsets in the definition but not both.  It is also expected
+     * that both bank offsets are set when defining a banked register.  This
+     * use indicates that a register is banked.
+     */
+    ptrdiff_t bank_fieldoffsets[2];
+
+    /*
+     * Function for making any access checks for this register in addition to
+     * those specified by the 'access' permissions bits. If NULL, no extra
+     * checks required. The access check is performed at runtime, not at
+     * translate time.
+     */
+    CPAccessFn *accessfn;
+    /*
+     * Function for handling reads of this register. If NULL, then reads
+     * will be done by loading from the offset into CPUARMState specified
+     * by fieldoffset.
+     */
+    CPReadFn *readfn;
+    /*
+     * Function for handling writes of this register. If NULL, then writes
+     * will be done by writing to the offset into CPUARMState specified
+     * by fieldoffset.
+     */
+    CPWriteFn *writefn;
+    /*
+     * Function for doing a "raw" read; used when we need to copy
+     * coprocessor state to the kernel for KVM or out for
+     * migration. This only needs to be provided if there is also a
+     * readfn and it has side effects (for instance clear-on-read bits).
+     */
+    CPReadFn *raw_readfn;
+    /*
+     * Function for doing a "raw" write; used when we need to copy KVM
+     * kernel coprocessor state into userspace, or for inbound
+     * migration. This only needs to be provided if there is also a
+     * writefn and it masks out "unwritable" bits or has write-one-to-clear
+     * or similar behaviour.
+     */
+    CPWriteFn *raw_writefn;
+    /*
+     * Function for resetting the register. If NULL, then reset will be done
+     * by writing resetvalue to the field specified in fieldoffset. If
+     * fieldoffset is 0 then no reset will be done.
+     */
+    CPResetFn *resetfn;
+
+    /*
+     * "Original" writefn and readfn.
+     * For ARMv8.1-VHE register aliases, we overwrite the read/write
+     * accessor functions of various EL1/EL0 to perform the runtime
+     * check for which sysreg should actually be modified, and then
+     * forwards the operation.  Before overwriting the accessors,
+     * the original function is copied here, so that accesses that
+     * really do go to the EL1/EL0 version proceed normally.
+     * (The corresponding EL2 register is linked via opaque.)
+     */
+    CPReadFn *orig_readfn;
+    CPWriteFn *orig_writefn;
+};
+
+/*
+ * Macros which are lvalues for the field in CPUARMState for the
+ * ARMCPRegInfo *ri.
+ */
+#define CPREG_FIELD32(env, ri) \
+    (*(uint32_t *)((char *)(env) + (ri)->fieldoffset))
+#define CPREG_FIELD64(env, ri) \
+    (*(uint64_t *)((char *)(env) + (ri)->fieldoffset))
+
+#define REGINFO_SENTINEL { .type = ARM_CP_SENTINEL }
+
+void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
+                                    const ARMCPRegInfo *regs, void *opaque);
+void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
+                                       const ARMCPRegInfo *regs, void *opaque);
+static inline void define_arm_cp_regs(ARMCPU *cpu, const ARMCPRegInfo *regs)
+{
+    define_arm_cp_regs_with_opaque(cpu, regs, 0);
+}
+static inline void define_one_arm_cp_reg(ARMCPU *cpu, const ARMCPRegInfo *regs)
+{
+    define_one_arm_cp_reg_with_opaque(cpu, regs, 0);
+}
+const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp);
+
+/*
+ * Definition of an ARM co-processor register as viewed from
+ * userspace. This is used for presenting sanitised versions of
+ * registers to userspace when emulating the Linux AArch64 CPU
+ * ID/feature ABI (advertised as HWCAP_CPUID).
+ */
+typedef struct ARMCPRegUserSpaceInfo {
+    /* Name of register */
+    const char *name;
+
+    /* Is the name actually a glob pattern */
+    bool is_glob;
+
+    /* Only some bits are exported to user space */
+    uint64_t exported_bits;
+
+    /* Fixed bits are applied after the mask */
+    uint64_t fixed_bits;
+} ARMCPRegUserSpaceInfo;
+
+#define REGUSERINFO_SENTINEL { .name = NULL }
+
+void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods);
+
+/* CPWriteFn that can be used to implement writes-ignored behaviour */
+void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
+                         uint64_t value);
+/* CPReadFn that can be used for read-as-zero behaviour */
+uint64_t arm_cp_read_zero(CPUARMState *env, const ARMCPRegInfo *ri);
+
+/*
+ * CPResetFn that does nothing, for use if no reset is required even
+ * if fieldoffset is non zero.
+ */
+void arm_cp_reset_ignore(CPUARMState *env, const ARMCPRegInfo *opaque);
+
+/*
+ * Return true if this reginfo struct's field in the cpu state struct
+ * is 64 bits wide.
+ */
+static inline bool cpreg_field_is_64bit(const ARMCPRegInfo *ri)
+{
+    return (ri->state == ARM_CP_STATE_AA64) || (ri->type & ARM_CP_64BIT);
+}
+
+static inline bool cp_access_ok(int current_el,
+                                const ARMCPRegInfo *ri, int isread)
+{
+    return (ri->access >> ((current_el * 2) + isread)) & 1;
+}
+
+/* Raw read of a coprocessor register (as needed for migration, etc) */
+uint64_t read_raw_cp_reg(CPUARMState *env, const ARMCPRegInfo *ri);
+
+#endif /* TARGET_ARM_CPREGS_H */
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
     return kvmid;
 }
 
-/* ARMCPRegInfo type field bits. If the SPECIAL bit is set this is a
- * special-behaviour cp reg and bits [11..8] indicate what behaviour
- * it has. Otherwise it is a simple cp reg, where CONST indicates that
- * TCG can assume the value to be constant (ie load at translate time)
- * and 64BIT indicates a 64 bit wide coprocessor register. SUPPRESS_TB_END
- * indicates that the TB should not be ended after a write to this register
- * (the default is that the TB ends after cp writes). OVERRIDE permits
- * a register definition to override a previous definition for the
- * same (cp, is64, crn, crm, opc1, opc2) tuple: either the new or the
- * old must have the OVERRIDE bit set.
- * ALIAS indicates that this register is an alias view of some underlying
- * state which is also visible via another register, and that the other
- * register is handling migration and reset; registers marked ALIAS will not be
- * migrated but may have their state set by syncing of register state from KVM.
- * NO_RAW indicates that this register has no underlying state and does not
- * support raw access for state saving/loading; it will not be used for either
- * migration or KVM state synchronization. (Typically this is for "registers"
- * which are actually used as instructions for cache maintenance and so on.)
- * IO indicates that this register does I/O and therefore its accesses
- * need to be marked with gen_io_start() and also end the TB. In particular,
- * registers which implement clocks or timers require this.
- * RAISES_EXC is for when the read or write hook might raise an exception;
- * the generated code will synchronize the CPU state before calling the hook
- * so that it is safe for the hook to call raise_exception().
- * NEWEL is for writes to registers that might change the exception
- * level - typically on older ARM chips. For those cases we need to
- * re-read the new el when recomputing the translation flags.
- */
-#define ARM_CP_SPECIAL           0x0001
-#define ARM_CP_CONST             0x0002
-#define ARM_CP_64BIT             0x0004
-#define ARM_CP_SUPPRESS_TB_END   0x0008
-#define ARM_CP_OVERRIDE          0x0010
-#define ARM_CP_ALIAS             0x0020
-#define ARM_CP_IO                0x0040
-#define ARM_CP_NO_RAW            0x0080
-#define ARM_CP_NOP               (ARM_CP_SPECIAL | 0x0100)
-#define ARM_CP_WFI               (ARM_CP_SPECIAL | 0x0200)
-#define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
-#define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
-#define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
-#define ARM_CP_DC_GVA            (ARM_CP_SPECIAL | 0x0600)
-#define ARM_CP_DC_GZVA           (ARM_CP_SPECIAL | 0x0700)
-#define ARM_LAST_SPECIAL         ARM_CP_DC_GZVA
-#define ARM_CP_FPU               0x1000
-#define ARM_CP_SVE               0x2000
-#define ARM_CP_NO_GDB            0x4000
-#define ARM_CP_RAISES_EXC        0x8000
-#define ARM_CP_NEWEL             0x10000
-/* Used only as a terminator for ARMCPRegInfo lists */
-#define ARM_CP_SENTINEL          0xfffff
-/* Mask of only the flag bits in a type field */
-#define ARM_CP_FLAG_MASK         0x1f0ff
-
-/* Valid values for ARMCPRegInfo state field, indicating which of
- * the AArch32 and AArch64 execution states this register is visible in.
- * If the reginfo doesn't explicitly specify then it is AArch32 only.
- * If the reginfo is declared to be visible in both states then a second
- * reginfo is synthesised for the AArch32 view of the AArch64 register,
- * such that the AArch32 view is the lower 32 bits of the AArch64 one.
- * Note that we rely on the values of these enums as we iterate through
- * the various states in some places.
- */
-enum {
-    ARM_CP_STATE_AA32 = 0,
-    ARM_CP_STATE_AA64 = 1,
-    ARM_CP_STATE_BOTH = 2,
-};
-
-/* ARM CP register secure state flags.  These flags identify security state
- * attributes for a given CP register entry.
- * The existence of both or neither secure and non-secure flags indicates that
- * the register has both a secure and non-secure hash entry.  A single one of
- * these flags causes the register to only be hashed for the specified
- * security state.
- * Although definitions may have any combination of the S/NS bits, each
- * registered entry will only have one to identify whether the entry is secure
- * or non-secure.
- */
-enum {
-    ARM_CP_SECSTATE_S =   (1 << 0), /* bit[0]: Secure state register */
-    ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
-};
-
-/* Return true if cptype is a valid type field. This is used to try to
- * catch errors where the sentinel has been accidentally left off the end
- * of a list of registers.
- */
-static inline bool cptype_valid(int cptype)
-{
-    return ((cptype & ~ARM_CP_FLAG_MASK) == 0)
-        || ((cptype & ARM_CP_SPECIAL) &&
-            ((cptype & ~ARM_CP_FLAG_MASK) <= ARM_LAST_SPECIAL));
-}
-
-/* Access rights:
- * We define bits for Read and Write access for what rev C of the v7-AR ARM ARM
- * defines as PL0 (user), PL1 (fiq/irq/svc/abt/und/sys, ie privileged), and
- * PL2 (hyp). The other level which has Read and Write bits is Secure PL1
- * (ie any of the privileged modes in Secure state, or Monitor mode).
- * If a register is accessible in one privilege level it's always accessible
- * in higher privilege levels too. Since "Secure PL1" also follows this rule
- * (ie anything visible in PL2 is visible in S-PL1, some things are only
- * visible in S-PL1) but "Secure PL1" is a bit of a mouthful, we bend the
- * terminology a little and call this PL3.
- * In AArch64 things are somewhat simpler as the PLx bits line up exactly
- * with the ELx exception levels.
- *
- * If access permissions for a register are more complex than can be
- * described with these bits, then use a laxer set of restrictions, and
- * do the more restrictive/complex check inside a helper function.
- */
-#define PL3_R 0x80
-#define PL3_W 0x40
-#define PL2_R (0x20 | PL3_R)
-#define PL2_W (0x10 | PL3_W)
-#define PL1_R (0x08 | PL2_R)
-#define PL1_W (0x04 | PL2_W)
-#define PL0_R (0x02 | PL1_R)
-#define PL0_W (0x01 | PL1_W)
-
-/*
- * For user-mode some registers are accessible to EL0 via a kernel
- * trap-and-emulate ABI. In this case we define the read permissions
- * as actually being PL0_R. However some bits of any given register
- * may still be masked.
- */
-#ifdef CONFIG_USER_ONLY
-#define PL0U_R PL0_R
-#else
-#define PL0U_R PL1_R
-#endif
-
-#define PL3_RW (PL3_R | PL3_W)
-#define PL2_RW (PL2_R | PL2_W)
-#define PL1_RW (PL1_R | PL1_W)
-#define PL0_RW (PL0_R | PL0_W)
-
 /* Return the highest implemented Exception Level */
 static inline int arm_highest_el(CPUARMState *env)
 {
@@ -XXX,XX +XXX,XX @@ static inline int arm_current_el(CPUARMState *env)
     }
 }
 
-typedef struct ARMCPRegInfo ARMCPRegInfo;
-
-typedef enum CPAccessResult {
-    /* Access is permitted */
-    CP_ACCESS_OK = 0,
-    /* Access fails due to a configurable trap or enable which would
-     * result in a categorized exception syndrome giving information about
-     * the failing instruction (ie syndrome category 0x3, 0x4, 0x5, 0x6,
-     * 0xc or 0x18). The exception is taken to the usual target EL (EL1 or
-     * PL1 if in EL0, otherwise to the current EL).
-     */
-    CP_ACCESS_TRAP = 1,
-    /* Access fails and results in an exception syndrome 0x0 ("uncategorized").
-     * Note that this is not a catch-all case -- the set of cases which may
-     * result in this failure is specifically defined by the architecture.
-     */
-    CP_ACCESS_TRAP_UNCATEGORIZED = 2,
-    /* As CP_ACCESS_TRAP, but for traps directly to EL2 or EL3 */
-    CP_ACCESS_TRAP_EL2 = 3,
-    CP_ACCESS_TRAP_EL3 = 4,
-    /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
-    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
-    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
-} CPAccessResult;
-
-/* Access functions for coprocessor registers. These cannot fail and
- * may not raise exceptions.
- */
-typedef uint64_t CPReadFn(CPUARMState *env, const ARMCPRegInfo *opaque);
-typedef void CPWriteFn(CPUARMState *env, const ARMCPRegInfo *opaque,
-                       uint64_t value);
-/* Access permission check functions for coprocessor registers. */
-typedef CPAccessResult CPAccessFn(CPUARMState *env,
-                                  const ARMCPRegInfo *opaque,
-                                  bool isread);
-/* Hook function for register reset */
-typedef void CPResetFn(CPUARMState *env, const ARMCPRegInfo *opaque);
-
-#define CP_ANY 0xff
-
-/* Definition of an ARM coprocessor register */
-struct ARMCPRegInfo {
-    /* Name of register (useful mainly for debugging, need not be unique) */
-    const char *name;
-    /* Location of register: coprocessor number and (crn,crm,opc1,opc2)
-     * tuple. Any of crm, opc1 and opc2 may be CP_ANY to indicate a
-     * 'wildcard' field -- any value of that field in the MRC/MCR insn
-     * will be decoded to this register. The register read and write
-     * callbacks will be passed an ARMCPRegInfo with the crn/crm/opc1/opc2
-     * used by the program, so it is possible to register a wildcard and
-     * then behave differently on read/write if necessary.
-     * For 64 bit registers, only crm and opc1 are relevant; crn and opc2
-     * must both be zero.
-     * For AArch64-visible registers, opc0 is also used.
-     * Since there are no "coprocessors" in AArch64, cp is purely used as a
-     * way to distinguish (for KVM's benefit) guest-visible system registers
-     * from demuxed ones provided to preserve the "no side effects on
-     * KVM register read/write from QEMU" semantics. cp==0x13 is guest
-     * visible (to match KVM's encoding); cp==0 will be converted to
-     * cp==0x13 when the ARMCPRegInfo is registered, for convenience.
-     */
-    uint8_t cp;
-    uint8_t crn;
-    uint8_t crm;
-    uint8_t opc0;
-    uint8_t opc1;
-    uint8_t opc2;
-    /* Execution state in which this register is visible: ARM_CP_STATE_* */
-    int state;
-    /* Register type: ARM_CP_* bits/values */
-    int type;
-    /* Access rights: PL*_[RW] */
-    int access;
-    /* Security state: ARM_CP_SECSTATE_* bits/values */
-    int secure;
-    /* The opaque pointer passed to define_arm_cp_regs_with_opaque() when
-     * this register was defined: can be used to hand data through to the
-     * register read/write functions, since they are passed the ARMCPRegInfo*.
-     */
-    void *opaque;
-    /* Value of this register, if it is ARM_CP_CONST. Otherwise, if
-     * fieldoffset is non-zero, the reset value of the register.
-     */
-    uint64_t resetvalue;
-    /* Offset of the field in CPUARMState for this register.
-     *
-     * This is not needed if either:
-     *  1. type is ARM_CP_CONST or one of the ARM_CP_SPECIALs
-     *  2. both readfn and writefn are specified
-     */
-    ptrdiff_t fieldoffset; /* offsetof(CPUARMState, field) */
-
-    /* Offsets of the secure and non-secure fields in CPUARMState for the
-     * register if it is banked.  These fields are only used during the static
-     * registration of a register.  During hashing the bank associated
-     * with a given security state is copied to fieldoffset which is used from
-     * there on out.
-     *
-     * It is expected that register definitions use either fieldoffset or
-     * bank_fieldoffsets in the definition but not both.  It is also expected
-     * that both bank offsets are set when defining a banked register.  This
-     * use indicates that a register is banked.
-     */
-    ptrdiff_t bank_fieldoffsets[2];
-
-    /* Function for making any access checks for this register in addition to
-     * those specified by the 'access' permissions bits. If NULL, no extra
-     * checks required. The access check is performed at runtime, not at
-     * translate time.
-     */
-    CPAccessFn *accessfn;
-    /* Function for handling reads of this register. If NULL, then reads
-     * will be done by loading from the offset into CPUARMState specified
-     * by fieldoffset.
-     */
-    CPReadFn *readfn;
-    /* Function for handling writes of this register. If NULL, then writes
-     * will be done by writing to the offset into CPUARMState specified
-     * by fieldoffset.
-     */
-    CPWriteFn *writefn;
-    /* Function for doing a "raw" read; used when we need to copy
-     * coprocessor state to the kernel for KVM or out for
-     * migration. This only needs to be provided if there is also a
-     * readfn and it has side effects (for instance clear-on-read bits).
-     */
-    CPReadFn *raw_readfn;
-    /* Function for doing a "raw" write; used when we need to copy KVM
-     * kernel coprocessor state into userspace, or for inbound
-     * migration. This only needs to be provided if there is also a
-     * writefn and it masks out "unwritable" bits or has write-one-to-clear
-     * or similar behaviour.
-     */
-    CPWriteFn *raw_writefn;
-    /* Function for resetting the register. If NULL, then reset will be done
-     * by writing resetvalue to the field specified in fieldoffset. If
-     * fieldoffset is 0 then no reset will be done.
-     */
-    CPResetFn *resetfn;
-
-    /*
-     * "Original" writefn and readfn.
-     * For ARMv8.1-VHE register aliases, we overwrite the read/write
-     * accessor functions of various EL1/EL0 to perform the runtime
-     * check for which sysreg should actually be modified, and then
-     * forwards the operation.  Before overwriting the accessors,
-     * the original function is copied here, so that accesses that
-     * really do go to the EL1/EL0 version proceed normally.
-     * (The corresponding EL2 register is linked via opaque.)
-     */
-    CPReadFn *orig_readfn;
-    CPWriteFn *orig_writefn;
-};
-
-/* Macros which are lvalues for the field in CPUARMState for the
- * ARMCPRegInfo *ri.
- */
-#define CPREG_FIELD32(env, ri) \
-    (*(uint32_t *)((char *)(env) + (ri)->fieldoffset))
-#define CPREG_FIELD64(env, ri) \
-    (*(uint64_t *)((char *)(env) + (ri)->fieldoffset))
-
-#define REGINFO_SENTINEL { .type = ARM_CP_SENTINEL }
-
-void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
-                                    const ARMCPRegInfo *regs, void *opaque);
-void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
-                                       const ARMCPRegInfo *regs, void *opaque);
-static inline void define_arm_cp_regs(ARMCPU *cpu, const ARMCPRegInfo *regs)
-{
-    define_arm_cp_regs_with_opaque(cpu, regs, 0);
-}
-static inline void define_one_arm_cp_reg(ARMCPU *cpu, const ARMCPRegInfo *regs)
-{
-    define_one_arm_cp_reg_with_opaque(cpu, regs, 0);
-}
-const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp);
-
-/*
- * Definition of an ARM co-processor register as viewed from
- * userspace. This is used for presenting sanitised versions of
- * registers to userspace when emulating the Linux AArch64 CPU
- * ID/feature ABI (advertised as HWCAP_CPUID).
- */
-typedef struct ARMCPRegUserSpaceInfo {
-    /* Name of register */
-    const char *name;
-
-    /* Is the name actually a glob pattern */
-    bool is_glob;
-
-    /* Only some bits are exported to user space */
-    uint64_t exported_bits;
-
-    /* Fixed bits are applied after the mask */
-    uint64_t fixed_bits;
-} ARMCPRegUserSpaceInfo;
-
-#define REGUSERINFO_SENTINEL { .name = NULL }
-
-void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods);
-
-/* CPWriteFn that can be used to implement writes-ignored behaviour */
-void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
-                         uint64_t value);
-/* CPReadFn that can be used for read-as-zero behaviour */
-uint64_t arm_cp_read_zero(CPUARMState *env, const ARMCPRegInfo *ri);
-
-/* CPResetFn that does nothing, for use if no reset is required even
- * if fieldoffset is non zero.
- */
-void arm_cp_reset_ignore(CPUARMState *env, const ARMCPRegInfo *opaque);
-
-/* Return true if this reginfo struct's field in the cpu state struct
- * is 64 bits wide.
- */
-static inline bool cpreg_field_is_64bit(const ARMCPRegInfo *ri)
-{
-    return (ri->state == ARM_CP_STATE_AA64) || (ri->type & ARM_CP_64BIT);
-}
-
-static inline bool cp_access_ok(int current_el,
-                                const ARMCPRegInfo *ri, int isread)
-{
-    return (ri->access >> ((current_el * 2) + isread)) & 1;
-}
-
-/* Raw read of a coprocessor register (as needed for migration, etc) */
-uint64_t read_raw_cp_reg(CPUARMState *env, const ARMCPRegInfo *ri);
-
 /**
  * write_list_to_cpustate
  * @cpu: ARMCPU
diff --git a/hw/arm/pxa2xx.c b/hw/arm/pxa2xx.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/pxa2xx.c
+++ b/hw/arm/pxa2xx.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/cutils.h"
 #include "qemu/log.h"
 #include "qom/object.h"
+#include "target/arm/cpregs.h"
 
 static struct {
     hwaddr io_base;
diff --git a/hw/arm/pxa2xx_pic.c b/hw/arm/pxa2xx_pic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/pxa2xx_pic.c
+++ b/hw/arm/pxa2xx_pic.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/sysbus.h"
 #include "migration/vmstate.h"
 #include "qom/object.h"
+#include "target/arm/cpregs.h"
 
 #define ICIP	0x00	/* Interrupt Controller IRQ Pending register */
 #define ICMR	0x04	/* Interrupt Controller Mask register */
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -XXX,XX +XXX,XX @@
 #include "gicv3_internal.h"
 #include "hw/irq.h"
 #include "cpu.h"
+#include "target/arm/cpregs.h"
 
 /*
  * Special case return value from hppvi_index(); must be larger than
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -XXX,XX +XXX,XX @@
 #include "vgic_common.h"
 #include "migration/blocker.h"
 #include "qom/object.h"
+#include "target/arm/cpregs.h"
+
 
 #ifdef DEBUG_GICV3_KVM
 #define DPRINTF(fmt, ...) \
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@
 #include "kvm_arm.h"
 #include "disas/capstone.h"
 #include "fpu/softfloat.h"
+#include "cpregs.h"
 
 static void arm_cpu_set_pc(CPUState *cs, vaddr value)
 {
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@
 #include "hvf_arm.h"
 #include "qapi/visitor.h"
 #include "hw/qdev-properties.h"
+#include "cpregs.h"
 
 
 #ifndef CONFIG_USER_ONLY
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -XXX,XX +XXX,XX @@
 #if !defined(CONFIG_USER_ONLY)
 #include "hw/boards.h"
 #endif
+#include "cpregs.h"
 
 /* CPU models. These are not needed for the AArch64 linux-user build. */
 #if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -XXX,XX +XXX,XX @@
  */
 #include "qemu/osdep.h"
 #include "cpu.h"
-#include "internals.h"
 #include "exec/gdbstub.h"
+#include "internals.h"
+#include "cpregs.h"
 
 typedef struct RegisterSysregXmlParam {
     CPUState *cs;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@
 #include "exec/cpu_ldst.h"
 #include "semihosting/common-semi.h"
 #endif
+#include "cpregs.h"
 
 #define ARM_CPU_FREQ 1000000000 /* FIXME: 1 GHz, should be configurable */
 #define PMCR_NUM_COUNTERS 4 /* QEMU IMPDEF choice */
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -XXX,XX +XXX,XX @@
 #include "internals.h"
 #include "exec/exec-all.h"
 #include "exec/cpu_ldst.h"
+#include "cpregs.h"
 
 #define SIGNBIT (uint32_t)0x80000000
 #define SIGNBIT64 ((uint64_t)1 << 63)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@
 #include "translate.h"
 #include "internals.h"
 #include "qemu/host-utils.h"
-
 #include "semihosting/semihost.h"
 #include "exec/gen-icount.h"
-
 #include "exec/helper-proto.h"
 #include "exec/helper-gen.h"
 #include "exec/log.h"
-
+#include "cpregs.h"
 #include "translate-a64.h"
 #include "qemu/atomic128.h"
 
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/bitops.h"
 #include "arm_ldst.h"
 #include "semihosting/semihost.h"
-
 #include "exec/helper-proto.h"
 #include "exec/helper-gen.h"
-
 #include "exec/log.h"
+#include "cpregs.h"
 
 
 #define ENABLE_ARCH_4T    arm_dc_feature(s, ARM_FEATURE_V4T)
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Rearrange the values of the enumerators of CPAccessResult
so that we may directly extract the target el. For the two
special cases in access_check_cp_reg, use CPAccessResult.

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -XXX,XX +XXX,XX @@ static inline bool cptype_valid(int cptype)
 typedef enum CPAccessResult {
     /* Access is permitted */
     CP_ACCESS_OK = 0,
+
+    /*
+     * Combined with one of the following, the low 2 bits indicate the
+     * target exception level.  If 0, the exception is taken to the usual
+     * target EL (EL1 or PL1 if in EL0, otherwise to the current EL).
+     */
+    CP_ACCESS_EL_MASK = 3,
+
     /*
      * Access fails due to a configurable trap or enable which would
      * result in a categorized exception syndrome giving information about
      * the failing instruction (ie syndrome category 0x3, 0x4, 0x5, 0x6,
-     * 0xc or 0x18). The exception is taken to the usual target EL (EL1 or
-     * PL1 if in EL0, otherwise to the current EL).
+     * 0xc or 0x18).
      */
-    CP_ACCESS_TRAP = 1,
+    CP_ACCESS_TRAP = (1 << 2),
+    CP_ACCESS_TRAP_EL2 = CP_ACCESS_TRAP | 2,
+    CP_ACCESS_TRAP_EL3 = CP_ACCESS_TRAP | 3,
+
     /*
      * Access fails and results in an exception syndrome 0x0 ("uncategorized").
      * Note that this is not a catch-all case -- the set of cases which may
      * result in this failure is specifically defined by the architecture.
      */
-    CP_ACCESS_TRAP_UNCATEGORIZED = 2,
-    /* As CP_ACCESS_TRAP, but for traps directly to EL2 or EL3 */
-    CP_ACCESS_TRAP_EL2 = 3,
-    CP_ACCESS_TRAP_EL3 = 4,
-    /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
-    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
-    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
+    CP_ACCESS_TRAP_UNCATEGORIZED = (2 << 2),
+    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = CP_ACCESS_TRAP_UNCATEGORIZED | 2,
+    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = CP_ACCESS_TRAP_UNCATEGORIZED | 3,
 } CPAccessResult;
 
 typedef struct ARMCPRegInfo ARMCPRegInfo;
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(access_check_cp_reg)(CPUARMState *env, void *rip, uint32_t syndrome,
                                  uint32_t isread)
 {
     const ARMCPRegInfo *ri = rip;
+    CPAccessResult res = CP_ACCESS_OK;
     int target_el;
 
     if (arm_feature(env, ARM_FEATURE_XSCALE) && ri->cp < 14
         && extract32(env->cp15.c15_cpar, ri->cp, 1) == 0) {
-        raise_exception(env, EXCP_UDEF, syndrome, exception_target_el(env));
+        res = CP_ACCESS_TRAP;
+        goto fail;
     }
 
     /*
@@ -XXX,XX +XXX,XX @@ void HELPER(access_check_cp_reg)(CPUARMState *env, void *rip, uint32_t syndrome,
         mask &= ~((1 << 4) | (1 << 14));
 
         if (env->cp15.hstr_el2 & mask) {
-            target_el = 2;
-            goto exept;
+            res = CP_ACCESS_TRAP_EL2;
+            goto fail;
         }
     }
 
-    if (!ri->accessfn) {
+    if (ri->accessfn) {
+        res = ri->accessfn(env, ri, isread);
+    }
+    if (likely(res == CP_ACCESS_OK)) {
         return;
     }
 
-    switch (ri->accessfn(env, ri, isread)) {
-    case CP_ACCESS_OK:
-        return;
+ fail:
+    switch (res & ~CP_ACCESS_EL_MASK) {
     case CP_ACCESS_TRAP:
-        target_el = exception_target_el(env);
-        break;
-    case CP_ACCESS_TRAP_EL2:
-        /* Requesting a trap to EL2 when we're in EL3 is
-         * a bug in the access function.
-         */
-        assert(arm_current_el(env) != 3);
-        target_el = 2;
-        break;
-    case CP_ACCESS_TRAP_EL3:
-        target_el = 3;
         break;
     case CP_ACCESS_TRAP_UNCATEGORIZED:
-        target_el = exception_target_el(env);
-        syndrome = syn_uncategorized();
-        break;
-    case CP_ACCESS_TRAP_UNCATEGORIZED_EL2:
-        target_el = 2;
-        syndrome = syn_uncategorized();
-        break;
-    case CP_ACCESS_TRAP_UNCATEGORIZED_EL3:
-        target_el = 3;
         syndrome = syn_uncategorized();
         break;
     default:
         g_assert_not_reached();
     }
 
-exept:
+    target_el = res & CP_ACCESS_EL_MASK;
+    switch (target_el) {
+    case 0:
+        target_el = exception_target_el(env);
+        break;
+    case 2:
+        assert(arm_current_el(env) != 3);
+        assert(arm_is_el2_enabled(env));
+        break;
+    case 3:
+        assert(arm_feature(env, ARM_FEATURE_EL3));
+        break;
+    default:
+        /* No "direct" traps to EL1 */
+        g_assert_not_reached();
+    }
+
     raise_exception(env, EXCP_UDEF, syndrome, target_el);
 }
 
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Remove a possible source of error by removing REGINFO_SENTINEL
and using ARRAY_SIZE (convinently hidden inside a macro) to
find the end of the set of regs being registered or modified.

The space saved by not having the extra array element reduces
the executable's .data.rel.ro section by about 9k.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220501055028.646596-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpregs.h       |  53 +++++++++---------
 hw/arm/pxa2xx.c           |   1 -
 hw/arm/pxa2xx_pic.c       |   1 -
 hw/intc/arm_gicv3_cpuif.c |   5 --
 hw/intc/arm_gicv3_kvm.c   |   1 -
 target/arm/cpu64.c        |   1 -
 target/arm/cpu_tcg.c      |   4 --
 target/arm/helper.c       | 111 ++++++++------------------------------
 8 files changed, 48 insertions(+), 129 deletions(-)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -XXX,XX +XXX,XX @@
 #define ARM_CP_NO_GDB            0x4000
 #define ARM_CP_RAISES_EXC        0x8000
 #define ARM_CP_NEWEL             0x10000
-/* Used only as a terminator for ARMCPRegInfo lists */
-#define ARM_CP_SENTINEL          0xfffff
 /* Mask of only the flag bits in a type field */
 #define ARM_CP_FLAG_MASK         0x1f0ff
 
@@ -XXX,XX +XXX,XX @@ enum {
     ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
 };
 
-/*
- * Return true if cptype is a valid type field. This is used to try to
- * catch errors where the sentinel has been accidentally left off the end
- * of a list of registers.
- */
-static inline bool cptype_valid(int cptype)
-{
-    return ((cptype & ~ARM_CP_FLAG_MASK) == 0)
-        || ((cptype & ARM_CP_SPECIAL) &&
-            ((cptype & ~ARM_CP_FLAG_MASK) <= ARM_LAST_SPECIAL));
-}
-
 /*
  * Access rights:
  * We define bits for Read and Write access for what rev C of the v7-AR ARM ARM
@@ -XXX,XX +XXX,XX @@ struct ARMCPRegInfo {
 #define CPREG_FIELD64(env, ri) \
     (*(uint64_t *)((char *)(env) + (ri)->fieldoffset))
 
-#define REGINFO_SENTINEL { .type = ARM_CP_SENTINEL }
+void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu, const ARMCPRegInfo *reg,
+                                       void *opaque);
 
-void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
-                                    const ARMCPRegInfo *regs, void *opaque);
-void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
-                                       const ARMCPRegInfo *regs, void *opaque);
-static inline void define_arm_cp_regs(ARMCPU *cpu, const ARMCPRegInfo *regs)
-{
-    define_arm_cp_regs_with_opaque(cpu, regs, 0);
-}
 static inline void define_one_arm_cp_reg(ARMCPU *cpu, const ARMCPRegInfo *regs)
 {
-    define_one_arm_cp_reg_with_opaque(cpu, regs, 0);
+    define_one_arm_cp_reg_with_opaque(cpu, regs, NULL);
 }
+
+void define_arm_cp_regs_with_opaque_len(ARMCPU *cpu, const ARMCPRegInfo *regs,
+                                        void *opaque, size_t len);
+
+#define define_arm_cp_regs_with_opaque(CPU, REGS, OPAQUE)               \
+    do {                                                                \
+        QEMU_BUILD_BUG_ON(ARRAY_SIZE(REGS) == 0);                       \
+        define_arm_cp_regs_with_opaque_len(CPU, REGS, OPAQUE,           \
+                                           ARRAY_SIZE(REGS));           \
+    } while (0)
+
+#define define_arm_cp_regs(CPU, REGS) \
+    define_arm_cp_regs_with_opaque(CPU, REGS, NULL)
+
 const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp);
 
 /*
@@ -XXX,XX +XXX,XX @@ typedef struct ARMCPRegUserSpaceInfo {
     uint64_t fixed_bits;
 } ARMCPRegUserSpaceInfo;
 
-#define REGUSERINFO_SENTINEL { .name = NULL }
+void modify_arm_cp_regs_with_len(ARMCPRegInfo *regs, size_t regs_len,
+                                 const ARMCPRegUserSpaceInfo *mods,
+                                 size_t mods_len);
 
-void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods);
+#define modify_arm_cp_regs(REGS, MODS)                                  \
+    do {                                                                \
+        QEMU_BUILD_BUG_ON(ARRAY_SIZE(REGS) == 0);                       \
+        QEMU_BUILD_BUG_ON(ARRAY_SIZE(MODS) == 0);                       \
+        modify_arm_cp_regs_with_len(REGS, ARRAY_SIZE(REGS),             \
+                                    MODS, ARRAY_SIZE(MODS));            \
+    } while (0)
 
 /* CPWriteFn that can be used to implement writes-ignored behaviour */
 void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
diff --git a/hw/arm/pxa2xx.c b/hw/arm/pxa2xx.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/pxa2xx.c
+++ b/hw/arm/pxa2xx.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pxa_cp_reginfo[] = {
     { .name = "PWRMODE", .cp = 14, .crn = 7, .crm = 0, .opc1 = 0, .opc2 = 0,
       .access = PL1_RW, .type = ARM_CP_IO,
       .readfn = arm_cp_read_zero, .writefn = pxa2xx_pwrmode_write },
-    REGINFO_SENTINEL
 };
 
 static void pxa2xx_setup_cp14(PXA2xxState *s)
diff --git a/hw/arm/pxa2xx_pic.c b/hw/arm/pxa2xx_pic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/pxa2xx_pic.c
+++ b/hw/arm/pxa2xx_pic.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pxa_pic_cp_reginfo[] = {
     REGINFO_FOR_PIC_CP("ICLR2", 8),
     REGINFO_FOR_PIC_CP("ICFP2", 9),
     REGINFO_FOR_PIC_CP("ICPR2", 0xa),
-    REGINFO_SENTINEL
 };
 
 static const MemoryRegionOps pxa2xx_pic_ops = {
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_reginfo[] = {
       .readfn = icc_igrpen1_el3_read,
       .writefn = icc_igrpen1_el3_write,
     },
-    REGINFO_SENTINEL
 };
 
 static uint64_t ich_ap_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_hcr_reginfo[] = {
       .readfn = ich_vmcr_read,
       .writefn = ich_vmcr_write,
     },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo gicv3_cpuif_ich_apxr1_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_ich_apxr1_reginfo[] = {
       .readfn = ich_ap_read,
       .writefn = ich_ap_write,
     },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo gicv3_cpuif_ich_apxr23_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_ich_apxr23_reginfo[] = {
       .readfn = ich_ap_read,
       .writefn = ich_ap_write,
     },
-    REGINFO_SENTINEL
 };
 
 static void gicv3_cpuif_el_change_hook(ARMCPU *cpu, void *opaque)
@@ -XXX,XX +XXX,XX @@ void gicv3_init_cpuif(GICv3State *s)
                       .readfn = ich_lr_read,
                       .writefn = ich_lr_write,
                     },
-                    REGINFO_SENTINEL
                 };
                 define_arm_cp_regs(cpu, lr_regset);
             }
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo gicv3_cpuif_reginfo[] = {
        */
       .resetfn = arm_gicv3_icc_reset,
     },
-    REGINFO_SENTINEL
 };
 
 /**
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cortex_a72_a57_a53_cp_reginfo[] = {
     { .name = "L2MERRSR",
       .cp = 15, .opc1 = 3, .crm = 15,
       .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static void aarch64_a57_initfn(Object *obj)
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cortexa8_cp_reginfo[] = {
       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
     { .name = "L2AUXCR", .cp = 15, .crn = 9, .crm = 0, .opc1 = 1, .opc2 = 2,
       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static void cortex_a8_initfn(Object *obj)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cortexa9_cp_reginfo[] = {
       .access = PL1_RW, .resetvalue = 0, .type = ARM_CP_CONST },
     { .name = "TLB_ATTR", .cp = 15, .crn = 15, .crm = 7, .opc1 = 5, .opc2 = 2,
       .access = PL1_RW, .resetvalue = 0, .type = ARM_CP_CONST },
-    REGINFO_SENTINEL
 };
 
 static void cortex_a9_initfn(Object *obj)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cortexa15_cp_reginfo[] = {
 #endif
     { .name = "L2ECTLR", .cp = 15, .crn = 9, .crm = 0, .opc1 = 1, .opc2 = 3,
       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static void cortex_a7_initfn(Object *obj)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cortexr5_cp_reginfo[] = {
       .access = PL1_RW, .type = ARM_CP_CONST },
     { .name = "DCACHE_INVAL", .cp = 15, .opc1 = 0, .crn = 15, .crm = 5,
       .opc2 = 0, .access = PL1_W, .type = ARM_CP_NOP },
-    REGINFO_SENTINEL
 };
 
 static void cortex_r5_initfn(Object *obj)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cp_reginfo[] = {
       .secure = ARM_CP_SECSTATE_S,
       .fieldoffset = offsetof(CPUARMState, cp15.contextidr_s),
       .resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo not_v8_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v8_cp_reginfo[] = {
     { .name = "CACHEMAINT", .cp = 15, .crn = 7, .crm = CP_ANY,
       .opc1 = 0, .opc2 = CP_ANY, .access = PL1_W,
       .type = ARM_CP_NOP | ARM_CP_OVERRIDE },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo not_v6_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v6_cp_reginfo[] = {
      */
     { .name = "WFI_v5", .cp = 15, .crn = 7, .crm = 8, .opc1 = 0, .opc2 = 2,
       .access = PL1_W, .type = ARM_CP_WFI },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo not_v7_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo not_v7_cp_reginfo[] = {
       .opc1 = 0, .opc2 = 0, .access = PL1_RW, .type = ARM_CP_NOP },
     { .name = "NMRR", .cp = 15, .crn = 10, .crm = 2,
       .opc1 = 0, .opc2 = 1, .access = PL1_RW, .type = ARM_CP_NOP },
-    REGINFO_SENTINEL
 };
 
 static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
       .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 2, .accessfn = cpacr_access,
       .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.cpacr_el1),
       .resetfn = cpacr_reset, .writefn = cpacr_write, .readfn = cpacr_read },
-    REGINFO_SENTINEL
 };
 
 typedef struct pm_event {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
     { .name = "TLBIMVAA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 3,
       .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
       .writefn = tlbimvaa_write },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo v7mp_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v7mp_cp_reginfo[] = {
     { .name = "TLBIMVAAIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
       .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
       .writefn = tlbimvaa_is_write },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo pmovsset_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pmovsset_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
       .writefn = pmovsset_write,
       .raw_writefn = raw_write },
-    REGINFO_SENTINEL
 };
 
 static void teecr_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo t2ee_cp_reginfo[] = {
     { .name = "TEEHBR", .cp = 14, .crn = 1, .crm = 0, .opc1 = 6, .opc2 = 0,
       .access = PL0_RW, .fieldoffset = offsetof(CPUARMState, teehbr),
       .accessfn = teehbr_access, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo v6k_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v6k_cp_reginfo[] = {
       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.tpidrprw_s),
                              offsetoflow32(CPUARMState, cp15.tpidrprw_ns) },
       .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 #ifndef CONFIG_USER_ONLY
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_SEC].cval),
       .writefn = gt_sec_cval_write, .raw_writefn = raw_write,
     },
-    REGINFO_SENTINEL
 };
 
 static CPAccessResult e2h_access(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
       .access = PL0_R, .type = ARM_CP_NO_RAW | ARM_CP_IO,
       .readfn = gt_virt_cnt_read,
     },
-    REGINFO_SENTINEL
 };
 
 #endif
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vapa_cp_reginfo[] = {
       .access = PL1_W, .accessfn = ats_access,
       .writefn = ats_write, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC },
 #endif
-    REGINFO_SENTINEL
 };
 
 /* Return basic MPU access permission bits.  */
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pmsav7_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, pmsav7.rnr[M_REG_NS]),
       .writefn = pmsav7_rgnr_write,
       .resetfn = arm_cp_reset_ignore },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo pmsav5_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pmsav5_cp_reginfo[] = {
     { .name = "946_PRBS7", .cp = 15, .crn = 6, .crm = 7, .opc1 = 0,
       .opc2 = CP_ANY, .access = PL1_RW, .resetvalue = 0,
       .fieldoffset = offsetof(CPUARMState, cp15.c6_region[7]) },
-    REGINFO_SENTINEL
 };
 
 static void vmsa_ttbcr_raw_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
       .access = PL1_RW, .accessfn = access_tvm_trvm,
       .fieldoffset = offsetof(CPUARMState, cp15.far_el[1]),
       .resetvalue = 0, },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo vmsa_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
       /* No offsetoflow32 -- pass the entire TCR to writefn/raw_writefn. */
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.tcr_el[3]),
                              offsetof(CPUARMState, cp15.tcr_el[1])} },
-    REGINFO_SENTINEL
 };
 
 /* Note that unlike TTBCR, writing to TTBCR2 does not require flushing
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo omap_cp_reginfo[] = {
     { .name = "C9", .cp = 15, .crn = 9,
       .crm = CP_ANY, .opc1 = CP_ANY, .opc2 = CP_ANY, .access = PL1_RW,
       .type = ARM_CP_CONST | ARM_CP_OVERRIDE, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static void xscale_cpar_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo xscale_cp_reginfo[] = {
     { .name = "XSCALE_UNLOCK_DCACHE",
       .cp = 15, .opc1 = 0, .crn = 9, .crm = 2, .opc2 = 1,
       .access = PL1_W, .type = ARM_CP_NOP },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo dummy_c15_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dummy_c15_cp_reginfo[] = {
       .access = PL1_RW,
       .type = ARM_CP_CONST | ARM_CP_NO_RAW | ARM_CP_OVERRIDE,
       .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo cache_dirty_status_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cache_dirty_status_cp_reginfo[] = {
     { .name = "CDSR", .cp = 15, .crn = 7, .crm = 10, .opc1 = 0, .opc2 = 6,
       .access = PL1_R, .type = ARM_CP_CONST | ARM_CP_NO_RAW,
       .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo cache_block_ops_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cache_block_ops_cp_reginfo[] = {
       .access = PL0_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
     { .name = "CIDCR", .cp = 15, .crm = 14, .opc1 = 0,
       .access = PL1_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo cache_test_clean_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo cache_test_clean_cp_reginfo[] = {
     { .name = "TCI_DCACHE", .cp = 15, .crn = 7, .crm = 14, .opc1 = 0, .opc2 = 3,
       .access = PL0_R, .type = ARM_CP_CONST | ARM_CP_NO_RAW,
       .resetvalue = (1 << 30) },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo strongarm_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo strongarm_cp_reginfo[] = {
       .crm = CP_ANY, .opc1 = CP_ANY, .opc2 = CP_ANY,
       .access = PL1_RW, .resetvalue = 0,
       .type = ARM_CP_CONST | ARM_CP_OVERRIDE | ARM_CP_NO_RAW },
-    REGINFO_SENTINEL
 };
 
 static uint64_t midr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo lpae_cp_reginfo[] = {
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) },
       .writefn = vmsa_ttbr_write, },
-    REGINFO_SENTINEL
 };
 
 static uint64_t aa64_fpcr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .access = PL1_RW, .accessfn = access_trap_aa32s_el1,
       .writefn = sdcr_write,
       .fieldoffset = offsetoflow32(CPUARMState, cp15.mdcr_el3) },
-    REGINFO_SENTINEL
 };
 
 /* Used to describe the behaviour of EL2 regs when EL2 does not exist.  */
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_cp_reginfo[] = {
       .type = ARM_CP_CONST,
       .cp = 15, .opc1 = 4, .crn = 6, .crm = 0, .opc2 = 2,
       .access = PL2_RW, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 /* Ditto, but for registers which exist in ARMv8 but not v7 */
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
       .cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 4,
       .access = PL2_RW,
       .type = ARM_CP_CONST, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
       .cp = 15, .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 3,
       .access = PL2_RW,
       .fieldoffset = offsetof(CPUARMState, cp15.hstr_el2) },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo el2_v8_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_v8_cp_reginfo[] = {
       .access = PL2_RW,
       .fieldoffset = offsetofhigh32(CPUARMState, cp15.hcr_el2),
       .writefn = hcr_writehigh },
-    REGINFO_SENTINEL
 };
 
 static CPAccessResult sel2_access(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_sec_cp_reginfo[] = {
       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 6, .opc2 = 2,
       .access = PL2_RW, .accessfn = sel2_access,
       .fieldoffset = offsetof(CPUARMState, cp15.vstcr_el2) },
-    REGINFO_SENTINEL
 };
 
 static CPAccessResult nsacr_access(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
       .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 7, .opc2 = 5,
       .access = PL3_W, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae3_write },
-    REGINFO_SENTINEL
 };
 
 #ifndef CONFIG_USER_ONLY
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
       .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 0,
       .access = PL1_RW, .accessfn = access_tda,
       .type = ARM_CP_NOP },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
       .access = PL0_R, .type = ARM_CP_CONST|ARM_CP_64BIT, .resetvalue = 0 },
     { .name = "DBGDSAR", .cp = 14, .crm = 2, .opc1 = 0,
       .access = PL0_R, .type = ARM_CP_CONST|ARM_CP_64BIT, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 /* Return the exception level to which exceptions should be taken
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
               .fieldoffset = offsetof(CPUARMState, cp15.dbgbcr[i]),
               .writefn = dbgbcr_write, .raw_writefn = raw_write
             },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, dbgregs);
     }
@@ -XXX,XX +XXX,XX @@ static void define_debug_regs(ARMCPU *cpu)
               .fieldoffset = offsetof(CPUARMState, cp15.dbgwcr[i]),
               .writefn = dbgwcr_write, .raw_writefn = raw_write
             },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, dbgregs);
     }
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
               .type = ARM_CP_IO,
               .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
               .raw_writefn = pmevtyper_rawwrite },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, pmev_regs);
         g_free(pmevcntr_name);
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
               .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 5,
               .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
               .resetvalue = extract64(cpu->pmceid1, 32, 32) },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, v81_pmu_regs);
     }
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo lor_reginfo[] = {
       .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 7,
       .access = PL1_R, .accessfn = access_lor_ns,
       .type = ARM_CP_CONST, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 #ifdef TARGET_AARCH64
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo pauth_reginfo[] = {
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 3,
       .access = PL1_RW, .accessfn = access_pauth,
       .fieldoffset = offsetof(CPUARMState, keys.apib.hi) },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo tlbirange_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo tlbirange_reginfo[] = {
       .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 6, .opc2 = 5,
       .access = PL3_W, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_rvae3_write },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo tlbios_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo tlbios_reginfo[] = {
       .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 1, .opc2 = 5,
       .access = PL3_W, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae3is_write },
-    REGINFO_SENTINEL
 };
 
 static uint64_t rndr_readfn(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo rndr_reginfo[] = {
       .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END | ARM_CP_IO,
       .opc0 = 3, .opc1 = 3, .crn = 2, .crm = 4, .opc2 = 1,
       .access = PL0_R, .readfn = rndr_readfn },
-    REGINFO_SENTINEL
 };
 
 #ifndef CONFIG_USER_ONLY
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dcpop_reg[] = {
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
       .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo dcpodp_reg[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo dcpodp_reg[] = {
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
       .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
-    REGINFO_SENTINEL
 };
 #endif /*CONFIG_USER_ONLY*/
 
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo mte_reginfo[] = {
     { .name = "DC_CIGDSW", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 6,
       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo mte_tco_ro_reginfo[] = {
     { .name = "TCO", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 3, .crn = 4, .crm = 2, .opc2 = 7,
       .type = ARM_CP_CONST, .access = PL0_RW, },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
       .accessfn = aa64_zva_access,
 #endif
     },
-    REGINFO_SENTINEL
 };
 
 #endif
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo predinv_reginfo[] = {
     { .name = "CPPRCTX", .state = ARM_CP_STATE_AA32,
       .cp = 15, .opc1 = 0, .crn = 7, .crm = 3, .opc2 = 7,
       .type = ARM_CP_NOP, .access = PL0_W, .accessfn = access_predinv },
-    REGINFO_SENTINEL
 };
 
 static uint64_t ccsidr2_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo ccsidr2_reginfo[] = {
       .access = PL1_R,
       .accessfn = access_aa64_tid2,
       .readfn = ccsidr2_read, .type = ARM_CP_NO_RAW },
-    REGINFO_SENTINEL
 };
 
 static CPAccessResult access_aa64_tid3(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo jazelle_regs[] = {
       .cp = 14, .crn = 2, .crm = 0, .opc1 = 7, .opc2 = 0,
       .accessfn = access_joscr_jmcr,
       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo vhe_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vhe_reginfo[] = {
       .access = PL2_RW, .accessfn = e2h_access,
       .writefn = gt_virt_cval_write, .raw_writefn = raw_write },
 #endif
-    REGINFO_SENTINEL
 };
 
 #ifndef CONFIG_USER_ONLY
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo ats1e1_reginfo[] = {
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 1,
       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
       .writefn = ats_write64 },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo ats1cp_reginfo[] = {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo ats1cp_reginfo[] = {
       .cp = 15, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 1,
       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
       .writefn = ats_write },
-    REGINFO_SENTINEL
 };
 #endif
 
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo actlr2_hactlr2_reginfo[] = {
       .cp = 15, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 3,
       .access = PL2_RW, .type = ARM_CP_CONST,
       .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 void register_cp_regs_for_features(ARMCPU *cpu)
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa32_tid3,
               .resetvalue = cpu->isar.id_isar6 },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, v6_idregs);
         define_arm_cp_regs(cpu, v6_cp_reginfo);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 7,
               .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
               .resetvalue = cpu->pmceid1 },
-            REGINFO_SENTINEL
         };
 #ifdef CONFIG_USER_ONLY
         ARMCPRegUserSpaceInfo v8_user_idregs[] = {
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .exported_bits = 0x000000f0ffffffff },
             { .name = "ID_AA64ISAR*_EL1_RESERVED",
               .is_glob = true                     },
-            REGUSERINFO_SENTINEL
         };
         modify_arm_cp_regs(v8_idregs, v8_user_idregs);
 #endif
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL2_RW,
               .resetvalue = vmpidr_def,
               .fieldoffset = offsetof(CPUARMState, cp15.vmpidr_el2) },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, vpidr_regs);
         define_arm_cp_regs(cpu, el2_cp_reginfo);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                   .access = PL2_RW, .accessfn = access_el3_aa32ns,
                   .type = ARM_CP_NO_RAW,
                   .writefn = arm_cp_write_ignore, .readfn = mpidr_read },
-                REGINFO_SENTINEL
             };
             define_arm_cp_regs(cpu, vpidr_regs);
             define_arm_cp_regs(cpu, el3_no_el2_cp_reginfo);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .raw_writefn = raw_write, .writefn = sctlr_write,
               .fieldoffset = offsetof(CPUARMState, cp15.sctlr_el[3]),
               .resetvalue = cpu->reset_sctlr },
-            REGINFO_SENTINEL
         };
 
         define_arm_cp_regs(cpu, el3_regs);
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "DUMMY",
               .cp = 15, .crn = 0, .crm = 7, .opc1 = 0, .opc2 = CP_ANY,
               .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
-            REGINFO_SENTINEL
         };
         ARMCPRegInfo id_v8_midr_cp_reginfo[] = {
             { .name = "MIDR_EL1", .state = ARM_CP_STATE_BOTH,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R,
               .accessfn = access_aa64_tid1,
               .type = ARM_CP_CONST, .resetvalue = cpu->revidr },
-            REGINFO_SENTINEL
         };
         ARMCPRegInfo id_cp_reginfo[] = {
             /* These are common to v8 and pre-v8 */
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R,
               .accessfn = access_aa32_tid1,
               .type = ARM_CP_CONST, .resetvalue = 0 },
-            REGINFO_SENTINEL
         };
         /* TLBTR is specific to VMSA */
         ARMCPRegInfo id_tlbtr_reginfo = {
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "MIDR_EL1",
               .exported_bits = 0x00000000ffffffff },
             { .name = "REVIDR_EL1"                },
-            REGUSERINFO_SENTINEL
         };
         modify_arm_cp_regs(id_v8_midr_cp_reginfo, id_v8_user_midr_cp_reginfo);
 #endif
         if (arm_feature(env, ARM_FEATURE_OMAPCP) ||
             arm_feature(env, ARM_FEATURE_STRONGARM)) {
-            ARMCPRegInfo *r;
+            size_t i;
             /* Register the blanket "writes ignored" value first to cover the
              * whole space. Then update the specific ID registers to allow write
              * access, so that they ignore writes rather than causing them to
              * UNDEF.
              */
             define_one_arm_cp_reg(cpu, &crn0_wi_reginfo);
-            for (r = id_pre_v8_midr_cp_reginfo;
-                 r->type != ARM_CP_SENTINEL; r++) {
-                r->access = PL1_RW;
+            for (i = 0; i < ARRAY_SIZE(id_pre_v8_midr_cp_reginfo); ++i) {
+                id_pre_v8_midr_cp_reginfo[i].access = PL1_RW;
             }
-            for (r = id_cp_reginfo; r->type != ARM_CP_SENTINEL; r++) {
-                r->access = PL1_RW;
+            for (i = 0; i < ARRAY_SIZE(id_cp_reginfo); ++i) {
+                id_cp_reginfo[i].access = PL1_RW;
             }
             id_mpuir_reginfo.access = PL1_RW;
             id_tlbtr_reginfo.access = PL1_RW;
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "MPIDR_EL1", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 5,
               .access = PL1_R, .readfn = mpidr_read, .type = ARM_CP_NO_RAW },
-            REGINFO_SENTINEL
         };
 #ifdef CONFIG_USER_ONLY
         ARMCPRegUserSpaceInfo mpidr_user_cp_reginfo[] = {
             { .name = "MPIDR_EL1",
               .fixed_bits = 0x0000000080000000 },
-            REGUSERINFO_SENTINEL
         };
         modify_arm_cp_regs(mpidr_cp_reginfo, mpidr_user_cp_reginfo);
 #endif
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 0, .opc2 = 1,
               .access = PL3_RW, .type = ARM_CP_CONST,
               .resetvalue = 0 },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, auxcr_reginfo);
         if (cpu_isar_feature(aa32_ac2, cpu)) {
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
                   .type = ARM_CP_CONST,
                   .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 0,
                   .access = PL1_R, .resetvalue = cpu->reset_cbar },
-                REGINFO_SENTINEL
             };
             /* We don't implement a r/w 64 bit CBAR currently */
             assert(arm_feature(env, ARM_FEATURE_CBAR_RO));
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .bank_fieldoffsets = { offsetof(CPUARMState, cp15.vbar_s),
                                      offsetof(CPUARMState, cp15.vbar_ns) },
               .resetvalue = 0 },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, vbar_cp_reginfo);
     }
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
                    r->writefn);
         }
     }
-    /* Bad type field probably means missing sentinel at end of reg list */
-    assert(cptype_valid(r->type));
+
     for (crm = crmmin; crm <= crmmax; crm++) {
         for (opc1 = opc1min; opc1 <= opc1max; opc1++) {
             for (opc2 = opc2min; opc2 <= opc2max; opc2++) {
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
     }
 }
 
-void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
-                                    const ARMCPRegInfo *regs, void *opaque)
+/* Define a whole list of registers */
+void define_arm_cp_regs_with_opaque_len(ARMCPU *cpu, const ARMCPRegInfo *regs,
+                                        void *opaque, size_t len)
 {
-    /* Define a whole list of registers */
-    const ARMCPRegInfo *r;
-    for (r = regs; r->type != ARM_CP_SENTINEL; r++) {
-        define_one_arm_cp_reg_with_opaque(cpu, r, opaque);
+    size_t i;
+    for (i = 0; i < len; ++i) {
+        define_one_arm_cp_reg_with_opaque(cpu, regs + i, opaque);
     }
 }
 
@@ -XXX,XX +XXX,XX @@ void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
  * user-space cannot alter any values and dynamic values pertaining to
  * execution state are hidden from user space view anyway.
  */
-void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods)
+void modify_arm_cp_regs_with_len(ARMCPRegInfo *regs, size_t regs_len,
+                                 const ARMCPRegUserSpaceInfo *mods,
+                                 size_t mods_len)
 {
-    const ARMCPRegUserSpaceInfo *m;
-    ARMCPRegInfo *r;
-
-    for (m = mods; m->name; m++) {
+    for (size_t mi = 0; mi < mods_len; ++mi) {
+        const ARMCPRegUserSpaceInfo *m = mods + mi;
         GPatternSpec *pat = NULL;
+
         if (m->is_glob) {
             pat = g_pattern_spec_new(m->name);
         }
-        for (r = regs; r->type != ARM_CP_SENTINEL; r++) {
+        for (size_t ri = 0; ri < regs_len; ++ri) {
+            ARMCPRegInfo *r = regs + ri;
+
             if (pat && g_pattern_match_string(pat, r->name)) {
                 r->type = ARM_CP_CONST;
                 r->access = PL0U_R;
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

These particular data structures are not modified at runtime.

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .resetvalue = cpu->pmceid1 },
         };
 #ifdef CONFIG_USER_ONLY
-        ARMCPRegUserSpaceInfo v8_user_idregs[] = {
+        static const ARMCPRegUserSpaceInfo v8_user_idregs[] = {
             { .name = "ID_AA64PFR0_EL1",
               .exported_bits = 0x000f000f00ff0000,
               .fixed_bits    = 0x0000000000000011 },
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
      */
     if (arm_feature(env, ARM_FEATURE_EL3)) {
         if (arm_feature(env, ARM_FEATURE_AARCH64)) {
-            ARMCPRegInfo nsacr = {
+            static const ARMCPRegInfo nsacr = {
                 .name = "NSACR", .type = ARM_CP_CONST,
                 .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 2,
                 .access = PL1_RW, .accessfn = nsacr_access,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
             };
             define_one_arm_cp_reg(cpu, &nsacr);
         } else {
-            ARMCPRegInfo nsacr = {
+            static const ARMCPRegInfo nsacr = {
                 .name = "NSACR",
                 .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 2,
                 .access = PL3_RW | PL1_R,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
         }
     } else {
         if (arm_feature(env, ARM_FEATURE_V8)) {
-            ARMCPRegInfo nsacr = {
+            static const ARMCPRegInfo nsacr = {
                 .name = "NSACR", .type = ARM_CP_CONST,
                 .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 2,
                 .access = PL1_R,
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R, .type = ARM_CP_CONST,
               .resetvalue = cpu->pmsav7_dregion << 8
         };
-        ARMCPRegInfo crn0_wi_reginfo = {
+        static const ARMCPRegInfo crn0_wi_reginfo = {
             .name = "CRN0_WI", .cp = 15, .crn = 0, .crm = CP_ANY,
             .opc1 = CP_ANY, .opc2 = CP_ANY, .access = PL1_W,
             .type = ARM_CP_NOP | ARM_CP_OVERRIDE
         };
 #ifdef CONFIG_USER_ONLY
-        ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = {
+        static const ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = {
             { .name = "MIDR_EL1",
               .exported_bits = 0x00000000ffffffff },
             { .name = "REVIDR_EL1"                },
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R, .readfn = mpidr_read, .type = ARM_CP_NO_RAW },
         };
 #ifdef CONFIG_USER_ONLY
-        ARMCPRegUserSpaceInfo mpidr_user_cp_reginfo[] = {
+        static const ARMCPRegUserSpaceInfo mpidr_user_cp_reginfo[] = {
             { .name = "MPIDR_EL1",
               .fixed_bits = 0x0000000080000000 },
         };
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
     }
 
     if (arm_feature(env, ARM_FEATURE_VBAR)) {
-        ARMCPRegInfo vbar_cp_reginfo[] = {
+        static const ARMCPRegInfo vbar_cp_reginfo[] = {
             { .name = "VBAR", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .crn = 12, .crm = 0, .opc1 = 0, .opc2 = 0,
               .access = PL1_RW, .writefn = vbar_write,
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Instead of defining ARM_CP_FLAG_MASK to remove flags,
define ARM_CP_SPECIAL_MASK to isolate special cases.
Sort the specials to the low bits. Use an enum.

Split the large comment block so as to document each
value separately.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpregs.h        | 130 +++++++++++++++++++++++--------------
 target/arm/cpu.c           |   4 +-
 target/arm/helper.c        |   4 +-
 target/arm/translate-a64.c |   6 +-
 target/arm/translate.c     |   6 +-
 5 files changed, 92 insertions(+), 58 deletions(-)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -XXX,XX +XXX,XX @@
 #define TARGET_ARM_CPREGS_H
 
 /*
- * ARMCPRegInfo type field bits. If the SPECIAL bit is set this is a
- * special-behaviour cp reg and bits [11..8] indicate what behaviour
- * it has. Otherwise it is a simple cp reg, where CONST indicates that
- * TCG can assume the value to be constant (ie load at translate time)
- * and 64BIT indicates a 64 bit wide coprocessor register. SUPPRESS_TB_END
- * indicates that the TB should not be ended after a write to this register
- * (the default is that the TB ends after cp writes). OVERRIDE permits
- * a register definition to override a previous definition for the
- * same (cp, is64, crn, crm, opc1, opc2) tuple: either the new or the
- * old must have the OVERRIDE bit set.
- * ALIAS indicates that this register is an alias view of some underlying
- * state which is also visible via another register, and that the other
- * register is handling migration and reset; registers marked ALIAS will not be
- * migrated but may have their state set by syncing of register state from KVM.
- * NO_RAW indicates that this register has no underlying state and does not
- * support raw access for state saving/loading; it will not be used for either
- * migration or KVM state synchronization. (Typically this is for "registers"
- * which are actually used as instructions for cache maintenance and so on.)
- * IO indicates that this register does I/O and therefore its accesses
- * need to be marked with gen_io_start() and also end the TB. In particular,
- * registers which implement clocks or timers require this.
- * RAISES_EXC is for when the read or write hook might raise an exception;
- * the generated code will synchronize the CPU state before calling the hook
- * so that it is safe for the hook to call raise_exception().
- * NEWEL is for writes to registers that might change the exception
- * level - typically on older ARM chips. For those cases we need to
- * re-read the new el when recomputing the translation flags.
+ * ARMCPRegInfo type field bits:
  */
-#define ARM_CP_SPECIAL           0x0001
-#define ARM_CP_CONST             0x0002
-#define ARM_CP_64BIT             0x0004
-#define ARM_CP_SUPPRESS_TB_END   0x0008
-#define ARM_CP_OVERRIDE          0x0010
-#define ARM_CP_ALIAS             0x0020
-#define ARM_CP_IO                0x0040
-#define ARM_CP_NO_RAW            0x0080
-#define ARM_CP_NOP               (ARM_CP_SPECIAL | 0x0100)
-#define ARM_CP_WFI               (ARM_CP_SPECIAL | 0x0200)
-#define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
-#define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
-#define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
-#define ARM_CP_DC_GVA            (ARM_CP_SPECIAL | 0x0600)
-#define ARM_CP_DC_GZVA           (ARM_CP_SPECIAL | 0x0700)
-#define ARM_LAST_SPECIAL         ARM_CP_DC_GZVA
-#define ARM_CP_FPU               0x1000
-#define ARM_CP_SVE               0x2000
-#define ARM_CP_NO_GDB            0x4000
-#define ARM_CP_RAISES_EXC        0x8000
-#define ARM_CP_NEWEL             0x10000
-/* Mask of only the flag bits in a type field */
-#define ARM_CP_FLAG_MASK         0x1f0ff
+enum {
+    /*
+     * Register must be handled specially during translation.
+     * The method is one of the values below:
+     */
+    ARM_CP_SPECIAL_MASK          = 0x000f,
+    /* Special: no change to PE state: writes ignored, reads ignored. */
+    ARM_CP_NOP                   = 0x0001,
+    /* Special: sysreg is WFI, for v5 and v6. */
+    ARM_CP_WFI                   = 0x0002,
+    /* Special: sysreg is NZCV. */
+    ARM_CP_NZCV                  = 0x0003,
+    /* Special: sysreg is CURRENTEL. */
+    ARM_CP_CURRENTEL             = 0x0004,
+    /* Special: sysreg is DC ZVA or similar. */
+    ARM_CP_DC_ZVA                = 0x0005,
+    ARM_CP_DC_GVA                = 0x0006,
+    ARM_CP_DC_GZVA               = 0x0007,
+
+    /* Flag: reads produce resetvalue; writes ignored. */
+    ARM_CP_CONST                 = 1 << 4,
+    /* Flag: For ARM_CP_STATE_AA32, sysreg is 64-bit. */
+    ARM_CP_64BIT                 = 1 << 5,
+    /*
+     * Flag: TB should not be ended after a write to this register
+     * (the default is that the TB ends after cp writes).
+     */
+    ARM_CP_SUPPRESS_TB_END       = 1 << 6,
+    /*
+     * Flag: Permit a register definition to override a previous definition
+     * for the same (cp, is64, crn, crm, opc1, opc2) tuple: either the new
+     * or the old must have the ARM_CP_OVERRIDE bit set.
+     */
+    ARM_CP_OVERRIDE              = 1 << 7,
+    /*
+     * Flag: Register is an alias view of some underlying state which is also
+     * visible via another register, and that the other register is handling
+     * migration and reset; registers marked ARM_CP_ALIAS will not be migrated
+     * but may have their state set by syncing of register state from KVM.
+     */
+    ARM_CP_ALIAS                 = 1 << 8,
+    /*
+     * Flag: Register does I/O and therefore its accesses need to be marked
+     * with gen_io_start() and also end the TB. In particular, registers which
+     * implement clocks or timers require this.
+     */
+    ARM_CP_IO                    = 1 << 9,
+    /*
+     * Flag: Register has no underlying state and does not support raw access
+     * for state saving/loading; it will not be used for either migration or
+     * KVM state synchronization. Typically this is for "registers" which are
+     * actually used as instructions for cache maintenance and so on.
+     */
+    ARM_CP_NO_RAW                = 1 << 10,
+    /*
+     * Flag: The read or write hook might raise an exception; the generated
+     * code will synchronize the CPU state before calling the hook so that it
+     * is safe for the hook to call raise_exception().
+     */
+    ARM_CP_RAISES_EXC            = 1 << 11,
+    /*
+     * Flag: Writes to the sysreg might change the exception level - typically
+     * on older ARM chips. For those cases we need to re-read the new el when
+     * recomputing the translation flags.
+     */
+    ARM_CP_NEWEL                 = 1 << 12,
+    /*
+     * Flag: Access check for this sysreg is identical to accessing FPU state
+     * from an instruction: use translation fp_access_check().
+     */
+    ARM_CP_FPU                   = 1 << 13,
+    /*
+     * Flag: Access check for this sysreg is identical to accessing SVE state
+     * from an instruction: use translation sve_access_check().
+     */
+    ARM_CP_SVE                   = 1 << 14,
+    /* Flag: Do not expose in gdb sysreg xml. */
+    ARM_CP_NO_GDB                = 1 << 15,
+};
 
 /*
  * Valid values for ARMCPRegInfo state field, indicating which of
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
     ARMCPRegInfo *ri = value;
     ARMCPU *cpu = opaque;
 
-    if (ri->type & (ARM_CP_SPECIAL | ARM_CP_ALIAS)) {
+    if (ri->type & (ARM_CP_SPECIAL_MASK | ARM_CP_ALIAS)) {
         return;
     }
 
@@ -XXX,XX +XXX,XX @@ static void cp_reg_check_reset(gpointer key, gpointer value,  gpointer opaque)
     ARMCPU *cpu = opaque;
     uint64_t oldvalue, newvalue;
 
-    if (ri->type & (ARM_CP_SPECIAL | ARM_CP_ALIAS | ARM_CP_NO_RAW)) {
+    if (ri->type & (ARM_CP_SPECIAL_MASK | ARM_CP_ALIAS | ARM_CP_NO_RAW)) {
         return;
     }
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
      * multiple times. Special registers (ie NOP/WFI) are
      * never migratable and not even raw-accessible.
      */
-    if ((r->type & ARM_CP_SPECIAL)) {
+    if (r->type & ARM_CP_SPECIAL_MASK) {
         r2->type |= ARM_CP_NO_RAW;
     }
     if (((r->crm == CP_ANY) && crm != 0) ||
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
     /* Check that the register definition has enough info to handle
      * reads and writes if they are permitted.
      */
-    if (!(r->type & (ARM_CP_SPECIAL|ARM_CP_CONST))) {
+    if (!(r->type & (ARM_CP_SPECIAL_MASK | ARM_CP_CONST))) {
         if (r->access & PL3_R) {
             assert((r->fieldoffset ||
                    (r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1])) ||
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
     }
 
     /* Handle special cases first */
-    switch (ri->type & ~(ARM_CP_FLAG_MASK & ~ARM_CP_SPECIAL)) {
+    switch (ri->type & ARM_CP_SPECIAL_MASK) {
+    case 0:
+        break;
     case ARM_CP_NOP:
         return;
     case ARM_CP_NZCV:
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
         }
         return;
     default:
-        break;
+        g_assert_not_reached();
     }
     if ((ri->type & ARM_CP_FPU) && !fp_access_check(s)) {
         return;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
         }
 
         /* Handle special cases first */
-        switch (ri->type & ~(ARM_CP_FLAG_MASK & ~ARM_CP_SPECIAL)) {
+        switch (ri->type & ARM_CP_SPECIAL_MASK) {
+        case 0:
+            break;
         case ARM_CP_NOP:
             return;
         case ARM_CP_WFI:
@@ -XXX,XX +XXX,XX @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
             s->base.is_jmp = DISAS_WFI;
             return;
         default:
-            break;
+            g_assert_not_reached();
         }
 
         if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Standardize on g_assert_not_reached() for "should not happen".
Retain abort() when preceeded by fprintf or error_report.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c         | 7 +++----
 target/arm/hvf/hvf.c        | 2 +-
 target/arm/kvm-stub.c       | 4 ++--
 target/arm/kvm.c            | 4 ++--
 target/arm/machine.c        | 4 ++--
 target/arm/translate-a64.c  | 4 ++--
 target/arm/translate-neon.c | 2 +-
 target/arm/translate.c      | 4 ++--
 8 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
             break;
         default:
             /* broken reginfo with out-of-range opc1 */
-            assert(false);
-            break;
+            g_assert_not_reached();
         }
         /* assert our permissions are not too lax (stricter is fine) */
         assert((r->access & ~mask) == 0);
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_v5(CPUARMState *env, uint32_t address,
             break;
         default:
             /* Never happens, but compiler isn't smart enough to tell.  */
-            abort();
+            g_assert_not_reached();
         }
     }
     *prot = ap_to_rw_prot(env, mmu_idx, ap, domain_prot);
@@ -XXX,XX +XXX,XX @@ static bool get_phys_addr_v6(CPUARMState *env, uint32_t address,
             break;
         default:
             /* Never happens, but compiler isn't smart enough to tell.  */
-            abort();
+            g_assert_not_reached();
         }
     }
     if (domain_prot == 3) {
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -XXX,XX +XXX,XX @@ int hvf_vcpu_exec(CPUState *cpu)
         /* we got kicked, no exit to process */
         return 0;
     default:
-        assert(0);
+        g_assert_not_reached();
     }
 
     hvf_sync_vtimer(cpu);
diff --git a/target/arm/kvm-stub.c b/target/arm/kvm-stub.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm-stub.c
+++ b/target/arm/kvm-stub.c
@@ -XXX,XX +XXX,XX @@
 
 bool write_kvmstate_to_list(ARMCPU *cpu)
 {
-    abort();
+    g_assert_not_reached();
 }
 
 bool write_list_to_kvmstate(ARMCPU *cpu, int level)
 {
-    abort();
+    g_assert_not_reached();
 }
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -XXX,XX +XXX,XX @@ bool write_kvmstate_to_list(ARMCPU *cpu)
             ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &r);
             break;
         default:
-            abort();
+            g_assert_not_reached();
         }
         if (ret) {
             ok = false;
@@ -XXX,XX +XXX,XX @@ bool write_list_to_kvmstate(ARMCPU *cpu, int level)
             r.addr = (uintptr_t)(cpu->cpreg_values + i);
             break;
         default:
-            abort();
+            g_assert_not_reached();
         }
         ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &r);
         if (ret) {
diff --git a/target/arm/machine.c b/target/arm/machine.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static int cpu_pre_save(void *opaque)
     if (kvm_enabled()) {
         if (!write_kvmstate_to_list(cpu)) {
             /* This should never fail */
-            abort();
+            g_assert_not_reached();
         }
 
         /*
@@ -XXX,XX +XXX,XX @@ static int cpu_pre_save(void *opaque)
     } else {
         if (!write_cpustate_to_list(cpu, false)) {
             /* This should never fail. */
-            abort();
+            g_assert_not_reached();
         }
     }
 
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
         gen_helper_advsimd_rinth(tcg_res, tcg_op, fpst);
         break;
     default:
-        abort();
+        g_assert_not_reached();
     }
 
     write_fp_sreg(s, rd, tcg_res);
@@ -XXX,XX +XXX,XX @@ static void handle_fp_fcvt(DisasContext *s, int opcode,
         break;
     }
     default:
-        abort();
+        g_assert_not_reached();
     }
 }
 
diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.c
+++ b/target/arm/translate-neon.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a)
         }
         break;
     default:
-        abort();
+        g_assert_not_reached();
     }
     if ((vd + a->stride * (nregs - 1)) > 31) {
         /*
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
         offset = 4;
         break;
     default:
-        abort();
+        g_assert_not_reached();
     }
     tcg_gen_addi_i32(addr, addr, offset);
     tmp = load_reg(s, 14);
@@ -XXX,XX +XXX,XX @@ static void gen_srs(DisasContext *s,
             offset = 0;
             break;
         default:
-            abort();
+            g_assert_not_reached();
         }
         tcg_gen_addi_i32(addr, addr, offset);
         gen_helper_set_r13_banked(cpu_env, tcg_constant_i32(mode), addr);
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Create a typedef as well, and use it in ARMCPRegInfo.
This won't be perfect for debugging, but it'll nicely
display the most common cases.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220501055028.646596-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpregs.h | 44 +++++++++++++++++++++++---------------------
 target/arm/helper.c |  2 +-
 2 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -XXX,XX +XXX,XX @@ enum {
  * described with these bits, then use a laxer set of restrictions, and
  * do the more restrictive/complex check inside a helper function.
  */
-#define PL3_R 0x80
-#define PL3_W 0x40
-#define PL2_R (0x20 | PL3_R)
-#define PL2_W (0x10 | PL3_W)
-#define PL1_R (0x08 | PL2_R)
-#define PL1_W (0x04 | PL2_W)
-#define PL0_R (0x02 | PL1_R)
-#define PL0_W (0x01 | PL1_W)
+typedef enum {
+    PL3_R = 0x80,
+    PL3_W = 0x40,
+    PL2_R = 0x20 | PL3_R,
+    PL2_W = 0x10 | PL3_W,
+    PL1_R = 0x08 | PL2_R,
+    PL1_W = 0x04 | PL2_W,
+    PL0_R = 0x02 | PL1_R,
+    PL0_W = 0x01 | PL1_W,
 
-/*
- * For user-mode some registers are accessible to EL0 via a kernel
- * trap-and-emulate ABI. In this case we define the read permissions
- * as actually being PL0_R. However some bits of any given register
- * may still be masked.
- */
+    /*
+     * For user-mode some registers are accessible to EL0 via a kernel
+     * trap-and-emulate ABI. In this case we define the read permissions
+     * as actually being PL0_R. However some bits of any given register
+     * may still be masked.
+     */
 #ifdef CONFIG_USER_ONLY
-#define PL0U_R PL0_R
+    PL0U_R = PL0_R,
 #else
-#define PL0U_R PL1_R
+    PL0U_R = PL1_R,
 #endif
 
-#define PL3_RW (PL3_R | PL3_W)
-#define PL2_RW (PL2_R | PL2_W)
-#define PL1_RW (PL1_R | PL1_W)
-#define PL0_RW (PL0_R | PL0_W)
+    PL3_RW = PL3_R | PL3_W,
+    PL2_RW = PL2_R | PL2_W,
+    PL1_RW = PL1_R | PL1_W,
+    PL0_RW = PL0_R | PL0_W,
+} CPAccessRights;
 
 typedef enum CPAccessResult {
     /* Access is permitted */
@@ -XXX,XX +XXX,XX @@ struct ARMCPRegInfo {
     /* Register type: ARM_CP_* bits/values */
     int type;
     /* Access rights: PL*_[RW] */
-    int access;
+    CPAccessRights access;
     /* Security state: ARM_CP_SECSTATE_* bits/values */
     int secure;
     /*
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
      * to encompass the generic architectural permission check.
      */
     if (r->state != ARM_CP_STATE_AA32) {
-        int mask = 0;
+        CPAccessRights mask;
         switch (r->opc1) {
         case 0:
             /* min_EL EL1, but some accessible to EL0 via kernel ABI */
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Give this enum a name and use in ARMCPRegInfo,
add_cpreg_to_hashtable and define_one_arm_cp_reg_with_opaque.

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -XXX,XX +XXX,XX @@ enum {
  * Note that we rely on the values of these enums as we iterate through
  * the various states in some places.
  */
-enum {
+typedef enum {
     ARM_CP_STATE_AA32 = 0,
     ARM_CP_STATE_AA64 = 1,
     ARM_CP_STATE_BOTH = 2,
-};
+} CPState;
 
 /*
  * ARM CP register secure state flags.  These flags identify security state
@@ -XXX,XX +XXX,XX @@ struct ARMCPRegInfo {
     uint8_t opc1;
     uint8_t opc2;
     /* Execution state in which this register is visible: ARM_CP_STATE_* */
-    int state;
+    CPState state;
     /* Register type: ARM_CP_* bits/values */
     int type;
     /* Access rights: PL*_[RW] */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
 }
 
 static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-                                   void *opaque, int state, int secstate,
+                                   void *opaque, CPState state, int secstate,
                                    int crm, int opc1, int opc2,
                                    const char *name)
 {
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
      * bits; the ARM_CP_64BIT* flag applies only to the AArch32 view of
      * the register, if any.
      */
-    int crm, opc1, opc2, state;
+    int crm, opc1, opc2;
     int crmmin = (r->crm == CP_ANY) ? 0 : r->crm;
     int crmmax = (r->crm == CP_ANY) ? 15 : r->crm;
     int opc1min = (r->opc1 == CP_ANY) ? 0 : r->opc1;
     int opc1max = (r->opc1 == CP_ANY) ? 7 : r->opc1;
     int opc2min = (r->opc2 == CP_ANY) ? 0 : r->opc2;
     int opc2max = (r->opc2 == CP_ANY) ? 7 : r->opc2;
+    CPState state;
+
     /* 64 bit registers have only CRm and Opc1 fields */
     assert(!((r->type & ARM_CP_64BIT) && (r->opc2 || r->crn)));
     /* op0 only exists in the AArch64 encodings */
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Give this enum a name and use in ARMCPRegInfo and add_cpreg_to_hashtable.
Add the enumerator ARM_CP_SECSTATE_BOTH to clarify how 0
is handled in define_one_arm_cp_reg_with_opaque.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220501055028.646596-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpregs.h | 7 ++++---
 target/arm/helper.c | 7 +++++--
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -XXX,XX +XXX,XX @@ typedef enum {
  * registered entry will only have one to identify whether the entry is secure
  * or non-secure.
  */
-enum {
+typedef enum {
+    ARM_CP_SECSTATE_BOTH = 0,       /* define one cpreg for each secstate */
     ARM_CP_SECSTATE_S =   (1 << 0), /* bit[0]: Secure state register */
     ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
-};
+} CPSecureState;
 
 /*
  * Access rights:
@@ -XXX,XX +XXX,XX @@ struct ARMCPRegInfo {
     /* Access rights: PL*_[RW] */
     CPAccessRights access;
     /* Security state: ARM_CP_SECSTATE_* bits/values */
-    int secure;
+    CPSecureState secure;
     /*
      * The opaque pointer passed to define_arm_cp_regs_with_opaque() when
      * this register was defined: can be used to hand data through to the
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
 }
 
 static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-                                   void *opaque, CPState state, int secstate,
+                                   void *opaque, CPState state,
+                                   CPSecureState secstate,
                                    int crm, int opc1, int opc2,
                                    const char *name)
 {
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
                                                    r->secure, crm, opc1, opc2,
                                                    r->name);
                             break;
-                        default:
+                        case ARM_CP_SECSTATE_BOTH:
                             name = g_strdup_printf("%s_S", r->name);
                             add_cpreg_to_hashtable(cpu, r, opaque, state,
                                                    ARM_CP_SECSTATE_S,
@@ -XXX,XX +XXX,XX @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
                                                    ARM_CP_SECSTATE_NS,
                                                    crm, opc1, opc2, r->name);
                             break;
+                        default:
+                            g_assert_not_reached();
                         }
                     } else {
                         /* AArch64 registers get mapped to non-secure instance
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

The new_key field is always non-zero -- drop the if.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-11-richard.henderson@linaro.org
[PMM: reinstated dropped PL3_RW mask]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
 
     for (i = 0; i < ARRAY_SIZE(aliases); i++) {
         const struct E2HAlias *a = &aliases[i];
-        ARMCPRegInfo *src_reg, *dst_reg;
+        ARMCPRegInfo *src_reg, *dst_reg, *new_reg;
+        uint32_t *new_key;
+        bool ok;
 
         if (a->feature && !a->feature(&cpu->isar)) {
             continue;
@@ -XXX,XX +XXX,XX @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
         g_assert(src_reg->opaque == NULL);
 
         /* Create alias before redirection so we dup the right data. */
-        if (a->new_key) {
-            ARMCPRegInfo *new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
-            uint32_t *new_key = g_memdup(&a->new_key, sizeof(uint32_t));
-            bool ok;
+        new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
+        new_key = g_memdup(&a->new_key, sizeof(uint32_t));
 
-            new_reg->name = a->new_name;
-            new_reg->type |= ARM_CP_ALIAS;
-            /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
-            new_reg->access &= PL2_RW | PL3_RW;
+        new_reg->name = a->new_name;
+        new_reg->type |= ARM_CP_ALIAS;
+        /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
+        new_reg->access &= PL2_RW | PL3_RW;
 
-            ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
-            g_assert(ok);
-        }
+        ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
+        g_assert(ok);
 
         src_reg->opaque = dst_reg;
         src_reg->orig_readfn = src_reg->readfn ?: raw_read;
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Cast the uint32_t key into a gpointer directly, which
allows us to avoid allocating storage for each key.

Use g_hash_table_lookup when we already have a gpointer
(e.g. for callbacks like count_cpreg), or when using
get_arm_cp_reginfo would require casting away const.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c     |  4 ++--
 target/arm/gdbstub.c |  2 +-
 target/arm/helper.c  | 41 ++++++++++++++++++-----------------------
 3 files changed, 21 insertions(+), 26 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_initfn(Object *obj)
     ARMCPU *cpu = ARM_CPU(obj);
 
     cpu_set_cpustate_pointers(cpu);
-    cpu->cp_regs = g_hash_table_new_full(g_int_hash, g_int_equal,
-                                         g_free, cpreg_hashtable_data_destroy);
+    cpu->cp_regs = g_hash_table_new_full(g_direct_hash, g_direct_equal,
+                                         NULL, cpreg_hashtable_data_destroy);
 
     QLIST_INIT(&cpu->pre_el_change_hooks);
     QLIST_INIT(&cpu->el_change_hooks);
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -XXX,XX +XXX,XX @@ static void arm_gen_one_xml_sysreg_tag(GString *s, DynamicGDBXMLInfo *dyn_xml,
 static void arm_register_sysreg_for_xml(gpointer key, gpointer value,
                                         gpointer p)
 {
-    uint32_t ri_key = *(uint32_t *)key;
+    uint32_t ri_key = (uintptr_t)key;
     ARMCPRegInfo *ri = value;
     RegisterSysregXmlParam *param = (RegisterSysregXmlParam *)p;
     GString *s = param->s;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ bool write_list_to_cpustate(ARMCPU *cpu)
 static void add_cpreg_to_list(gpointer key, gpointer opaque)
 {
     ARMCPU *cpu = opaque;
-    uint64_t regidx;
-    const ARMCPRegInfo *ri;
-
-    regidx = *(uint32_t *)key;
-    ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
+    uint32_t regidx = (uintptr_t)key;
+    const ARMCPRegInfo *ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
 
     if (!(ri->type & (ARM_CP_NO_RAW|ARM_CP_ALIAS))) {
         cpu->cpreg_indexes[cpu->cpreg_array_len] = cpreg_to_kvm_id(regidx);
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_list(gpointer key, gpointer opaque)
 static void count_cpreg(gpointer key, gpointer opaque)
 {
     ARMCPU *cpu = opaque;
-    uint64_t regidx;
     const ARMCPRegInfo *ri;
 
-    regidx = *(uint32_t *)key;
-    ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
+    ri = g_hash_table_lookup(cpu->cp_regs, key);
 
     if (!(ri->type & (ARM_CP_NO_RAW|ARM_CP_ALIAS))) {
         cpu->cpreg_array_len++;
@@ -XXX,XX +XXX,XX @@ static void count_cpreg(gpointer key, gpointer opaque)
 
 static gint cpreg_key_compare(gconstpointer a, gconstpointer b)
 {
-    uint64_t aidx = cpreg_to_kvm_id(*(uint32_t *)a);
-    uint64_t bidx = cpreg_to_kvm_id(*(uint32_t *)b);
+    uint64_t aidx = cpreg_to_kvm_id((uintptr_t)a);
+    uint64_t bidx = cpreg_to_kvm_id((uintptr_t)b);
 
     if (aidx > bidx) {
         return 1;
@@ -XXX,XX +XXX,XX @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
     for (i = 0; i < ARRAY_SIZE(aliases); i++) {
         const struct E2HAlias *a = &aliases[i];
         ARMCPRegInfo *src_reg, *dst_reg, *new_reg;
-        uint32_t *new_key;
         bool ok;
 
         if (a->feature && !a->feature(&cpu->isar)) {
             continue;
         }
 
-        src_reg = g_hash_table_lookup(cpu->cp_regs, &a->src_key);
-        dst_reg = g_hash_table_lookup(cpu->cp_regs, &a->dst_key);
+        src_reg = g_hash_table_lookup(cpu->cp_regs,
+                                      (gpointer)(uintptr_t)a->src_key);
+        dst_reg = g_hash_table_lookup(cpu->cp_regs,
+                                      (gpointer)(uintptr_t)a->dst_key);
         g_assert(src_reg != NULL);
         g_assert(dst_reg != NULL);
 
@@ -XXX,XX +XXX,XX @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
 
         /* Create alias before redirection so we dup the right data. */
         new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
-        new_key = g_memdup(&a->new_key, sizeof(uint32_t));
 
         new_reg->name = a->new_name;
         new_reg->type |= ARM_CP_ALIAS;
         /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
         new_reg->access &= PL2_RW | PL3_RW;
 
-        ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
+        ok = g_hash_table_insert(cpu->cp_regs,
+                                 (gpointer)(uintptr_t)a->new_key, new_reg);
         g_assert(ok);
 
         src_reg->opaque = dst_reg;
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
     /* Private utility function for define_one_arm_cp_reg_with_opaque():
      * add a single reginfo struct to the hash table.
      */
-    uint32_t *key = g_new(uint32_t, 1);
+    uint32_t key;
     ARMCPRegInfo *r2 = g_memdup(r, sizeof(ARMCPRegInfo));
     int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
     int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
         if (r->cp == 0 || r->state == ARM_CP_STATE_BOTH) {
             r2->cp = CP_REG_ARM64_SYSREG_CP;
         }
-        *key = ENCODE_AA64_CP_REG(r2->cp, r2->crn, crm,
-                                  r2->opc0, opc1, opc2);
+        key = ENCODE_AA64_CP_REG(r2->cp, r2->crn, crm,
+                                 r2->opc0, opc1, opc2);
     } else {
-        *key = ENCODE_CP_REG(r2->cp, is64, ns, r2->crn, crm, opc1, opc2);
+        key = ENCODE_CP_REG(r2->cp, is64, ns, r2->crn, crm, opc1, opc2);
     }
     if (opaque) {
         r2->opaque = opaque;
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
      * requested.
      */
     if (!(r->type & ARM_CP_OVERRIDE)) {
-        ARMCPRegInfo *oldreg;
-        oldreg = g_hash_table_lookup(cpu->cp_regs, key);
+        const ARMCPRegInfo *oldreg = get_arm_cp_reginfo(cpu->cp_regs, key);
         if (oldreg && !(oldreg->type & ARM_CP_OVERRIDE)) {
             fprintf(stderr, "Register redefined: cp=%d %d bit "
                     "crn=%d crm=%d opc1=%d opc2=%d, "
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
             g_assert_not_reached();
         }
     }
-    g_hash_table_insert(cpu->cp_regs, key, r2);
+    g_hash_table_insert(cpu->cp_regs, (gpointer)(uintptr_t)key, r2);
 }
 
 
@@ -XXX,XX +XXX,XX @@ void modify_arm_cp_regs_with_len(ARMCPRegInfo *regs, size_t regs_len,
 
 const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp)
 {
-    return g_hash_table_lookup(cpregs, &encoded_cp);
+    return g_hash_table_lookup(cpregs, (gpointer)(uintptr_t)encoded_cp);
 }
 
 void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Simplify freeing cp_regs hash table entries by using a single
allocation for the entire value.

This fixes a theoretical bug if we were to ever free the entire
hash table, because we've been installing string literal constants
into the cpreg structure in define_arm_vh_e2h_redirects_aliases.
However, at present we only free entries created for AArch32
wildcard cpregs which get overwritten by more specific cpregs,
so this bug is never exposed.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c    | 16 +---------------
 target/arm/helper.c | 10 ++++++++--
 2 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ uint64_t arm_cpu_mp_affinity(int idx, uint8_t clustersz)
     return (Aff1 << ARM_AFF1_SHIFT) | Aff0;
 }
 
-static void cpreg_hashtable_data_destroy(gpointer data)
-{
-    /*
-     * Destroy function for cpu->cp_regs hashtable data entries.
-     * We must free the name string because it was g_strdup()ed in
-     * add_cpreg_to_hashtable(). It's OK to cast away the 'const'
-     * from r->name because we know we definitely allocated it.
-     */
-    ARMCPRegInfo *r = data;
-
-    g_free((void *)r->name);
-    g_free(r);
-}
-
 static void arm_cpu_initfn(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
 
     cpu_set_cpustate_pointers(cpu);
     cpu->cp_regs = g_hash_table_new_full(g_direct_hash, g_direct_equal,
-                                         NULL, cpreg_hashtable_data_destroy);
+                                         NULL, g_free);
 
     QLIST_INIT(&cpu->pre_el_change_hooks);
     QLIST_INIT(&cpu->el_change_hooks);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
      * add a single reginfo struct to the hash table.
      */
     uint32_t key;
-    ARMCPRegInfo *r2 = g_memdup(r, sizeof(ARMCPRegInfo));
+    ARMCPRegInfo *r2;
     int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
     int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
+    size_t name_len;
+
+    /* Combine cpreg and name into one allocation. */
+    name_len = strlen(name) + 1;
+    r2 = g_malloc(sizeof(*r2) + name_len);
+    *r2 = *r;
+    r2->name = memcpy(r2 + 1, name, name_len);
 
-    r2->name = g_strdup(name);
     /* Reset the secure state to the specific incoming state.  This is
      * necessary as the register may have been defined with both states.
      */
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Move the computation of key to the top of the function.
Hoist the resolution of cp as well, as an input to the
computation of key.

This will be required by a subsequent patch.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 49 +++++++++++++++++++++++++--------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
     ARMCPRegInfo *r2;
     int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
     int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
+    int cp = r->cp;
     size_t name_len;
 
+    switch (state) {
+    case ARM_CP_STATE_AA32:
+        /* We assume it is a cp15 register if the .cp field is left unset. */
+        if (cp == 0 && r->state == ARM_CP_STATE_BOTH) {
+            cp = 15;
+        }
+        key = ENCODE_CP_REG(cp, is64, ns, r->crn, crm, opc1, opc2);
+        break;
+    case ARM_CP_STATE_AA64:
+        /*
+         * To allow abbreviation of ARMCPRegInfo definitions, we treat
+         * cp == 0 as equivalent to the value for "standard guest-visible
+         * sysreg".  STATE_BOTH definitions are also always "standard sysreg"
+         * in their AArch64 view (the .cp value may be non-zero for the
+         * benefit of the AArch32 view).
+         */
+        if (cp == 0 || r->state == ARM_CP_STATE_BOTH) {
+            cp = CP_REG_ARM64_SYSREG_CP;
+        }
+        key = ENCODE_AA64_CP_REG(cp, r->crn, crm, r->opc0, opc1, opc2);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
     /* Combine cpreg and name into one allocation. */
     name_len = strlen(name) + 1;
     r2 = g_malloc(sizeof(*r2) + name_len);
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
         }
 
         if (r->state == ARM_CP_STATE_BOTH) {
-            /* We assume it is a cp15 register if the .cp field is left unset.
-             */
-            if (r2->cp == 0) {
-                r2->cp = 15;
-            }
-
 #if HOST_BIG_ENDIAN
             if (r2->fieldoffset) {
                 r2->fieldoffset += sizeof(uint32_t);
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
 #endif
         }
     }
-    if (state == ARM_CP_STATE_AA64) {
-        /* To allow abbreviation of ARMCPRegInfo
-         * definitions, we treat cp == 0 as equivalent to
-         * the value for "standard guest-visible sysreg".
-         * STATE_BOTH definitions are also always "standard
-         * sysreg" in their AArch64 view (the .cp value may
-         * be non-zero for the benefit of the AArch32 view).
-         */
-        if (r->cp == 0 || r->state == ARM_CP_STATE_BOTH) {
-            r2->cp = CP_REG_ARM64_SYSREG_CP;
-        }
-        key = ENCODE_AA64_CP_REG(r2->cp, r2->crn, crm,
-                                 r2->opc0, opc1, opc2);
-    } else {
-        key = ENCODE_CP_REG(r2->cp, is64, ns, r2->crn, crm, opc1, opc2);
-    }
     if (opaque) {
         r2->opaque = opaque;
     }
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
     /* Make sure reginfo passed to helpers for wildcarded regs
      * has the correct crm/opc1/opc2 for this reg, not CP_ANY:
      */
+    r2->cp = cp;
     r2->crm = crm;
     r2->opc1 = opc1;
     r2->opc2 = opc2;
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Put most of the value writeback to the same place,
and improve the comment that goes with them.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 28 ++++++++++++----------------
 1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
     *r2 = *r;
     r2->name = memcpy(r2 + 1, name, name_len);
 
-    /* Reset the secure state to the specific incoming state.  This is
-     * necessary as the register may have been defined with both states.
+    /*
+     * Update fields to match the instantiation, overwiting wildcards
+     * such as CP_ANY, ARM_CP_STATE_BOTH, or ARM_CP_SECSTATE_BOTH.
      */
+    r2->cp = cp;
+    r2->crm = crm;
+    r2->opc1 = opc1;
+    r2->opc2 = opc2;
+    r2->state = state;
     r2->secure = secstate;
+    if (opaque) {
+        r2->opaque = opaque;
+    }
 
     if (r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1]) {
         /* Register is banked (using both entries in array).
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
 #endif
         }
     }
-    if (opaque) {
-        r2->opaque = opaque;
-    }
-    /* reginfo passed to helpers is correct for the actual access,
-     * and is never ARM_CP_STATE_BOTH:
-     */
-    r2->state = state;
-    /* Make sure reginfo passed to helpers for wildcarded regs
-     * has the correct crm/opc1/opc2 for this reg, not CP_ANY:
-     */
-    r2->cp = cp;
-    r2->crm = crm;
-    r2->opc1 = opc1;
-    r2->opc2 = opc2;
+
     /* By convention, for wildcarded registers only the first
      * entry is used for migration; the others are marked as
      * ALIAS so we don't try to transfer the register
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Computing isbanked only once makes the code
a bit easier to read.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

From: Richard Henderson <richard.henderson@linaro.org>

Perform the override check early, so that it is still done
even when we decide to discard an unreachable cpreg.

Use assert not printf+abort.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-18-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
         g_assert_not_reached();
     }
 
+    /* Overriding of an existing definition must be explicitly requested. */
+    if (!(r->type & ARM_CP_OVERRIDE)) {
+        const ARMCPRegInfo *oldreg = get_arm_cp_reginfo(cpu->cp_regs, key);
+        if (oldreg) {
+            assert(oldreg->type & ARM_CP_OVERRIDE);
+        }
+    }
+
     /* Combine cpreg and name into one allocation. */
     name_len = strlen(name) + 1;
     r2 = g_malloc(sizeof(*r2) + name_len);
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
         assert(!raw_accessors_invalid(r2));
     }
 
-    /* Overriding of an existing definition must be explicitly
-     * requested.
-     */
-    if (!(r->type & ARM_CP_OVERRIDE)) {
-        const ARMCPRegInfo *oldreg = get_arm_cp_reginfo(cpu->cp_regs, key);
-        if (oldreg && !(oldreg->type & ARM_CP_OVERRIDE)) {
-            fprintf(stderr, "Register redefined: cp=%d %d bit "
-                    "crn=%d crm=%d opc1=%d opc2=%d, "
-                    "was %s, now %s\n", r2->cp, 32 + 32 * is64,
-                    r2->crn, r2->crm, r2->opc1, r2->opc2,
-                    oldreg->name, r2->name);
-            g_assert_not_reached();
-        }
-    }
     g_hash_table_insert(cpu->cp_regs, (gpointer)(uintptr_t)key, r2);
 }
 
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Put the block comments into the current coding style.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-19-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
     return cpu_list;
 }
 
+/*
+ * Private utility function for define_one_arm_cp_reg_with_opaque():
+ * add a single reginfo struct to the hash table.
+ */
 static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
                                    void *opaque, CPState state,
                                    CPSecureState secstate,
                                    int crm, int opc1, int opc2,
                                    const char *name)
 {
-    /* Private utility function for define_one_arm_cp_reg_with_opaque():
-     * add a single reginfo struct to the hash table.
-     */
     uint32_t key;
     ARMCPRegInfo *r2;
     bool is64 = r->type & ARM_CP_64BIT;
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
 
     isbanked = r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1];
     if (isbanked) {
-        /* Register is banked (using both entries in array).
+        /*
+         * Register is banked (using both entries in array).
          * Overwriting fieldoffset as the array is only used to define
          * banked registers but later only fieldoffset is used.
          */
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
 
     if (state == ARM_CP_STATE_AA32) {
         if (isbanked) {
-            /* If the register is banked then we don't need to migrate or
+            /*
+             * If the register is banked then we don't need to migrate or
              * reset the 32-bit instance in certain cases:
              *
              * 1) If the register has both 32-bit and 64-bit instances then we
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
                 r2->type |= ARM_CP_ALIAS;
             }
         } else if ((secstate != r->secure) && !ns) {
-            /* The register is not banked so we only want to allow migration of
-             * the non-secure instance.
+            /*
+             * The register is not banked so we only want to allow migration
+             * of the non-secure instance.
              */
             r2->type |= ARM_CP_ALIAS;
         }
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
         }
     }
 
-    /* By convention, for wildcarded registers only the first
+    /*
+     * By convention, for wildcarded registers only the first
      * entry is used for migration; the others are marked as
      * ALIAS so we don't try to transfer the register
      * multiple times. Special registers (ie NOP/WFI) are
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
         r2->type |= ARM_CP_ALIAS | ARM_CP_NO_GDB;
     }
 
-    /* Check that raw accesses are either forbidden or handled. Note that
+    /*
+     * Check that raw accesses are either forbidden or handled. Note that
      * we can't assert this earlier because the setup of fieldoffset for
      * banked registers has to be done first.
      */
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Since e03b56863d2bc, our host endian indicator is unconditionally
set, which means that we can use a normal C condition.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-20-richard.henderson@linaro.org
[PMM: quote correct git hash in commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
             r2->type |= ARM_CP_ALIAS;
         }
 
-        if (r->state == ARM_CP_STATE_BOTH) {
-#if HOST_BIG_ENDIAN
-            if (r2->fieldoffset) {
-                r2->fieldoffset += sizeof(uint32_t);
-            }
-#endif
+        if (HOST_BIG_ENDIAN &&
+            r->state == ARM_CP_STATE_BOTH && r2->fieldoffset) {
+            r2->fieldoffset += sizeof(uint32_t);
         }
     }
 
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220501055028.646596-24-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa32_ssbs(const ARMISARegisters *id)
     return FIELD_EX32(id->id_pfr2, ID_PFR2, SSBS) != 0;
 }
 
+static inline bool isar_feature_aa32_debugv8p2(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_dfr0, ID_DFR0, COPDBG) >= 8;
+}
+
 /*
  * 64-bit feature tests via id registers.
  */
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_ssbs(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, SSBS) != 0;
 }
 
+static inline bool isar_feature_aa64_debugv8p2(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64dfr0, ID_AA64DFR0, DEBUGVER) >= 8;
+}
+
 static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id)
 {
     return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SVEVER) != 0;
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_tts2uxn(const ARMISARegisters *id)
     return isar_feature_aa64_tts2uxn(id) || isar_feature_aa32_tts2uxn(id);
 }
 
+static inline bool isar_feature_any_debugv8p2(const ARMISARegisters *id)
+{
+    return isar_feature_aa64_debugv8p2(id) || isar_feature_aa32_debugv8p2(id);
+}
+
 /*
  * Forward to the above feature tests given an ARMCPU pointer.
  */
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Add the aa64 predicate for detecting RAS support from id registers.
We already have the aa32 version from the M-profile work.
Add the 'any' predicate for testing both aa64 and aa32.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220501055028.646596-34-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_aa64_aa32_el1(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, EL1) >= 2;
 }
 
+static inline bool isar_feature_aa64_ras(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, RAS) != 0;
+}
+
 static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
 {
     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
@@ -XXX,XX +XXX,XX @@ static inline bool isar_feature_any_debugv8p2(const ARMISARegisters *id)
     return isar_feature_aa64_debugv8p2(id) || isar_feature_aa32_debugv8p2(id);
 }
 
+static inline bool isar_feature_any_ras(const ARMISARegisters *id)
+{
+    return isar_feature_aa64_ras(id) || isar_feature_aa32_ras(id);
+}
+
 /*
  * Forward to the above feature tests given an ARMCPU pointer.
  */
-- 
2.25.1

From: Alex Zuepke <alex.zuepke@tum.de>

The ARMv8 manual defines that PMUSERENR_EL0.ER enables read-access
to both PMXEVCNTR_EL0 and PMEVCNTR<n>_EL0 registers, however,
we only use it for PMXEVCNTR_EL0. Extend to PMEVCNTR<n>_EL0 as well.

Signed-off-by: Alex Zuepke <alex.zuepke@tum.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220428132717.84190-1-alex.zuepke@tum.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void define_pmu_regs(ARMCPU *cpu)
               .crm = 8 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
               .access = PL0_RW, .type = ARM_CP_IO | ARM_CP_ALIAS,
               .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
-              .accessfn = pmreg_access },
+              .accessfn = pmreg_access_xevcntr },
             { .name = pmevcntr_el0_name, .state = ARM_CP_STATE_AA64,
               .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 8 | (3 & (i >> 3)),
-              .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
+              .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access_xevcntr,
               .type = ARM_CP_IO,
               .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
               .raw_readfn = pmevcntr_rawread,
-- 
2.25.1