Series comparison

-[PULL 00/25] target-arm queue
+[PULL v2 00/14] target-arm queue
-target-arm queue, mostly SME preliminaries.
+Changes v1->v2 (fixing CI failures in v1, added a couple of
 extra patches in an attempt to avoid having to do a last
 minute arm pullreq next week):
  * new patch to hopefully fix the build issue with the SVE/SME sysregs test
  * dropped the IC IVAU test case patch
  * new patch: fix over-length shift
  * new patches: define neoverse-v1
-In the unlikely event we don't land the rest of SME before freeze
+thanks
 for 7.1 we can revert the docs/property changes included here.
 -- PMM
-The following changes since commit 097ccbbbaf2681df1e65542e5b7d2b2d0c66e2bc:
+The following changes since commit 2a6ae69154542caa91dd17c40fd3f5ffbec300de:
-  Merge tag 'qemu-sparc-20220626' of https://github.com/mcayland/qemu into staging (2022-06-27 05:21:05 +0530)
+  Merge tag 'pull-maintainer-ominbus-030723-1' of https://gitlab.com/stsquad/qemu into staging (2023-07-04 08:36:44 +0200)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20220627
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20230706
-for you to fetch changes up to 59e1b8a22ea9f947d038ccac784de1020f266e14:
+for you to fetch changes up to c41077235168140cdd4a34fce9bd95c3d30efe9c:
-  target/arm: Check V7VE as well as LPAE in arm_pamax (2022-06-27 11:18:17 +0100)
+  target/arm: Avoid over-length shift in arm_cpu_sve_finalize() error case (2023-07-06 13:36:51 +0100)
 ----------------------------------------------------------------
 target-arm queue:
- * sphinx: change default language to 'en'
+ * Add raw_writes ops for register whose write induce TLB maintenance
- * Diagnose attempts to emulate EL3 in hvf as well as kvm
+ * hw/arm/sbsa-ref: use XHCI to replace EHCI
- * More SME groundwork patches
+ * Avoid splitting Zregs across lines in dump
- * virt: Fix calculation of physical address space size
+ * Dump ZA[] when active
-   for v7VE CPUs (eg cortex-a15)
+ * Fix SME full tile indexing
  * Handle IC IVAU to improve compatibility with JITs
  * xlnx-canfd-test: Fix code coverity issues
  * gdbstub: Guard M-profile code with CONFIG_TCG
  * allwinner-sramc: Set class_size
  * target/xtensa: Assert that interrupt level is within bounds
  * Avoid over-length shift in arm_cpu_sve_finalize() error case
  * Define new 'neoverse-v1' CPU type
 ----------------------------------------------------------------
-Alexander Graf (2):
+Akihiko Odaki (1):
-      accel: Introduce current_accel_name()
+      hw: arm: allwinner-sramc: Set class_size
       target/arm: Catch invalid kvm state also for hvf
-Martin Liška (1):
+Eric Auger (1):
-      sphinx: change default language to 'en'
+      target/arm: Add raw_writes ops for register whose write induce TLB maintenance
-Richard Henderson (22):
+Fabiano Rosas (1):
-      target/arm: Implement TPIDR2_EL0
+      target/arm: gdbstub: Guard M-profile code with CONFIG_TCG
       target/arm: Add SMEEXC_EL to TB flags
       target/arm: Add syn_smetrap
       target/arm: Add ARM_CP_SME
       target/arm: Add SVCR
       target/arm: Add SMCR_ELx
       target/arm: Add SMIDR_EL1, SMPRI_EL1, SMPRIMAP_EL2
       target/arm: Add PSTATE.{SM,ZA} to TB flags
       target/arm: Add the SME ZA storage to CPUARMState
       target/arm: Implement SMSTART, SMSTOP
       target/arm: Move error for sve%d property to arm_cpu_sve_finalize
       target/arm: Create ARMVQMap
       target/arm: Generalize cpu_arm_{get,set}_vq
       target/arm: Generalize cpu_arm_{get, set}_default_vec_len
       target/arm: Move arm_cpu_*_finalize to internals.h
       target/arm: Unexport aarch64_add_*_properties
       target/arm: Add cpu properties for SME
       target/arm: Introduce sve_vqm1_for_el_sm
       target/arm: Add SVL to TB flags
       target/arm: Move pred_{full, gvec}_reg_{offset, size} to translate-a64.h
       target/arm: Extend arm_pamax to more than aarch64
       target/arm: Check V7VE as well as LPAE in arm_pamax
- docs/conf.py                     |   2 +-
+John Högberg (1):
- docs/system/arm/cpu-features.rst |  56 ++++++++++
+      target/arm: Handle IC IVAU to improve compatibility with JITs
  include/qemu/accel.h             |   1 +
  target/arm/cpregs.h              |   5 +
  target/arm/cpu.h                 | 103 ++++++++++++++-----
  target/arm/helper-sme.h          |  21 ++++
  target/arm/helper.h              |   1 +
  target/arm/internals.h           |   4 +
  target/arm/syndrome.h            |  14 +++
  target/arm/translate-a64.h       |  38 +++++++
  target/arm/translate.h           |   6 ++
  accel/accel-common.c             |   8 ++
  hw/arm/virt.c                    |  10 +-
  softmmu/vl.c                     |   3 +-
  target/arm/cpu.c                 |  32 ++++--
  target/arm/cpu64.c               | 205 ++++++++++++++++++++++++++++---------
  target/arm/helper.c              | 213 +++++++++++++++++++++++++++++++++++++--
  target/arm/kvm64.c               |   2 +-
  target/arm/machine.c             |  34 +++++++
  target/arm/ptw.c                 |  26 +++--
  target/arm/sme_helper.c          |  61 +++++++++++
  target/arm/translate-a64.c       |  46 +++++++++
  target/arm/translate-sve.c       |  36 -------
  target/arm/meson.build           |   1 +
 files changed, 782 insertions(+), 146 deletions(-)
  create mode 100644 target/arm/helper-sme.h
  create mode 100644 target/arm/sme_helper.c
+Peter Maydell (5):
+      tests/tcg/aarch64/sysregs.c: Use S syntax for id_aa64zfr0_el1 and id_aa64smfr0_el1
+      target/xtensa: Assert that interrupt level is within bounds
+      target/arm: Suppress more TCG unimplemented features in ID registers
+      target/arm: Define neoverse-v1
+      target/arm: Avoid over-length shift in arm_cpu_sve_finalize() error case
+Richard Henderson (3):
+      target/arm: Avoid splitting Zregs across lines in dump
+      target/arm: Dump ZA[] when active
+      target/arm: Fix SME full tile indexing
+Vikram Garhwal (1):
+      tests/qtest: xlnx-canfd-test: Fix code coverity issues
+Yuquan Wang (1):
+      hw/arm/sbsa-ref: use XHCI to replace EHCI
+ docs/system/arm/sbsa.rst          |   5 +-
+ docs/system/arm/virt.rst          |   1 +
+ hw/arm/sbsa-ref.c                 |  24 ++++---
+ hw/arm/virt.c                     |   1 +
+ hw/misc/allwinner-sramc.c         |   1 +
+ target/arm/cpu.c                  |  98 +++++++++++++++++++++--------
+ target/arm/cpu64.c                |   4 +-
+ target/arm/gdbstub.c              |   4 ++
+ target/arm/helper.c               |  70 +++++++++++++++++----
+ target/arm/tcg/cpu64.c            | 128 ++++++++++++++++++++++++++++++++++++++
+ target/arm/tcg/translate-sme.c    |  24 +++++--
+ target/xtensa/exc_helper.c        |   3 +
+ tests/qtest/xlnx-canfd-test.c     |  33 ++++------
+ tests/tcg/aarch64/sme-outprod1.c  |  83 ++++++++++++++++++++++++
+ tests/tcg/aarch64/sysregs.c       |  11 ++--
+ hw/arm/Kconfig                    |   2 +-
+ tests/tcg/aarch64/Makefile.target |  16 ++---
+files changed, 415 insertions(+), 93 deletions(-)
+ create mode 100644 tests/tcg/aarch64/sme-outprod1.c

-[PULL 04/25] target/arm: Implement TPIDR2_EL0
+[PULL 01/14] target/arm: Add raw_writes ops for register whose write induce TLB maintenance
-From: Richard Henderson <richard.henderson@linaro.org>
+From: Eric Auger <eric.auger@redhat.com>
-This register is part of SME, but isn't closely related to the
+Some registers whose 'cooked' writefns induce TLB maintenance do
-rest of the extension.
+not have raw_writefn ops defined. If only the writefn ops is set
 (ie. no raw_writefn is provided), it is assumed the cooked also
 work as the raw one. For those registers it is not obvious the
 tlb_flush works on KVM mode so better/safer setting the raw write.
+Signed-off-by: Eric Auger <eric.auger@redhat.com>
+Suggested-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-2-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h    |  1 +
+ target/arm/helper.c | 23 +++++++++++++----------
- target/arm/helper.c | 32 ++++++++++++++++++++++++++++++++
+file changed, 13 insertions(+), 10 deletions(-)
 files changed, 33 insertions(+)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
-             };
-             uint64_t tpidr_el[4];
-         };
-+        uint64_t tpidr2_el0;
-         /* The secure banks of these registers don't map anywhere */
-         uint64_t tpidrurw_s;
-         uint64_t tpidrprw_s;
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo zcr_reginfo[] = {
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
-       .writefn = zcr_write, .raw_writefn = raw_write },
+       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
        .access = PL1_RW, .accessfn = access_tvm_trvm,
        .fgt = FGT_TTBR0_EL1,
 -      .writefn = vmsa_ttbr_write, .resetvalue = 0,
 +      .writefn = vmsa_ttbr_write, .resetvalue = 0, .raw_writefn = raw_write,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                               offsetof(CPUARMState, cp15.ttbr0_ns) } },
      { .name = "TTBR1_EL1", .state = ARM_CP_STATE_BOTH,
        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 1,
        .access = PL1_RW, .accessfn = access_tvm_trvm,
        .fgt = FGT_TTBR1_EL1,
 -      .writefn = vmsa_ttbr_write, .resetvalue = 0,
 +      .writefn = vmsa_ttbr_write, .resetvalue = 0, .raw_writefn = raw_write,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                               offsetof(CPUARMState, cp15.ttbr1_ns) } },
      { .name = "TCR_EL1", .state = ARM_CP_STATE_AA64,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo lpae_cp_reginfo[] = {
        .type = ARM_CP_64BIT | ARM_CP_ALIAS,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                               offsetof(CPUARMState, cp15.ttbr0_ns) },
 -      .writefn = vmsa_ttbr_write, },
 +      .writefn = vmsa_ttbr_write, .raw_writefn = raw_write },
      { .name = "TTBR1", .cp = 15, .crm = 2, .opc1 = 1,
        .access = PL1_RW, .accessfn = access_tvm_trvm,
        .type = ARM_CP_64BIT | ARM_CP_ALIAS,
        .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                               offsetof(CPUARMState, cp15.ttbr1_ns) },
 -      .writefn = vmsa_ttbr_write, },
 +      .writefn = vmsa_ttbr_write, .raw_writefn = raw_write },
  };
-+#ifdef TARGET_AARCH64
+ static uint64_t aa64_fpcr_read(CPUARMState *env, const ARMCPRegInfo *ri)
-+static CPAccessResult access_tpidr2(CPUARMState *env, const ARMCPRegInfo *ri,
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
-+                                    bool isread)
+       .type = ARM_CP_IO,
-+{
+       .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
-+    int el = arm_current_el(env);
+       .access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.hcr_el2),
-+
+-      .writefn = hcr_write },
-+    if (el == 0) {
++      .writefn = hcr_write, .raw_writefn = raw_write },
-+        uint64_t sctlr = arm_sctlr(env, el);
+     { .name = "HCR", .state = ARM_CP_STATE_AA32,
-+        if (!(sctlr & SCTLR_EnTP2)) {
+       .type = ARM_CP_ALIAS | ARM_CP_IO,
-+            return CP_ACCESS_TRAP;
+       .cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
-+        }
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
-+    }
+     { .name = "TCR_EL2", .state = ARM_CP_STATE_BOTH,
-+    /* TODO: FEAT_FGT */
+       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 2,
-+    if (el < 3
+       .access = PL2_RW, .writefn = vmsa_tcr_el12_write,
-+        && arm_feature(env, ARM_FEATURE_EL3)
++      .raw_writefn = raw_write,
-+        && !(env->cp15.scr_el3 & SCR_ENTP2)) {
+       .fieldoffset = offsetof(CPUARMState, cp15.tcr_el[2]) },
-+        return CP_ACCESS_TRAP_EL3;
+     { .name = "VTCR", .state = ARM_CP_STATE_AA32,
-+    }
+       .cp = 15, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
-+    return CP_ACCESS_OK;
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
-+}
+       .type = ARM_CP_64BIT | ARM_CP_ALIAS,
-+
+       .access = PL2_RW, .accessfn = access_el3_aa32ns,
-+static const ARMCPRegInfo sme_reginfo[] = {
+       .fieldoffset = offsetof(CPUARMState, cp15.vttbr_el2),
-+    { .name = "TPIDR2_EL0", .state = ARM_CP_STATE_AA64,
+-      .writefn = vttbr_write },
-+      .opc0 = 3, .opc1 = 3, .crn = 13, .crm = 0, .opc2 = 5,
++      .writefn = vttbr_write, .raw_writefn = raw_write },
-+      .access = PL0_RW, .accessfn = access_tpidr2,
+     { .name = "VTTBR_EL2", .state = ARM_CP_STATE_AA64,
-+      .fieldoffset = offsetof(CPUARMState, cp15.tpidr2_el0) },
+       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 0,
-+};
+-      .access = PL2_RW, .writefn = vttbr_write,
-+#endif /* TARGET_AARCH64 */
++      .access = PL2_RW, .writefn = vttbr_write, .raw_writefn = raw_write,
-+
+       .fieldoffset = offsetof(CPUARMState, cp15.vttbr_el2) },
- void hw_watchpoint_update(ARMCPU *cpu, int n)
+     { .name = "SCTLR_EL2", .state = ARM_CP_STATE_BOTH,
- {
+       .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 0,
-     CPUARMState *env = &cpu->env;
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
-@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
+       .fieldoffset = offsetof(CPUARMState, cp15.tpidr_el[2]) },
-     }
+     { .name = "TTBR0_EL2", .state = ARM_CP_STATE_AA64,
+       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 0,
- #ifdef TARGET_AARCH64
+-      .access = PL2_RW, .resetvalue = 0, .writefn = vmsa_tcr_ttbr_el2_write,
-+    if (cpu_isar_feature(aa64_sme, cpu)) {
++      .access = PL2_RW, .resetvalue = 0,
-+        define_arm_cp_regs(cpu, sme_reginfo);
++      .writefn = vmsa_tcr_ttbr_el2_write, .raw_writefn = raw_write,
-+    }
+       .fieldoffset = offsetof(CPUARMState, cp15.ttbr0_el[2]) },
-     if (cpu_isar_feature(aa64_pauth, cpu)) {
+     { .name = "HTTBR", .cp = 15, .opc1 = 4, .crm = 2,
-         define_arm_cp_regs(cpu, pauth_reginfo);
+       .access = PL2_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
-     }
+@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
      { .name = "SCR_EL3", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 1, .opc2 = 0,
        .access = PL3_RW, .fieldoffset = offsetof(CPUARMState, cp15.scr_el3),
 -      .resetfn = scr_reset, .writefn = scr_write },
 +      .resetfn = scr_reset, .writefn = scr_write, .raw_writefn = raw_write },
      { .name = "SCR",  .type = ARM_CP_ALIAS | ARM_CP_NEWEL,
        .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 0,
        .access = PL1_RW, .accessfn = access_trap_aa32s_el1,
        .fieldoffset = offsetoflow32(CPUARMState, cp15.scr_el3),
 -      .writefn = scr_write },
 +      .writefn = scr_write, .raw_writefn = raw_write },
      { .name = "SDER32_EL3", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 1, .opc2 = 1,
        .access = PL3_RW, .resetvalue = 0,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vhe_reginfo[] = {
      { .name = "TTBR1_EL2", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 1,
        .access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
 +      .raw_writefn = raw_write,
        .fieldoffset = offsetof(CPUARMState, cp15.ttbr1_el[2]) },
  #ifndef CONFIG_USER_ONLY
      { .name = "CNTHV_CVAL_EL2", .state = ARM_CP_STATE_AA64,
 --
-.25.1
+.34.1

-[PULL 17/25] target/arm: Generalize cpu_arm_{get, set}_default_vec_len
+[PULL 02/14] hw/arm/sbsa-ref: use XHCI to replace EHCI
-From: Richard Henderson <richard.henderson@linaro.org>
+From: Yuquan Wang <wangyuquan1236@phytium.com.cn>
-Rename from cpu_arm_{get,set}_sve_default_vec_len,
+The current sbsa-ref cannot use EHCI controller which is only
-and take the pointer to default_vq from opaque.
+able to do 32-bit DMA, since sbsa-ref doesn't have RAM below 4GB.
 Hence, this uses XHCI to provide a usb controller with 64-bit
 DMA capablity instead of EHCI.
+We bump the platform version to 0.3 with this change.  Although the
+hardware at the USB controller address changes, the firmware and
+Linux can both cope with this -- on an older non-XHCI-aware
+firmware/kernel setup the probe routine simply fails and the guest
+proceeds without any USB.  (This isn't a loss of functionality,
+because the old USB controller never worked in the first place.) So
+we can call this a backwards-compatible change and only bump the
+minor version.
+Signed-off-by: Yuquan Wang <wangyuquan1236@phytium.com.cn>
+Message-id: 20230621103847.447508-2-wangyuquan1236@phytium.com.cn
+[PMM: tweaked commit message; add line to docs about what
+ changes in platform version 0.3]
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-15-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu64.c | 27 ++++++++++++++-------------
+ docs/system/arm/sbsa.rst |  5 ++++-
-file changed, 14 insertions(+), 13 deletions(-)
+ hw/arm/sbsa-ref.c        | 23 +++++++++++++----------
  hw/arm/Kconfig           |  2 +-
 files changed, 18 insertions(+), 12 deletions(-)
-diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
+diff --git a/docs/system/arm/sbsa.rst b/docs/system/arm/sbsa.rst
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu64.c
+--- a/docs/system/arm/sbsa.rst
-+++ b/target/arm/cpu64.c
++++ b/docs/system/arm/sbsa.rst
-@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve(Object *obj, bool value, Error **errp)
+@@ -XXX,XX +XXX,XX @@ The ``sbsa-ref`` board supports:
+   - A configurable number of AArch64 CPUs
- #ifdef CONFIG_USER_ONLY
+   - GIC version 3
- /* Mirror linux /proc/sys/abi/sve_default_vector_length. */
+   - System bus AHCI controller
--static void cpu_arm_set_sve_default_vec_len(Object *obj, Visitor *v,
+-  - System bus EHCI controller
--                                            const char *name, void *opaque,
++  - System bus XHCI controller
--                                            Error **errp)
+   - CDROM and hard disc on AHCI bus
-+static void cpu_arm_set_default_vec_len(Object *obj, Visitor *v,
+   - E1000E ethernet card on PCIe bus
-+                                        const char *name, void *opaque,
+   - Bochs display adapter on PCIe bus
-+                                        Error **errp)
+@@ -XXX,XX +XXX,XX @@ Platform version changes:
 .2
    GIC ITS information is present in devicetree.
 +
 +0.3
 +  The USB controller is an XHCI device, not EHCI
 diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/sbsa-ref.c
 +++ b/hw/arm/sbsa-ref.c
@@ -XXX,XX +XXX,XX @@
  #include "hw/pci-host/gpex.h"
  #include "hw/qdev-properties.h"
  #include "hw/usb.h"
 +#include "hw/usb/xhci.h"
  #include "hw/char/pl011.h"
  #include "hw/watchdog/sbsa_gwdt.h"
  #include "net/net.h"
@@ -XXX,XX +XXX,XX @@ enum {
      SBSA_SECURE_UART_MM,
      SBSA_SECURE_MEM,
      SBSA_AHCI,
 -    SBSA_EHCI,
 +    SBSA_XHCI,
  };
  struct SBSAMachineState {
@@ -XXX,XX +XXX,XX @@ static const MemMapEntry sbsa_ref_memmap[] = {
      [SBSA_SMMU] =               { 0x60050000, 0x00020000 },
      /* Space here reserved for more SMMUs */
      [SBSA_AHCI] =               { 0x60100000, 0x00010000 },
 -    [SBSA_EHCI] =               { 0x60110000, 0x00010000 },
 +    [SBSA_XHCI] =               { 0x60110000, 0x00010000 },
      /* Space here reserved for other devices */
      [SBSA_PCIE_PIO] =           { 0x7fff0000, 0x00010000 },
      /* 32-bit address PCIE MMIO space */
@@ -XXX,XX +XXX,XX @@ static const int sbsa_ref_irqmap[] = {
      [SBSA_SECURE_UART] = 8,
      [SBSA_SECURE_UART_MM] = 9,
      [SBSA_AHCI] = 10,
 -    [SBSA_EHCI] = 11,
 +    [SBSA_XHCI] = 11,
      [SBSA_SMMU] = 12, /* ... to 15 */
      [SBSA_GWDT_WS0] = 16,
  };
@@ -XXX,XX +XXX,XX @@ static void create_fdt(SBSAMachineState *sms)
       *                        fw compatibility.
       */
      qemu_fdt_setprop_cell(fdt, "/", "machine-version-major", 0);
 -    qemu_fdt_setprop_cell(fdt, "/", "machine-version-minor", 2);
 +    qemu_fdt_setprop_cell(fdt, "/", "machine-version-minor", 3);
      if (ms->numa_state->have_numa_distance) {
          int size = nb_numa_nodes * nb_numa_nodes * 3 * sizeof(uint32_t);
@@ -XXX,XX +XXX,XX @@ static void create_ahci(const SBSAMachineState *sms)
      }
  }
 -static void create_ehci(const SBSAMachineState *sms)
 +static void create_xhci(const SBSAMachineState *sms)
  {
--    ARMCPU *cpu = ARM_CPU(obj);
+-    hwaddr base = sbsa_ref_memmap[SBSA_EHCI].base;
-+    uint32_t *ptr_default_vq = opaque;
+-    int irq = sbsa_ref_irqmap[SBSA_EHCI];
-     int32_t default_len, default_vq, remainder;
++    hwaddr base = sbsa_ref_memmap[SBSA_XHCI].base;
++    int irq = sbsa_ref_irqmap[SBSA_XHCI];
-     if (!visit_type_int32(v, name, &default_len, errp)) {
++    DeviceState *dev = qdev_new(TYPE_XHCI_SYSBUS);
-@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve_default_vec_len(Object *obj, Visitor *v,
+-    sysbus_create_simple("platform-ehci-usb", base,
-     /* Undocumented, but the kernel allows -1 to indicate "maximum". */
+-                         qdev_get_gpio_in(sms->gic, irq));
-     if (default_len == -1) {
++    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
--        cpu->sve_default_vq = ARM_MAX_VQ;
++    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, base);
-+        *ptr_default_vq = ARM_MAX_VQ;
++    sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, qdev_get_gpio_in(sms->gic, irq));
          return;
      }
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve_default_vec_len(Object *obj, Visitor *v,
          return;
      }
 -    cpu->sve_default_vq = default_vq;
 +    *ptr_default_vq = default_vq;
  }
--static void cpu_arm_get_sve_default_vec_len(Object *obj, Visitor *v,
+ static void create_smmu(const SBSAMachineState *sms, PCIBus *bus)
--                                            const char *name, void *opaque,
+@@ -XXX,XX +XXX,XX @@ static void sbsa_ref_init(MachineState *machine)
--                                            Error **errp)
-+static void cpu_arm_get_default_vec_len(Object *obj, Visitor *v,
+     create_ahci(sms);
-+                                        const char *name, void *opaque,
-+                                        Error **errp)
+-    create_ehci(sms);
- {
++    create_xhci(sms);
--    ARMCPU *cpu = ARM_CPU(obj);
--    int32_t value = cpu->sve_default_vq * 16;
+     create_pcie(sms);
-+    uint32_t *ptr_default_vq = opaque;
-+    int32_t value = *ptr_default_vq * 16;
+diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
+index XXXXXXX..XXXXXXX 100644
-     visit_type_int32(v, name, &value, errp);
+--- a/hw/arm/Kconfig
- }
++++ b/hw/arm/Kconfig
-@@ -XXX,XX +XXX,XX @@ void aarch64_add_sve_properties(Object *obj)
+@@ -XXX,XX +XXX,XX @@ config SBSA_REF
- #ifdef CONFIG_USER_ONLY
+     select PL011 # UART
-     /* Mirror linux /proc/sys/abi/sve_default_vector_length. */
+     select PL031 # RTC
-     object_property_add(obj, "sve-default-vector-length", "int32",
+     select PL061 # GPIO
--                        cpu_arm_get_sve_default_vec_len,
+-    select USB_EHCI_SYSBUS
--                        cpu_arm_set_sve_default_vec_len, NULL, NULL);
++    select USB_XHCI_SYSBUS
-+                        cpu_arm_get_default_vec_len,
+     select WDT_SBSA
-+                        cpu_arm_set_default_vec_len, NULL,
+     select BOCHS_DISPLAY
 +                        &cpu->sve_default_vq);
  #endif
  }
 --
-.25.1
+.34.1

-[PULL 25/25] target/arm: Check V7VE as well as LPAE in arm_pamax
+[PULL 03/14] tests/tcg/aarch64/sysregs.c: Use S syntax for id_aa64zfr0_el1 and id_aa64smfr0_el1
-From: Richard Henderson <richard.henderson@linaro.org>
+Some assemblers will complain about attempts to access
 id_aa64zfr0_el1 and id_aa64smfr0_el1 by name if the test
 binary isn't built for the right processor type:
-In machvirt_init we create a cpu but do not fully initialize it.
+ /tmp/ccASXpLo.s:782: Error: selected processor does not support system register name 'id_aa64zfr0_el1'
-Thus the propagation of V7VE to LPAE has not been done, and we
+ /tmp/ccASXpLo.s:829: Error: selected processor does not support system register name 'id_aa64smfr0_el1'
 compute the wrong value for some v7 cpus, e.g. cortex-a15.
-Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1078
+However, these registers are in the ID space and are guaranteed to
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+read-as-zero on older CPUs, so the access is both safe and sensible.
-Reported-by: He Zhe <zhe.he@windriver.com>
+Switch to using the S syntax, as we already do for ID_AA64ISAR2_EL1
-Message-id: 20220619001541.131672-3-richard.henderson@linaro.org
+and ID_AA64MMFR2_EL1.  This allows us to drop the HAS_ARMV9_SME check
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+and the makefile machinery to adjust the CFLAGS for this test, so we
 don't rely on having a sufficiently new compiler to be able to check
 these registers.
 This means we're actually testing the SME ID register: no released
 GCC yet recognizes -march=armv9-a+sme, so that was always skipped.
 It also avoids a future problem if we try to switch the "do we have
 SME support in the toolchain" check from "in the compiler" to "in the
 assembler" (at which point we would otherwise run into the above
 errors).
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/ptw.c | 8 +++++++-
+ tests/tcg/aarch64/sysregs.c       | 11 +++++++----
-file changed, 7 insertions(+), 1 deletion(-)
+ tests/tcg/aarch64/Makefile.target |  7 +------
 files changed, 8 insertions(+), 10 deletions(-)
-diff --git a/target/arm/ptw.c b/target/arm/ptw.c
+diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/ptw.c
+--- a/tests/tcg/aarch64/sysregs.c
-+++ b/target/arm/ptw.c
++++ b/tests/tcg/aarch64/sysregs.c
-@@ -XXX,XX +XXX,XX @@ unsigned int arm_pamax(ARMCPU *cpu)
+@@ -XXX,XX +XXX,XX @@
-         assert(parange < ARRAY_SIZE(pamax_map));
+ /*
-         return pamax_map[parange];
+  * Older assemblers don't recognize newer system register names,
-     }
+  * but we can still access them by the Sn_n_Cn_Cn_n syntax.
--    if (arm_feature(&cpu->env, ARM_FEATURE_LPAE)) {
++ * This also means we don't need to specifically request that the
-+
++ * assembler enables whatever architectural features the ID registers
-+    /*
++ * syntax might be gated behind.
-+     * In machvirt_init, we call arm_pamax on a cpu that is not fully
+  */
-+     * initialized, so we can't rely on the propagation done in realize.
+ #define SYS_ID_AA64ISAR2_EL1 S3_0_C0_C6_2
-+     */
+ #define SYS_ID_AA64MMFR2_EL1 S3_0_C0_C7_2
-+    if (arm_feature(&cpu->env, ARM_FEATURE_LPAE) ||
++#define SYS_ID_AA64ZFR0_EL1 S3_0_C0_C4_4
-+        arm_feature(&cpu->env, ARM_FEATURE_V7VE)) {
++#define SYS_ID_AA64SMFR0_EL1 S3_0_C0_C4_5
-         /* v7 with LPAE */
-         return 40;
+ int failed_bit_count;
-     }
@@ -XXX,XX +XXX,XX @@ int main(void)
      /* all hidden, DebugVer fixed to 0x6 (ARMv8 debug architecture) */
      get_cpu_reg_check_mask(id_aa64dfr0_el1,  _m(0000,0000,0000,0006));
      get_cpu_reg_check_zero(id_aa64dfr1_el1);
 -    get_cpu_reg_check_mask(id_aa64zfr0_el1,  _m(0ff0,ff0f,00ff,00ff));
 -#ifdef HAS_ARMV9_SME
 -    get_cpu_reg_check_mask(id_aa64smfr0_el1, _m(80f1,00fd,0000,0000));
 -#endif
 +    get_cpu_reg_check_mask(SYS_ID_AA64ZFR0_EL1,  _m(0ff0,ff0f,00ff,00ff));
 +    get_cpu_reg_check_mask(SYS_ID_AA64SMFR0_EL1, _m(80f1,00fd,0000,0000));
      get_cpu_reg_check_zero(id_aa64afr0_el1);
      get_cpu_reg_check_zero(id_aa64afr1_el1);
 diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
 index XXXXXXX..XXXXXXX 100644
 --- a/tests/tcg/aarch64/Makefile.target
 +++ b/tests/tcg/aarch64/Makefile.target
@@ -XXX,XX +XXX,XX @@ AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7
  mte-%: CFLAGS += -march=armv8.5-a+memtag
  endif
 -ifneq ($(CROSS_CC_HAS_SVE),)
  # System Registers Tests
  AARCH64_TESTS += sysregs
 -ifneq ($(CROSS_CC_HAS_ARMV9_SME),)
 -sysregs: CFLAGS+=-march=armv9-a+sme -DHAS_ARMV9_SME
 -else
 -sysregs: CFLAGS+=-march=armv8.1-a+sve
 -endif
 +ifneq ($(CROSS_CC_HAS_SVE),)
  # SVE ioctl test
  AARCH64_TESTS += sve-ioctls
  sve-ioctls: CFLAGS+=-march=armv8.1-a+sve
 --
-.25.1
+.34.1

-[PULL 20/25] target/arm: Add cpu properties for SME
+[PULL 04/14] target/arm: Avoid splitting Zregs across lines in dump
 From: Richard Henderson <richard.henderson@linaro.org>
-Mirror the properties for SVE.  The main difference is
+Allow the line length to extend to 548 columns.  While annoyingly wide,
-that any arbitrary set of powers of 2 may be supported,
+it's still less confusing than the continuations we print.  Also, the
-and not the stricter constraints that apply to SVE.
+default VL used by Linux (and max for A64FX) uses only 140 columns.
 Include a property to control FEAT_SME_FA64, as failing
 to restrict the runtime to the proper subset of insns
 could be a major point for bugs.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230622151201.1578522-2-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20220620175235.60881-18-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- docs/system/arm/cpu-features.rst |  56 +++++++++++++++
+ target/arm/cpu.c | 36 ++++++++++++++----------------------
- target/arm/cpu.h                 |   2 +
+file changed, 14 insertions(+), 22 deletions(-)
  target/arm/internals.h           |   1 +
  target/arm/cpu.c                 |  14 +++-
  target/arm/cpu64.c               | 114 +++++++++++++++++++++++++++++--
 files changed, 180 insertions(+), 7 deletions(-)
-diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
-index XXXXXXX..XXXXXXX 100644
---- a/docs/system/arm/cpu-features.rst
-+++ b/docs/system/arm/cpu-features.rst
-@@ -XXX,XX +XXX,XX @@ verbose command lines.  However, the recommended way to select vector
- lengths is to explicitly enable each desired length.  Therefore only
- example's (1), (4), and (6) exhibit recommended uses of the properties.
-+SME CPU Property Examples
-+-------------------------
-+
-+  1) Disable SME::
-+
-+     $ qemu-system-aarch64 -M virt -cpu max,sme=off
-+
-+  2) Implicitly enable all vector lengths for the ``max`` CPU type::
-+
-+     $ qemu-system-aarch64 -M virt -cpu max
-+
-+  3) Only enable the 256-bit vector length::
-+
-+     $ qemu-system-aarch64 -M virt -cpu max,sme256=on
-+
-+  3) Enable the 256-bit and 1024-bit vector lengths::
-+
-+     $ qemu-system-aarch64 -M virt -cpu max,sme256=on,sme1024=on
-+
-+  4) Disable the 512-bit vector length.  This results in all the other
-+     lengths supported by ``max`` defaulting to enabled
-+     (128, 256, 1024 and 2048)::
-+
-+     $ qemu-system-aarch64 -M virt -cpu max,sve512=off
-+
- SVE User-mode Default Vector Length Property
- --------------------------------------------
-@@ -XXX,XX +XXX,XX @@ length supported by QEMU is 256.
- If this property is set to ``-1`` then the default vector length
- is set to the maximum possible length.
-+
-+SME CPU Properties
-+==================
-+
-+The SME CPU properties are much like the SVE properties: ``sme`` is
-+used to enable or disable the entire SME feature, and ``sme<N>`` is
-+used to enable or disable specific vector lengths.  Finally,
-+``sme_fa64`` is used to enable or disable ``FEAT_SME_FA64``, which
-+allows execution of the "full a64" instruction set while Streaming
-+SVE mode is enabled.
-+
-+SME is not supported by KVM at this time.
-+
-+At least one vector length must be enabled when ``sme`` is enabled,
-+and all vector lengths must be powers of 2.  The maximum vector
-+length supported by qemu is 2048 bits.  Otherwise, there are no
-+additional constraints on the set of vector lengths supported by SME.
-+
-+SME User-mode Default Vector Length Property
-+--------------------------------------------
-+
-+For qemu-aarch64, the cpu propery ``sme-default-vector-length=N`` is
-+defined to mirror the Linux kernel parameter file
-+``/proc/sys/abi/sme_default_vector_length``.  The default length, ``N``,
-+is in units of bytes and must be between 16 and 8192.
-+If not specified, the default vector length is 32.
-+
-+As with ``sve-default-vector-length``, if the default length is larger
-+than the maximum vector length enabled, the actual vector length will
-+be reduced.  If this property is set to ``-1`` then the default vector
-+length is set to the maximum possible length.
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
- #ifdef CONFIG_USER_ONLY
-     /* Used to set the default vector length at process start. */
-     uint32_t sve_default_vq;
-+    uint32_t sme_default_vq;
- #endif
-     ARMVQMap sve_vq;
-+    ARMVQMap sme_vq;
-     /* Generic timer counter frequency, in Hz */
-     uint64_t gt_cntfrq_hz;
-diff --git a/target/arm/internals.h b/target/arm/internals.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/internals.h
-+++ b/target/arm/internals.h
-@@ -XXX,XX +XXX,XX @@ int arm_gdb_set_svereg(CPUARMState *env, uint8_t *buf, int reg);
- int aarch64_fpu_gdb_get_reg(CPUARMState *env, GByteArray *buf, int reg);
- int aarch64_fpu_gdb_set_reg(CPUARMState *env, uint8_t *buf, int reg);
- void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp);
-+void arm_cpu_sme_finalize(ARMCPU *cpu, Error **errp);
- void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp);
- void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp);
- #endif
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_initfn(Object *obj)
+@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
- #ifdef CONFIG_USER_ONLY
+     ARMCPU *cpu = ARM_CPU(cs);
- # ifdef TARGET_AARCH64
+     CPUARMState *env = &cpu->env;
-     /*
+     uint32_t psr = pstate_read(env);
--     * The linux kernel defaults to 512-bit vectors, when sve is supported.
+-    int i;
--     * See documentation for /proc/sys/abi/sve_default_vector_length, and
++    int i, j;
--     * our corresponding sve-default-vector-length cpu property.
+     int el = arm_current_el(env);
-+     * The linux kernel defaults to 512-bit for SVE, and 256-bit for SME.
+     const char *ns_status;
-+     * These values were chosen to fit within the default signal frame.
+     bool sve;
-+     * See documentation for /proc/sys/abi/{sve,sme}_default_vector_length,
+@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
-+     * and our corresponding cpu property.
+     }
-      */
-     cpu->sve_default_vq = 4;
+     if (sve) {
-+    cpu->sme_default_vq = 2;
+-        int j, zcr_len = sve_vqm1_for_el(env, el);
- # endif
++        int zcr_len = sve_vqm1_for_el(env, el);
- #else
-     /* Our inbound IRQ and FIQ lines */
+         for (i = 0; i <= FFR_PRED_NUM; i++) {
-@@ -XXX,XX +XXX,XX @@ void arm_cpu_finalize_features(ARMCPU *cpu, Error **errp)
+             bool eol;
-             return;
+@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
              }
          }
-+        arm_cpu_sme_finalize(cpu, &local_err);
+-        for (i = 0; i < 32; i++) {
-+        if (local_err != NULL) {
+-            if (zcr_len == 0) {
-+            error_propagate(errp, local_err);
++        if (zcr_len == 0) {
-+            return;
++            /*
-+        }
++             * With vl=16, there are only 37 columns per register,
-+
++             * so output two registers per line.
-         arm_cpu_pauth_finalize(cpu, &local_err);
++             */
-         if (local_err != NULL) {
++            for (i = 0; i < 32; i++) {
-             error_propagate(errp, local_err);
+                 qemu_fprintf(f, "Z%02d=%016" PRIx64 ":%016" PRIx64 "%s",
-diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
+                              i, env->vfp.zregs[i].d[1],
-index XXXXXXX..XXXXXXX 100644
+                              env->vfp.zregs[i].d[0], i & 1 ? "\n" : " ");
---- a/target/arm/cpu64.c
+-            } else if (zcr_len == 1) {
-+++ b/target/arm/cpu64.c
+-                qemu_fprintf(f, "Z%02d=%016" PRIx64 ":%016" PRIx64
-@@ -XXX,XX +XXX,XX @@ static void cpu_arm_get_vq(Object *obj, Visitor *v, const char *name,
+-                             ":%016" PRIx64 ":%016" PRIx64 "\n",
-     ARMCPU *cpu = ARM_CPU(obj);
+-                             i, env->vfp.zregs[i].d[3], env->vfp.zregs[i].d[2],
-     ARMVQMap *vq_map = opaque;
+-                             env->vfp.zregs[i].d[1], env->vfp.zregs[i].d[0]);
-     uint32_t vq = atoi(&name[3]) / 128;
+-            } else {
-+    bool sve = vq_map == &cpu->sve_vq;
++            }
-     bool value;
++        } else {
++            for (i = 0; i < 32; i++) {
--    /* All vector lengths are disabled when SVE is off. */
++                qemu_fprintf(f, "Z%02d=", i);
--    if (!cpu_isar_feature(aa64_sve, cpu)) {
+                 for (j = zcr_len; j >= 0; j--) {
-+    /* All vector lengths are disabled when feature is off. */
+-                    bool odd = (zcr_len - j) % 2 != 0;
-+    if (sve
+-                    if (j == zcr_len) {
-+        ? !cpu_isar_feature(aa64_sve, cpu)
+-                        qemu_fprintf(f, "Z%02d[%x-%x]=", i, j, j - 1);
-+        : !cpu_isar_feature(aa64_sme, cpu)) {
+-                    } else if (!odd) {
-         value = false;
+-                        if (j > 0) {
-     } else {
+-                            qemu_fprintf(f, "   [%x-%x]=", j, j - 1);
-         value = extract32(vq_map->map, vq - 1, 1);
+-                        } else {
-@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve(Object *obj, bool value, Error **errp)
+-                            qemu_fprintf(f, "     [%x]=", j);
-     cpu->isar.id_aa64pfr0 = t;
+-                        }
- }
+-                    }
+                     qemu_fprintf(f, "%016" PRIx64 ":%016" PRIx64 "%s",
-+void arm_cpu_sme_finalize(ARMCPU *cpu, Error **errp)
+                                  env->vfp.zregs[i].d[j * 2 + 1],
-+{
+-                                 env->vfp.zregs[i].d[j * 2],
-+    uint32_t vq_map = cpu->sme_vq.map;
+-                                 odd || j == 0 ? "\n" : ":");
-+    uint32_t vq_init = cpu->sme_vq.init;
++                                 env->vfp.zregs[i].d[j * 2 + 0],
-+    uint32_t vq_supported = cpu->sme_vq.supported;
++                                 j ? ":" : "\n");
-+    uint32_t vq;
+                 }
-+
+             }
-+    if (vq_map == 0) {
+         }
 +        if (!cpu_isar_feature(aa64_sme, cpu)) {
 +            cpu->isar.id_aa64smfr0 = 0;
 +            return;
 +        }
 +
 +        /* TODO: KVM will require limitations via SMCR_EL2. */
 +        vq_map = vq_supported & ~vq_init;
 +
 +        if (vq_map == 0) {
 +            vq = ctz32(vq_supported) + 1;
 +            error_setg(errp, "cannot disable sme%d", vq * 128);
 +            error_append_hint(errp, "All SME vector lengths are disabled.\n");
 +            error_append_hint(errp, "With SME enabled, at least one "
 +                              "vector length must be enabled.\n");
 +            return;
 +        }
 +    } else {
 +        if (!cpu_isar_feature(aa64_sme, cpu)) {
 +            vq = 32 - clz32(vq_map);
 +            error_setg(errp, "cannot enable sme%d", vq * 128);
 +            error_append_hint(errp, "SME must be enabled to enable "
 +                              "vector lengths.\n");
 +            error_append_hint(errp, "Add sme=on to the CPU property list.\n");
 +            return;
 +        }
 +        /* TODO: KVM will require limitations via SMCR_EL2. */
 +    }
 +
 +    cpu->sme_vq.map = vq_map;
 +}
 +
 +static bool cpu_arm_get_sme(Object *obj, Error **errp)
 +{
 +    ARMCPU *cpu = ARM_CPU(obj);
 +    return cpu_isar_feature(aa64_sme, cpu);
 +}
 +
 +static void cpu_arm_set_sme(Object *obj, bool value, Error **errp)
 +{
 +    ARMCPU *cpu = ARM_CPU(obj);
 +    uint64_t t;
 +
 +    t = cpu->isar.id_aa64pfr1;
 +    t = FIELD_DP64(t, ID_AA64PFR1, SME, value);
 +    cpu->isar.id_aa64pfr1 = t;
 +}
 +
 +static bool cpu_arm_get_sme_fa64(Object *obj, Error **errp)
 +{
 +    ARMCPU *cpu = ARM_CPU(obj);
 +    return cpu_isar_feature(aa64_sme, cpu) &&
 +           cpu_isar_feature(aa64_sme_fa64, cpu);
 +}
 +
 +static void cpu_arm_set_sme_fa64(Object *obj, bool value, Error **errp)
 +{
 +    ARMCPU *cpu = ARM_CPU(obj);
 +    uint64_t t;
 +
 +    t = cpu->isar.id_aa64smfr0;
 +    t = FIELD_DP64(t, ID_AA64SMFR0, FA64, value);
 +    cpu->isar.id_aa64smfr0 = t;
 +}
 +
  #ifdef CONFIG_USER_ONLY
 -/* Mirror linux /proc/sys/abi/sve_default_vector_length. */
 +/* Mirror linux /proc/sys/abi/{sve,sme}_default_vector_length. */
  static void cpu_arm_set_default_vec_len(Object *obj, Visitor *v,
                                          const char *name, void *opaque,
                                          Error **errp)
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_default_vec_len(Object *obj, Visitor *v,
       * and is the maximum architectural width of ZCR_ELx.LEN.
       */
      if (remainder || default_vq < 1 || default_vq > 512) {
 -        error_setg(errp, "cannot set sve-default-vector-length");
 +        ARMCPU *cpu = ARM_CPU(obj);
 +        const char *which =
 +            (ptr_default_vq == &cpu->sve_default_vq ? "sve" : "sme");
 +
 +        error_setg(errp, "cannot set %s-default-vector-length", which);
          if (remainder) {
              error_append_hint(errp, "Vector length not a multiple of 16\n");
          } else if (default_vq < 1) {
@@ -XXX,XX +XXX,XX @@ static void aarch64_add_sve_properties(Object *obj)
  #endif
  }
 +static void aarch64_add_sme_properties(Object *obj)
 +{
 +    ARMCPU *cpu = ARM_CPU(obj);
 +    uint32_t vq;
 +
 +    object_property_add_bool(obj, "sme", cpu_arm_get_sme, cpu_arm_set_sme);
 +    object_property_add_bool(obj, "sme_fa64", cpu_arm_get_sme_fa64,
 +                             cpu_arm_set_sme_fa64);
 +
 +    for (vq = 1; vq <= ARM_MAX_VQ; vq <<= 1) {
 +        char name[8];
 +        sprintf(name, "sme%d", vq * 128);
 +        object_property_add(obj, name, "bool", cpu_arm_get_vq,
 +                            cpu_arm_set_vq, NULL, &cpu->sme_vq);
 +    }
 +
 +#ifdef CONFIG_USER_ONLY
 +    /* Mirror linux /proc/sys/abi/sme_default_vector_length. */
 +    object_property_add(obj, "sme-default-vector-length", "int32",
 +                        cpu_arm_get_default_vec_len,
 +                        cpu_arm_set_default_vec_len, NULL,
 +                        &cpu->sme_default_vq);
 +#endif
 +}
 +
  void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
  {
      int arch_val = 0, impdef_val = 0;
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
  #endif
      cpu->sve_vq.supported = MAKE_64BIT_MASK(0, ARM_MAX_VQ);
 +    cpu->sme_vq.supported = SVE_VQ_POW2_MAP;
      aarch64_add_pauth_properties(obj);
      aarch64_add_sve_properties(obj);
 +    aarch64_add_sme_properties(obj);
      object_property_add(obj, "sve-max-vq", "uint32", cpu_max_get_sve_max_vq,
                          cpu_max_set_sve_max_vq, NULL, NULL);
      qdev_property_add_static(DEVICE(obj), &arm_cpu_lpa2_property);
 --
-.25.1
+.34.1

-[PULL 18/25] target/arm: Move arm_cpu_*_finalize to internals.h
+[PULL 05/14] target/arm: Dump ZA[] when active
 From: Richard Henderson <richard.henderson@linaro.org>
-Drop the aa32-only inline fallbacks,
+Always print each matrix row whole, one per line, so that we
-and just use a couple of ifdefs.
+get the entire matrix in the proper shape.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230622151201.1578522-3-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-16-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h       | 6 ------
+ target/arm/cpu.c | 18 ++++++++++++++++++
- target/arm/internals.h | 3 +++
+file changed, 18 insertions(+)
  target/arm/cpu.c       | 2 ++
 files changed, 5 insertions(+), 6 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ typedef struct {
- #ifdef TARGET_AARCH64
- # define ARM_MAX_VQ    16
--void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp);
--void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp);
--void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp);
- #else
- # define ARM_MAX_VQ    1
--static inline void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp) { }
--static inline void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp) { }
--static inline void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp) { }
- #endif
- typedef struct ARMVectorReg {
-diff --git a/target/arm/internals.h b/target/arm/internals.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/internals.h
-+++ b/target/arm/internals.h
-@@ -XXX,XX +XXX,XX @@ int arm_gdb_get_svereg(CPUARMState *env, GByteArray *buf, int reg);
- int arm_gdb_set_svereg(CPUARMState *env, uint8_t *buf, int reg);
- int aarch64_fpu_gdb_get_reg(CPUARMState *env, GByteArray *buf, int reg);
- int aarch64_fpu_gdb_set_reg(CPUARMState *env, uint8_t *buf, int reg);
-+void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp);
-+void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp);
-+void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp);
- #endif
- #ifdef CONFIG_USER_ONLY
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ void arm_cpu_finalize_features(ARMCPU *cpu, Error **errp)
+@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
- {
+                          i, q[1], q[0], (i & 1 ? "\n" : " "));
      Error *local_err = NULL;
 +#ifdef TARGET_AARCH64
      if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
          arm_cpu_sve_finalize(cpu, &local_err);
          if (local_err != NULL) {
@@ -XXX,XX +XXX,XX @@ void arm_cpu_finalize_features(ARMCPU *cpu, Error **errp)
              return;
          }
      }
-+#endif
++
++    if (cpu_isar_feature(aa64_sme, cpu) &&
-     if (kvm_enabled()) {
++        FIELD_EX64(env->svcr, SVCR, ZA) &&
-         kvm_arm_steal_time_finalize(cpu, &local_err);
++        sme_exception_el(env, el) == 0) {
 +        int zcr_len = sve_vqm1_for_el_sm(env, el, true);
 +        int svl = (zcr_len + 1) * 16;
 +        int svl_lg10 = svl < 100 ? 2 : 3;
 +
 +        for (i = 0; i < svl; i++) {
 +            qemu_fprintf(f, "ZA[%0*d]=", svl_lg10, i);
 +            for (j = zcr_len; j >= 0; --j) {
 +                qemu_fprintf(f, "%016" PRIx64 ":%016" PRIx64 "%c",
 +                             env->zarray[i].d[2 * j + 1],
 +                             env->zarray[i].d[2 * j],
 +                             j ? ':' : '\n');
 +            }
 +        }
 +    }
  }
  #else
 --
-.25.1
+.34.1

-[PULL 05/25] target/arm: Add SMEEXC_EL to TB flags
+[PULL 06/14] target/arm: Fix SME full tile indexing
 From: Richard Henderson <richard.henderson@linaro.org>
-This is CheckSMEAccess, which is the basis for a set of
+For the outer product set of insns, which take an entire matrix
-related tests for various SME cpregs and instructions.
+tile as output, the argument is not a combined tile+column.
 Therefore using get_tile_rowcol was incorrect, as we extracted
 the tile number from itself.
+The test case relies only on assembler support for SME, since
+no release of GCC recognizes -march=armv9-a+sme yet.
+Cc: qemu-stable@nongnu.org
+Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1620
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230622151201.1578522-5-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+[PMM: dropped now-unneeded changes to sysregs CFLAGS]
 Message-id: 20220620175235.60881-3-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h           |  2 ++
+ target/arm/tcg/translate-sme.c    | 24 ++++++---
- target/arm/translate.h     |  1 +
+ tests/tcg/aarch64/sme-outprod1.c  | 83 +++++++++++++++++++++++++++++++
- target/arm/helper.c        | 52 ++++++++++++++++++++++++++++++++++++++
+ tests/tcg/aarch64/Makefile.target |  7 ++-
- target/arm/translate-a64.c |  1 +
+files changed, 107 insertions(+), 7 deletions(-)
-files changed, 56 insertions(+)
+ create mode 100644 tests/tcg/aarch64/sme-outprod1.c
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/tcg/translate-sme.c
-+++ b/target/arm/cpu.h
++++ b/target/arm/tcg/translate-sme.c
-@@ -XXX,XX +XXX,XX @@ void aarch64_sync_64_to_32(CPUARMState *env);
+@@ -XXX,XX +XXX,XX @@ static TCGv_ptr get_tile_rowcol(DisasContext *s, int esz, int rs,
+     return addr;
  int fp_exception_el(CPUARMState *env, int cur_el);
  int sve_exception_el(CPUARMState *env, int cur_el);
 +int sme_exception_el(CPUARMState *env, int cur_el);
  /**
   * sve_vqm1_for_el:
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, ATA, 15, 1)
  FIELD(TBFLAG_A64, TCMA, 16, 2)
  FIELD(TBFLAG_A64, MTE_ACTIVE, 18, 1)
  FIELD(TBFLAG_A64, MTE0_ACTIVE, 19, 1)
 +FIELD(TBFLAG_A64, SMEEXC_EL, 20, 2)
  /*
   * Helpers for using the above.
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
      bool ns;        /* Use non-secure CPREG bank on access */
      int fp_excp_el; /* FP exception EL or 0 if enabled */
      int sve_excp_el; /* SVE exception EL or 0 if enabled */
 +    int sme_excp_el; /* SME exception EL or 0 if enabled */
      int vl;          /* current vector length in bytes */
      bool vfp_enabled; /* FP enabled via FPSCR.EN */
      int vec_len;
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ int sve_exception_el(CPUARMState *env, int el)
      return 0;
  }
 +/*
-+ * Return the exception level to which exceptions should be taken for SME.
++ * Resolve tile.size[0] to a host pointer.
-+ * C.f. the ARM pseudocode function CheckSMEAccess.
++ * Used by e.g. outer product insns where we require the entire tile.
 + */
-+int sme_exception_el(CPUARMState *env, int el)
++static TCGv_ptr get_tile(DisasContext *s, int esz, int tile)
 +{
-+#ifndef CONFIG_USER_ONLY
++    TCGv_ptr addr = tcg_temp_new_ptr();
-+    if (el <= 1 && !el_is_in_host(env, el)) {
++    int offset;
-+        switch (FIELD_EX64(env->cp15.cpacr_el1, CPACR_EL1, SMEN)) {
++
-+        case 1:
++    offset = tile * sizeof(ARMVectorReg) + offsetof(CPUARMState, zarray);
-+            if (el != 0) {
++
-+                break;
++    tcg_gen_addi_ptr(addr, cpu_env, offset);
-+            }
++    return addr;
-+            /* fall through */
++}
-+        case 0:
++
-+        case 2:
+ static bool trans_ZERO(DisasContext *s, arg_ZERO *a)
-+            return 1;
+ {
      if (!dc_isar_feature(aa64_sme, s)) {
@@ -XXX,XX +XXX,XX @@ static bool do_adda(DisasContext *s, arg_adda *a, MemOp esz,
          return true;
      }
 -    /* Sum XZR+zad to find ZAd. */
 -    za = get_tile_rowcol(s, esz, 31, a->zad, false);
 +    za = get_tile(s, esz, a->zad);
      zn = vec_full_reg_ptr(s, a->zn);
      pn = pred_full_reg_ptr(s, a->pn);
      pm = pred_full_reg_ptr(s, a->pm);
@@ -XXX,XX +XXX,XX @@ static bool do_outprod(DisasContext *s, arg_op *a, MemOp esz,
          return true;
      }
 -    /* Sum XZR+zad to find ZAd. */
 -    za = get_tile_rowcol(s, esz, 31, a->zad, false);
 +    za = get_tile(s, esz, a->zad);
      zn = vec_full_reg_ptr(s, a->zn);
      zm = vec_full_reg_ptr(s, a->zm);
      pn = pred_full_reg_ptr(s, a->pn);
@@ -XXX,XX +XXX,XX @@ static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz,
          return true;
      }
 -    /* Sum XZR+zad to find ZAd. */
 -    za = get_tile_rowcol(s, esz, 31, a->zad, false);
 +    za = get_tile(s, esz, a->zad);
      zn = vec_full_reg_ptr(s, a->zn);
      zm = vec_full_reg_ptr(s, a->zm);
      pn = pred_full_reg_ptr(s, a->pn);
 diff --git a/tests/tcg/aarch64/sme-outprod1.c b/tests/tcg/aarch64/sme-outprod1.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/tests/tcg/aarch64/sme-outprod1.c
@@ -XXX,XX +XXX,XX @@
 +/*
 + * SME outer product, 1 x 1.
 + * SPDX-License-Identifier: GPL-2.0-or-later
 + */
 +
 +#include <stdio.h>
 +
 +extern void foo(float *dst);
 +
 +asm(
 +"    .arch_extension sme\n"
 +"    .type foo, @function\n"
 +"foo:\n"
 +"    stp x29, x30, [sp, -80]!\n"
 +"    mov x29, sp\n"
 +"    stp d8, d9, [sp, 16]\n"
 +"    stp d10, d11, [sp, 32]\n"
 +"    stp d12, d13, [sp, 48]\n"
 +"    stp d14, d15, [sp, 64]\n"
 +"    smstart\n"
 +"    ptrue p0.s, vl4\n"
 +"    fmov z0.s, #1.0\n"
 +/*
 + * An outer product of a vector of 1.0 by itself should be a matrix of 1.0.
 + * Note that we are using tile 1 here (za1.s) rather than tile 0.
 + */
 +"    zero {za}\n"
 +"    fmopa za1.s, p0/m, p0/m, z0.s, z0.s\n"
 +/*
 + * Read the first 4x4 sub-matrix of elements from tile 1:
 + * Note that za1h should be interchangable here.
 + */
 +"    mov w12, #0\n"
 +"    mova z0.s, p0/m, za1v.s[w12, #0]\n"
 +"    mova z1.s, p0/m, za1v.s[w12, #1]\n"
 +"    mova z2.s, p0/m, za1v.s[w12, #2]\n"
 +"    mova z3.s, p0/m, za1v.s[w12, #3]\n"
 +/*
 + * And store them to the input pointer (dst in the C code):
 + */
 +"    st1w {z0.s}, p0, [x0]\n"
 +"    add x0, x0, #16\n"
 +"    st1w {z1.s}, p0, [x0]\n"
 +"    add x0, x0, #16\n"
 +"    st1w {z2.s}, p0, [x0]\n"
 +"    add x0, x0, #16\n"
 +"    st1w {z3.s}, p0, [x0]\n"
 +"    smstop\n"
 +"    ldp d8, d9, [sp, 16]\n"
 +"    ldp d10, d11, [sp, 32]\n"
 +"    ldp d12, d13, [sp, 48]\n"
 +"    ldp d14, d15, [sp, 64]\n"
 +"    ldp x29, x30, [sp], 80\n"
 +"    ret\n"
 +"    .size foo, . - foo"
 +);
 +
 +int main()
 +{
 +    float dst[16];
 +    int i, j;
 +
 +    foo(dst);
 +
 +    for (i = 0; i < 16; i++) {
 +        if (dst[i] != 1.0f) {
 +            break;
 +        }
 +    }
 +
-+    if (el <= 2 && arm_is_el2_enabled(env)) {
++    if (i == 16) {
-+        /* CPTR_EL2 changes format with HCR_EL2.E2H (regardless of TGE). */
++        return 0; /* success */
 +        if (env->cp15.hcr_el2 & HCR_E2H) {
 +            switch (FIELD_EX64(env->cp15.cptr_el[2], CPTR_EL2, SMEN)) {
 +            case 1:
 +                if (el != 0 || !(env->cp15.hcr_el2 & HCR_TGE)) {
 +                    break;
 +                }
 +                /* fall through */
 +            case 0:
 +            case 2:
 +                return 2;
 +            }
 +        } else {
 +            if (FIELD_EX64(env->cp15.cptr_el[2], CPTR_EL2, TSM)) {
 +                return 2;
 +            }
 +        }
 +    }
 +
-+    /* CPTR_EL3.  Since ESM is negative we must check for EL3.  */
++    /* failure */
-+    if (arm_feature(env, ARM_FEATURE_EL3)
++    for (i = 0; i < 4; ++i) {
-+        && !FIELD_EX64(env->cp15.cptr_el[3], CPTR_EL3, ESM)) {
++        for (j = 0; j < 4; ++j) {
-+        return 3;
++            printf("%f ", (double)dst[i * 4 + j]);
 +        }
 +        printf("\n");
 +    }
-+#endif
++    return 1;
 +    return 0;
 +}
+diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
+index XXXXXXX..XXXXXXX 100644
+--- a/tests/tcg/aarch64/Makefile.target
++++ b/tests/tcg/aarch64/Makefile.target
+@@ -XXX,XX +XXX,XX @@ config-cc.mak: Makefile
+         $(call cc-option,-march=armv8.5-a,              CROSS_CC_HAS_ARMV8_5); \
+         $(call cc-option,-mbranch-protection=standard,  CROSS_CC_HAS_ARMV8_BTI); \
+         $(call cc-option,-march=armv8.5-a+memtag,       CROSS_CC_HAS_ARMV8_MTE); \
+-        $(call cc-option,-march=armv9-a+sme,            CROSS_CC_HAS_ARMV9_SME)) 3> config-cc.mak
++        $(call cc-option,-Wa$(COMMA)-march=armv9-a+sme, CROSS_AS_HAS_ARMV9_SME)) 3> config-cc.mak
+ -include config-cc.mak
+ ifneq ($(CROSS_CC_HAS_ARMV8_2),)
+@@ -XXX,XX +XXX,XX @@ AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7
+ mte-%: CFLAGS += -march=armv8.5-a+memtag
+ endif
++# SME Tests
++ifneq ($(CROSS_AS_HAS_ARMV9_SME),)
++AARCH64_TESTS += sme-outprod1
++endif
 +
- /*
+ # System Registers Tests
-  * Given that SVE is enabled, return the vector length for EL.
+ AARCH64_TESTS += sysregs
-  */
@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
          }
          DP_TBFLAG_A64(flags, SVEEXC_EL, sve_el);
      }
 +    if (cpu_isar_feature(aa64_sme, env_archcpu(env))) {
 +        DP_TBFLAG_A64(flags, SMEEXC_EL, sme_exception_el(env, el));
 +    }
      sctlr = regime_sctlr(env, stage1);
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
      dc->align_mem = EX_TBFLAG_ANY(tb_flags, ALIGN_MEM);
      dc->pstate_il = EX_TBFLAG_ANY(tb_flags, PSTATE__IL);
      dc->sve_excp_el = EX_TBFLAG_A64(tb_flags, SVEEXC_EL);
 +    dc->sme_excp_el = EX_TBFLAG_A64(tb_flags, SMEEXC_EL);
      dc->vl = (EX_TBFLAG_A64(tb_flags, VL) + 1) * 16;
      dc->pauth_active = EX_TBFLAG_A64(tb_flags, PAUTH_ACTIVE);
      dc->bt = EX_TBFLAG_A64(tb_flags, BT);
 --
-.25.1
+.34.1

-[PULL 15/25] target/arm: Create ARMVQMap
+[PULL 07/14] target/arm: Handle IC IVAU to improve compatibility with JITs
-From: Richard Henderson <richard.henderson@linaro.org>
+From: John Högberg <john.hogberg@ericsson.com>
-Pull the three sve_vq_* values into a structure.
+Unlike architectures with precise self-modifying code semantics
-This will be reused for SME.
+(e.g. x86) ARM processors do not maintain coherency for instruction
 execution and memory, requiring an instruction synchronization
 barrier on every core that will execute the new code, and on many
 models also the explicit use of cache management instructions.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+While this is required to make JITs work on actual hardware, QEMU
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+has gotten away with not handling this since it does not emulate
-Message-id: 20220620175235.60881-13-richard.henderson@linaro.org
+caches, and unconditionally invalidates code whenever the softmmu
 or the user-mode page protection logic detects that code has been
 modified.
 Unfortunately the latter does not work in the face of dual-mapped
 code (a common W^X workaround), where one page is executable and
 the other is writable: user-mode has no way to connect one with the
 other as that is only known to the kernel and the emulated
 application.
 This commit works around the issue by telling software that
 instruction cache invalidation is required by clearing the
 CPR_EL0.DIC flag (regardless of whether the emulated processor
 needs it), and then invalidating code in IC IVAU instructions.
 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1034
 Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: John Högberg <john.hogberg@ericsson.com>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Message-id: 168778890374.24232.3402138851538068785-1@git.sr.ht
 [PMM: removed unnecessary AArch64 feature check; moved
  "clear CTR_EL1.DIC" code up a bit so it's not in the middle
  of the vfp/neon related tests]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.h    | 29 ++++++++++++++---------------
+ target/arm/cpu.c    | 11 +++++++++++
- target/arm/cpu64.c  | 22 +++++++++++-----------
+ target/arm/helper.c | 47 ++++++++++++++++++++++++++++++++++++++++++---
- target/arm/helper.c |  2 +-
+files changed, 55 insertions(+), 3 deletions(-)
  target/arm/kvm64.c  |  2 +-
 files changed, 27 insertions(+), 28 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/cpu.c
-+++ b/target/arm/cpu.h
++++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ typedef enum ARMPSCIState {
+@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
  typedef struct ARMISARegisters ARMISARegisters;
 +/*
 + * In map, each set bit is a supported vector length of (bit-number + 1) * 16
 + * bytes, i.e. each bit number + 1 is the vector length in quadwords.
 + *
 + * While processing properties during initialization, corresponding init bits
 + * are set for bits in sve_vq_map that have been set by properties.
 + *
 + * Bits set in supported represent valid vector lengths for the CPU type.
 + */
 +typedef struct {
 +    uint32_t map, init, supported;
 +} ARMVQMap;
 +
  /**
   * ARMCPU:
   * @env: #CPUARMState
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
      uint32_t sve_default_vq;
  #endif
 -    /*
 -     * In sve_vq_map each set bit is a supported vector length of
 -     * (bit-number + 1) * 16 bytes, i.e. each bit number + 1 is the vector
 -     * length in quadwords.
 -     *
 -     * While processing properties during initialization, corresponding
 -     * sve_vq_init bits are set for bits in sve_vq_map that have been
 -     * set by properties.
 -     *
 -     * Bits set in sve_vq_supported represent valid vector lengths for
 -     * the CPU type.
 -     */
 -    uint32_t sve_vq_map;
 -    uint32_t sve_vq_init;
 -    uint32_t sve_vq_supported;
 +    ARMVQMap sve_vq;
      /* Generic timer counter frequency, in Hz */
      uint64_t gt_cntfrq_hz;
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
       * any of the above.  Finally, if SVE is not disabled, then at least one
       * vector length must be enabled.
       */
 -    uint32_t vq_map = cpu->sve_vq_map;
 -    uint32_t vq_init = cpu->sve_vq_init;
 +    uint32_t vq_map = cpu->sve_vq.map;
 +    uint32_t vq_init = cpu->sve_vq.init;
      uint32_t vq_supported;
      uint32_t vq_mask = 0;
      uint32_t tmp, vq, max_vq = 0;
@@ -XXX,XX +XXX,XX @@ void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
       */
      if (kvm_enabled()) {
          if (kvm_arm_sve_supported()) {
 -            cpu->sve_vq_supported = kvm_arm_sve_get_vls(CPU(cpu));
 -            vq_supported = cpu->sve_vq_supported;
 +            cpu->sve_vq.supported = kvm_arm_sve_get_vls(CPU(cpu));
 +            vq_supported = cpu->sve_vq.supported;
          } else {
              assert(!cpu_isar_feature(aa64_sve, cpu));
              vq_supported = 0;
          }
      } else {
 -        vq_supported = cpu->sve_vq_supported;
 +        vq_supported = cpu->sve_vq.supported;
      }
      /*
@@ -XXX,XX +XXX,XX @@ void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
      /* From now on sve_max_vq is the actual maximum supported length. */
      cpu->sve_max_vq = max_vq;
 -    cpu->sve_vq_map = vq_map;
 +    cpu->sve_vq.map = vq_map;
  }
  static void cpu_max_get_sve_max_vq(Object *obj, Visitor *v, const char *name,
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_get_sve_vq(Object *obj, Visitor *v, const char *name,
      if (!cpu_isar_feature(aa64_sve, cpu)) {
          value = false;
      } else {
 -        value = extract32(cpu->sve_vq_map, vq - 1, 1);
 +        value = extract32(cpu->sve_vq.map, vq - 1, 1);
      }
      visit_type_bool(v, name, &value, errp);
  }
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve_vq(Object *obj, Visitor *v, const char *name,
          return;
      }
--    cpu->sve_vq_map = deposit32(cpu->sve_vq_map, vq - 1, 1, value);
++#ifdef CONFIG_USER_ONLY
--    cpu->sve_vq_init |= 1 << (vq - 1);
++    /*
-+    cpu->sve_vq.map = deposit32(cpu->sve_vq.map, vq - 1, 1, value);
++     * User mode relies on IC IVAU instructions to catch modification of
-+    cpu->sve_vq.init |= 1 << (vq - 1);
++     * dual-mapped code.
- }
++     *
++     * Clear CTR_EL0.DIC to ensure that software that honors these flags uses
- static bool cpu_arm_get_sve(Object *obj, Error **errp)
++     * IC IVAU even if the emulated processor does not normally require it.
-@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
++     */
-     cpu->dcz_blocksize = 7; /*  512 bytes */
++    cpu->ctr = FIELD_DP64(cpu->ctr, CTR_EL0, DIC, 0);
- #endif
++#endif
++
--    cpu->sve_vq_supported = MAKE_64BIT_MASK(0, ARM_MAX_VQ);
+     if (arm_feature(env, ARM_FEATURE_AARCH64) &&
-+    cpu->sve_vq.supported = MAKE_64BIT_MASK(0, ARM_MAX_VQ);
+         cpu->has_vfp != cpu->has_neon) {
+         /*
      aarch64_add_pauth_properties(obj);
      aarch64_add_sve_properties(obj);
@@ -XXX,XX +XXX,XX @@ static void aarch64_a64fx_initfn(Object *obj)
      /* The A64FX supports only 128, 256 and 512 bit vector lengths */
      aarch64_add_sve_properties(obj);
 -    cpu->sve_vq_supported = (1 << 0)  /* 128bit */
 +    cpu->sve_vq.supported = (1 << 0)  /* 128bit */
                            | (1 << 1)  /* 256bit */
                            | (1 << 3); /* 512bit */
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ uint32_t sve_vqm1_for_el(CPUARMState *env, int el)
+@@ -XXX,XX +XXX,XX @@ static void mdcr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
          len = MIN(len, 0xf & (uint32_t)env->vfp.zcr_el[3]);
      }
--    len = 31 - clz32(cpu->sve_vq_map & MAKE_64BIT_MASK(0, len + 1));
-+    len = 31 - clz32(cpu->sve_vq.map & MAKE_64BIT_MASK(0, len + 1));
-     return len;
  }
-diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
++#ifdef CONFIG_USER_ONLY
-index XXXXXXX..XXXXXXX 100644
++/*
---- a/target/arm/kvm64.c
++ * `IC IVAU` is handled to improve compatibility with JITs that dual-map their
-+++ b/target/arm/kvm64.c
++ * code to get around W^X restrictions, where one region is writable and the
-@@ -XXX,XX +XXX,XX @@ uint32_t kvm_arm_sve_get_vls(CPUState *cs)
++ * other is executable.
- static int kvm_arm_sve_set_vls(CPUState *cs)
++ *
- {
++ * Since the executable region is never written to we cannot detect code
-     ARMCPU *cpu = ARM_CPU(cs);
++ * changes when running in user mode, and rely on the emulated JIT telling us
--    uint64_t vls[KVM_ARM64_SVE_VLS_WORDS] = { cpu->sve_vq_map };
++ * that the code has changed by executing this instruction.
-+    uint64_t vls[KVM_ARM64_SVE_VLS_WORDS] = { cpu->sve_vq.map };
++ */
-     struct kvm_one_reg reg = {
++static void ic_ivau_write(CPUARMState *env, const ARMCPRegInfo *ri,
-         .id = KVM_REG_ARM64_SVE_VLS,
++                          uint64_t value)
-         .addr = (uint64_t)&vls[0],
++{
 +    uint64_t icache_line_mask, start_address, end_address;
 +    const ARMCPU *cpu;
 +
 +    cpu = env_archcpu(env);
 +
 +    icache_line_mask = (4 << extract32(cpu->ctr, 0, 4)) - 1;
 +    start_address = value & ~icache_line_mask;
 +    end_address = value | icache_line_mask;
 +
 +    mmap_lock();
 +
 +    tb_invalidate_phys_range(start_address, end_address);
 +
 +    mmap_unlock();
 +}
 +#endif
 +
  static const ARMCPRegInfo v8_cp_reginfo[] = {
      /*
       * Minimal set of EL0-visible registers. This will need to be expanded
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
      { .name = "CURRENTEL", .state = ARM_CP_STATE_AA64,
        .opc0 = 3, .opc1 = 0, .opc2 = 2, .crn = 4, .crm = 2,
        .access = PL1_R, .type = ARM_CP_CURRENTEL },
 -    /* Cache ops: all NOPs since we don't emulate caches */
 +    /*
 +     * Instruction cache ops. All of these except `IC IVAU` NOP because we
 +     * don't emulate caches.
 +     */
      { .name = "IC_IALLUIS", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
        .access = PL1_W, .type = ARM_CP_NOP,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
        .accessfn = access_tocu },
      { .name = "IC_IVAU", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 5, .opc2 = 1,
 -      .access = PL0_W, .type = ARM_CP_NOP,
 +      .access = PL0_W,
        .fgt = FGT_ICIVAU,
 -      .accessfn = access_tocu },
 +      .accessfn = access_tocu,
 +#ifdef CONFIG_USER_ONLY
 +      .type = ARM_CP_NO_RAW,
 +      .writefn = ic_ivau_write
 +#else
 +      .type = ARM_CP_NOP
 +#endif
 +    },
 +    /* Cache ops: all NOPs since we don't emulate caches */
      { .name = "DC_IVAC", .state = ARM_CP_STATE_AA64,
        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
        .access = PL1_W, .accessfn = aa64_cacheop_poc_access,
 --
-.25.1
+.34.1

-[PULL 02/25] accel: Introduce current_accel_name()
+[PULL 08/14] tests/qtest: xlnx-canfd-test: Fix code coverity issues
-From: Alexander Graf <agraf@csgraf.de>
+From: Vikram Garhwal <vikram.garhwal@amd.com>
-We need to fetch the name of the current accelerator in flexible error
+Following are done to fix the coverity issues:
-messages more going forward. Let's create a helper that gives it to us
+. Change read_data to fix the CID 1512899: Out-of-bounds access (OVERRUN)
-without casting in the target code.
+. Fix match_rx_tx_data to fix CID 1512900: Logically dead code (DEADCODE)
 . Replace rand() in generate_random_data() with g_rand_int()
-Signed-off-by: Alexander Graf <agraf@csgraf.de>
+Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230628202758.16398-1-vikram.garhwal@amd.com
-Message-id: 20220620192242.70573-1-agraf@csgraf.de
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/qemu/accel.h | 1 +
+ tests/qtest/xlnx-canfd-test.c | 33 +++++++++++----------------------
- accel/accel-common.c | 8 ++++++++
+file changed, 11 insertions(+), 22 deletions(-)
  softmmu/vl.c         | 3 +--
 files changed, 10 insertions(+), 2 deletions(-)
-diff --git a/include/qemu/accel.h b/include/qemu/accel.h
+diff --git a/tests/qtest/xlnx-canfd-test.c b/tests/qtest/xlnx-canfd-test.c
 index XXXXXXX..XXXXXXX 100644
---- a/include/qemu/accel.h
+--- a/tests/qtest/xlnx-canfd-test.c
-+++ b/include/qemu/accel.h
++++ b/tests/qtest/xlnx-canfd-test.c
-@@ -XXX,XX +XXX,XX @@ typedef struct AccelClass {
+@@ -XXX,XX +XXX,XX @@ static void generate_random_data(uint32_t *buf_tx, bool is_canfd_frame)
+     /* Generate random TX data for CANFD frame. */
- AccelClass *accel_find(const char *opt_name);
+     if (is_canfd_frame) {
- AccelState *current_accel(void);
+         for (int i = 0; i < CANFD_FRAME_SIZE - 2; i++) {
-+const char *current_accel_name(void);
+-            buf_tx[2 + i] = rand();
++            buf_tx[2 + i] = g_random_int();
- void accel_init_interfaces(AccelClass *ac);
+         }
+     } else {
-diff --git a/accel/accel-common.c b/accel/accel-common.c
+         /* Generate random TX data for CAN frame. */
-index XXXXXXX..XXXXXXX 100644
+         for (int i = 0; i < CAN_FRAME_SIZE - 2; i++) {
---- a/accel/accel-common.c
+-            buf_tx[2 + i] = rand();
-+++ b/accel/accel-common.c
++            buf_tx[2 + i] = g_random_int();
-@@ -XXX,XX +XXX,XX @@ AccelClass *accel_find(const char *opt_name)
+         }
-     return ac;
+     }
  }
-+/* Return the name of the current accelerator */
+-static void read_data(QTestState *qts, uint64_t can_base_addr, uint32_t *buf_rx)
-+const char *current_accel_name(void)
++static void read_data(QTestState *qts, uint64_t can_base_addr, uint32_t *buf_rx,
-+{
++                      uint32_t frame_size)
 +    AccelClass *ac = ACCEL_GET_CLASS(current_accel());
 +
 +    return ac->name;
 +}
 +
  static void accel_init_cpu_int_aux(ObjectClass *klass, void *opaque)
  {
-     CPUClass *cc = CPU_CLASS(klass);
+     uint32_t int_status;
-diff --git a/softmmu/vl.c b/softmmu/vl.c
+     uint32_t fifo_status_reg_value;
-index XXXXXXX..XXXXXXX 100644
+     /* At which RX FIFO the received data is stored. */
---- a/softmmu/vl.c
+     uint8_t store_ind = 0;
-+++ b/softmmu/vl.c
+-    bool is_canfd_frame = false;
-@@ -XXX,XX +XXX,XX @@ static void configure_accelerators(const char *progname)
      /* Read the interrupt on CANFD rx. */
      int_status = qtest_readl(qts, can_base_addr + R_ISR_OFFSET) & ISR_RXOK;
@@ -XXX,XX +XXX,XX @@ static void read_data(QTestState *qts, uint64_t can_base_addr, uint32_t *buf_rx)
      buf_rx[0] = qtest_readl(qts, can_base_addr + R_RX0_ID_OFFSET);
      buf_rx[1] = qtest_readl(qts, can_base_addr + R_RX0_DLC_OFFSET);
 -    is_canfd_frame = (buf_rx[1] >> DLC_FD_BIT_SHIFT) & 1;
 -
 -    if (is_canfd_frame) {
 -        for (int i = 0; i < CANFD_FRAME_SIZE - 2; i++) {
 -            buf_rx[i + 2] = qtest_readl(qts,
 -                                    can_base_addr + R_RX0_DATA1_OFFSET + 4 * i);
 -        }
 -    } else {
 -        buf_rx[2] = qtest_readl(qts, can_base_addr + R_RX0_DATA1_OFFSET);
 -        buf_rx[3] = qtest_readl(qts, can_base_addr + R_RX0_DATA2_OFFSET);
 +    for (int i = 0; i < frame_size - 2; i++) {
 +        buf_rx[i + 2] = qtest_readl(qts,
 +                                can_base_addr + R_RX0_DATA1_OFFSET + 4 * i);
      }
-     if (init_failed && !qtest_chrdev) {
+     /* Clear the RX interrupt. */
--        AccelClass *ac = ACCEL_GET_CLASS(current_accel());
+@@ -XXX,XX +XXX,XX @@ static void match_rx_tx_data(const uint32_t *buf_tx, const uint32_t *buf_rx,
--        error_report("falling back to %s", ac->name);
+             g_assert_cmpint((buf_rx[size] & DLC_FD_BIT_MASK), ==,
-+        error_report("falling back to %s", current_accel_name());
+                             (buf_tx[size] & DLC_FD_BIT_MASK));
-     }
+         } else {
+-            if (!is_canfd_frame && size == 4) {
-     if (icount_enabled() && !tcg_enabled()) {
+-                break;
 -            }
 -
              g_assert_cmpint(buf_rx[size], ==, buf_tx[size]);
          }
@@ -XXX,XX +XXX,XX @@ static void test_can_data_transfer(void)
      write_data(qts, CANFD0_BASE_ADDR, buf_tx, false);
      send_data(qts, CANFD0_BASE_ADDR);
 -    read_data(qts, CANFD1_BASE_ADDR, buf_rx);
 +    read_data(qts, CANFD1_BASE_ADDR, buf_rx, CAN_FRAME_SIZE);
      match_rx_tx_data(buf_tx, buf_rx, false);
      qtest_quit(qts);
@@ -XXX,XX +XXX,XX @@ static void test_canfd_data_transfer(void)
      write_data(qts, CANFD0_BASE_ADDR, buf_tx, true);
      send_data(qts, CANFD0_BASE_ADDR);
 -    read_data(qts, CANFD1_BASE_ADDR, buf_rx);
 +    read_data(qts, CANFD1_BASE_ADDR, buf_rx, CANFD_FRAME_SIZE);
      match_rx_tx_data(buf_tx, buf_rx, true);
      qtest_quit(qts);
@@ -XXX,XX +XXX,XX @@ static void test_can_loopback(void)
      write_data(qts, CANFD0_BASE_ADDR, buf_tx, true);
      send_data(qts, CANFD0_BASE_ADDR);
 -    read_data(qts, CANFD0_BASE_ADDR, buf_rx);
 +    read_data(qts, CANFD0_BASE_ADDR, buf_rx, CANFD_FRAME_SIZE);
      match_rx_tx_data(buf_tx, buf_rx, true);
      generate_random_data(buf_tx, true);
@@ -XXX,XX +XXX,XX @@ static void test_can_loopback(void)
      write_data(qts, CANFD1_BASE_ADDR, buf_tx, true);
      send_data(qts, CANFD1_BASE_ADDR);
 -    read_data(qts, CANFD1_BASE_ADDR, buf_rx);
 +    read_data(qts, CANFD1_BASE_ADDR, buf_rx, CANFD_FRAME_SIZE);
      match_rx_tx_data(buf_tx, buf_rx, true);
      qtest_quit(qts);
 --
-.25.1
+.34.1

-[PULL 23/25] target/arm: Move pred_{full, gvec}_reg_{offset, size} to translate-a64.h
+[PULL 09/14] target/arm: gdbstub: Guard M-profile code with CONFIG_TCG
-From: Richard Henderson <richard.henderson@linaro.org>
+From: Fabiano Rosas <farosas@suse.de>
-We will need these functions in translate-sme.c.
+This code is only relevant when TCG is present in the build. Building
 with --disable-tcg --enable-xen on an x86 host we get:
+$ ../configure --target-list=x86_64-softmmu,aarch64-softmmu --disable-tcg --enable-xen
+$ make -j$(nproc)
+...
+libqemu-aarch64-softmmu.fa.p/target_arm_gdbstub.c.o: in function `m_sysreg_ptr':
+ ../target/arm/gdbstub.c:358: undefined reference to `arm_v7m_get_sp_ptr'
+ ../target/arm/gdbstub.c:361: undefined reference to `arm_v7m_get_sp_ptr'
+libqemu-aarch64-softmmu.fa.p/target_arm_gdbstub.c.o: in function `arm_gdb_get_m_systemreg':
+../target/arm/gdbstub.c:405: undefined reference to `arm_v7m_mrs_control'
+Signed-off-by: Fabiano Rosas <farosas@suse.de>
+Message-id: 20230628164821.16771-1-farosas@suse.de
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-21-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/translate-a64.h | 38 ++++++++++++++++++++++++++++++++++++++
+ target/arm/gdbstub.c | 4 ++++
- target/arm/translate-sve.c | 36 ------------------------------------
+file changed, 4 insertions(+)
 files changed, 38 insertions(+), 36 deletions(-)
-diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
+diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.h
+--- a/target/arm/gdbstub.c
-+++ b/target/arm/translate-a64.h
++++ b/target/arm/gdbstub.c
-@@ -XXX,XX +XXX,XX @@ static inline int vec_full_reg_size(DisasContext *s)
+@@ -XXX,XX +XXX,XX @@ static int arm_gen_dynamic_sysreg_xml(CPUState *cs, int base_reg)
-     return s->vl;
+     return cpu->dyn_sysreg_xml.num;
  }
-+/*
++#ifdef CONFIG_TCG
-+ * Return the offset info CPUARMState of the predicate vector register Pn.
+ typedef enum {
-+ * Note for this purpose, FFR is P16.
+     M_SYSREG_MSP,
-+ */
+     M_SYSREG_PSP,
-+static inline int pred_full_reg_offset(DisasContext *s, int regno)
+@@ -XXX,XX +XXX,XX @@ static int arm_gen_dynamic_m_secextreg_xml(CPUState *cs, int orig_base_reg)
-+{
+     return cpu->dyn_m_secextreg_xml.num;
-+    return offsetof(CPUARMState, vfp.pregs[regno]);
+ }
-+}
+ #endif
-+
++#endif /* CONFIG_TCG */
-+/* Return the byte size of the whole predicate register, VL / 64.  */
-+static inline int pred_full_reg_size(DisasContext *s)
+ const char *arm_gdb_get_dynamic_xml(CPUState *cs, const char *xmlname)
-+{
+ {
-+    return s->vl >> 3;
+@@ -XXX,XX +XXX,XX @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
-+}
+                              arm_gen_dynamic_sysreg_xml(cs, cs->gdb_num_regs),
-+
+                              "system-registers.xml", 0);
-+/*
-+ * Round up the size of a register to a size allowed by
++#ifdef CONFIG_TCG
-+ * the tcg vector infrastructure.  Any operation which uses this
+     if (arm_feature(env, ARM_FEATURE_M) && tcg_enabled()) {
-+ * size may assume that the bits above pred_full_reg_size are zero,
+         gdb_register_coprocessor(cs,
-+ * and must leave them the same way.
+             arm_gdb_get_m_systemreg, arm_gdb_set_m_systemreg,
-+ *
+@@ -XXX,XX +XXX,XX @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
-+ * Note that this is not needed for the vector registers as they
+         }
-+ * are always properly sized for tcg vectors.
+ #endif
-+ */
+     }
-+static inline int size_for_gvec(int size)
++#endif /* CONFIG_TCG */
-+{
+ }
 +    if (size <= 8) {
 +        return 8;
 +    } else {
 +        return QEMU_ALIGN_UP(size, 16);
 +    }
 +}
 +
 +static inline int pred_gvec_reg_size(DisasContext *s)
 +{
 +    return size_for_gvec(pred_full_reg_size(s));
 +}
 +
  bool disas_sve(DisasContext *, uint32_t);
  void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-sve.c
 +++ b/target/arm/translate-sve.c
@@ -XXX,XX +XXX,XX @@ static inline int msz_dtype(DisasContext *s, int msz)
   * Implement all of the translator functions referenced by the decoder.
   */
 -/* Return the offset info CPUARMState of the predicate vector register Pn.
 - * Note for this purpose, FFR is P16.
 - */
 -static inline int pred_full_reg_offset(DisasContext *s, int regno)
 -{
 -    return offsetof(CPUARMState, vfp.pregs[regno]);
 -}
 -
 -/* Return the byte size of the whole predicate register, VL / 64.  */
 -static inline int pred_full_reg_size(DisasContext *s)
 -{
 -    return s->vl >> 3;
 -}
 -
 -/* Round up the size of a register to a size allowed by
 - * the tcg vector infrastructure.  Any operation which uses this
 - * size may assume that the bits above pred_full_reg_size are zero,
 - * and must leave them the same way.
 - *
 - * Note that this is not needed for the vector registers as they
 - * are always properly sized for tcg vectors.
 - */
 -static int size_for_gvec(int size)
 -{
 -    if (size <= 8) {
 -        return 8;
 -    } else {
 -        return QEMU_ALIGN_UP(size, 16);
 -    }
 -}
 -
 -static int pred_gvec_reg_size(DisasContext *s)
 -{
 -    return size_for_gvec(pred_full_reg_size(s));
 -}
 -
  /* Invoke an out-of-line helper on 2 Zregs. */
  static bool gen_gvec_ool_zz(DisasContext *s, gen_helper_gvec_2 *fn,
                              int rd, int rn, int data)
 --
-.25.1
+.34.1

-[PULL 01/25] sphinx: change default language to 'en'
+[PULL 10/14] hw: arm: allwinner-sramc: Set class_size
-From: Martin Liška <mliska@suse.cz>
+From: Akihiko Odaki <akihiko.odaki@daynix.com>
-Fixes the following Sphinx warning (treated as error) starting
+AwSRAMCClass is larger than SysBusDeviceClass so the class size must be
-with 5.0 release:
+advertised accordingly.
-Warning, treated as error:
+Fixes: 05def917e1 ("hw: arm: allwinner-sramc: Add SRAM Controller support for R40")
-Invalid configuration value found: 'language = None'. Update your configuration to a valid langauge code. Falling back to 'en' (English).
+Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
-Signed-off-by: Martin Liska <mliska@suse.cz>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: e91e51ee-48ac-437e-6467-98b56ee40042@suse.cz
+Message-id: 20230628110905.38125-1-akihiko.odaki@daynix.com
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- docs/conf.py | 2 +-
+ hw/misc/allwinner-sramc.c | 1 +
-file changed, 1 insertion(+), 1 deletion(-)
+file changed, 1 insertion(+)
-diff --git a/docs/conf.py b/docs/conf.py
+diff --git a/hw/misc/allwinner-sramc.c b/hw/misc/allwinner-sramc.c
 index XXXXXXX..XXXXXXX 100644
---- a/docs/conf.py
+--- a/hw/misc/allwinner-sramc.c
-+++ b/docs/conf.py
++++ b/hw/misc/allwinner-sramc.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static const TypeInfo allwinner_sramc_info = {
- #
+     .parent        = TYPE_SYS_BUS_DEVICE,
- # This is also used if you do content translation via gettext catalogs.
+     .instance_init = allwinner_sramc_init,
- # Usually you set "language" from the command line for these cases.
+     .instance_size = sizeof(AwSRAMCState),
--language = None
++    .class_size    = sizeof(AwSRAMCClass),
-+language = 'en'
+     .class_init    = allwinner_sramc_class_init,
+ };
- # List of patterns, relative to source directory, that match files and
  # directories to ignore when looking for source files.
 --
-.25.1
+.34.1

-[PULL 22/25] target/arm: Add SVL to TB flags
+[PULL 11/14] target/xtensa: Assert that interrupt level is within bounds
-From: Richard Henderson <richard.henderson@linaro.org>
+In handle_interrupt() we use level as an index into the interrupt_vector[]
 array. This is safe because we have checked it against env->config->nlevel,
 but Coverity can't see that (and it is only true because each CPU config
 sets its XCHAL_NUM_INTLEVELS to something less than MAX_NLEVELS), so it
 complains about a possible array overrun (CID 1507131)
-We need SVL separate from VL for RDSVL et al, as well as
+Add an assert() which will make Coverity happy and catch the unlikely
-ZA storage loads and stores, which do not require PSTATE.SM.
+case of a mis-set XCHAL_NUM_INTLEVELS in future.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-20-richard.henderson@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Acked-by: Max Filippov <jcmvbkbc@gmail.com>
+Message-id: 20230623154135.1930261-1-peter.maydell@linaro.org
 ---
- target/arm/cpu.h           | 12 ++++++++++++
+ target/xtensa/exc_helper.c | 3 +++
- target/arm/translate.h     |  1 +
+file changed, 3 insertions(+)
  target/arm/helper.c        |  8 +++++++-
  target/arm/translate-a64.c |  1 +
 files changed, 21 insertions(+), 1 deletion(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/xtensa/exc_helper.c b/target/xtensa/exc_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/xtensa/exc_helper.c
-+++ b/target/arm/cpu.h
++++ b/target/xtensa/exc_helper.c
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, MTE0_ACTIVE, 19, 1)
+@@ -XXX,XX +XXX,XX @@ static void handle_interrupt(CPUXtensaState *env)
- FIELD(TBFLAG_A64, SMEEXC_EL, 20, 2)
+         CPUState *cs = env_cpu(env);
- FIELD(TBFLAG_A64, PSTATE_SM, 22, 1)
- FIELD(TBFLAG_A64, PSTATE_ZA, 23, 1)
+         if (level > 1) {
-+FIELD(TBFLAG_A64, SVL, 24, 4)
++            /* env->config->nlevel check should have ensured this */
++            assert(level < sizeof(env->config->interrupt_vector));
  /*
   * Helpers for using the above.
@@ -XXX,XX +XXX,XX @@ static inline int sve_vq(CPUARMState *env)
      return EX_TBFLAG_A64(env->hflags, VL) + 1;
  }
 +/**
 + * sme_vq
 + * @env: the cpu context
 + *
 + * Return the SVL cached within env->hflags, in units of quadwords.
 + */
 +static inline int sme_vq(CPUARMState *env)
 +{
 +    return EX_TBFLAG_A64(env->hflags, SVL) + 1;
 +}
 +
- static inline bool bswap_code(bool sctlr_b)
+             env->sregs[EPC1 + level - 1] = env->pc;
- {
+             env->sregs[EPS2 + level - 2] = env->sregs[PS];
- #ifdef CONFIG_USER_ONLY
+             env->sregs[PS] =
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
      int sve_excp_el; /* SVE exception EL or 0 if enabled */
      int sme_excp_el; /* SME exception EL or 0 if enabled */
      int vl;          /* current vector length in bytes */
 +    int svl;         /* current streaming vector length in bytes */
      bool vfp_enabled; /* FP enabled via FPSCR.EN */
      int vec_len;
      int vec_stride;
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
          DP_TBFLAG_A64(flags, SVEEXC_EL, sve_el);
      }
      if (cpu_isar_feature(aa64_sme, env_archcpu(env))) {
 -        DP_TBFLAG_A64(flags, SMEEXC_EL, sme_exception_el(env, el));
 +        int sme_el = sme_exception_el(env, el);
 +
 +        DP_TBFLAG_A64(flags, SMEEXC_EL, sme_el);
 +        if (sme_el == 0) {
 +            /* Similarly, do not compute SVL if SME is disabled. */
 +            DP_TBFLAG_A64(flags, SVL, sve_vqm1_for_el_sm(env, el, true));
 +        }
          if (FIELD_EX64(env->svcr, SVCR, SM)) {
              DP_TBFLAG_A64(flags, PSTATE_SM, 1);
          }
 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-a64.c
 +++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
      dc->sve_excp_el = EX_TBFLAG_A64(tb_flags, SVEEXC_EL);
      dc->sme_excp_el = EX_TBFLAG_A64(tb_flags, SMEEXC_EL);
      dc->vl = (EX_TBFLAG_A64(tb_flags, VL) + 1) * 16;
 +    dc->svl = (EX_TBFLAG_A64(tb_flags, SVL) + 1) * 16;
      dc->pauth_active = EX_TBFLAG_A64(tb_flags, PAUTH_ACTIVE);
      dc->bt = EX_TBFLAG_A64(tb_flags, BT);
      dc->btype = EX_TBFLAG_A64(tb_flags, BTYPE);
 --
-.25.1
+.34.1

-[PULL 03/25] target/arm: Catch invalid kvm state also for hvf
+[PULL 12/14] target/arm: Suppress more TCG unimplemented features in ID registers
-From: Alexander Graf <agraf@csgraf.de>
+We already squash the ID register field for FEAT_SPE (the Statistical
 Profiling Extension) because TCG does not implement it and if we
 advertise it to the guest the guest will crash trying to look at
 non-existent system registers.  Do the same for some other features
 which a real hardware Neoverse-V1 implements but which TCG doesn't:
  * FEAT_TRF (Self-hosted Trace Extension)
  * Trace Macrocell system register access
  * Memory mapped trace
  * FEAT_AMU (Activity Monitors Extension)
  * FEAT_MPAM (Memory Partitioning and Monitoring Extension)
  * FEAT_NV (Nested Virtualization)
-Some features such as running in EL3 or running M profile code are
+Most of these, like FEAT_SPE, are "introspection/trace" type features
-incompatible with virtualization as QEMU implements it today. To prevent
+which QEMU is unlikely to ever implement.  The odd-one-out here is
-users from picking invalid configurations on other virt solutions like
+FEAT_NV -- we could implement that and at some point we probably
-Hvf, let's run the same checks there too.
+will.
-Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1073
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Alexander Graf <agraf@csgraf.de>
+Message-id: 20230704130647.2842917-2-peter.maydell@linaro.org
 Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620192242.70573-2-agraf@csgraf.de
-[PMM: Allow qtest accelerator too; tweak comment]
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.c | 16 ++++++++++++----
+ target/arm/cpu.c | 33 +++++++++++++++++++++++++++++----
-file changed, 12 insertions(+), 4 deletions(-)
+file changed, 29 insertions(+), 4 deletions(-)
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/boards.h"
- #endif
- #include "sysemu/tcg.h"
-+#include "sysemu/qtest.h"
- #include "sysemu/hw_accel.h"
- #include "kvm_arm.h"
- #include "disas/capstone.h"
 @@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
-         }
      if (tcg_enabled()) {
          /*
 -         * Don't report the Statistical Profiling Extension in the ID
 -         * registers, because TCG doesn't implement it yet (not even a
 -         * minimal stub version) and guests will fall over when they
 -         * try to access the non-existent system registers for it.
 +         * Don't report some architectural features in the ID registers
 +         * where TCG does not yet implement it (not even a minimal
 +         * stub version). This avoids guests falling over when they
 +         * try to access the non-existent system registers for them.
           */
 +        /* FEAT_SPE (Statistical Profiling Extension) */
          cpu->isar.id_aa64dfr0 =
              FIELD_DP64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, PMSVER, 0);
 +        /* FEAT_TRF (Self-hosted Trace Extension) */
 +        cpu->isar.id_aa64dfr0 =
 +            FIELD_DP64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, TRACEFILT, 0);
 +        cpu->isar.id_dfr0 =
 +            FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, TRACEFILT, 0);
 +        /* Trace Macrocell system register access */
 +        cpu->isar.id_aa64dfr0 =
 +            FIELD_DP64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, TRACEVER, 0);
 +        cpu->isar.id_dfr0 =
 +            FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, COPTRC, 0);
 +        /* Memory mapped trace */
 +        cpu->isar.id_dfr0 =
 +            FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, MMAPTRC, 0);
 +        /* FEAT_AMU (Activity Monitors Extension) */
 +        cpu->isar.id_aa64pfr0 =
 +            FIELD_DP64(cpu->isar.id_aa64pfr0, ID_AA64PFR0, AMU, 0);
 +        cpu->isar.id_pfr0 =
 +            FIELD_DP32(cpu->isar.id_pfr0, ID_PFR0, AMU, 0);
 +        /* FEAT_MPAM (Memory Partitioning and Monitoring Extension) */
 +        cpu->isar.id_aa64pfr0 =
 +            FIELD_DP64(cpu->isar.id_aa64pfr0, ID_AA64PFR0, MPAM, 0);
 +        /* FEAT_NV (Nested Virtualization) */
 +        cpu->isar.id_aa64mmfr2 =
 +            FIELD_DP64(cpu->isar.id_aa64mmfr2, ID_AA64MMFR2, NV, 0);
      }
--    if (kvm_enabled()) {
+     /* MPU can be configured out of a PMSA CPU either by setting has-mpu
 +    if (!tcg_enabled() && !qtest_enabled()) {
          /*
 +         * We assume that no accelerator except TCG (and the "not really an
 +         * accelerator" qtest) can handle these features, because Arm hardware
 +         * virtualization can't virtualize them.
 +         *
           * Catch all the cases which might cause us to create more than one
           * address space for the CPU (otherwise we will assert() later in
           * cpu_address_space_init()).
           */
          if (arm_feature(env, ARM_FEATURE_M)) {
              error_setg(errp,
 -                       "Cannot enable KVM when using an M-profile guest CPU");
 +                       "Cannot enable %s when using an M-profile guest CPU",
 +                       current_accel_name());
              return;
          }
          if (cpu->has_el3) {
              error_setg(errp,
 -                       "Cannot enable KVM when guest CPU has EL3 enabled");
 +                       "Cannot enable %s when guest CPU has EL3 enabled",
 +                       current_accel_name());
              return;
          }
          if (cpu->tag_memory) {
              error_setg(errp,
 -                       "Cannot enable KVM when guest CPUs has MTE enabled");
 +                       "Cannot enable %s when guest CPUs has MTE enabled",
 +                       current_accel_name());
              return;
          }
      }
 --
-.25.1
+.34.1

-[PULL 06/25] target/arm: Add syn_smetrap
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-This will be used for raising various traps for SME.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-4-richard.henderson@linaro.org
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/syndrome.h | 14 ++++++++++++++
-file changed, 14 insertions(+)
-diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/syndrome.h
-+++ b/target/arm/syndrome.h
-@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
-     EC_AA64_SMC               = 0x17,
-     EC_SYSTEMREGISTERTRAP     = 0x18,
-     EC_SVEACCESSTRAP          = 0x19,
-+    EC_SMETRAP                = 0x1d,
-     EC_INSNABORT              = 0x20,
-     EC_INSNABORT_SAME_EL      = 0x21,
-     EC_PCALIGNMENT            = 0x22,
-@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
-     EC_AA64_BKPT              = 0x3c,
- };
-+typedef enum {
-+    SME_ET_AccessTrap,
-+    SME_ET_Streaming,
-+    SME_ET_NotStreaming,
-+    SME_ET_InactiveZA,
-+} SMEExceptionType;
-+
- #define ARM_EL_EC_SHIFT 26
- #define ARM_EL_IL_SHIFT 25
- #define ARM_EL_ISV_SHIFT 24
-@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_sve_access_trap(void)
-     return EC_SVEACCESSTRAP << ARM_EL_EC_SHIFT;
- }
-+static inline uint32_t syn_smetrap(SMEExceptionType etype, bool is_16bit)
-+{
-+    return (EC_SMETRAP << ARM_EL_EC_SHIFT)
-+        | (is_16bit ? 0 : ARM_EL_IL) | etype;
-+}
-+
- static inline uint32_t syn_pactrap(void)
- {
-     return EC_PACTRAP << ARM_EL_EC_SHIFT;
---
-.25.1

-[PULL 07/25] target/arm: Add ARM_CP_SME
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-This will be used for controlling access to SME cpregs.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-5-richard.henderson@linaro.org
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/cpregs.h        |  5 +++++
- target/arm/translate-a64.c | 18 ++++++++++++++++++
-files changed, 23 insertions(+)
-diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpregs.h
-+++ b/target/arm/cpregs.h
-@@ -XXX,XX +XXX,XX @@ enum {
-     ARM_CP_EL3_NO_EL2_UNDEF      = 1 << 16,
-     ARM_CP_EL3_NO_EL2_KEEP       = 1 << 17,
-     ARM_CP_EL3_NO_EL2_C_NZ       = 1 << 18,
-+    /*
-+     * Flag: Access check for this sysreg is constrained by the
-+     * ARM pseudocode function CheckSMEAccess().
-+     */
-+    ARM_CP_SME                   = 1 << 19,
- };
- /*
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
-+++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ bool sve_access_check(DisasContext *s)
-     return fp_access_check(s);
- }
-+/*
-+ * Check that SME access is enabled, raise an exception if not.
-+ * Note that this function corresponds to CheckSMEAccess and is
-+ * only used directly for cpregs.
-+ */
-+static bool sme_access_check(DisasContext *s)
-+{
-+    if (s->sme_excp_el) {
-+        gen_exception_insn_el(s, s->pc_curr, EXCP_UDEF,
-+                              syn_smetrap(SME_ET_AccessTrap, false),
-+                              s->sme_excp_el);
-+        return false;
-+    }
-+    return true;
-+}
-+
- /*
-  * This utility function is for doing register extension with an
-  * optional shift. You will likely want to pass a temporary for the
-@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
-         return;
-     } else if ((ri->type & ARM_CP_SVE) && !sve_access_check(s)) {
-         return;
-+    } else if ((ri->type & ARM_CP_SME) && !sme_access_check(s)) {
-+        return;
-     }
-     if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
---
-.25.1

-[PULL 08/25] target/arm: Add SVCR
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-This cpreg is used to access two new bits of PSTATE
-that are not visible via any other mechanism.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-6-richard.henderson@linaro.org
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/cpu.h    |  6 ++++++
- target/arm/helper.c | 13 +++++++++++++
-files changed, 19 insertions(+)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
-      *  nRW (also known as M[4]) is kept, inverted, in env->aarch64
-      *  DAIF (exception masks) are kept in env->daif
-      *  BTYPE is kept in env->btype
-+     *  SM and ZA are kept in env->svcr
-      *  all other bits are stored in their correct places in env->pstate
-      */
-     uint32_t pstate;
-@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
-     uint32_t condexec_bits; /* IT bits.  cpsr[15:10,26:25].  */
-     uint32_t btype;  /* BTI branch type.  spsr[11:10].  */
-     uint64_t daif; /* exception masks, in the bits they are in PSTATE */
-+    uint64_t svcr; /* PSTATE.{SM,ZA} in the bits they are in SVCR */
-     uint64_t elr_el[4]; /* AArch64 exception link regs  */
-     uint64_t sp_el[4]; /* AArch64 banked stack pointers */
-@@ -XXX,XX +XXX,XX @@ FIELD(CPTR_EL3, TCPAC, 31, 1)
- #define PSTATE_MODE_EL1t 4
- #define PSTATE_MODE_EL0t 0
-+/* PSTATE bits that are accessed via SVCR and not stored in SPSR_ELx. */
-+FIELD(SVCR, SM, 0, 1)
-+FIELD(SVCR, ZA, 1, 1)
-+
- /* Write a new value to v7m.exception, thus transitioning into or out
-  * of Handler mode; this may result in a change of active stack pointer.
-  */
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tpidr2(CPUARMState *env, const ARMCPRegInfo *ri,
-     return CP_ACCESS_OK;
- }
-+static void svcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-+                       uint64_t value)
-+{
-+    value &= R_SVCR_SM_MASK | R_SVCR_ZA_MASK;
-+    /* TODO: Side effects. */
-+    env->svcr = value;
-+}
-+
- static const ARMCPRegInfo sme_reginfo[] = {
-     { .name = "TPIDR2_EL0", .state = ARM_CP_STATE_AA64,
-       .opc0 = 3, .opc1 = 3, .crn = 13, .crm = 0, .opc2 = 5,
-       .access = PL0_RW, .accessfn = access_tpidr2,
-       .fieldoffset = offsetof(CPUARMState, cp15.tpidr2_el0) },
-+    { .name = "SVCR", .state = ARM_CP_STATE_AA64,
-+      .opc0 = 3, .opc1 = 3, .crn = 4, .crm = 2, .opc2 = 2,
-+      .access = PL0_RW, .type = ARM_CP_SME,
-+      .fieldoffset = offsetof(CPUARMState, svcr),
-+      .writefn = svcr_write, .raw_writefn = raw_write },
- };
- #endif /* TARGET_AARCH64 */
---
-.25.1

-[PULL 09/25] target/arm: Add SMCR_ELx
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-These cpregs control the streaming vector length and whether the
-full a64 instruction set is allowed while in streaming mode.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-7-richard.henderson@linaro.org
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/cpu.h    |  8 ++++++--
- target/arm/helper.c | 41 +++++++++++++++++++++++++++++++++++++++++
-files changed, 47 insertions(+), 2 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
-         float_status standard_fp_status;
-         float_status standard_fp_status_f16;
--        /* ZCR_EL[1-3] */
--        uint64_t zcr_el[4];
-+        uint64_t zcr_el[4];   /* ZCR_EL[1-3] */
-+        uint64_t smcr_el[4];  /* SMCR_EL[1-3] */
-     } vfp;
-     uint64_t exclusive_addr;
-     uint64_t exclusive_val;
-@@ -XXX,XX +XXX,XX @@ FIELD(CPTR_EL3, TCPAC, 31, 1)
- FIELD(SVCR, SM, 0, 1)
- FIELD(SVCR, ZA, 1, 1)
-+/* Fields for SMCR_ELx. */
-+FIELD(SMCR, LEN, 0, 4)
-+FIELD(SMCR, FA64, 31, 1)
-+
- /* Write a new value to v7m.exception, thus transitioning into or out
-  * of Handler mode; this may result in a change of active stack pointer.
-  */
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
-          */
-         { K(3, 0,  1, 2, 0), K(3, 4,  1, 2, 0), K(3, 5, 1, 2, 0),
-           "ZCR_EL1", "ZCR_EL2", "ZCR_EL12", isar_feature_aa64_sve },
-+        { K(3, 0,  1, 2, 6), K(3, 4,  1, 2, 6), K(3, 5, 1, 2, 6),
-+          "SMCR_EL1", "SMCR_EL2", "SMCR_EL12", isar_feature_aa64_sme },
-         { K(3, 0,  5, 6, 0), K(3, 4,  5, 6, 0), K(3, 5, 5, 6, 0),
-           "TFSR_EL1", "TFSR_EL2", "TFSR_EL12", isar_feature_aa64_mte },
-@@ -XXX,XX +XXX,XX @@ static void svcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-     env->svcr = value;
- }
-+static void smcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-+                       uint64_t value)
-+{
-+    int cur_el = arm_current_el(env);
-+    int old_len = sve_vqm1_for_el(env, cur_el);
-+    int new_len;
-+
-+    QEMU_BUILD_BUG_ON(ARM_MAX_VQ > R_SMCR_LEN_MASK + 1);
-+    value &= R_SMCR_LEN_MASK | R_SMCR_FA64_MASK;
-+    raw_write(env, ri, value);
-+
-+    /*
-+     * Note that it is CONSTRAINED UNPREDICTABLE what happens to ZA storage
-+     * when SVL is widened (old values kept, or zeros).  Choose to keep the
-+     * current values for simplicity.  But for QEMU internals, we must still
-+     * apply the narrower SVL to the Zregs and Pregs -- see the comment
-+     * above aarch64_sve_narrow_vq.
-+     */
-+    new_len = sve_vqm1_for_el(env, cur_el);
-+    if (new_len < old_len) {
-+        aarch64_sve_narrow_vq(env, new_len + 1);
-+    }
-+}
-+
- static const ARMCPRegInfo sme_reginfo[] = {
-     { .name = "TPIDR2_EL0", .state = ARM_CP_STATE_AA64,
-       .opc0 = 3, .opc1 = 3, .crn = 13, .crm = 0, .opc2 = 5,
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo sme_reginfo[] = {
-       .access = PL0_RW, .type = ARM_CP_SME,
-       .fieldoffset = offsetof(CPUARMState, svcr),
-       .writefn = svcr_write, .raw_writefn = raw_write },
-+    { .name = "SMCR_EL1", .state = ARM_CP_STATE_AA64,
-+      .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 2, .opc2 = 6,
-+      .access = PL1_RW, .type = ARM_CP_SME,
-+      .fieldoffset = offsetof(CPUARMState, vfp.smcr_el[1]),
-+      .writefn = smcr_write, .raw_writefn = raw_write },
-+    { .name = "SMCR_EL2", .state = ARM_CP_STATE_AA64,
-+      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 6,
-+      .access = PL2_RW, .type = ARM_CP_SME,
-+      .fieldoffset = offsetof(CPUARMState, vfp.smcr_el[2]),
-+      .writefn = smcr_write, .raw_writefn = raw_write },
-+    { .name = "SMCR_EL3", .state = ARM_CP_STATE_AA64,
-+      .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 2, .opc2 = 6,
-+      .access = PL3_RW, .type = ARM_CP_SME,
-+      .fieldoffset = offsetof(CPUARMState, vfp.smcr_el[3]),
-+      .writefn = smcr_write, .raw_writefn = raw_write },
- };
- #endif /* TARGET_AARCH64 */
---
-.25.1

-[PULL 10/25] target/arm: Add SMIDR_EL1, SMPRI_EL1, SMPRIMAP_EL2
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-Implement the streaming mode identification register, and the
-two streaming priority registers.  For QEMU, they are all RES0.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-8-richard.henderson@linaro.org
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/helper.c | 33 +++++++++++++++++++++++++++++++++
-file changed, 33 insertions(+)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tpidr2(CPUARMState *env, const ARMCPRegInfo *ri,
-     return CP_ACCESS_OK;
- }
-+static CPAccessResult access_esm(CPUARMState *env, const ARMCPRegInfo *ri,
-+                                 bool isread)
-+{
-+    /* TODO: FEAT_FGT for SMPRI_EL1 but not SMPRIMAP_EL2 */
-+    if (arm_current_el(env) < 3
-+        && arm_feature(env, ARM_FEATURE_EL3)
-+        && !FIELD_EX64(env->cp15.cptr_el[3], CPTR_EL3, ESM)) {
-+        return CP_ACCESS_TRAP_EL3;
-+    }
-+    return CP_ACCESS_OK;
-+}
-+
- static void svcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                        uint64_t value)
- {
-@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo sme_reginfo[] = {
-       .access = PL3_RW, .type = ARM_CP_SME,
-       .fieldoffset = offsetof(CPUARMState, vfp.smcr_el[3]),
-       .writefn = smcr_write, .raw_writefn = raw_write },
-+    { .name = "SMIDR_EL1", .state = ARM_CP_STATE_AA64,
-+      .opc0 = 3, .opc1 = 1, .crn = 0, .crm = 0, .opc2 = 6,
-+      .access = PL1_R, .accessfn = access_aa64_tid1,
-+      /*
-+       * IMPLEMENTOR = 0 (software)
-+       * REVISION    = 0 (implementation defined)
-+       * SMPS        = 0 (no streaming execution priority in QEMU)
-+       * AFFINITY    = 0 (streaming sve mode not shared with other PEs)
-+       */
-+      .type = ARM_CP_CONST, .resetvalue = 0, },
-+    /*
-+     * Because SMIDR_EL1.SMPS is 0, SMPRI_EL1 and SMPRIMAP_EL2 are RES 0.
-+     */
-+    { .name = "SMPRI_EL1", .state = ARM_CP_STATE_AA64,
-+      .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 2, .opc2 = 4,
-+      .access = PL1_RW, .accessfn = access_esm,
-+      .type = ARM_CP_CONST, .resetvalue = 0 },
-+    { .name = "SMPRIMAP_EL2", .state = ARM_CP_STATE_AA64,
-+      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 5,
-+      .access = PL2_RW, .accessfn = access_esm,
-+      .type = ARM_CP_CONST, .resetvalue = 0 },
- };
- #endif /* TARGET_AARCH64 */
---
-.25.1

-[PULL 11/25] target/arm: Add PSTATE.{SM,ZA} to TB flags
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-These are required to determine if various insns
-are allowed to issue.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-9-richard.henderson@linaro.org
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/cpu.h           | 2 ++
- target/arm/translate.h     | 4 ++++
- target/arm/helper.c        | 4 ++++
- target/arm/translate-a64.c | 2 ++
-files changed, 12 insertions(+)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, TCMA, 16, 2)
- FIELD(TBFLAG_A64, MTE_ACTIVE, 18, 1)
- FIELD(TBFLAG_A64, MTE0_ACTIVE, 19, 1)
- FIELD(TBFLAG_A64, SMEEXC_EL, 20, 2)
-+FIELD(TBFLAG_A64, PSTATE_SM, 22, 1)
-+FIELD(TBFLAG_A64, PSTATE_ZA, 23, 1)
- /*
-  * Helpers for using the above.
-diff --git a/target/arm/translate.h b/target/arm/translate.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
-+++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
-     bool align_mem;
-     /* True if PSTATE.IL is set */
-     bool pstate_il;
-+    /* True if PSTATE.SM is set. */
-+    bool pstate_sm;
-+    /* True if PSTATE.ZA is set. */
-+    bool pstate_za;
-     /* True if MVE insns are definitely not predicated by VPR or LTPSIZE */
-     bool mve_no_pred;
-     /*
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
-     }
-     if (cpu_isar_feature(aa64_sme, env_archcpu(env))) {
-         DP_TBFLAG_A64(flags, SMEEXC_EL, sme_exception_el(env, el));
-+        if (FIELD_EX64(env->svcr, SVCR, SM)) {
-+            DP_TBFLAG_A64(flags, PSTATE_SM, 1);
-+        }
-+        DP_TBFLAG_A64(flags, PSTATE_ZA, FIELD_EX64(env->svcr, SVCR, ZA));
-     }
-     sctlr = regime_sctlr(env, stage1);
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
-+++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
-     dc->ata = EX_TBFLAG_A64(tb_flags, ATA);
-     dc->mte_active[0] = EX_TBFLAG_A64(tb_flags, MTE_ACTIVE);
-     dc->mte_active[1] = EX_TBFLAG_A64(tb_flags, MTE0_ACTIVE);
-+    dc->pstate_sm = EX_TBFLAG_A64(tb_flags, PSTATE_SM);
-+    dc->pstate_za = EX_TBFLAG_A64(tb_flags, PSTATE_ZA);
-     dc->vec_len = 0;
-     dc->vec_stride = 0;
-     dc->cp_regs = arm_cpu->cp_regs;
---
-.25.1

-[PULL 12/25] target/arm: Add the SME ZA storage to CPUARMState
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-Place this late in the resettable section of the structure,
-to keep the most common element offsets from being > 64k.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-10-richard.henderson@linaro.org
-[PMM: expanded comment on zarray[] format]
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/cpu.h     | 22 ++++++++++++++++++++++
- target/arm/machine.c | 34 ++++++++++++++++++++++++++++++++++
-files changed, 56 insertions(+)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
-     } keys;
-     uint64_t scxtnum_el[4];
-+
-+    /*
-+     * SME ZA storage -- 256 x 256 byte array, with bytes in host word order,
-+     * as we do with vfp.zregs[].  This corresponds to the architectural ZA
-+     * array, where ZA[N] is in the least-significant bytes of env->zarray[N].
-+     * When SVL is less than the architectural maximum, the accessible
-+     * storage is restricted, such that if the SVL is X bytes the guest can
-+     * see only the bottom X elements of zarray[], and only the least
-+     * significant X bytes of each element of the array. (In other words,
-+     * the observable part is always square.)
-+     *
-+     * The ZA storage can also be considered as a set of square tiles of
-+     * elements of different sizes. The mapping from tiles to the ZA array
-+     * is architecturally defined, such that for tiles of elements of esz
-+     * bytes, the Nth row (or "horizontal slice") of tile T is in
-+     * ZA[T + N * esz]. Note that this means that each tile is not contiguous
-+     * in the ZA storage, because its rows are striped through the ZA array.
-+     *
-+     * Because this is so large, keep this toward the end of the reset area,
-+     * to keep the offsets into the rest of the structure smaller.
-+     */
-+    ARMVectorReg zarray[ARM_MAX_VQ * 16];
- #endif
- #if defined(CONFIG_USER_ONLY)
-diff --git a/target/arm/machine.c b/target/arm/machine.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/machine.c
-+++ b/target/arm/machine.c
-@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_sve = {
-         VMSTATE_END_OF_LIST()
-     }
- };
-+
-+static const VMStateDescription vmstate_vreg = {
-+    .name = "vreg",
-+    .version_id = 1,
-+    .minimum_version_id = 1,
-+    .fields = (VMStateField[]) {
-+        VMSTATE_UINT64_ARRAY(d, ARMVectorReg, ARM_MAX_VQ * 2),
-+        VMSTATE_END_OF_LIST()
-+    }
-+};
-+
-+static bool za_needed(void *opaque)
-+{
-+    ARMCPU *cpu = opaque;
-+
-+    /*
-+     * When ZA storage is disabled, its contents are discarded.
-+     * It will be zeroed when ZA storage is re-enabled.
-+     */
-+    return FIELD_EX64(cpu->env.svcr, SVCR, ZA);
-+}
-+
-+static const VMStateDescription vmstate_za = {
-+    .name = "cpu/sme",
-+    .version_id = 1,
-+    .minimum_version_id = 1,
-+    .needed = za_needed,
-+    .fields = (VMStateField[]) {
-+        VMSTATE_STRUCT_ARRAY(env.zarray, ARMCPU, ARM_MAX_VQ * 16, 0,
-+                             vmstate_vreg, ARMVectorReg),
-+        VMSTATE_END_OF_LIST()
-+    }
-+};
- #endif /* AARCH64 */
- static bool serror_needed(void *opaque)
-@@ -XXX,XX +XXX,XX @@ const VMStateDescription vmstate_arm_cpu = {
-         &vmstate_m_security,
- #ifdef TARGET_AARCH64
-         &vmstate_sve,
-+        &vmstate_za,
- #endif
-         &vmstate_serror,
-         &vmstate_irq_line_state,
---
-.25.1

-[PULL 13/25] target/arm: Implement SMSTART, SMSTOP
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-These two instructions are aliases of MSR (immediate).
-Use the two helpers to properly implement svcr_write.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-11-richard.henderson@linaro.org
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/cpu.h           |  1 +
- target/arm/helper-sme.h    | 21 +++++++++++++
- target/arm/helper.h        |  1 +
- target/arm/helper.c        |  6 ++--
- target/arm/sme_helper.c    | 61 ++++++++++++++++++++++++++++++++++++++
- target/arm/translate-a64.c | 24 +++++++++++++++
- target/arm/meson.build     |  1 +
-files changed, 112 insertions(+), 3 deletions(-)
- create mode 100644 target/arm/helper-sme.h
- create mode 100644 target/arm/sme_helper.c
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ void aarch64_sve_change_el(CPUARMState *env, int old_el,
-                            int new_el, bool el0_a64);
- void aarch64_add_sve_properties(Object *obj);
- void aarch64_add_pauth_properties(Object *obj);
-+void arm_reset_sve_state(CPUARMState *env);
- /*
-  * SVE registers are encoded in KVM's memory in an endianness-invariant format.
-diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/target/arm/helper-sme.h
-@@ -XXX,XX +XXX,XX @@
-+/*
-+ *  AArch64 SME specific helper definitions
-+ *
-+ *  Copyright (c) 2022 Linaro, Ltd
-+ *
-+ * This library is free software; you can redistribute it and/or
-+ * modify it under the terms of the GNU Lesser General Public
-+ * License as published by the Free Software Foundation; either
-+ * version 2.1 of the License, or (at your option) any later version.
-+ *
-+ * This library is distributed in the hope that it will be useful,
-+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
-+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-+ * Lesser General Public License for more details.
-+ *
-+ * You should have received a copy of the GNU Lesser General Public
-+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
-+ */
-+
-+DEF_HELPER_FLAGS_2(set_pstate_sm, TCG_CALL_NO_RWG, void, env, i32)
-+DEF_HELPER_FLAGS_2(set_pstate_za, TCG_CALL_NO_RWG, void, env, i32)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
-+++ b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(gvec_bfmlal_idx, TCG_CALL_NO_RWG,
- #ifdef TARGET_AARCH64
- #include "helper-a64.h"
- #include "helper-sve.h"
-+#include "helper-sme.h"
- #endif
- #include "helper-mve.h"
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_esm(CPUARMState *env, const ARMCPRegInfo *ri,
- static void svcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                        uint64_t value)
- {
--    value &= R_SVCR_SM_MASK | R_SVCR_ZA_MASK;
--    /* TODO: Side effects. */
--    env->svcr = value;
-+    helper_set_pstate_sm(env, FIELD_EX64(value, SVCR, SM));
-+    helper_set_pstate_za(env, FIELD_EX64(value, SVCR, ZA));
-+    arm_rebuild_hflags(env);
- }
- static void smcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/target/arm/sme_helper.c
-@@ -XXX,XX +XXX,XX @@
-+/*
-+ * ARM SME Operations
-+ *
-+ * Copyright (c) 2022 Linaro, Ltd.
-+ *
-+ * This library is free software; you can redistribute it and/or
-+ * modify it under the terms of the GNU Lesser General Public
-+ * License as published by the Free Software Foundation; either
-+ * version 2.1 of the License, or (at your option) any later version.
-+ *
-+ * This library is distributed in the hope that it will be useful,
-+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
-+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-+ * Lesser General Public License for more details.
-+ *
-+ * You should have received a copy of the GNU Lesser General Public
-+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
-+ */
-+
-+#include "qemu/osdep.h"
-+#include "cpu.h"
-+#include "internals.h"
-+#include "exec/helper-proto.h"
-+
-+/* ResetSVEState */
-+void arm_reset_sve_state(CPUARMState *env)
-+{
-+    memset(env->vfp.zregs, 0, sizeof(env->vfp.zregs));
-+    /* Recall that FFR is stored as pregs[16]. */
-+    memset(env->vfp.pregs, 0, sizeof(env->vfp.pregs));
-+    vfp_set_fpcr(env, 0x0800009f);
-+}
-+
-+void helper_set_pstate_sm(CPUARMState *env, uint32_t i)
-+{
-+    if (i == FIELD_EX64(env->svcr, SVCR, SM)) {
-+        return;
-+    }
-+    env->svcr ^= R_SVCR_SM_MASK;
-+    arm_reset_sve_state(env);
-+}
-+
-+void helper_set_pstate_za(CPUARMState *env, uint32_t i)
-+{
-+    if (i == FIELD_EX64(env->svcr, SVCR, ZA)) {
-+        return;
-+    }
-+    env->svcr ^= R_SVCR_ZA_MASK;
-+
-+    /*
-+     * ResetSMEState.
-+     *
-+     * SetPSTATE_ZA zeros on enable and disable.  We can zero this only
-+     * on enable: while disabled, the storage is inaccessible and the
-+     * value does not matter.  We're not saving the storage in vmstate
-+     * when disabled either.
-+     */
-+    if (i) {
-+        memset(env->zarray, 0, sizeof(env->zarray));
-+    }
-+}
-diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-a64.c
-+++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@ static void handle_msr_i(DisasContext *s, uint32_t insn,
-         }
-         break;
-+    case 0x1b: /* SVCR* */
-+        if (!dc_isar_feature(aa64_sme, s) || crm < 2 || crm > 7) {
-+            goto do_unallocated;
-+        }
-+        if (sme_access_check(s)) {
-+            bool i = crm & 1;
-+            bool changed = false;
-+
-+            if ((crm & 2) && i != s->pstate_sm) {
-+                gen_helper_set_pstate_sm(cpu_env, tcg_constant_i32(i));
-+                changed = true;
-+            }
-+            if ((crm & 4) && i != s->pstate_za) {
-+                gen_helper_set_pstate_za(cpu_env, tcg_constant_i32(i));
-+                changed = true;
-+            }
-+            if (changed) {
-+                gen_rebuild_hflags(s);
-+            } else {
-+                s->base.is_jmp = DISAS_NEXT;
-+            }
-+        }
-+        break;
-+
-     default:
-     do_unallocated:
-         unallocated_encoding(s);
-diff --git a/target/arm/meson.build b/target/arm/meson.build
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/meson.build
-+++ b/target/arm/meson.build
-@@ -XXX,XX +XXX,XX @@ arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
-   'mte_helper.c',
-   'pauth_helper.c',
-   'sve_helper.c',
-+  'sme_helper.c',
-   'translate-a64.c',
-   'translate-sve.c',
- ))
---
-.25.1

-[PULL 24/25] target/arm: Extend arm_pamax to more than aarch64
+[PULL 13/14] target/arm: Define neoverse-v1
-From: Richard Henderson <richard.henderson@linaro.org>
+Now that we have implemented support for FEAT_LSE2, we can define
+a CPU model for the Neoverse-V1, and enable it for the virt and
-Move the code from hw/arm/virt.c that is supposed
+sbsa-ref boards.
-to handle v7 into the one function.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reported-by: He Zhe <zhe.he@windriver.com>
 Message-id: 20220619001541.131672-2-richard.henderson@linaro.org
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20230704130647.2842917-3-peter.maydell@linaro.org
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- hw/arm/virt.c    | 10 +---------
+ docs/system/arm/virt.rst |   1 +
- target/arm/ptw.c | 24 ++++++++++++++++--------
+ hw/arm/sbsa-ref.c        |   1 +
-files changed, 17 insertions(+), 17 deletions(-)
+ hw/arm/virt.c            |   1 +
+ target/arm/tcg/cpu64.c   | 128 +++++++++++++++++++++++++++++++++++++++
 files changed, 131 insertions(+)
 diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
 index XXXXXXX..XXXXXXX 100644
 --- a/docs/system/arm/virt.rst
 +++ b/docs/system/arm/virt.rst
@@ -XXX,XX +XXX,XX @@ Supported guest CPU types:
  - ``a64fx`` (64-bit)
  - ``host`` (with KVM only)
  - ``neoverse-n1`` (64-bit)
 +- ``neoverse-v1`` (64-bit)
  - ``max`` (same as ``host`` for KVM; best possible emulation with TCG)
  Note that the default is ``cortex-a15``, so for an AArch64 guest you must
 diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/sbsa-ref.c
 +++ b/hw/arm/sbsa-ref.c
@@ -XXX,XX +XXX,XX @@ static const char * const valid_cpus[] = {
      ARM_CPU_TYPE_NAME("cortex-a57"),
      ARM_CPU_TYPE_NAME("cortex-a72"),
      ARM_CPU_TYPE_NAME("neoverse-n1"),
 +    ARM_CPU_TYPE_NAME("neoverse-v1"),
      ARM_CPU_TYPE_NAME("max"),
  };
 diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/virt.c
 +++ b/hw/arm/virt.c
-@@ -XXX,XX +XXX,XX @@ static void machvirt_init(MachineState *machine)
+@@ -XXX,XX +XXX,XX @@ static const char *valid_cpus[] = {
-         cpuobj = object_new(possible_cpus->cpus[0].type);
+     ARM_CPU_TYPE_NAME("cortex-a76"),
-         armcpu = ARM_CPU(cpuobj);
+     ARM_CPU_TYPE_NAME("a64fx"),
+     ARM_CPU_TYPE_NAME("neoverse-n1"),
--        if (object_property_get_bool(cpuobj, "aarch64", NULL)) {
++    ARM_CPU_TYPE_NAME("neoverse-v1"),
--            pa_bits = arm_pamax(armcpu);
+ #endif
--        } else if (arm_feature(&armcpu->env, ARM_FEATURE_LPAE)) {
+     ARM_CPU_TYPE_NAME("cortex-a53"),
--            /* v7 with LPAE */
+     ARM_CPU_TYPE_NAME("cortex-a57"),
--            pa_bits = 40;
+diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
--        } else {
+index XXXXXXX..XXXXXXX 100644
--            /* Anything else */
+--- a/target/arm/tcg/cpu64.c
--            pa_bits = 32;
++++ b/target/arm/tcg/cpu64.c
--        }
+@@ -XXX,XX +XXX,XX @@ static void define_neoverse_n1_cp_reginfo(ARMCPU *cpu)
-+        pa_bits = arm_pamax(armcpu);
+     define_arm_cp_regs(cpu, neoverse_n1_cp_reginfo);
+ }
-         object_unref(cpuobj);
++static const ARMCPRegInfo neoverse_v1_cp_reginfo[] = {
-diff --git a/target/arm/ptw.c b/target/arm/ptw.c
++    { .name = "CPUECTLR2_EL1", .state = ARM_CP_STATE_AA64,
-index XXXXXXX..XXXXXXX 100644
++      .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 1, .opc2 = 5,
---- a/target/arm/ptw.c
++      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-+++ b/target/arm/ptw.c
++    { .name = "CPUPPMCR_EL3", .state = ARM_CP_STATE_AA64,
-@@ -XXX,XX +XXX,XX @@ static const uint8_t pamax_map[] = {
++      .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 0,
- /* The cpu-specific constant value of PAMax; also used by hw/arm/virt. */
++      .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
- unsigned int arm_pamax(ARMCPU *cpu)
++    { .name = "CPUPPMCR2_EL3", .state = ARM_CP_STATE_AA64,
 +      .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 1,
 +      .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
 +    { .name = "CPUPPMCR3_EL3", .state = ARM_CP_STATE_AA64,
 +      .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 6,
 +      .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
 +};
 +
 +static void define_neoverse_v1_cp_reginfo(ARMCPU *cpu)
 +{
 +    /*
 +     * The Neoverse V1 has all of the Neoverse N1's IMPDEF
 +     * registers and a few more of its own.
 +     */
 +    define_arm_cp_regs(cpu, neoverse_n1_cp_reginfo);
 +    define_arm_cp_regs(cpu, neoverse_v1_cp_reginfo);
 +}
 +
  static void aarch64_neoverse_n1_initfn(Object *obj)
  {
--    unsigned int parange =
+     ARMCPU *cpu = ARM_CPU(obj);
--        FIELD_EX64(cpu->isar.id_aa64mmfr0, ID_AA64MMFR0, PARANGE);
+@@ -XXX,XX +XXX,XX @@ static void aarch64_neoverse_n1_initfn(Object *obj)
-+    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+     define_neoverse_n1_cp_reginfo(cpu);
 +        unsigned int parange =
 +            FIELD_EX64(cpu->isar.id_aa64mmfr0, ID_AA64MMFR0, PARANGE);
 -    /*
 -     * id_aa64mmfr0 is a read-only register so values outside of the
 -     * supported mappings can be considered an implementation error.
 -     */
 -    assert(parange < ARRAY_SIZE(pamax_map));
 -    return pamax_map[parange];
 +        /*
 +         * id_aa64mmfr0 is a read-only register so values outside of the
 +         * supported mappings can be considered an implementation error.
 +         */
 +        assert(parange < ARRAY_SIZE(pamax_map));
 +        return pamax_map[parange];
 +    }
 +    if (arm_feature(&cpu->env, ARM_FEATURE_LPAE)) {
 +        /* v7 with LPAE */
 +        return 40;
 +    }
 +    /* Anything else */
 +    return 32;
  }
++static void aarch64_neoverse_v1_initfn(Object *obj)
++{
++    ARMCPU *cpu = ARM_CPU(obj);
++
++    cpu->dtb_compatible = "arm,neoverse-v1";
++    set_feature(&cpu->env, ARM_FEATURE_V8);
++    set_feature(&cpu->env, ARM_FEATURE_NEON);
++    set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
++    set_feature(&cpu->env, ARM_FEATURE_AARCH64);
++    set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
++    set_feature(&cpu->env, ARM_FEATURE_EL2);
++    set_feature(&cpu->env, ARM_FEATURE_EL3);
++    set_feature(&cpu->env, ARM_FEATURE_PMU);
++
++    /* Ordered by 3.2.4 AArch64 registers by functional group */
++    cpu->clidr = 0x82000023;
++    cpu->ctr = 0xb444c004; /* With DIC and IDC set */
++    cpu->dcz_blocksize = 4;
++    cpu->id_aa64afr0 = 0x00000000;
++    cpu->id_aa64afr1 = 0x00000000;
++    cpu->isar.id_aa64dfr0  = 0x000001f210305519ull;
++    cpu->isar.id_aa64dfr1 = 0x00000000;
++    cpu->isar.id_aa64isar0 = 0x1011111110212120ull; /* with FEAT_RNG */
++    cpu->isar.id_aa64isar1 = 0x0111000001211032ull;
++    cpu->isar.id_aa64mmfr0 = 0x0000000000101125ull;
++    cpu->isar.id_aa64mmfr1 = 0x0000000010212122ull;
++    cpu->isar.id_aa64mmfr2 = 0x0220011102101011ull;
++    cpu->isar.id_aa64pfr0  = 0x1101110120111112ull; /* GIC filled in later */
++    cpu->isar.id_aa64pfr1  = 0x0000000000000020ull;
++    cpu->id_afr0       = 0x00000000;
++    cpu->isar.id_dfr0  = 0x15011099;
++    cpu->isar.id_isar0 = 0x02101110;
++    cpu->isar.id_isar1 = 0x13112111;
++    cpu->isar.id_isar2 = 0x21232042;
++    cpu->isar.id_isar3 = 0x01112131;
++    cpu->isar.id_isar4 = 0x00010142;
++    cpu->isar.id_isar5 = 0x11011121;
++    cpu->isar.id_isar6 = 0x01100111;
++    cpu->isar.id_mmfr0 = 0x10201105;
++    cpu->isar.id_mmfr1 = 0x40000000;
++    cpu->isar.id_mmfr2 = 0x01260000;
++    cpu->isar.id_mmfr3 = 0x02122211;
++    cpu->isar.id_mmfr4 = 0x01021110;
++    cpu->isar.id_pfr0  = 0x21110131;
++    cpu->isar.id_pfr1  = 0x00010000; /* GIC filled in later */
++    cpu->isar.id_pfr2  = 0x00000011;
++    cpu->midr = 0x411FD402;          /* r1p2 */
++    cpu->revidr = 0;
++
++    /*
++     * The Neoverse-V1 r1p2 TRM lists 32-bit format CCSIDR_EL1 values,
++     * but also says it implements CCIDX, which means they should be
++     * 64-bit format. So we here use values which are based on the textual
++     * information in chapter 2 of the TRM (and on the fact that
++     * sets * associativity * linesize == cachesize).
++     *
++     * The 64-bit CCSIDR_EL1 format is:
++     *   [55:32] number of sets - 1
++     *   [23:3]  associativity - 1
++     *   [2:0]   log2(linesize) - 4
++     *           so 0 == 16 bytes, 1 == 32 bytes, 2 == 64 bytes, etc
++     *
++     * L1: 4-way set associative 64-byte line size, total size 64K,
++     * so sets is 256.
++     *
++     * L2: 8-way set associative, 64 byte line size, either 512K or 1MB.
++     * We pick 1MB, so this has 2048 sets.
++     *
++     * L3: No L3 (this matches the CLIDR_EL1 value).
++     */
++    cpu->ccsidr[0] = 0x000000ff0000001aull; /* 64KB L1 dcache */
++    cpu->ccsidr[1] = 0x000000ff0000001aull; /* 64KB L1 icache */
++    cpu->ccsidr[2] = 0x000007ff0000003aull; /* 1MB L2 cache */
++
++    /* From 3.2.115 SCTLR_EL3 */
++    cpu->reset_sctlr = 0x30c50838;
++
++    /* From 3.4.8 ICC_CTLR_EL3 and 3.4.23 ICH_VTR_EL2 */
++    cpu->gic_num_lrs = 4;
++    cpu->gic_vpribits = 5;
++    cpu->gic_vprebits = 5;
++    cpu->gic_pribits = 5;
++
++    /* From 3.5.1 AdvSIMD AArch64 register summary */
++    cpu->isar.mvfr0 = 0x10110222;
++    cpu->isar.mvfr1 = 0x13211111;
++    cpu->isar.mvfr2 = 0x00000043;
++
++    /* From 3.7.5 ID_AA64ZFR0_EL1 */
++    cpu->isar.id_aa64zfr0 = 0x0000100000100000;
++    cpu->sve_vq.supported = (1 << 0)  /* 128bit */
++                            | (1 << 1);  /* 256bit */
++
++    /* From 5.5.1 AArch64 PMU register summary */
++    cpu->isar.reset_pmcr_el0 = 0x41213000;
++
++    define_neoverse_v1_cp_reginfo(cpu);
++
++    aarch64_add_pauth_properties(obj);
++    aarch64_add_sve_properties(obj);
++}
++
  /*
+  * -cpu max: a CPU with as many features enabled as our emulation supports.
+  * The version of '-cpu max' for qemu-system-arm is defined in cpu32.c;
+@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo aarch64_cpus[] = {
+     { .name = "cortex-a76",         .initfn = aarch64_a76_initfn },
+     { .name = "a64fx",              .initfn = aarch64_a64fx_initfn },
+     { .name = "neoverse-n1",        .initfn = aarch64_neoverse_n1_initfn },
++    { .name = "neoverse-v1",        .initfn = aarch64_neoverse_v1_initfn },
+ };
+ static void aarch64_cpu_register_types(void)
 --
-.25.1
+.34.1

-[PULL 14/25] target/arm: Move error for sve%d property to arm_cpu_sve_finalize
+[PULL 14/14] target/arm: Avoid over-length shift in arm_cpu_sve_finalize() error case
-From: Richard Henderson <richard.henderson@linaro.org>
+If you build QEMU with the clang sanitizer enabled, you can see it
 fire when running the arm-cpu-features test:
-Keep all of the error messages together.  This does mean that
+$ QTEST_QEMU_BINARY=./build/arm-clang/qemu-system-aarch64 ./build/arm-clang/tests/qtest/arm-cpu-features
-when setting many sve length properties we'll only generate
+[...]
-one error, but we only really need one.
+../../target/arm/cpu64.c:125:19: runtime error: shift exponent 64 is too large for 64-bit type 'unsigned long long'
 [...]
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+This happens because the user can specify some incorrect SVE
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+properties that result in our calculating a max_vq of 0.  We catch
-Message-id: 20220620175235.60881-12-richard.henderson@linaro.org
+this and error out, but before we do that we calculate
  vq_mask = MAKE_64BIT_MASK(0, max_vq);$
 and the MAKE_64BIT_MASK() call is only valid for lengths that are
 greater than zero, so we hit the undefined behaviour.
 Change the logic so that if max_vq is 0 we specifically set vq_mask
 to 0 without going via MAKE_64BIT_MASK().  This lets us drop the
 max_vq check from the error-exit logic, because if max_vq is 0 then
 vq_map must now be 0.
 The UB only happens in the case where the user passed us an incorrect
 set of SVE properties, so it's not a big problem in practice.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20230704154332.3014896-1-peter.maydell@linaro.org
 ---
- target/arm/cpu64.c | 15 +++++++--------
+ target/arm/cpu64.c | 4 ++--
-file changed, 7 insertions(+), 8 deletions(-)
+file changed, 2 insertions(+), 2 deletions(-)
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
 @@ -XXX,XX +XXX,XX @@ void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
-                                   "using only sve<N> properties.\n");
+         vq = ctz32(tmp) + 1;
-             } else {
-                 error_setg(errp, "cannot enable sve%d", vq * 128);
+         max_vq = vq <= ARM_MAX_VQ ? vq - 1 : ARM_MAX_VQ;
--                error_append_hint(errp, "This CPU does not support "
+-        vq_mask = MAKE_64BIT_MASK(0, max_vq);
--                                  "the vector length %d-bits.\n", vq * 128);
++        vq_mask = max_vq > 0 ? MAKE_64BIT_MASK(0, max_vq) : 0;
-+                if (vq_supported) {
+         vq_map = vq_supported & ~vq_init & vq_mask;
-+                    error_append_hint(errp, "This CPU does not support "
-+                                      "the vector length %d-bits.\n", vq * 128);
+-        if (max_vq == 0 || vq_map == 0) {
-+                } else {
++        if (vq_map == 0) {
-+                    error_append_hint(errp, "SVE not supported by KVM "
+             error_setg(errp, "cannot disable sve%d", vq * 128);
-+                                      "on this host\n");
+             error_append_hint(errp, "Disabling sve%d results in all "
-+                }
+                               "vector lengths being disabled.\n",
              }
              return;
          } else {
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve_vq(Object *obj, Visitor *v, const char *name,
          return;
      }
 -    if (value && kvm_enabled() && !kvm_arm_sve_supported()) {
 -        error_setg(errp, "cannot enable %s", name);
 -        error_append_hint(errp, "SVE not supported by KVM on this host\n");
 -        return;
 -    }
 -
      cpu->sve_vq_map = deposit32(cpu->sve_vq_map, vq - 1, 1, value);
      cpu->sve_vq_init |= 1 << (vq - 1);
  }
 --
-.25.1
+.34.1

-[PULL 16/25] target/arm: Generalize cpu_arm_{get,set}_vq
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-Rename from cpu_arm_{get,set}_sve_vq, and take the
-ARMVQMap as the opaque parameter.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-14-richard.henderson@linaro.org
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/cpu64.c | 29 +++++++++++++++--------------
-file changed, 15 insertions(+), 14 deletions(-)
-diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu64.c
-+++ b/target/arm/cpu64.c
-@@ -XXX,XX +XXX,XX @@ static void cpu_max_set_sve_max_vq(Object *obj, Visitor *v, const char *name,
- }
- /*
-- * Note that cpu_arm_get/set_sve_vq cannot use the simpler
-- * object_property_add_bool interface because they make use
-- * of the contents of "name" to determine which bit on which
-- * to operate.
-+ * Note that cpu_arm_{get,set}_vq cannot use the simpler
-+ * object_property_add_bool interface because they make use of the
-+ * contents of "name" to determine which bit on which to operate.
-  */
--static void cpu_arm_get_sve_vq(Object *obj, Visitor *v, const char *name,
--                               void *opaque, Error **errp)
-+static void cpu_arm_get_vq(Object *obj, Visitor *v, const char *name,
-+                           void *opaque, Error **errp)
- {
-     ARMCPU *cpu = ARM_CPU(obj);
-+    ARMVQMap *vq_map = opaque;
-     uint32_t vq = atoi(&name[3]) / 128;
-     bool value;
-@@ -XXX,XX +XXX,XX @@ static void cpu_arm_get_sve_vq(Object *obj, Visitor *v, const char *name,
-     if (!cpu_isar_feature(aa64_sve, cpu)) {
-         value = false;
-     } else {
--        value = extract32(cpu->sve_vq.map, vq - 1, 1);
-+        value = extract32(vq_map->map, vq - 1, 1);
-     }
-     visit_type_bool(v, name, &value, errp);
- }
--static void cpu_arm_set_sve_vq(Object *obj, Visitor *v, const char *name,
--                               void *opaque, Error **errp)
-+static void cpu_arm_set_vq(Object *obj, Visitor *v, const char *name,
-+                           void *opaque, Error **errp)
- {
--    ARMCPU *cpu = ARM_CPU(obj);
-+    ARMVQMap *vq_map = opaque;
-     uint32_t vq = atoi(&name[3]) / 128;
-     bool value;
-@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve_vq(Object *obj, Visitor *v, const char *name,
-         return;
-     }
--    cpu->sve_vq.map = deposit32(cpu->sve_vq.map, vq - 1, 1, value);
--    cpu->sve_vq.init |= 1 << (vq - 1);
-+    vq_map->map = deposit32(vq_map->map, vq - 1, 1, value);
-+    vq_map->init |= 1 << (vq - 1);
- }
- static bool cpu_arm_get_sve(Object *obj, Error **errp)
-@@ -XXX,XX +XXX,XX @@ static void cpu_arm_get_sve_default_vec_len(Object *obj, Visitor *v,
- void aarch64_add_sve_properties(Object *obj)
- {
-+    ARMCPU *cpu = ARM_CPU(obj);
-     uint32_t vq;
-     object_property_add_bool(obj, "sve", cpu_arm_get_sve, cpu_arm_set_sve);
-@@ -XXX,XX +XXX,XX @@ void aarch64_add_sve_properties(Object *obj)
-     for (vq = 1; vq <= ARM_MAX_VQ; ++vq) {
-         char name[8];
-         sprintf(name, "sve%d", vq * 128);
--        object_property_add(obj, name, "bool", cpu_arm_get_sve_vq,
--                            cpu_arm_set_sve_vq, NULL, NULL);
-+        object_property_add(obj, name, "bool", cpu_arm_get_vq,
-+                            cpu_arm_set_vq, NULL, &cpu->sve_vq);
-     }
- #ifdef CONFIG_USER_ONLY
---
-.25.1

-[PULL 19/25] target/arm: Unexport aarch64_add_*_properties
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-These functions are not used outside cpu64.c,
-so make them static.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-17-richard.henderson@linaro.org
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/cpu.h   | 3 ---
- target/arm/cpu64.c | 4 ++--
-files changed, 2 insertions(+), 5 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ int aarch64_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
- void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq);
- void aarch64_sve_change_el(CPUARMState *env, int old_el,
-                            int new_el, bool el0_a64);
--void aarch64_add_sve_properties(Object *obj);
--void aarch64_add_pauth_properties(Object *obj);
- void arm_reset_sve_state(CPUARMState *env);
- /*
-@@ -XXX,XX +XXX,XX @@ static inline void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq) { }
- static inline void aarch64_sve_change_el(CPUARMState *env, int o,
-                                          int n, bool a)
- { }
--static inline void aarch64_add_sve_properties(Object *obj) { }
- #endif
- void aarch64_sync_32_to_64(CPUARMState *env);
-diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu64.c
-+++ b/target/arm/cpu64.c
-@@ -XXX,XX +XXX,XX @@ static void cpu_arm_get_default_vec_len(Object *obj, Visitor *v,
- }
- #endif
--void aarch64_add_sve_properties(Object *obj)
-+static void aarch64_add_sve_properties(Object *obj)
- {
-     ARMCPU *cpu = ARM_CPU(obj);
-     uint32_t vq;
-@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_pauth_property =
- static Property arm_cpu_pauth_impdef_property =
-     DEFINE_PROP_BOOL("pauth-impdef", ARMCPU, prop_pauth_impdef, false);
--void aarch64_add_pauth_properties(Object *obj)
-+static void aarch64_add_pauth_properties(Object *obj)
- {
-     ARMCPU *cpu = ARM_CPU(obj);
---
-.25.1

-[PULL 21/25] target/arm: Introduce sve_vqm1_for_el_sm
+Deleted patch
-From: Richard Henderson <richard.henderson@linaro.org>
-When Streaming SVE mode is enabled, the size is taken from
-SMCR_ELx instead of ZCR_ELx.  The format is shared, but the
-set of vector lengths is not.  Further, Streaming SVE does
-not require any particular length to be supported.
-Adjust sve_vqm1_for_el to pass the current value of PSTATE.SM
-to the new function.
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20220620175235.60881-19-richard.henderson@linaro.org
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- target/arm/cpu.h    |  9 +++++++--
- target/arm/helper.c | 32 +++++++++++++++++++++++++-------
-files changed, 32 insertions(+), 9 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ int sve_exception_el(CPUARMState *env, int cur_el);
- int sme_exception_el(CPUARMState *env, int cur_el);
- /**
-- * sve_vqm1_for_el:
-+ * sve_vqm1_for_el_sm:
-  * @env: CPUARMState
-  * @el: exception level
-+ * @sm: streaming mode
-  *
-- * Compute the current SVE vector length for @el, in units of
-+ * Compute the current vector length for @el & @sm, in units of
-  * Quadwords Minus 1 -- the same scale used for ZCR_ELx.LEN.
-+ * If @sm, compute for SVL, otherwise NVL.
-  */
-+uint32_t sve_vqm1_for_el_sm(CPUARMState *env, int el, bool sm);
-+
-+/* Likewise, but using @sm = PSTATE.SM. */
- uint32_t sve_vqm1_for_el(CPUARMState *env, int el);
- static inline bool is_a64(CPUARMState *env)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ int sme_exception_el(CPUARMState *env, int el)
- /*
-  * Given that SVE is enabled, return the vector length for EL.
-  */
--uint32_t sve_vqm1_for_el(CPUARMState *env, int el)
-+uint32_t sve_vqm1_for_el_sm(CPUARMState *env, int el, bool sm)
- {
-     ARMCPU *cpu = env_archcpu(env);
--    uint32_t len = cpu->sve_max_vq - 1;
-+    uint64_t *cr = env->vfp.zcr_el;
-+    uint32_t map = cpu->sve_vq.map;
-+    uint32_t len = ARM_MAX_VQ - 1;
-+
-+    if (sm) {
-+        cr = env->vfp.smcr_el;
-+        map = cpu->sme_vq.map;
-+    }
-     if (el <= 1 && !el_is_in_host(env, el)) {
--        len = MIN(len, 0xf & (uint32_t)env->vfp.zcr_el[1]);
-+        len = MIN(len, 0xf & (uint32_t)cr[1]);
-     }
-     if (el <= 2 && arm_feature(env, ARM_FEATURE_EL2)) {
--        len = MIN(len, 0xf & (uint32_t)env->vfp.zcr_el[2]);
-+        len = MIN(len, 0xf & (uint32_t)cr[2]);
-     }
-     if (arm_feature(env, ARM_FEATURE_EL3)) {
--        len = MIN(len, 0xf & (uint32_t)env->vfp.zcr_el[3]);
-+        len = MIN(len, 0xf & (uint32_t)cr[3]);
-     }
--    len = 31 - clz32(cpu->sve_vq.map & MAKE_64BIT_MASK(0, len + 1));
--    return len;
-+    map &= MAKE_64BIT_MASK(0, len + 1);
-+    if (map != 0) {
-+        return 31 - clz32(map);
-+    }
-+
-+    /* Bit 0 is always set for Normal SVE -- not so for Streaming SVE. */
-+    assert(sm);
-+    return ctz32(cpu->sme_vq.map);
-+}
-+
-+uint32_t sve_vqm1_for_el(CPUARMState *env, int el)
-+{
-+    return sve_vqm1_for_el_sm(env, el, FIELD_EX64(env->svcr, SVCR, SM));
- }
- static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
---
-.25.1

target-arm queue, mostly SME preliminaries.

In the unlikely event we don't land the rest of SME before freeze
for 7.1 we can revert the docs/property changes included here.

-- PMM

The following changes since commit 097ccbbbaf2681df1e65542e5b7d2b2d0c66e2bc:

Merge tag 'qemu-sparc-20220626' of https://github.com/mcayland/qemu into staging (2022-06-27 05:21:05 +0530)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20220627

for you to fetch changes up to 59e1b8a22ea9f947d038ccac784de1020f266e14:

target/arm: Check V7VE as well as LPAE in arm_pamax (2022-06-27 11:18:17 +0100)

----------------------------------------------------------------
target-arm queue:
 * sphinx: change default language to 'en'
 * Diagnose attempts to emulate EL3 in hvf as well as kvm
 * More SME groundwork patches
 * virt: Fix calculation of physical address space size
   for v7VE CPUs (eg cortex-a15)

----------------------------------------------------------------
Alexander Graf (2):
      accel: Introduce current_accel_name()
      target/arm: Catch invalid kvm state also for hvf

Martin Liška (1):
      sphinx: change default language to 'en'

Richard Henderson (22):
      target/arm: Implement TPIDR2_EL0
      target/arm: Add SMEEXC_EL to TB flags
      target/arm: Add syn_smetrap
      target/arm: Add ARM_CP_SME
      target/arm: Add SVCR
      target/arm: Add SMCR_ELx
      target/arm: Add SMIDR_EL1, SMPRI_EL1, SMPRIMAP_EL2
      target/arm: Add PSTATE.{SM,ZA} to TB flags
      target/arm: Add the SME ZA storage to CPUARMState
      target/arm: Implement SMSTART, SMSTOP
      target/arm: Move error for sve%d property to arm_cpu_sve_finalize
      target/arm: Create ARMVQMap
      target/arm: Generalize cpu_arm_{get,set}_vq
      target/arm: Generalize cpu_arm_{get, set}_default_vec_len
      target/arm: Move arm_cpu_*_finalize to internals.h
      target/arm: Unexport aarch64_add_*_properties
      target/arm: Add cpu properties for SME
      target/arm: Introduce sve_vqm1_for_el_sm
      target/arm: Add SVL to TB flags
      target/arm: Move pred_{full, gvec}_reg_{offset, size} to translate-a64.h
      target/arm: Extend arm_pamax to more than aarch64
      target/arm: Check V7VE as well as LPAE in arm_pamax

From: Martin Liška <mliska@suse.cz>

Fixes the following Sphinx warning (treated as error) starting
with 5.0 release:

Warning, treated as error:
Invalid configuration value found: 'language = None'. Update your configuration to a valid langauge code. Falling back to 'en' (English).

Signed-off-by: Martin Liska <mliska@suse.cz>
Message-id: e91e51ee-48ac-437e-6467-98b56ee40042@suse.cz
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/conf.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/conf.py b/docs/conf.py
index XXXXXXX..XXXXXXX 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -XXX,XX +XXX,XX @@
 #
 # This is also used if you do content translation via gettext catalogs.
 # Usually you set "language" from the command line for these cases.
-language = None
+language = 'en'
 
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
-- 
2.25.1

From: Alexander Graf <agraf@csgraf.de>

We need to fetch the name of the current accelerator in flexible error
messages more going forward. Let's create a helper that gives it to us
without casting in the target code.

Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620192242.70573-1-agraf@csgraf.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/qemu/accel.h | 1 +
 accel/accel-common.c | 8 ++++++++
 softmmu/vl.c         | 3 +--
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/qemu/accel.h b/include/qemu/accel.h
index XXXXXXX..XXXXXXX 100644
--- a/include/qemu/accel.h
+++ b/include/qemu/accel.h
@@ -XXX,XX +XXX,XX @@ typedef struct AccelClass {
 
 AccelClass *accel_find(const char *opt_name);
 AccelState *current_accel(void);
+const char *current_accel_name(void);
 
 void accel_init_interfaces(AccelClass *ac);
 
diff --git a/accel/accel-common.c b/accel/accel-common.c
index XXXXXXX..XXXXXXX 100644
--- a/accel/accel-common.c
+++ b/accel/accel-common.c
@@ -XXX,XX +XXX,XX @@ AccelClass *accel_find(const char *opt_name)
     return ac;
 }
 
+/* Return the name of the current accelerator */
+const char *current_accel_name(void)
+{
+    AccelClass *ac = ACCEL_GET_CLASS(current_accel());
+
+    return ac->name;
+}
+
 static void accel_init_cpu_int_aux(ObjectClass *klass, void *opaque)
 {
     CPUClass *cc = CPU_CLASS(klass);
diff --git a/softmmu/vl.c b/softmmu/vl.c
index XXXXXXX..XXXXXXX 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -XXX,XX +XXX,XX @@ static void configure_accelerators(const char *progname)
     }
 
     if (init_failed && !qtest_chrdev) {
-        AccelClass *ac = ACCEL_GET_CLASS(current_accel());
-        error_report("falling back to %s", ac->name);
+        error_report("falling back to %s", current_accel_name());
     }
 
     if (icount_enabled() && !tcg_enabled()) {
-- 
2.25.1

From: Alexander Graf <agraf@csgraf.de>

Some features such as running in EL3 or running M profile code are
incompatible with virtualization as QEMU implements it today. To prevent
users from picking invalid configurations on other virt solutions like
Hvf, let's run the same checks there too.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1073
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620192242.70573-2-agraf@csgraf.de
[PMM: Allow qtest accelerator too; tweak comment]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/boards.h"
 #endif
 #include "sysemu/tcg.h"
+#include "sysemu/qtest.h"
 #include "sysemu/hw_accel.h"
 #include "kvm_arm.h"
 #include "disas/capstone.h"
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         }
     }
 
-    if (kvm_enabled()) {
+    if (!tcg_enabled() && !qtest_enabled()) {
         /*
+         * We assume that no accelerator except TCG (and the "not really an
+         * accelerator" qtest) can handle these features, because Arm hardware
+         * virtualization can't virtualize them.
+         *
          * Catch all the cases which might cause us to create more than one
          * address space for the CPU (otherwise we will assert() later in
          * cpu_address_space_init()).
          */
         if (arm_feature(env, ARM_FEATURE_M)) {
             error_setg(errp,
-                       "Cannot enable KVM when using an M-profile guest CPU");
+                       "Cannot enable %s when using an M-profile guest CPU",
+                       current_accel_name());
             return;
         }
         if (cpu->has_el3) {
             error_setg(errp,
-                       "Cannot enable KVM when guest CPU has EL3 enabled");
+                       "Cannot enable %s when guest CPU has EL3 enabled",
+                       current_accel_name());
             return;
         }
         if (cpu->tag_memory) {
             error_setg(errp,
-                       "Cannot enable KVM when guest CPUs has MTE enabled");
+                       "Cannot enable %s when guest CPUs has MTE enabled",
+                       current_accel_name());
             return;
         }
     }
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

This register is part of SME, but isn't closely related to the
rest of the extension.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h    |  1 +
 target/arm/helper.c | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
             };
             uint64_t tpidr_el[4];
         };
+        uint64_t tpidr2_el0;
         /* The secure banks of these registers don't map anywhere */
         uint64_t tpidrurw_s;
         uint64_t tpidrprw_s;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo zcr_reginfo[] = {
       .writefn = zcr_write, .raw_writefn = raw_write },
 };
 
+#ifdef TARGET_AARCH64
+static CPAccessResult access_tpidr2(CPUARMState *env, const ARMCPRegInfo *ri,
+                                    bool isread)
+{
+    int el = arm_current_el(env);
+
+    if (el == 0) {
+        uint64_t sctlr = arm_sctlr(env, el);
+        if (!(sctlr & SCTLR_EnTP2)) {
+            return CP_ACCESS_TRAP;
+        }
+    }
+    /* TODO: FEAT_FGT */
+    if (el < 3
+        && arm_feature(env, ARM_FEATURE_EL3)
+        && !(env->cp15.scr_el3 & SCR_ENTP2)) {
+        return CP_ACCESS_TRAP_EL3;
+    }
+    return CP_ACCESS_OK;
+}
+
+static const ARMCPRegInfo sme_reginfo[] = {
+    { .name = "TPIDR2_EL0", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 3, .crn = 13, .crm = 0, .opc2 = 5,
+      .access = PL0_RW, .accessfn = access_tpidr2,
+      .fieldoffset = offsetof(CPUARMState, cp15.tpidr2_el0) },
+};
+#endif /* TARGET_AARCH64 */
+
 void hw_watchpoint_update(ARMCPU *cpu, int n)
 {
     CPUARMState *env = &cpu->env;
@@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu)
     }
 
 #ifdef TARGET_AARCH64
+    if (cpu_isar_feature(aa64_sme, cpu)) {
+        define_arm_cp_regs(cpu, sme_reginfo);
+    }
     if (cpu_isar_feature(aa64_pauth, cpu)) {
         define_arm_cp_regs(cpu, pauth_reginfo);
     }
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

This is CheckSMEAccess, which is the basis for a set of
related tests for various SME cpregs and instructions.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h           |  2 ++
 target/arm/translate.h     |  1 +
 target/arm/helper.c        | 52 ++++++++++++++++++++++++++++++++++++++
 target/arm/translate-a64.c |  1 +
 4 files changed, 56 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ void aarch64_sync_64_to_32(CPUARMState *env);
 
 int fp_exception_el(CPUARMState *env, int cur_el);
 int sve_exception_el(CPUARMState *env, int cur_el);
+int sme_exception_el(CPUARMState *env, int cur_el);
 
 /**
  * sve_vqm1_for_el:
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, ATA, 15, 1)
 FIELD(TBFLAG_A64, TCMA, 16, 2)
 FIELD(TBFLAG_A64, MTE_ACTIVE, 18, 1)
 FIELD(TBFLAG_A64, MTE0_ACTIVE, 19, 1)
+FIELD(TBFLAG_A64, SMEEXC_EL, 20, 2)
 
 /*
  * Helpers for using the above.
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     bool ns;        /* Use non-secure CPREG bank on access */
     int fp_excp_el; /* FP exception EL or 0 if enabled */
     int sve_excp_el; /* SVE exception EL or 0 if enabled */
+    int sme_excp_el; /* SME exception EL or 0 if enabled */
     int vl;          /* current vector length in bytes */
     bool vfp_enabled; /* FP enabled via FPSCR.EN */
     int vec_len;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ int sve_exception_el(CPUARMState *env, int el)
     return 0;
 }
 
+/*
+ * Return the exception level to which exceptions should be taken for SME.
+ * C.f. the ARM pseudocode function CheckSMEAccess.
+ */
+int sme_exception_el(CPUARMState *env, int el)
+{
+#ifndef CONFIG_USER_ONLY
+    if (el <= 1 && !el_is_in_host(env, el)) {
+        switch (FIELD_EX64(env->cp15.cpacr_el1, CPACR_EL1, SMEN)) {
+        case 1:
+            if (el != 0) {
+                break;
+            }
+            /* fall through */
+        case 0:
+        case 2:
+            return 1;
+        }
+    }
+
+    if (el <= 2 && arm_is_el2_enabled(env)) {
+        /* CPTR_EL2 changes format with HCR_EL2.E2H (regardless of TGE). */
+        if (env->cp15.hcr_el2 & HCR_E2H) {
+            switch (FIELD_EX64(env->cp15.cptr_el[2], CPTR_EL2, SMEN)) {
+            case 1:
+                if (el != 0 || !(env->cp15.hcr_el2 & HCR_TGE)) {
+                    break;
+                }
+                /* fall through */
+            case 0:
+            case 2:
+                return 2;
+            }
+        } else {
+            if (FIELD_EX64(env->cp15.cptr_el[2], CPTR_EL2, TSM)) {
+                return 2;
+            }
+        }
+    }
+
+    /* CPTR_EL3.  Since ESM is negative we must check for EL3.  */
+    if (arm_feature(env, ARM_FEATURE_EL3)
+        && !FIELD_EX64(env->cp15.cptr_el[3], CPTR_EL3, ESM)) {
+        return 3;
+    }
+#endif
+    return 0;
+}
+
 /*
  * Given that SVE is enabled, return the vector length for EL.
  */
@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
         }
         DP_TBFLAG_A64(flags, SVEEXC_EL, sve_el);
     }
+    if (cpu_isar_feature(aa64_sme, env_archcpu(env))) {
+        DP_TBFLAG_A64(flags, SMEEXC_EL, sme_exception_el(env, el));
+    }
 
     sctlr = regime_sctlr(env, stage1);
 
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
     dc->align_mem = EX_TBFLAG_ANY(tb_flags, ALIGN_MEM);
     dc->pstate_il = EX_TBFLAG_ANY(tb_flags, PSTATE__IL);
     dc->sve_excp_el = EX_TBFLAG_A64(tb_flags, SVEEXC_EL);
+    dc->sme_excp_el = EX_TBFLAG_A64(tb_flags, SMEEXC_EL);
     dc->vl = (EX_TBFLAG_A64(tb_flags, VL) + 1) * 16;
     dc->pauth_active = EX_TBFLAG_A64(tb_flags, PAUTH_ACTIVE);
     dc->bt = EX_TBFLAG_A64(tb_flags, BT);
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

This will be used for raising various traps for SME.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/syndrome.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/syndrome.h
+++ b/target/arm/syndrome.h
@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
     EC_AA64_SMC               = 0x17,
     EC_SYSTEMREGISTERTRAP     = 0x18,
     EC_SVEACCESSTRAP          = 0x19,
+    EC_SMETRAP                = 0x1d,
     EC_INSNABORT              = 0x20,
     EC_INSNABORT_SAME_EL      = 0x21,
     EC_PCALIGNMENT            = 0x22,
@@ -XXX,XX +XXX,XX @@ enum arm_exception_class {
     EC_AA64_BKPT              = 0x3c,
 };
 
+typedef enum {
+    SME_ET_AccessTrap,
+    SME_ET_Streaming,
+    SME_ET_NotStreaming,
+    SME_ET_InactiveZA,
+} SMEExceptionType;
+
 #define ARM_EL_EC_SHIFT 26
 #define ARM_EL_IL_SHIFT 25
 #define ARM_EL_ISV_SHIFT 24
@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_sve_access_trap(void)
     return EC_SVEACCESSTRAP << ARM_EL_EC_SHIFT;
 }
 
+static inline uint32_t syn_smetrap(SMEExceptionType etype, bool is_16bit)
+{
+    return (EC_SMETRAP << ARM_EL_EC_SHIFT)
+        | (is_16bit ? 0 : ARM_EL_IL) | etype;
+}
+
 static inline uint32_t syn_pactrap(void)
 {
     return EC_PACTRAP << ARM_EL_EC_SHIFT;
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

This will be used for controlling access to SME cpregs.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpregs.h        |  5 +++++
 target/arm/translate-a64.c | 18 ++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -XXX,XX +XXX,XX @@ enum {
     ARM_CP_EL3_NO_EL2_UNDEF      = 1 << 16,
     ARM_CP_EL3_NO_EL2_KEEP       = 1 << 17,
     ARM_CP_EL3_NO_EL2_C_NZ       = 1 << 18,
+    /*
+     * Flag: Access check for this sysreg is constrained by the
+     * ARM pseudocode function CheckSMEAccess().
+     */
+    ARM_CP_SME                   = 1 << 19,
 };
 
 /*
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ bool sve_access_check(DisasContext *s)
     return fp_access_check(s);
 }
 
+/*
+ * Check that SME access is enabled, raise an exception if not.
+ * Note that this function corresponds to CheckSMEAccess and is
+ * only used directly for cpregs.
+ */
+static bool sme_access_check(DisasContext *s)
+{
+    if (s->sme_excp_el) {
+        gen_exception_insn_el(s, s->pc_curr, EXCP_UDEF,
+                              syn_smetrap(SME_ET_AccessTrap, false),
+                              s->sme_excp_el);
+        return false;
+    }
+    return true;
+}
+
 /*
  * This utility function is for doing register extension with an
  * optional shift. You will likely want to pass a temporary for the
@@ -XXX,XX +XXX,XX @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
         return;
     } else if ((ri->type & ARM_CP_SVE) && !sve_access_check(s)) {
         return;
+    } else if ((ri->type & ARM_CP_SME) && !sme_access_check(s)) {
+        return;
     }
 
     if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

This cpreg is used to access two new bits of PSTATE
that are not visible via any other mechanism.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h    |  6 ++++++
 target/arm/helper.c | 13 +++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
      *  nRW (also known as M[4]) is kept, inverted, in env->aarch64
      *  DAIF (exception masks) are kept in env->daif
      *  BTYPE is kept in env->btype
+     *  SM and ZA are kept in env->svcr
      *  all other bits are stored in their correct places in env->pstate
      */
     uint32_t pstate;
@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
     uint32_t condexec_bits; /* IT bits.  cpsr[15:10,26:25].  */
     uint32_t btype;  /* BTI branch type.  spsr[11:10].  */
     uint64_t daif; /* exception masks, in the bits they are in PSTATE */
+    uint64_t svcr; /* PSTATE.{SM,ZA} in the bits they are in SVCR */
 
     uint64_t elr_el[4]; /* AArch64 exception link regs  */
     uint64_t sp_el[4]; /* AArch64 banked stack pointers */
@@ -XXX,XX +XXX,XX @@ FIELD(CPTR_EL3, TCPAC, 31, 1)
 #define PSTATE_MODE_EL1t 4
 #define PSTATE_MODE_EL0t 0
 
+/* PSTATE bits that are accessed via SVCR and not stored in SPSR_ELx. */
+FIELD(SVCR, SM, 0, 1)
+FIELD(SVCR, ZA, 1, 1)
+
 /* Write a new value to v7m.exception, thus transitioning into or out
  * of Handler mode; this may result in a change of active stack pointer.
  */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tpidr2(CPUARMState *env, const ARMCPRegInfo *ri,
     return CP_ACCESS_OK;
 }
 
+static void svcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                       uint64_t value)
+{
+    value &= R_SVCR_SM_MASK | R_SVCR_ZA_MASK;
+    /* TODO: Side effects. */
+    env->svcr = value;
+}
+
 static const ARMCPRegInfo sme_reginfo[] = {
     { .name = "TPIDR2_EL0", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 3, .crn = 13, .crm = 0, .opc2 = 5,
       .access = PL0_RW, .accessfn = access_tpidr2,
       .fieldoffset = offsetof(CPUARMState, cp15.tpidr2_el0) },
+    { .name = "SVCR", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 3, .crn = 4, .crm = 2, .opc2 = 2,
+      .access = PL0_RW, .type = ARM_CP_SME,
+      .fieldoffset = offsetof(CPUARMState, svcr),
+      .writefn = svcr_write, .raw_writefn = raw_write },
 };
 #endif /* TARGET_AARCH64 */
 
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

These cpregs control the streaming vector length and whether the
full a64 instruction set is allowed while in streaming mode.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h    |  8 ++++++--
 target/arm/helper.c | 41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
         float_status standard_fp_status;
         float_status standard_fp_status_f16;
 
-        /* ZCR_EL[1-3] */
-        uint64_t zcr_el[4];
+        uint64_t zcr_el[4];   /* ZCR_EL[1-3] */
+        uint64_t smcr_el[4];  /* SMCR_EL[1-3] */
     } vfp;
     uint64_t exclusive_addr;
     uint64_t exclusive_val;
@@ -XXX,XX +XXX,XX @@ FIELD(CPTR_EL3, TCPAC, 31, 1)
 FIELD(SVCR, SM, 0, 1)
 FIELD(SVCR, ZA, 1, 1)
 
+/* Fields for SMCR_ELx. */
+FIELD(SMCR, LEN, 0, 4)
+FIELD(SMCR, FA64, 31, 1)
+
 /* Write a new value to v7m.exception, thus transitioning into or out
  * of Handler mode; this may result in a change of active stack pointer.
  */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
          */
         { K(3, 0,  1, 2, 0), K(3, 4,  1, 2, 0), K(3, 5, 1, 2, 0),
           "ZCR_EL1", "ZCR_EL2", "ZCR_EL12", isar_feature_aa64_sve },
+        { K(3, 0,  1, 2, 6), K(3, 4,  1, 2, 6), K(3, 5, 1, 2, 6),
+          "SMCR_EL1", "SMCR_EL2", "SMCR_EL12", isar_feature_aa64_sme },
 
         { K(3, 0,  5, 6, 0), K(3, 4,  5, 6, 0), K(3, 5, 5, 6, 0),
           "TFSR_EL1", "TFSR_EL2", "TFSR_EL12", isar_feature_aa64_mte },
@@ -XXX,XX +XXX,XX @@ static void svcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
     env->svcr = value;
 }
 
+static void smcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                       uint64_t value)
+{
+    int cur_el = arm_current_el(env);
+    int old_len = sve_vqm1_for_el(env, cur_el);
+    int new_len;
+
+    QEMU_BUILD_BUG_ON(ARM_MAX_VQ > R_SMCR_LEN_MASK + 1);
+    value &= R_SMCR_LEN_MASK | R_SMCR_FA64_MASK;
+    raw_write(env, ri, value);
+
+    /*
+     * Note that it is CONSTRAINED UNPREDICTABLE what happens to ZA storage
+     * when SVL is widened (old values kept, or zeros).  Choose to keep the
+     * current values for simplicity.  But for QEMU internals, we must still
+     * apply the narrower SVL to the Zregs and Pregs -- see the comment
+     * above aarch64_sve_narrow_vq.
+     */
+    new_len = sve_vqm1_for_el(env, cur_el);
+    if (new_len < old_len) {
+        aarch64_sve_narrow_vq(env, new_len + 1);
+    }
+}
+
 static const ARMCPRegInfo sme_reginfo[] = {
     { .name = "TPIDR2_EL0", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 3, .crn = 13, .crm = 0, .opc2 = 5,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo sme_reginfo[] = {
       .access = PL0_RW, .type = ARM_CP_SME,
       .fieldoffset = offsetof(CPUARMState, svcr),
       .writefn = svcr_write, .raw_writefn = raw_write },
+    { .name = "SMCR_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 2, .opc2 = 6,
+      .access = PL1_RW, .type = ARM_CP_SME,
+      .fieldoffset = offsetof(CPUARMState, vfp.smcr_el[1]),
+      .writefn = smcr_write, .raw_writefn = raw_write },
+    { .name = "SMCR_EL2", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 6,
+      .access = PL2_RW, .type = ARM_CP_SME,
+      .fieldoffset = offsetof(CPUARMState, vfp.smcr_el[2]),
+      .writefn = smcr_write, .raw_writefn = raw_write },
+    { .name = "SMCR_EL3", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 2, .opc2 = 6,
+      .access = PL3_RW, .type = ARM_CP_SME,
+      .fieldoffset = offsetof(CPUARMState, vfp.smcr_el[3]),
+      .writefn = smcr_write, .raw_writefn = raw_write },
 };
 #endif /* TARGET_AARCH64 */
 
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Implement the streaming mode identification register, and the
two streaming priority registers.  For QEMU, they are all RES0.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_tpidr2(CPUARMState *env, const ARMCPRegInfo *ri,
     return CP_ACCESS_OK;
 }
 
+static CPAccessResult access_esm(CPUARMState *env, const ARMCPRegInfo *ri,
+                                 bool isread)
+{
+    /* TODO: FEAT_FGT for SMPRI_EL1 but not SMPRIMAP_EL2 */
+    if (arm_current_el(env) < 3
+        && arm_feature(env, ARM_FEATURE_EL3)
+        && !FIELD_EX64(env->cp15.cptr_el[3], CPTR_EL3, ESM)) {
+        return CP_ACCESS_TRAP_EL3;
+    }
+    return CP_ACCESS_OK;
+}
+
 static void svcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                        uint64_t value)
 {
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo sme_reginfo[] = {
       .access = PL3_RW, .type = ARM_CP_SME,
       .fieldoffset = offsetof(CPUARMState, vfp.smcr_el[3]),
       .writefn = smcr_write, .raw_writefn = raw_write },
+    { .name = "SMIDR_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 1, .crn = 0, .crm = 0, .opc2 = 6,
+      .access = PL1_R, .accessfn = access_aa64_tid1,
+      /*
+       * IMPLEMENTOR = 0 (software)
+       * REVISION    = 0 (implementation defined)
+       * SMPS        = 0 (no streaming execution priority in QEMU)
+       * AFFINITY    = 0 (streaming sve mode not shared with other PEs)
+       */
+      .type = ARM_CP_CONST, .resetvalue = 0, },
+    /*
+     * Because SMIDR_EL1.SMPS is 0, SMPRI_EL1 and SMPRIMAP_EL2 are RES 0.
+     */
+    { .name = "SMPRI_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 2, .opc2 = 4,
+      .access = PL1_RW, .accessfn = access_esm,
+      .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "SMPRIMAP_EL2", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 5,
+      .access = PL2_RW, .accessfn = access_esm,
+      .type = ARM_CP_CONST, .resetvalue = 0 },
 };
 #endif /* TARGET_AARCH64 */
 
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

These are required to determine if various insns
are allowed to issue.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h           | 2 ++
 target/arm/translate.h     | 4 ++++
 target/arm/helper.c        | 4 ++++
 target/arm/translate-a64.c | 2 ++
 4 files changed, 12 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, TCMA, 16, 2)
 FIELD(TBFLAG_A64, MTE_ACTIVE, 18, 1)
 FIELD(TBFLAG_A64, MTE0_ACTIVE, 19, 1)
 FIELD(TBFLAG_A64, SMEEXC_EL, 20, 2)
+FIELD(TBFLAG_A64, PSTATE_SM, 22, 1)
+FIELD(TBFLAG_A64, PSTATE_ZA, 23, 1)
 
 /*
  * Helpers for using the above.
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     bool align_mem;
     /* True if PSTATE.IL is set */
     bool pstate_il;
+    /* True if PSTATE.SM is set. */
+    bool pstate_sm;
+    /* True if PSTATE.ZA is set. */
+    bool pstate_za;
     /* True if MVE insns are definitely not predicated by VPR or LTPSIZE */
     bool mve_no_pred;
     /*
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
     }
     if (cpu_isar_feature(aa64_sme, env_archcpu(env))) {
         DP_TBFLAG_A64(flags, SMEEXC_EL, sme_exception_el(env, el));
+        if (FIELD_EX64(env->svcr, SVCR, SM)) {
+            DP_TBFLAG_A64(flags, PSTATE_SM, 1);
+        }
+        DP_TBFLAG_A64(flags, PSTATE_ZA, FIELD_EX64(env->svcr, SVCR, ZA));
     }
 
     sctlr = regime_sctlr(env, stage1);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
     dc->ata = EX_TBFLAG_A64(tb_flags, ATA);
     dc->mte_active[0] = EX_TBFLAG_A64(tb_flags, MTE_ACTIVE);
     dc->mte_active[1] = EX_TBFLAG_A64(tb_flags, MTE0_ACTIVE);
+    dc->pstate_sm = EX_TBFLAG_A64(tb_flags, PSTATE_SM);
+    dc->pstate_za = EX_TBFLAG_A64(tb_flags, PSTATE_ZA);
     dc->vec_len = 0;
     dc->vec_stride = 0;
     dc->cp_regs = arm_cpu->cp_regs;
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Place this late in the resettable section of the structure,
to keep the most common element offsets from being > 64k.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-10-richard.henderson@linaro.org
[PMM: expanded comment on zarray[] format]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h     | 22 ++++++++++++++++++++++
 target/arm/machine.c | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef struct CPUArchState {
     } keys;
 
     uint64_t scxtnum_el[4];
+
+    /*
+     * SME ZA storage -- 256 x 256 byte array, with bytes in host word order,
+     * as we do with vfp.zregs[].  This corresponds to the architectural ZA
+     * array, where ZA[N] is in the least-significant bytes of env->zarray[N].
+     * When SVL is less than the architectural maximum, the accessible
+     * storage is restricted, such that if the SVL is X bytes the guest can
+     * see only the bottom X elements of zarray[], and only the least
+     * significant X bytes of each element of the array. (In other words,
+     * the observable part is always square.)
+     *
+     * The ZA storage can also be considered as a set of square tiles of
+     * elements of different sizes. The mapping from tiles to the ZA array
+     * is architecturally defined, such that for tiles of elements of esz
+     * bytes, the Nth row (or "horizontal slice") of tile T is in
+     * ZA[T + N * esz]. Note that this means that each tile is not contiguous
+     * in the ZA storage, because its rows are striped through the ZA array.
+     *
+     * Because this is so large, keep this toward the end of the reset area,
+     * to keep the offsets into the rest of the structure smaller.
+     */
+    ARMVectorReg zarray[ARM_MAX_VQ * 16];
 #endif
 
 #if defined(CONFIG_USER_ONLY)
diff --git a/target/arm/machine.c b/target/arm/machine.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_sve = {
         VMSTATE_END_OF_LIST()
     }
 };
+
+static const VMStateDescription vmstate_vreg = {
+    .name = "vreg",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT64_ARRAY(d, ARMVectorReg, ARM_MAX_VQ * 2),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static bool za_needed(void *opaque)
+{
+    ARMCPU *cpu = opaque;
+
+    /*
+     * When ZA storage is disabled, its contents are discarded.
+     * It will be zeroed when ZA storage is re-enabled.
+     */
+    return FIELD_EX64(cpu->env.svcr, SVCR, ZA);
+}
+
+static const VMStateDescription vmstate_za = {
+    .name = "cpu/sme",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = za_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_STRUCT_ARRAY(env.zarray, ARMCPU, ARM_MAX_VQ * 16, 0,
+                             vmstate_vreg, ARMVectorReg),
+        VMSTATE_END_OF_LIST()
+    }
+};
 #endif /* AARCH64 */
 
 static bool serror_needed(void *opaque)
@@ -XXX,XX +XXX,XX @@ const VMStateDescription vmstate_arm_cpu = {
         &vmstate_m_security,
 #ifdef TARGET_AARCH64
         &vmstate_sve,
+        &vmstate_za,
 #endif
         &vmstate_serror,
         &vmstate_irq_line_state,
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

These two instructions are aliases of MSR (immediate).
Use the two helpers to properly implement svcr_write.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h           |  1 +
 target/arm/helper-sme.h    | 21 +++++++++++++
 target/arm/helper.h        |  1 +
 target/arm/helper.c        |  6 ++--
 target/arm/sme_helper.c    | 61 ++++++++++++++++++++++++++++++++++++++
 target/arm/translate-a64.c | 24 +++++++++++++++
 target/arm/meson.build     |  1 +
 7 files changed, 112 insertions(+), 3 deletions(-)
 create mode 100644 target/arm/helper-sme.h
 create mode 100644 target/arm/sme_helper.c

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ void aarch64_sve_change_el(CPUARMState *env, int old_el,
                            int new_el, bool el0_a64);
 void aarch64_add_sve_properties(Object *obj);
 void aarch64_add_pauth_properties(Object *obj);
+void arm_reset_sve_state(CPUARMState *env);
 
 /*
  * SVE registers are encoded in KVM's memory in an endianness-invariant format.
diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/helper-sme.h
@@ -XXX,XX +XXX,XX @@
+/*
+ *  AArch64 SME specific helper definitions
+ *
+ *  Copyright (c) 2022 Linaro, Ltd
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+DEF_HELPER_FLAGS_2(set_pstate_sm, TCG_CALL_NO_RWG, void, env, i32)
+DEF_HELPER_FLAGS_2(set_pstate_za, TCG_CALL_NO_RWG, void, env, i32)
diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_6(gvec_bfmlal_idx, TCG_CALL_NO_RWG,
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #include "helper-sve.h"
+#include "helper-sme.h"
 #endif
 
 #include "helper-mve.h"
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static CPAccessResult access_esm(CPUARMState *env, const ARMCPRegInfo *ri,
 static void svcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                        uint64_t value)
 {
-    value &= R_SVCR_SM_MASK | R_SVCR_ZA_MASK;
-    /* TODO: Side effects. */
-    env->svcr = value;
+    helper_set_pstate_sm(env, FIELD_EX64(value, SVCR, SM));
+    helper_set_pstate_za(env, FIELD_EX64(value, SVCR, ZA));
+    arm_rebuild_hflags(env);
 }
 
 static void smcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/sme_helper.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * ARM SME Operations
+ *
+ * Copyright (c) 2022 Linaro, Ltd.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "internals.h"
+#include "exec/helper-proto.h"
+
+/* ResetSVEState */
+void arm_reset_sve_state(CPUARMState *env)
+{
+    memset(env->vfp.zregs, 0, sizeof(env->vfp.zregs));
+    /* Recall that FFR is stored as pregs[16]. */
+    memset(env->vfp.pregs, 0, sizeof(env->vfp.pregs));
+    vfp_set_fpcr(env, 0x0800009f);
+}
+
+void helper_set_pstate_sm(CPUARMState *env, uint32_t i)
+{
+    if (i == FIELD_EX64(env->svcr, SVCR, SM)) {
+        return;
+    }
+    env->svcr ^= R_SVCR_SM_MASK;
+    arm_reset_sve_state(env);
+}
+
+void helper_set_pstate_za(CPUARMState *env, uint32_t i)
+{
+    if (i == FIELD_EX64(env->svcr, SVCR, ZA)) {
+        return;
+    }
+    env->svcr ^= R_SVCR_ZA_MASK;
+
+    /*
+     * ResetSMEState.
+     *
+     * SetPSTATE_ZA zeros on enable and disable.  We can zero this only
+     * on enable: while disabled, the storage is inaccessible and the
+     * value does not matter.  We're not saving the storage in vmstate
+     * when disabled either.
+     */
+    if (i) {
+        memset(env->zarray, 0, sizeof(env->zarray));
+    }
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void handle_msr_i(DisasContext *s, uint32_t insn,
         }
         break;
 
+    case 0x1b: /* SVCR* */
+        if (!dc_isar_feature(aa64_sme, s) || crm < 2 || crm > 7) {
+            goto do_unallocated;
+        }
+        if (sme_access_check(s)) {
+            bool i = crm & 1;
+            bool changed = false;
+
+            if ((crm & 2) && i != s->pstate_sm) {
+                gen_helper_set_pstate_sm(cpu_env, tcg_constant_i32(i));
+                changed = true;
+            }
+            if ((crm & 4) && i != s->pstate_za) {
+                gen_helper_set_pstate_za(cpu_env, tcg_constant_i32(i));
+                changed = true;
+            }
+            if (changed) {
+                gen_rebuild_hflags(s);
+            } else {
+                s->base.is_jmp = DISAS_NEXT;
+            }
+        }
+        break;
+
     default:
     do_unallocated:
         unallocated_encoding(s);
diff --git a/target/arm/meson.build b/target/arm/meson.build
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/meson.build
+++ b/target/arm/meson.build
@@ -XXX,XX +XXX,XX @@ arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
   'mte_helper.c',
   'pauth_helper.c',
   'sve_helper.c',
+  'sme_helper.c',
   'translate-a64.c',
   'translate-sve.c',
 ))
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Keep all of the error messages together.  This does mean that
when setting many sve length properties we'll only generate
one error, but we only really need one.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu64.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
                                   "using only sve<N> properties.\n");
             } else {
                 error_setg(errp, "cannot enable sve%d", vq * 128);
-                error_append_hint(errp, "This CPU does not support "
-                                  "the vector length %d-bits.\n", vq * 128);
+                if (vq_supported) {
+                    error_append_hint(errp, "This CPU does not support "
+                                      "the vector length %d-bits.\n", vq * 128);
+                } else {
+                    error_append_hint(errp, "SVE not supported by KVM "
+                                      "on this host\n");
+                }
             }
             return;
         } else {
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve_vq(Object *obj, Visitor *v, const char *name,
         return;
     }
 
-    if (value && kvm_enabled() && !kvm_arm_sve_supported()) {
-        error_setg(errp, "cannot enable %s", name);
-        error_append_hint(errp, "SVE not supported by KVM on this host\n");
-        return;
-    }
-
     cpu->sve_vq_map = deposit32(cpu->sve_vq_map, vq - 1, 1, value);
     cpu->sve_vq_init |= 1 << (vq - 1);
 }
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Pull the three sve_vq_* values into a structure.
This will be reused for SME.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h    | 29 ++++++++++++++---------------
 target/arm/cpu64.c  | 22 +++++++++++-----------
 target/arm/helper.c |  2 +-
 target/arm/kvm64.c  |  2 +-
 4 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef enum ARMPSCIState {
 
 typedef struct ARMISARegisters ARMISARegisters;
 
+/*
+ * In map, each set bit is a supported vector length of (bit-number + 1) * 16
+ * bytes, i.e. each bit number + 1 is the vector length in quadwords.
+ *
+ * While processing properties during initialization, corresponding init bits
+ * are set for bits in sve_vq_map that have been set by properties.
+ *
+ * Bits set in supported represent valid vector lengths for the CPU type.
+ */
+typedef struct {
+    uint32_t map, init, supported;
+} ARMVQMap;
+
 /**
  * ARMCPU:
  * @env: #CPUARMState
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
     uint32_t sve_default_vq;
 #endif
 
-    /*
-     * In sve_vq_map each set bit is a supported vector length of
-     * (bit-number + 1) * 16 bytes, i.e. each bit number + 1 is the vector
-     * length in quadwords.
-     *
-     * While processing properties during initialization, corresponding
-     * sve_vq_init bits are set for bits in sve_vq_map that have been
-     * set by properties.
-     *
-     * Bits set in sve_vq_supported represent valid vector lengths for
-     * the CPU type.
-     */
-    uint32_t sve_vq_map;
-    uint32_t sve_vq_init;
-    uint32_t sve_vq_supported;
+    ARMVQMap sve_vq;
 
     /* Generic timer counter frequency, in Hz */
     uint64_t gt_cntfrq_hz;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
      * any of the above.  Finally, if SVE is not disabled, then at least one
      * vector length must be enabled.
      */
-    uint32_t vq_map = cpu->sve_vq_map;
-    uint32_t vq_init = cpu->sve_vq_init;
+    uint32_t vq_map = cpu->sve_vq.map;
+    uint32_t vq_init = cpu->sve_vq.init;
     uint32_t vq_supported;
     uint32_t vq_mask = 0;
     uint32_t tmp, vq, max_vq = 0;
@@ -XXX,XX +XXX,XX @@ void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
      */
     if (kvm_enabled()) {
         if (kvm_arm_sve_supported()) {
-            cpu->sve_vq_supported = kvm_arm_sve_get_vls(CPU(cpu));
-            vq_supported = cpu->sve_vq_supported;
+            cpu->sve_vq.supported = kvm_arm_sve_get_vls(CPU(cpu));
+            vq_supported = cpu->sve_vq.supported;
         } else {
             assert(!cpu_isar_feature(aa64_sve, cpu));
             vq_supported = 0;
         }
     } else {
-        vq_supported = cpu->sve_vq_supported;
+        vq_supported = cpu->sve_vq.supported;
     }
 
     /*
@@ -XXX,XX +XXX,XX @@ void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
 
     /* From now on sve_max_vq is the actual maximum supported length. */
     cpu->sve_max_vq = max_vq;
-    cpu->sve_vq_map = vq_map;
+    cpu->sve_vq.map = vq_map;
 }
 
 static void cpu_max_get_sve_max_vq(Object *obj, Visitor *v, const char *name,
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_get_sve_vq(Object *obj, Visitor *v, const char *name,
     if (!cpu_isar_feature(aa64_sve, cpu)) {
         value = false;
     } else {
-        value = extract32(cpu->sve_vq_map, vq - 1, 1);
+        value = extract32(cpu->sve_vq.map, vq - 1, 1);
     }
     visit_type_bool(v, name, &value, errp);
 }
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve_vq(Object *obj, Visitor *v, const char *name,
         return;
     }
 
-    cpu->sve_vq_map = deposit32(cpu->sve_vq_map, vq - 1, 1, value);
-    cpu->sve_vq_init |= 1 << (vq - 1);
+    cpu->sve_vq.map = deposit32(cpu->sve_vq.map, vq - 1, 1, value);
+    cpu->sve_vq.init |= 1 << (vq - 1);
 }
 
 static bool cpu_arm_get_sve(Object *obj, Error **errp)
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
     cpu->dcz_blocksize = 7; /*  512 bytes */
 #endif
 
-    cpu->sve_vq_supported = MAKE_64BIT_MASK(0, ARM_MAX_VQ);
+    cpu->sve_vq.supported = MAKE_64BIT_MASK(0, ARM_MAX_VQ);
 
     aarch64_add_pauth_properties(obj);
     aarch64_add_sve_properties(obj);
@@ -XXX,XX +XXX,XX @@ static void aarch64_a64fx_initfn(Object *obj)
 
     /* The A64FX supports only 128, 256 and 512 bit vector lengths */
     aarch64_add_sve_properties(obj);
-    cpu->sve_vq_supported = (1 << 0)  /* 128bit */
+    cpu->sve_vq.supported = (1 << 0)  /* 128bit */
                           | (1 << 1)  /* 256bit */
                           | (1 << 3); /* 512bit */
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ uint32_t sve_vqm1_for_el(CPUARMState *env, int el)
         len = MIN(len, 0xf & (uint32_t)env->vfp.zcr_el[3]);
     }
 
-    len = 31 - clz32(cpu->sve_vq_map & MAKE_64BIT_MASK(0, len + 1));
+    len = 31 - clz32(cpu->sve_vq.map & MAKE_64BIT_MASK(0, len + 1));
     return len;
 }
 
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -XXX,XX +XXX,XX @@ uint32_t kvm_arm_sve_get_vls(CPUState *cs)
 static int kvm_arm_sve_set_vls(CPUState *cs)
 {
     ARMCPU *cpu = ARM_CPU(cs);
-    uint64_t vls[KVM_ARM64_SVE_VLS_WORDS] = { cpu->sve_vq_map };
+    uint64_t vls[KVM_ARM64_SVE_VLS_WORDS] = { cpu->sve_vq.map };
     struct kvm_one_reg reg = {
         .id = KVM_REG_ARM64_SVE_VLS,
         .addr = (uint64_t)&vls[0],
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Rename from cpu_arm_{get,set}_sve_vq, and take the
ARMVQMap as the opaque parameter.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu64.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void cpu_max_set_sve_max_vq(Object *obj, Visitor *v, const char *name,
 }
 
 /*
- * Note that cpu_arm_get/set_sve_vq cannot use the simpler
- * object_property_add_bool interface because they make use
- * of the contents of "name" to determine which bit on which
- * to operate.
+ * Note that cpu_arm_{get,set}_vq cannot use the simpler
+ * object_property_add_bool interface because they make use of the
+ * contents of "name" to determine which bit on which to operate.
  */
-static void cpu_arm_get_sve_vq(Object *obj, Visitor *v, const char *name,
-                               void *opaque, Error **errp)
+static void cpu_arm_get_vq(Object *obj, Visitor *v, const char *name,
+                           void *opaque, Error **errp)
 {
     ARMCPU *cpu = ARM_CPU(obj);
+    ARMVQMap *vq_map = opaque;
     uint32_t vq = atoi(&name[3]) / 128;
     bool value;
 
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_get_sve_vq(Object *obj, Visitor *v, const char *name,
     if (!cpu_isar_feature(aa64_sve, cpu)) {
         value = false;
     } else {
-        value = extract32(cpu->sve_vq.map, vq - 1, 1);
+        value = extract32(vq_map->map, vq - 1, 1);
     }
     visit_type_bool(v, name, &value, errp);
 }
 
-static void cpu_arm_set_sve_vq(Object *obj, Visitor *v, const char *name,
-                               void *opaque, Error **errp)
+static void cpu_arm_set_vq(Object *obj, Visitor *v, const char *name,
+                           void *opaque, Error **errp)
 {
-    ARMCPU *cpu = ARM_CPU(obj);
+    ARMVQMap *vq_map = opaque;
     uint32_t vq = atoi(&name[3]) / 128;
     bool value;
 
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve_vq(Object *obj, Visitor *v, const char *name,
         return;
     }
 
-    cpu->sve_vq.map = deposit32(cpu->sve_vq.map, vq - 1, 1, value);
-    cpu->sve_vq.init |= 1 << (vq - 1);
+    vq_map->map = deposit32(vq_map->map, vq - 1, 1, value);
+    vq_map->init |= 1 << (vq - 1);
 }
 
 static bool cpu_arm_get_sve(Object *obj, Error **errp)
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_get_sve_default_vec_len(Object *obj, Visitor *v,
 
 void aarch64_add_sve_properties(Object *obj)
 {
+    ARMCPU *cpu = ARM_CPU(obj);
     uint32_t vq;
 
     object_property_add_bool(obj, "sve", cpu_arm_get_sve, cpu_arm_set_sve);
@@ -XXX,XX +XXX,XX @@ void aarch64_add_sve_properties(Object *obj)
     for (vq = 1; vq <= ARM_MAX_VQ; ++vq) {
         char name[8];
         sprintf(name, "sve%d", vq * 128);
-        object_property_add(obj, name, "bool", cpu_arm_get_sve_vq,
-                            cpu_arm_set_sve_vq, NULL, NULL);
+        object_property_add(obj, name, "bool", cpu_arm_get_vq,
+                            cpu_arm_set_vq, NULL, &cpu->sve_vq);
     }
 
 #ifdef CONFIG_USER_ONLY
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Rename from cpu_arm_{get,set}_sve_default_vec_len,
and take the pointer to default_vq from opaque.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu64.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve(Object *obj, bool value, Error **errp)
 
 #ifdef CONFIG_USER_ONLY
 /* Mirror linux /proc/sys/abi/sve_default_vector_length. */
-static void cpu_arm_set_sve_default_vec_len(Object *obj, Visitor *v,
-                                            const char *name, void *opaque,
-                                            Error **errp)
+static void cpu_arm_set_default_vec_len(Object *obj, Visitor *v,
+                                        const char *name, void *opaque,
+                                        Error **errp)
 {
-    ARMCPU *cpu = ARM_CPU(obj);
+    uint32_t *ptr_default_vq = opaque;
     int32_t default_len, default_vq, remainder;
 
     if (!visit_type_int32(v, name, &default_len, errp)) {
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve_default_vec_len(Object *obj, Visitor *v,
 
     /* Undocumented, but the kernel allows -1 to indicate "maximum". */
     if (default_len == -1) {
-        cpu->sve_default_vq = ARM_MAX_VQ;
+        *ptr_default_vq = ARM_MAX_VQ;
         return;
     }
 
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve_default_vec_len(Object *obj, Visitor *v,
         return;
     }
 
-    cpu->sve_default_vq = default_vq;
+    *ptr_default_vq = default_vq;
 }
 
-static void cpu_arm_get_sve_default_vec_len(Object *obj, Visitor *v,
-                                            const char *name, void *opaque,
-                                            Error **errp)
+static void cpu_arm_get_default_vec_len(Object *obj, Visitor *v,
+                                        const char *name, void *opaque,
+                                        Error **errp)
 {
-    ARMCPU *cpu = ARM_CPU(obj);
-    int32_t value = cpu->sve_default_vq * 16;
+    uint32_t *ptr_default_vq = opaque;
+    int32_t value = *ptr_default_vq * 16;
 
     visit_type_int32(v, name, &value, errp);
 }
@@ -XXX,XX +XXX,XX @@ void aarch64_add_sve_properties(Object *obj)
 #ifdef CONFIG_USER_ONLY
     /* Mirror linux /proc/sys/abi/sve_default_vector_length. */
     object_property_add(obj, "sve-default-vector-length", "int32",
-                        cpu_arm_get_sve_default_vec_len,
-                        cpu_arm_set_sve_default_vec_len, NULL, NULL);
+                        cpu_arm_get_default_vec_len,
+                        cpu_arm_set_default_vec_len, NULL,
+                        &cpu->sve_default_vq);
 #endif
 }
 
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Drop the aa32-only inline fallbacks,
and just use a couple of ifdefs.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h       | 6 ------
 target/arm/internals.h | 3 +++
 target/arm/cpu.c       | 2 ++
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef struct {
 
 #ifdef TARGET_AARCH64
 # define ARM_MAX_VQ    16
-void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp);
-void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp);
-void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp);
 #else
 # define ARM_MAX_VQ    1
-static inline void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp) { }
-static inline void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp) { }
-static inline void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp) { }
 #endif
 
 typedef struct ARMVectorReg {
diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ int arm_gdb_get_svereg(CPUARMState *env, GByteArray *buf, int reg);
 int arm_gdb_set_svereg(CPUARMState *env, uint8_t *buf, int reg);
 int aarch64_fpu_gdb_get_reg(CPUARMState *env, GByteArray *buf, int reg);
 int aarch64_fpu_gdb_set_reg(CPUARMState *env, uint8_t *buf, int reg);
+void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp);
+void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp);
+void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp);
 #endif
 
 #ifdef CONFIG_USER_ONLY
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ void arm_cpu_finalize_features(ARMCPU *cpu, Error **errp)
 {
     Error *local_err = NULL;
 
+#ifdef TARGET_AARCH64
     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
         arm_cpu_sve_finalize(cpu, &local_err);
         if (local_err != NULL) {
@@ -XXX,XX +XXX,XX @@ void arm_cpu_finalize_features(ARMCPU *cpu, Error **errp)
             return;
         }
     }
+#endif
 
     if (kvm_enabled()) {
         kvm_arm_steal_time_finalize(cpu, &local_err);
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

These functions are not used outside cpu64.c,
so make them static.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h   | 3 ---
 target/arm/cpu64.c | 4 ++--
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ int aarch64_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
 void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq);
 void aarch64_sve_change_el(CPUARMState *env, int old_el,
                            int new_el, bool el0_a64);
-void aarch64_add_sve_properties(Object *obj);
-void aarch64_add_pauth_properties(Object *obj);
 void arm_reset_sve_state(CPUARMState *env);
 
 /*
@@ -XXX,XX +XXX,XX @@ static inline void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq) { }
 static inline void aarch64_sve_change_el(CPUARMState *env, int o,
                                          int n, bool a)
 { }
-static inline void aarch64_add_sve_properties(Object *obj) { }
 #endif
 
 void aarch64_sync_32_to_64(CPUARMState *env);
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_get_default_vec_len(Object *obj, Visitor *v,
 }
 #endif
 
-void aarch64_add_sve_properties(Object *obj)
+static void aarch64_add_sve_properties(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
     uint32_t vq;
@@ -XXX,XX +XXX,XX @@ static Property arm_cpu_pauth_property =
 static Property arm_cpu_pauth_impdef_property =
     DEFINE_PROP_BOOL("pauth-impdef", ARMCPU, prop_pauth_impdef, false);
 
-void aarch64_add_pauth_properties(Object *obj)
+static void aarch64_add_pauth_properties(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
 
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Mirror the properties for SVE.  The main difference is
that any arbitrary set of powers of 2 may be supported,
and not the stricter constraints that apply to SVE.

Include a property to control FEAT_SME_FA64, as failing
to restrict the runtime to the proper subset of insns
could be a major point for bugs.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220620175235.60881-18-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/system/arm/cpu-features.rst |  56 +++++++++++++++
 target/arm/cpu.h                 |   2 +
 target/arm/internals.h           |   1 +
 target/arm/cpu.c                 |  14 +++-
 target/arm/cpu64.c               | 114 +++++++++++++++++++++++++++++--
 5 files changed, 180 insertions(+), 7 deletions(-)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -XXX,XX +XXX,XX @@ verbose command lines.  However, the recommended way to select vector
 lengths is to explicitly enable each desired length.  Therefore only
 example's (1), (4), and (6) exhibit recommended uses of the properties.
 
+SME CPU Property Examples
+-------------------------
+
+  1) Disable SME::
+
+     $ qemu-system-aarch64 -M virt -cpu max,sme=off
+
+  2) Implicitly enable all vector lengths for the ``max`` CPU type::
+
+     $ qemu-system-aarch64 -M virt -cpu max
+
+  3) Only enable the 256-bit vector length::
+
+     $ qemu-system-aarch64 -M virt -cpu max,sme256=on
+
+  3) Enable the 256-bit and 1024-bit vector lengths::
+
+     $ qemu-system-aarch64 -M virt -cpu max,sme256=on,sme1024=on
+
+  4) Disable the 512-bit vector length.  This results in all the other
+     lengths supported by ``max`` defaulting to enabled
+     (128, 256, 1024 and 2048)::
+
+     $ qemu-system-aarch64 -M virt -cpu max,sve512=off
+
 SVE User-mode Default Vector Length Property
 --------------------------------------------
 
@@ -XXX,XX +XXX,XX @@ length supported by QEMU is 256.
 
 If this property is set to ``-1`` then the default vector length
 is set to the maximum possible length.
+
+SME CPU Properties
+==================
+
+The SME CPU properties are much like the SVE properties: ``sme`` is
+used to enable or disable the entire SME feature, and ``sme<N>`` is
+used to enable or disable specific vector lengths.  Finally,
+``sme_fa64`` is used to enable or disable ``FEAT_SME_FA64``, which
+allows execution of the "full a64" instruction set while Streaming
+SVE mode is enabled.
+
+SME is not supported by KVM at this time.
+
+At least one vector length must be enabled when ``sme`` is enabled,
+and all vector lengths must be powers of 2.  The maximum vector
+length supported by qemu is 2048 bits.  Otherwise, there are no
+additional constraints on the set of vector lengths supported by SME.
+
+SME User-mode Default Vector Length Property
+--------------------------------------------
+
+For qemu-aarch64, the cpu propery ``sme-default-vector-length=N`` is
+defined to mirror the Linux kernel parameter file
+``/proc/sys/abi/sme_default_vector_length``.  The default length, ``N``,
+is in units of bytes and must be between 16 and 8192.
+If not specified, the default vector length is 32.
+
+As with ``sve-default-vector-length``, if the default length is larger
+than the maximum vector length enabled, the actual vector length will
+be reduced.  If this property is set to ``-1`` then the default vector
+length is set to the maximum possible length.
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
 #ifdef CONFIG_USER_ONLY
     /* Used to set the default vector length at process start. */
     uint32_t sve_default_vq;
+    uint32_t sme_default_vq;
 #endif
 
     ARMVQMap sve_vq;
+    ARMVQMap sme_vq;
 
     /* Generic timer counter frequency, in Hz */
     uint64_t gt_cntfrq_hz;
diff --git a/target/arm/internals.h b/target/arm/internals.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -XXX,XX +XXX,XX @@ int arm_gdb_set_svereg(CPUARMState *env, uint8_t *buf, int reg);
 int aarch64_fpu_gdb_get_reg(CPUARMState *env, GByteArray *buf, int reg);
 int aarch64_fpu_gdb_set_reg(CPUARMState *env, uint8_t *buf, int reg);
 void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp);
+void arm_cpu_sme_finalize(ARMCPU *cpu, Error **errp);
 void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp);
 void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp);
 #endif
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_initfn(Object *obj)
 #ifdef CONFIG_USER_ONLY
 # ifdef TARGET_AARCH64
     /*
-     * The linux kernel defaults to 512-bit vectors, when sve is supported.
-     * See documentation for /proc/sys/abi/sve_default_vector_length, and
-     * our corresponding sve-default-vector-length cpu property.
+     * The linux kernel defaults to 512-bit for SVE, and 256-bit for SME.
+     * These values were chosen to fit within the default signal frame.
+     * See documentation for /proc/sys/abi/{sve,sme}_default_vector_length,
+     * and our corresponding cpu property.
      */
     cpu->sve_default_vq = 4;
+    cpu->sme_default_vq = 2;
 # endif
 #else
     /* Our inbound IRQ and FIQ lines */
@@ -XXX,XX +XXX,XX @@ void arm_cpu_finalize_features(ARMCPU *cpu, Error **errp)
             return;
         }
 
+        arm_cpu_sme_finalize(cpu, &local_err);
+        if (local_err != NULL) {
+            error_propagate(errp, local_err);
+            return;
+        }
+
         arm_cpu_pauth_finalize(cpu, &local_err);
         if (local_err != NULL) {
             error_propagate(errp, local_err);
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_get_vq(Object *obj, Visitor *v, const char *name,
     ARMCPU *cpu = ARM_CPU(obj);
     ARMVQMap *vq_map = opaque;
     uint32_t vq = atoi(&name[3]) / 128;
+    bool sve = vq_map == &cpu->sve_vq;
     bool value;
 
-    /* All vector lengths are disabled when SVE is off. */
-    if (!cpu_isar_feature(aa64_sve, cpu)) {
+    /* All vector lengths are disabled when feature is off. */
+    if (sve
+        ? !cpu_isar_feature(aa64_sve, cpu)
+        : !cpu_isar_feature(aa64_sme, cpu)) {
         value = false;
     } else {
         value = extract32(vq_map->map, vq - 1, 1);
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_sve(Object *obj, bool value, Error **errp)
     cpu->isar.id_aa64pfr0 = t;
 }
 
+void arm_cpu_sme_finalize(ARMCPU *cpu, Error **errp)
+{
+    uint32_t vq_map = cpu->sme_vq.map;
+    uint32_t vq_init = cpu->sme_vq.init;
+    uint32_t vq_supported = cpu->sme_vq.supported;
+    uint32_t vq;
+
+    if (vq_map == 0) {
+        if (!cpu_isar_feature(aa64_sme, cpu)) {
+            cpu->isar.id_aa64smfr0 = 0;
+            return;
+        }
+
+        /* TODO: KVM will require limitations via SMCR_EL2. */
+        vq_map = vq_supported & ~vq_init;
+
+        if (vq_map == 0) {
+            vq = ctz32(vq_supported) + 1;
+            error_setg(errp, "cannot disable sme%d", vq * 128);
+            error_append_hint(errp, "All SME vector lengths are disabled.\n");
+            error_append_hint(errp, "With SME enabled, at least one "
+                              "vector length must be enabled.\n");
+            return;
+        }
+    } else {
+        if (!cpu_isar_feature(aa64_sme, cpu)) {
+            vq = 32 - clz32(vq_map);
+            error_setg(errp, "cannot enable sme%d", vq * 128);
+            error_append_hint(errp, "SME must be enabled to enable "
+                              "vector lengths.\n");
+            error_append_hint(errp, "Add sme=on to the CPU property list.\n");
+            return;
+        }
+        /* TODO: KVM will require limitations via SMCR_EL2. */
+    }
+
+    cpu->sme_vq.map = vq_map;
+}
+
+static bool cpu_arm_get_sme(Object *obj, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    return cpu_isar_feature(aa64_sme, cpu);
+}
+
+static void cpu_arm_set_sme(Object *obj, bool value, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    uint64_t t;
+
+    t = cpu->isar.id_aa64pfr1;
+    t = FIELD_DP64(t, ID_AA64PFR1, SME, value);
+    cpu->isar.id_aa64pfr1 = t;
+}
+
+static bool cpu_arm_get_sme_fa64(Object *obj, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    return cpu_isar_feature(aa64_sme, cpu) &&
+           cpu_isar_feature(aa64_sme_fa64, cpu);
+}
+
+static void cpu_arm_set_sme_fa64(Object *obj, bool value, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    uint64_t t;
+
+    t = cpu->isar.id_aa64smfr0;
+    t = FIELD_DP64(t, ID_AA64SMFR0, FA64, value);
+    cpu->isar.id_aa64smfr0 = t;
+}
+
 #ifdef CONFIG_USER_ONLY
-/* Mirror linux /proc/sys/abi/sve_default_vector_length. */
+/* Mirror linux /proc/sys/abi/{sve,sme}_default_vector_length. */
 static void cpu_arm_set_default_vec_len(Object *obj, Visitor *v,
                                         const char *name, void *opaque,
                                         Error **errp)
@@ -XXX,XX +XXX,XX @@ static void cpu_arm_set_default_vec_len(Object *obj, Visitor *v,
      * and is the maximum architectural width of ZCR_ELx.LEN.
      */
     if (remainder || default_vq < 1 || default_vq > 512) {
-        error_setg(errp, "cannot set sve-default-vector-length");
+        ARMCPU *cpu = ARM_CPU(obj);
+        const char *which =
+            (ptr_default_vq == &cpu->sve_default_vq ? "sve" : "sme");
+
+        error_setg(errp, "cannot set %s-default-vector-length", which);
         if (remainder) {
             error_append_hint(errp, "Vector length not a multiple of 16\n");
         } else if (default_vq < 1) {
@@ -XXX,XX +XXX,XX @@ static void aarch64_add_sve_properties(Object *obj)
 #endif
 }
 
+static void aarch64_add_sme_properties(Object *obj)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    uint32_t vq;
+
+    object_property_add_bool(obj, "sme", cpu_arm_get_sme, cpu_arm_set_sme);
+    object_property_add_bool(obj, "sme_fa64", cpu_arm_get_sme_fa64,
+                             cpu_arm_set_sme_fa64);
+
+    for (vq = 1; vq <= ARM_MAX_VQ; vq <<= 1) {
+        char name[8];
+        sprintf(name, "sme%d", vq * 128);
+        object_property_add(obj, name, "bool", cpu_arm_get_vq,
+                            cpu_arm_set_vq, NULL, &cpu->sme_vq);
+    }
+
+#ifdef CONFIG_USER_ONLY
+    /* Mirror linux /proc/sys/abi/sme_default_vector_length. */
+    object_property_add(obj, "sme-default-vector-length", "int32",
+                        cpu_arm_get_default_vec_len,
+                        cpu_arm_set_default_vec_len, NULL,
+                        &cpu->sme_default_vq);
+#endif
+}
+
 void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
 {
     int arch_val = 0, impdef_val = 0;
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
 #endif
 
     cpu->sve_vq.supported = MAKE_64BIT_MASK(0, ARM_MAX_VQ);
+    cpu->sme_vq.supported = SVE_VQ_POW2_MAP;
 
     aarch64_add_pauth_properties(obj);
     aarch64_add_sve_properties(obj);
+    aarch64_add_sme_properties(obj);
     object_property_add(obj, "sve-max-vq", "uint32", cpu_max_get_sve_max_vq,
                         cpu_max_set_sve_max_vq, NULL, NULL);
     qdev_property_add_static(DEVICE(obj), &arm_cpu_lpa2_property);
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

When Streaming SVE mode is enabled, the size is taken from
SMCR_ELx instead of ZCR_ELx.  The format is shared, but the
set of vector lengths is not.  Further, Streaming SVE does
not require any particular length to be supported.

Adjust sve_vqm1_for_el to pass the current value of PSTATE.SM
to the new function.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-19-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h    |  9 +++++++--
 target/arm/helper.c | 32 +++++++++++++++++++++++++-------
 2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ int sve_exception_el(CPUARMState *env, int cur_el);
 int sme_exception_el(CPUARMState *env, int cur_el);
 
 /**
- * sve_vqm1_for_el:
+ * sve_vqm1_for_el_sm:
  * @env: CPUARMState
  * @el: exception level
+ * @sm: streaming mode
  *
- * Compute the current SVE vector length for @el, in units of
+ * Compute the current vector length for @el & @sm, in units of
  * Quadwords Minus 1 -- the same scale used for ZCR_ELx.LEN.
+ * If @sm, compute for SVL, otherwise NVL.
  */
+uint32_t sve_vqm1_for_el_sm(CPUARMState *env, int el, bool sm);
+
+/* Likewise, but using @sm = PSTATE.SM. */
 uint32_t sve_vqm1_for_el(CPUARMState *env, int el);
 
 static inline bool is_a64(CPUARMState *env)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ int sme_exception_el(CPUARMState *env, int el)
 /*
  * Given that SVE is enabled, return the vector length for EL.
  */
-uint32_t sve_vqm1_for_el(CPUARMState *env, int el)
+uint32_t sve_vqm1_for_el_sm(CPUARMState *env, int el, bool sm)
 {
     ARMCPU *cpu = env_archcpu(env);
-    uint32_t len = cpu->sve_max_vq - 1;
+    uint64_t *cr = env->vfp.zcr_el;
+    uint32_t map = cpu->sve_vq.map;
+    uint32_t len = ARM_MAX_VQ - 1;
+
+    if (sm) {
+        cr = env->vfp.smcr_el;
+        map = cpu->sme_vq.map;
+    }
 
     if (el <= 1 && !el_is_in_host(env, el)) {
-        len = MIN(len, 0xf & (uint32_t)env->vfp.zcr_el[1]);
+        len = MIN(len, 0xf & (uint32_t)cr[1]);
     }
     if (el <= 2 && arm_feature(env, ARM_FEATURE_EL2)) {
-        len = MIN(len, 0xf & (uint32_t)env->vfp.zcr_el[2]);
+        len = MIN(len, 0xf & (uint32_t)cr[2]);
     }
     if (arm_feature(env, ARM_FEATURE_EL3)) {
-        len = MIN(len, 0xf & (uint32_t)env->vfp.zcr_el[3]);
+        len = MIN(len, 0xf & (uint32_t)cr[3]);
     }
 
-    len = 31 - clz32(cpu->sve_vq.map & MAKE_64BIT_MASK(0, len + 1));
-    return len;
+    map &= MAKE_64BIT_MASK(0, len + 1);
+    if (map != 0) {
+        return 31 - clz32(map);
+    }
+
+    /* Bit 0 is always set for Normal SVE -- not so for Streaming SVE. */
+    assert(sm);
+    return ctz32(cpu->sme_vq.map);
+}
+
+uint32_t sve_vqm1_for_el(CPUARMState *env, int el)
+{
+    return sve_vqm1_for_el_sm(env, el, FIELD_EX64(env->svcr, SVCR, SM));
 }
 
 static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

We need SVL separate from VL for RDSVL et al, as well as
ZA storage loads and stores, which do not require PSTATE.SM.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-20-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h           | 12 ++++++++++++
 target/arm/translate.h     |  1 +
 target/arm/helper.c        |  8 +++++++-
 target/arm/translate-a64.c |  1 +
 4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A64, MTE0_ACTIVE, 19, 1)
 FIELD(TBFLAG_A64, SMEEXC_EL, 20, 2)
 FIELD(TBFLAG_A64, PSTATE_SM, 22, 1)
 FIELD(TBFLAG_A64, PSTATE_ZA, 23, 1)
+FIELD(TBFLAG_A64, SVL, 24, 4)
 
 /*
  * Helpers for using the above.
@@ -XXX,XX +XXX,XX @@ static inline int sve_vq(CPUARMState *env)
     return EX_TBFLAG_A64(env->hflags, VL) + 1;
 }
 
+/**
+ * sme_vq
+ * @env: the cpu context
+ *
+ * Return the SVL cached within env->hflags, in units of quadwords.
+ */
+static inline int sme_vq(CPUARMState *env)
+{
+    return EX_TBFLAG_A64(env->hflags, SVL) + 1;
+}
+
 static inline bool bswap_code(bool sctlr_b)
 {
 #ifdef CONFIG_USER_ONLY
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     int sve_excp_el; /* SVE exception EL or 0 if enabled */
     int sme_excp_el; /* SME exception EL or 0 if enabled */
     int vl;          /* current vector length in bytes */
+    int svl;         /* current streaming vector length in bytes */
     bool vfp_enabled; /* FP enabled via FPSCR.EN */
     int vec_len;
     int vec_stride;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
         DP_TBFLAG_A64(flags, SVEEXC_EL, sve_el);
     }
     if (cpu_isar_feature(aa64_sme, env_archcpu(env))) {
-        DP_TBFLAG_A64(flags, SMEEXC_EL, sme_exception_el(env, el));
+        int sme_el = sme_exception_el(env, el);
+
+        DP_TBFLAG_A64(flags, SMEEXC_EL, sme_el);
+        if (sme_el == 0) {
+            /* Similarly, do not compute SVL if SME is disabled. */
+            DP_TBFLAG_A64(flags, SVL, sve_vqm1_for_el_sm(env, el, true));
+        }
         if (FIELD_EX64(env->svcr, SVCR, SM)) {
             DP_TBFLAG_A64(flags, PSTATE_SM, 1);
         }
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
     dc->sve_excp_el = EX_TBFLAG_A64(tb_flags, SVEEXC_EL);
     dc->sme_excp_el = EX_TBFLAG_A64(tb_flags, SMEEXC_EL);
     dc->vl = (EX_TBFLAG_A64(tb_flags, VL) + 1) * 16;
+    dc->svl = (EX_TBFLAG_A64(tb_flags, SVL) + 1) * 16;
     dc->pauth_active = EX_TBFLAG_A64(tb_flags, PAUTH_ACTIVE);
     dc->bt = EX_TBFLAG_A64(tb_flags, BT);
     dc->btype = EX_TBFLAG_A64(tb_flags, BTYPE);
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

We will need these functions in translate-sme.c.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220620175235.60881-21-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-a64.h | 38 ++++++++++++++++++++++++++++++++++++++
 target/arm/translate-sve.c | 36 ------------------------------------
 2 files changed, 38 insertions(+), 36 deletions(-)

diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-a64.h
+++ b/target/arm/translate-a64.h
@@ -XXX,XX +XXX,XX @@ static inline int vec_full_reg_size(DisasContext *s)
     return s->vl;
 }
 
+/*
+ * Return the offset info CPUARMState of the predicate vector register Pn.
+ * Note for this purpose, FFR is P16.
+ */
+static inline int pred_full_reg_offset(DisasContext *s, int regno)
+{
+    return offsetof(CPUARMState, vfp.pregs[regno]);
+}
+
+/* Return the byte size of the whole predicate register, VL / 64.  */
+static inline int pred_full_reg_size(DisasContext *s)
+{
+    return s->vl >> 3;
+}
+
+/*
+ * Round up the size of a register to a size allowed by
+ * the tcg vector infrastructure.  Any operation which uses this
+ * size may assume that the bits above pred_full_reg_size are zero,
+ * and must leave them the same way.
+ *
+ * Note that this is not needed for the vector registers as they
+ * are always properly sized for tcg vectors.
+ */
+static inline int size_for_gvec(int size)
+{
+    if (size <= 8) {
+        return 8;
+    } else {
+        return QEMU_ALIGN_UP(size, 16);
+    }
+}
+
+static inline int pred_gvec_reg_size(DisasContext *s)
+{
+    return size_for_gvec(pred_full_reg_size(s));
+}
+
 bool disas_sve(DisasContext *, uint32_t);
 
 void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -XXX,XX +XXX,XX @@ static inline int msz_dtype(DisasContext *s, int msz)
  * Implement all of the translator functions referenced by the decoder.
  */
 
-/* Return the offset info CPUARMState of the predicate vector register Pn.
- * Note for this purpose, FFR is P16.
- */
-static inline int pred_full_reg_offset(DisasContext *s, int regno)
-{
-    return offsetof(CPUARMState, vfp.pregs[regno]);
-}
-
-/* Return the byte size of the whole predicate register, VL / 64.  */
-static inline int pred_full_reg_size(DisasContext *s)
-{
-    return s->vl >> 3;
-}
-
-/* Round up the size of a register to a size allowed by
- * the tcg vector infrastructure.  Any operation which uses this
- * size may assume that the bits above pred_full_reg_size are zero,
- * and must leave them the same way.
- *
- * Note that this is not needed for the vector registers as they
- * are always properly sized for tcg vectors.
- */
-static int size_for_gvec(int size)
-{
-    if (size <= 8) {
-        return 8;
-    } else {
-        return QEMU_ALIGN_UP(size, 16);
-    }
-}
-
-static int pred_gvec_reg_size(DisasContext *s)
-{
-    return size_for_gvec(pred_full_reg_size(s));
-}
-
 /* Invoke an out-of-line helper on 2 Zregs. */
 static bool gen_gvec_ool_zz(DisasContext *s, gen_helper_gvec_2 *fn,
                             int rd, int rn, int data)
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

Move the code from hw/arm/virt.c that is supposed
to handle v7 into the one function.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reported-by: He Zhe <zhe.he@windriver.com>
Message-id: 20220619001541.131672-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/virt.c    | 10 +---------
 target/arm/ptw.c | 24 ++++++++++++++++--------
 2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -XXX,XX +XXX,XX @@ static void machvirt_init(MachineState *machine)
         cpuobj = object_new(possible_cpus->cpus[0].type);
         armcpu = ARM_CPU(cpuobj);
 
-        if (object_property_get_bool(cpuobj, "aarch64", NULL)) {
-            pa_bits = arm_pamax(armcpu);
-        } else if (arm_feature(&armcpu->env, ARM_FEATURE_LPAE)) {
-            /* v7 with LPAE */
-            pa_bits = 40;
-        } else {
-            /* Anything else */
-            pa_bits = 32;
-        }
+        pa_bits = arm_pamax(armcpu);
 
         object_unref(cpuobj);
 
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t pamax_map[] = {
 /* The cpu-specific constant value of PAMax; also used by hw/arm/virt. */
 unsigned int arm_pamax(ARMCPU *cpu)
 {
-    unsigned int parange =
-        FIELD_EX64(cpu->isar.id_aa64mmfr0, ID_AA64MMFR0, PARANGE);
+    if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+        unsigned int parange =
+            FIELD_EX64(cpu->isar.id_aa64mmfr0, ID_AA64MMFR0, PARANGE);
 
-    /*
-     * id_aa64mmfr0 is a read-only register so values outside of the
-     * supported mappings can be considered an implementation error.
-     */
-    assert(parange < ARRAY_SIZE(pamax_map));
-    return pamax_map[parange];
+        /*
+         * id_aa64mmfr0 is a read-only register so values outside of the
+         * supported mappings can be considered an implementation error.
+         */
+        assert(parange < ARRAY_SIZE(pamax_map));
+        return pamax_map[parange];
+    }
+    if (arm_feature(&cpu->env, ARM_FEATURE_LPAE)) {
+        /* v7 with LPAE */
+        return 40;
+    }
+    /* Anything else */
+    return 32;
 }
 
 /*
-- 
2.25.1

From: Richard Henderson <richard.henderson@linaro.org>

In machvirt_init we create a cpu but do not fully initialize it.
Thus the propagation of V7VE to LPAE has not been done, and we
compute the wrong value for some v7 cpus, e.g. cortex-a15.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1078
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reported-by: He Zhe <zhe.he@windriver.com>
Message-id: 20220619001541.131672-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/ptw.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -XXX,XX +XXX,XX @@ unsigned int arm_pamax(ARMCPU *cpu)
         assert(parange < ARRAY_SIZE(pamax_map));
         return pamax_map[parange];
     }
-    if (arm_feature(&cpu->env, ARM_FEATURE_LPAE)) {
+
+    /*
+     * In machvirt_init, we call arm_pamax on a cpu that is not fully
+     * initialized, so we can't rely on the propagation done in realize.
+     */
+    if (arm_feature(&cpu->env, ARM_FEATURE_LPAE) ||
+        arm_feature(&cpu->env, ARM_FEATURE_V7VE)) {
         /* v7 with LPAE */
         return 40;
     }
-- 
2.25.1

Changes v1->v2 (fixing CI failures in v1, added a couple of
extra patches in an attempt to avoid having to do a last
minute arm pullreq next week):
 * new patch to hopefully fix the build issue with the SVE/SME sysregs test
 * dropped the IC IVAU test case patch
 * new patch: fix over-length shift
 * new patches: define neoverse-v1

thanks
-- PMM

The following changes since commit 2a6ae69154542caa91dd17c40fd3f5ffbec300de:

Merge tag 'pull-maintainer-ominbus-030723-1' of https://gitlab.com/stsquad/qemu into staging (2023-07-04 08:36:44 +0200)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20230706

for you to fetch changes up to c41077235168140cdd4a34fce9bd95c3d30efe9c:

target/arm: Avoid over-length shift in arm_cpu_sve_finalize() error case (2023-07-06 13:36:51 +0100)

----------------------------------------------------------------
target-arm queue:
 * Add raw_writes ops for register whose write induce TLB maintenance
 * hw/arm/sbsa-ref: use XHCI to replace EHCI
 * Avoid splitting Zregs across lines in dump
 * Dump ZA[] when active
 * Fix SME full tile indexing
 * Handle IC IVAU to improve compatibility with JITs
 * xlnx-canfd-test: Fix code coverity issues
 * gdbstub: Guard M-profile code with CONFIG_TCG
 * allwinner-sramc: Set class_size
 * target/xtensa: Assert that interrupt level is within bounds
 * Avoid over-length shift in arm_cpu_sve_finalize() error case
 * Define new 'neoverse-v1' CPU type

----------------------------------------------------------------
Akihiko Odaki (1):
      hw: arm: allwinner-sramc: Set class_size

Eric Auger (1):
      target/arm: Add raw_writes ops for register whose write induce TLB maintenance

Fabiano Rosas (1):
      target/arm: gdbstub: Guard M-profile code with CONFIG_TCG

John Högberg (1):
      target/arm: Handle IC IVAU to improve compatibility with JITs

Peter Maydell (5):
      tests/tcg/aarch64/sysregs.c: Use S syntax for id_aa64zfr0_el1 and id_aa64smfr0_el1
      target/xtensa: Assert that interrupt level is within bounds
      target/arm: Suppress more TCG unimplemented features in ID registers
      target/arm: Define neoverse-v1
      target/arm: Avoid over-length shift in arm_cpu_sve_finalize() error case

Richard Henderson (3):
      target/arm: Avoid splitting Zregs across lines in dump
      target/arm: Dump ZA[] when active
      target/arm: Fix SME full tile indexing

Vikram Garhwal (1):
      tests/qtest: xlnx-canfd-test: Fix code coverity issues

Yuquan Wang (1):
      hw/arm/sbsa-ref: use XHCI to replace EHCI

From: Eric Auger <eric.auger@redhat.com>

Some registers whose 'cooked' writefns induce TLB maintenance do
not have raw_writefn ops defined. If only the writefn ops is set
(ie. no raw_writefn is provided), it is assumed the cooked also
work as the raw one. For those registers it is not obvious the
tlb_flush works on KVM mode so better/safer setting the raw write.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
       .access = PL1_RW, .accessfn = access_tvm_trvm,
       .fgt = FGT_TTBR0_EL1,
-      .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .writefn = vmsa_ttbr_write, .resetvalue = 0, .raw_writefn = raw_write,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                              offsetof(CPUARMState, cp15.ttbr0_ns) } },
     { .name = "TTBR1_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 1,
       .access = PL1_RW, .accessfn = access_tvm_trvm,
       .fgt = FGT_TTBR1_EL1,
-      .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .writefn = vmsa_ttbr_write, .resetvalue = 0, .raw_writefn = raw_write,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) } },
     { .name = "TCR_EL1", .state = ARM_CP_STATE_AA64,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo lpae_cp_reginfo[] = {
       .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                              offsetof(CPUARMState, cp15.ttbr0_ns) },
-      .writefn = vmsa_ttbr_write, },
+      .writefn = vmsa_ttbr_write, .raw_writefn = raw_write },
     { .name = "TTBR1", .cp = 15, .crm = 2, .opc1 = 1,
       .access = PL1_RW, .accessfn = access_tvm_trvm,
       .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) },
-      .writefn = vmsa_ttbr_write, },
+      .writefn = vmsa_ttbr_write, .raw_writefn = raw_write },
 };
 
 static uint64_t aa64_fpcr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
       .type = ARM_CP_IO,
       .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
       .access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.hcr_el2),
-      .writefn = hcr_write },
+      .writefn = hcr_write, .raw_writefn = raw_write },
     { .name = "HCR", .state = ARM_CP_STATE_AA32,
       .type = ARM_CP_ALIAS | ARM_CP_IO,
       .cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
     { .name = "TCR_EL2", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 2,
       .access = PL2_RW, .writefn = vmsa_tcr_el12_write,
+      .raw_writefn = raw_write,
       .fieldoffset = offsetof(CPUARMState, cp15.tcr_el[2]) },
     { .name = "VTCR", .state = ARM_CP_STATE_AA32,
       .cp = 15, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
       .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .access = PL2_RW, .accessfn = access_el3_aa32ns,
       .fieldoffset = offsetof(CPUARMState, cp15.vttbr_el2),
-      .writefn = vttbr_write },
+      .writefn = vttbr_write, .raw_writefn = raw_write },
     { .name = "VTTBR_EL2", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 0,
-      .access = PL2_RW, .writefn = vttbr_write,
+      .access = PL2_RW, .writefn = vttbr_write, .raw_writefn = raw_write,
       .fieldoffset = offsetof(CPUARMState, cp15.vttbr_el2) },
     { .name = "SCTLR_EL2", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 0,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.tpidr_el[2]) },
     { .name = "TTBR0_EL2", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 0,
-      .access = PL2_RW, .resetvalue = 0, .writefn = vmsa_tcr_ttbr_el2_write,
+      .access = PL2_RW, .resetvalue = 0,
+      .writefn = vmsa_tcr_ttbr_el2_write, .raw_writefn = raw_write,
       .fieldoffset = offsetof(CPUARMState, cp15.ttbr0_el[2]) },
     { .name = "HTTBR", .cp = 15, .opc1 = 4, .crm = 2,
       .access = PL2_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
     { .name = "SCR_EL3", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 1, .opc2 = 0,
       .access = PL3_RW, .fieldoffset = offsetof(CPUARMState, cp15.scr_el3),
-      .resetfn = scr_reset, .writefn = scr_write },
+      .resetfn = scr_reset, .writefn = scr_write, .raw_writefn = raw_write },
     { .name = "SCR",  .type = ARM_CP_ALIAS | ARM_CP_NEWEL,
       .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 0,
       .access = PL1_RW, .accessfn = access_trap_aa32s_el1,
       .fieldoffset = offsetoflow32(CPUARMState, cp15.scr_el3),
-      .writefn = scr_write },
+      .writefn = scr_write, .raw_writefn = raw_write },
     { .name = "SDER32_EL3", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 1, .opc2 = 1,
       .access = PL3_RW, .resetvalue = 0,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo vhe_reginfo[] = {
     { .name = "TTBR1_EL2", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 1,
       .access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
+      .raw_writefn = raw_write,
       .fieldoffset = offsetof(CPUARMState, cp15.ttbr1_el[2]) },
 #ifndef CONFIG_USER_ONLY
     { .name = "CNTHV_CVAL_EL2", .state = ARM_CP_STATE_AA64,
-- 
2.34.1

From: Yuquan Wang <wangyuquan1236@phytium.com.cn>

The current sbsa-ref cannot use EHCI controller which is only
able to do 32-bit DMA, since sbsa-ref doesn't have RAM below 4GB.
Hence, this uses XHCI to provide a usb controller with 64-bit
DMA capablity instead of EHCI.

We bump the platform version to 0.3 with this change.  Although the
hardware at the USB controller address changes, the firmware and
Linux can both cope with this -- on an older non-XHCI-aware
firmware/kernel setup the probe routine simply fails and the guest
proceeds without any USB.  (This isn't a loss of functionality,
because the old USB controller never worked in the first place.) So
we can call this a backwards-compatible change and only bump the
minor version.

Signed-off-by: Yuquan Wang <wangyuquan1236@phytium.com.cn>
Message-id: 20230621103847.447508-2-wangyuquan1236@phytium.com.cn
[PMM: tweaked commit message; add line to docs about what
 changes in platform version 0.3]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/system/arm/sbsa.rst |  5 ++++-
 hw/arm/sbsa-ref.c        | 23 +++++++++++++----------
 hw/arm/Kconfig           |  2 +-
 3 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/docs/system/arm/sbsa.rst b/docs/system/arm/sbsa.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/arm/sbsa.rst
+++ b/docs/system/arm/sbsa.rst
@@ -XXX,XX +XXX,XX @@ The ``sbsa-ref`` board supports:
   - A configurable number of AArch64 CPUs
   - GIC version 3
   - System bus AHCI controller
-  - System bus EHCI controller
+  - System bus XHCI controller
   - CDROM and hard disc on AHCI bus
   - E1000E ethernet card on PCIe bus
   - Bochs display adapter on PCIe bus
@@ -XXX,XX +XXX,XX @@ Platform version changes:
 
 0.2
   GIC ITS information is present in devicetree.
+
+0.3
+  The USB controller is an XHCI device, not EHCI
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -XXX,XX +XXX,XX @@
 #include "hw/pci-host/gpex.h"
 #include "hw/qdev-properties.h"
 #include "hw/usb.h"
+#include "hw/usb/xhci.h"
 #include "hw/char/pl011.h"
 #include "hw/watchdog/sbsa_gwdt.h"
 #include "net/net.h"
@@ -XXX,XX +XXX,XX @@ enum {
     SBSA_SECURE_UART_MM,
     SBSA_SECURE_MEM,
     SBSA_AHCI,
-    SBSA_EHCI,
+    SBSA_XHCI,
 };
 
 struct SBSAMachineState {
@@ -XXX,XX +XXX,XX @@ static const MemMapEntry sbsa_ref_memmap[] = {
     [SBSA_SMMU] =               { 0x60050000, 0x00020000 },
     /* Space here reserved for more SMMUs */
     [SBSA_AHCI] =               { 0x60100000, 0x00010000 },
-    [SBSA_EHCI] =               { 0x60110000, 0x00010000 },
+    [SBSA_XHCI] =               { 0x60110000, 0x00010000 },
     /* Space here reserved for other devices */
     [SBSA_PCIE_PIO] =           { 0x7fff0000, 0x00010000 },
     /* 32-bit address PCIE MMIO space */
@@ -XXX,XX +XXX,XX @@ static const int sbsa_ref_irqmap[] = {
     [SBSA_SECURE_UART] = 8,
     [SBSA_SECURE_UART_MM] = 9,
     [SBSA_AHCI] = 10,
-    [SBSA_EHCI] = 11,
+    [SBSA_XHCI] = 11,
     [SBSA_SMMU] = 12, /* ... to 15 */
     [SBSA_GWDT_WS0] = 16,
 };
@@ -XXX,XX +XXX,XX @@ static void create_fdt(SBSAMachineState *sms)
      *                        fw compatibility.
      */
     qemu_fdt_setprop_cell(fdt, "/", "machine-version-major", 0);
-    qemu_fdt_setprop_cell(fdt, "/", "machine-version-minor", 2);
+    qemu_fdt_setprop_cell(fdt, "/", "machine-version-minor", 3);
 
     if (ms->numa_state->have_numa_distance) {
         int size = nb_numa_nodes * nb_numa_nodes * 3 * sizeof(uint32_t);
@@ -XXX,XX +XXX,XX @@ static void create_ahci(const SBSAMachineState *sms)
     }
 }
 
-static void create_ehci(const SBSAMachineState *sms)
+static void create_xhci(const SBSAMachineState *sms)
 {
-    hwaddr base = sbsa_ref_memmap[SBSA_EHCI].base;
-    int irq = sbsa_ref_irqmap[SBSA_EHCI];
+    hwaddr base = sbsa_ref_memmap[SBSA_XHCI].base;
+    int irq = sbsa_ref_irqmap[SBSA_XHCI];
+    DeviceState *dev = qdev_new(TYPE_XHCI_SYSBUS);
 
-    sysbus_create_simple("platform-ehci-usb", base,
-                         qdev_get_gpio_in(sms->gic, irq));
+    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, base);
+    sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, qdev_get_gpio_in(sms->gic, irq));
 }
 
 static void create_smmu(const SBSAMachineState *sms, PCIBus *bus)
@@ -XXX,XX +XXX,XX @@ static void sbsa_ref_init(MachineState *machine)
 
     create_ahci(sms);
 
-    create_ehci(sms);
+    create_xhci(sms);
 
     create_pcie(sms);
 
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -XXX,XX +XXX,XX @@ config SBSA_REF
     select PL011 # UART
     select PL031 # RTC
     select PL061 # GPIO
-    select USB_EHCI_SYSBUS
+    select USB_XHCI_SYSBUS
     select WDT_SBSA
     select BOCHS_DISPLAY
 
-- 
2.34.1

Some assemblers will complain about attempts to access
id_aa64zfr0_el1 and id_aa64smfr0_el1 by name if the test
binary isn't built for the right processor type:

/tmp/ccASXpLo.s:782: Error: selected processor does not support system register name 'id_aa64zfr0_el1'
 /tmp/ccASXpLo.s:829: Error: selected processor does not support system register name 'id_aa64smfr0_el1'

However, these registers are in the ID space and are guaranteed to
read-as-zero on older CPUs, so the access is both safe and sensible.
Switch to using the S syntax, as we already do for ID_AA64ISAR2_EL1
and ID_AA64MMFR2_EL1.  This allows us to drop the HAS_ARMV9_SME check
and the makefile machinery to adjust the CFLAGS for this test, so we
don't rely on having a sufficiently new compiler to be able to check
these registers.

This means we're actually testing the SME ID register: no released
GCC yet recognizes -march=armv9-a+sme, so that was always skipped.
It also avoids a future problem if we try to switch the "do we have
SME support in the toolchain" check from "in the compiler" to "in the
assembler" (at which point we would otherwise run into the above
errors).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 tests/tcg/aarch64/sysregs.c       | 11 +++++++----
 tests/tcg/aarch64/Makefile.target |  7 +------
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c
index XXXXXXX..XXXXXXX 100644
--- a/tests/tcg/aarch64/sysregs.c
+++ b/tests/tcg/aarch64/sysregs.c
@@ -XXX,XX +XXX,XX @@
 /*
  * Older assemblers don't recognize newer system register names,
  * but we can still access them by the Sn_n_Cn_Cn_n syntax.
+ * This also means we don't need to specifically request that the
+ * assembler enables whatever architectural features the ID registers
+ * syntax might be gated behind.
  */
 #define SYS_ID_AA64ISAR2_EL1 S3_0_C0_C6_2
 #define SYS_ID_AA64MMFR2_EL1 S3_0_C0_C7_2
+#define SYS_ID_AA64ZFR0_EL1 S3_0_C0_C4_4
+#define SYS_ID_AA64SMFR0_EL1 S3_0_C0_C4_5
 
 int failed_bit_count;
 
@@ -XXX,XX +XXX,XX @@ int main(void)
     /* all hidden, DebugVer fixed to 0x6 (ARMv8 debug architecture) */
     get_cpu_reg_check_mask(id_aa64dfr0_el1,  _m(0000,0000,0000,0006));
     get_cpu_reg_check_zero(id_aa64dfr1_el1);
-    get_cpu_reg_check_mask(id_aa64zfr0_el1,  _m(0ff0,ff0f,00ff,00ff));
-#ifdef HAS_ARMV9_SME
-    get_cpu_reg_check_mask(id_aa64smfr0_el1, _m(80f1,00fd,0000,0000));
-#endif
+    get_cpu_reg_check_mask(SYS_ID_AA64ZFR0_EL1,  _m(0ff0,ff0f,00ff,00ff));
+    get_cpu_reg_check_mask(SYS_ID_AA64SMFR0_EL1, _m(80f1,00fd,0000,0000));
 
     get_cpu_reg_check_zero(id_aa64afr0_el1);
     get_cpu_reg_check_zero(id_aa64afr1_el1);
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index XXXXXXX..XXXXXXX 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -XXX,XX +XXX,XX @@ AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7
 mte-%: CFLAGS += -march=armv8.5-a+memtag
 endif
 
-ifneq ($(CROSS_CC_HAS_SVE),)
 # System Registers Tests
 AARCH64_TESTS += sysregs
-ifneq ($(CROSS_CC_HAS_ARMV9_SME),)
-sysregs: CFLAGS+=-march=armv9-a+sme -DHAS_ARMV9_SME
-else
-sysregs: CFLAGS+=-march=armv8.1-a+sve
-endif
 
+ifneq ($(CROSS_CC_HAS_SVE),)
 # SVE ioctl test
 AARCH64_TESTS += sve-ioctls
 sve-ioctls: CFLAGS+=-march=armv8.1-a+sve
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Allow the line length to extend to 548 columns.  While annoyingly wide,
it's still less confusing than the continuations we print.  Also, the
default VL used by Linux (and max for A64FX) uses only 140 columns.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230622151201.1578522-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c | 36 ++++++++++++++----------------------
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
     ARMCPU *cpu = ARM_CPU(cs);
     CPUARMState *env = &cpu->env;
     uint32_t psr = pstate_read(env);
-    int i;
+    int i, j;
     int el = arm_current_el(env);
     const char *ns_status;
     bool sve;
@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
     }
 
     if (sve) {
-        int j, zcr_len = sve_vqm1_for_el(env, el);
+        int zcr_len = sve_vqm1_for_el(env, el);
 
         for (i = 0; i <= FFR_PRED_NUM; i++) {
             bool eol;
@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
             }
         }
 
-        for (i = 0; i < 32; i++) {
-            if (zcr_len == 0) {
+        if (zcr_len == 0) {
+            /*
+             * With vl=16, there are only 37 columns per register,
+             * so output two registers per line.
+             */
+            for (i = 0; i < 32; i++) {
                 qemu_fprintf(f, "Z%02d=%016" PRIx64 ":%016" PRIx64 "%s",
                              i, env->vfp.zregs[i].d[1],
                              env->vfp.zregs[i].d[0], i & 1 ? "\n" : " ");
-            } else if (zcr_len == 1) {
-                qemu_fprintf(f, "Z%02d=%016" PRIx64 ":%016" PRIx64
-                             ":%016" PRIx64 ":%016" PRIx64 "\n",
-                             i, env->vfp.zregs[i].d[3], env->vfp.zregs[i].d[2],
-                             env->vfp.zregs[i].d[1], env->vfp.zregs[i].d[0]);
-            } else {
+            }
+        } else {
+            for (i = 0; i < 32; i++) {
+                qemu_fprintf(f, "Z%02d=", i);
                 for (j = zcr_len; j >= 0; j--) {
-                    bool odd = (zcr_len - j) % 2 != 0;
-                    if (j == zcr_len) {
-                        qemu_fprintf(f, "Z%02d[%x-%x]=", i, j, j - 1);
-                    } else if (!odd) {
-                        if (j > 0) {
-                            qemu_fprintf(f, "   [%x-%x]=", j, j - 1);
-                        } else {
-                            qemu_fprintf(f, "     [%x]=", j);
-                        }
-                    }
                     qemu_fprintf(f, "%016" PRIx64 ":%016" PRIx64 "%s",
                                  env->vfp.zregs[i].d[j * 2 + 1],
-                                 env->vfp.zregs[i].d[j * 2],
-                                 odd || j == 0 ? "\n" : ":");
+                                 env->vfp.zregs[i].d[j * 2 + 0],
+                                 j ? ":" : "\n");
                 }
             }
         }
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

Always print each matrix row whole, one per line, so that we
get the entire matrix in the proper shape.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230622151201.1578522-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE *f, int flags)
                          i, q[1], q[0], (i & 1 ? "\n" : " "));
         }
     }
+
+    if (cpu_isar_feature(aa64_sme, cpu) &&
+        FIELD_EX64(env->svcr, SVCR, ZA) &&
+        sme_exception_el(env, el) == 0) {
+        int zcr_len = sve_vqm1_for_el_sm(env, el, true);
+        int svl = (zcr_len + 1) * 16;
+        int svl_lg10 = svl < 100 ? 2 : 3;
+
+        for (i = 0; i < svl; i++) {
+            qemu_fprintf(f, "ZA[%0*d]=", svl_lg10, i);
+            for (j = zcr_len; j >= 0; --j) {
+                qemu_fprintf(f, "%016" PRIx64 ":%016" PRIx64 "%c",
+                             env->zarray[i].d[2 * j + 1],
+                             env->zarray[i].d[2 * j],
+                             j ? ':' : '\n');
+            }
+        }
+    }
 }
 
 #else
-- 
2.34.1

From: Richard Henderson <richard.henderson@linaro.org>

For the outer product set of insns, which take an entire matrix
tile as output, the argument is not a combined tile+column.
Therefore using get_tile_rowcol was incorrect, as we extracted
the tile number from itself.

The test case relies only on assembler support for SME, since
no release of GCC recognizes -march=armv9-a+sme yet.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1620
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230622151201.1578522-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: dropped now-unneeded changes to sysregs CFLAGS]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/translate-sme.c    | 24 ++++++---
 tests/tcg/aarch64/sme-outprod1.c  | 83 +++++++++++++++++++++++++++++++
 tests/tcg/aarch64/Makefile.target |  7 ++-
 3 files changed, 107 insertions(+), 7 deletions(-)
 create mode 100644 tests/tcg/aarch64/sme-outprod1.c

diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -XXX,XX +XXX,XX @@ static TCGv_ptr get_tile_rowcol(DisasContext *s, int esz, int rs,
     return addr;
 }
 
+/*
+ * Resolve tile.size[0] to a host pointer.
+ * Used by e.g. outer product insns where we require the entire tile.
+ */
+static TCGv_ptr get_tile(DisasContext *s, int esz, int tile)
+{
+    TCGv_ptr addr = tcg_temp_new_ptr();
+    int offset;
+
+    offset = tile * sizeof(ARMVectorReg) + offsetof(CPUARMState, zarray);
+
+    tcg_gen_addi_ptr(addr, cpu_env, offset);
+    return addr;
+}
+
 static bool trans_ZERO(DisasContext *s, arg_ZERO *a)
 {
     if (!dc_isar_feature(aa64_sme, s)) {
@@ -XXX,XX +XXX,XX @@ static bool do_adda(DisasContext *s, arg_adda *a, MemOp esz,
         return true;
     }
 
-    /* Sum XZR+zad to find ZAd. */
-    za = get_tile_rowcol(s, esz, 31, a->zad, false);
+    za = get_tile(s, esz, a->zad);
     zn = vec_full_reg_ptr(s, a->zn);
     pn = pred_full_reg_ptr(s, a->pn);
     pm = pred_full_reg_ptr(s, a->pm);
@@ -XXX,XX +XXX,XX @@ static bool do_outprod(DisasContext *s, arg_op *a, MemOp esz,
         return true;
     }
 
-    /* Sum XZR+zad to find ZAd. */
-    za = get_tile_rowcol(s, esz, 31, a->zad, false);
+    za = get_tile(s, esz, a->zad);
     zn = vec_full_reg_ptr(s, a->zn);
     zm = vec_full_reg_ptr(s, a->zm);
     pn = pred_full_reg_ptr(s, a->pn);
@@ -XXX,XX +XXX,XX @@ static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz,
         return true;
     }
 
-    /* Sum XZR+zad to find ZAd. */
-    za = get_tile_rowcol(s, esz, 31, a->zad, false);
+    za = get_tile(s, esz, a->zad);
     zn = vec_full_reg_ptr(s, a->zn);
     zm = vec_full_reg_ptr(s, a->zm);
     pn = pred_full_reg_ptr(s, a->pn);
diff --git a/tests/tcg/aarch64/sme-outprod1.c b/tests/tcg/aarch64/sme-outprod1.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/tests/tcg/aarch64/sme-outprod1.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * SME outer product, 1 x 1.
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include <stdio.h>
+
+extern void foo(float *dst);
+
+asm(
+"	.arch_extension sme\n"
+"	.type foo, @function\n"
+"foo:\n"
+"	stp x29, x30, [sp, -80]!\n"
+"	mov x29, sp\n"
+"	stp d8, d9, [sp, 16]\n"
+"	stp d10, d11, [sp, 32]\n"
+"	stp d12, d13, [sp, 48]\n"
+"	stp d14, d15, [sp, 64]\n"
+"	smstart\n"
+"	ptrue p0.s, vl4\n"
+"	fmov z0.s, #1.0\n"
+/*
+ * An outer product of a vector of 1.0 by itself should be a matrix of 1.0.
+ * Note that we are using tile 1 here (za1.s) rather than tile 0.
+ */
+"	zero {za}\n"
+"	fmopa za1.s, p0/m, p0/m, z0.s, z0.s\n"
+/*
+ * Read the first 4x4 sub-matrix of elements from tile 1:
+ * Note that za1h should be interchangable here.
+ */
+"	mov w12, #0\n"
+"	mova z0.s, p0/m, za1v.s[w12, #0]\n"
+"	mova z1.s, p0/m, za1v.s[w12, #1]\n"
+"	mova z2.s, p0/m, za1v.s[w12, #2]\n"
+"	mova z3.s, p0/m, za1v.s[w12, #3]\n"
+/*
+ * And store them to the input pointer (dst in the C code):
+ */
+"	st1w {z0.s}, p0, [x0]\n"
+"	add x0, x0, #16\n"
+"	st1w {z1.s}, p0, [x0]\n"
+"	add x0, x0, #16\n"
+"	st1w {z2.s}, p0, [x0]\n"
+"	add x0, x0, #16\n"
+"	st1w {z3.s}, p0, [x0]\n"
+"	smstop\n"
+"	ldp d8, d9, [sp, 16]\n"
+"	ldp d10, d11, [sp, 32]\n"
+"	ldp d12, d13, [sp, 48]\n"
+"	ldp d14, d15, [sp, 64]\n"
+"	ldp x29, x30, [sp], 80\n"
+"	ret\n"
+"	.size foo, . - foo"
+);
+
+int main()
+{
+    float dst[16];
+    int i, j;
+
+    foo(dst);
+
+    for (i = 0; i < 16; i++) {
+        if (dst[i] != 1.0f) {
+            break;
+        }
+    }
+
+    if (i == 16) {
+        return 0; /* success */
+    }
+
+    /* failure */
+    for (i = 0; i < 4; ++i) {
+        for (j = 0; j < 4; ++j) {
+            printf("%f ", (double)dst[i * 4 + j]);
+        }
+        printf("\n");
+    }
+    return 1;
+}
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index XXXXXXX..XXXXXXX 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -XXX,XX +XXX,XX @@ config-cc.mak: Makefile
 	    $(call cc-option,-march=armv8.5-a,              CROSS_CC_HAS_ARMV8_5); \
 	    $(call cc-option,-mbranch-protection=standard,  CROSS_CC_HAS_ARMV8_BTI); \
 	    $(call cc-option,-march=armv8.5-a+memtag,       CROSS_CC_HAS_ARMV8_MTE); \
-	    $(call cc-option,-march=armv9-a+sme,            CROSS_CC_HAS_ARMV9_SME)) 3> config-cc.mak
+	    $(call cc-option,-Wa$(COMMA)-march=armv9-a+sme, CROSS_AS_HAS_ARMV9_SME)) 3> config-cc.mak
 -include config-cc.mak
 
 ifneq ($(CROSS_CC_HAS_ARMV8_2),)
@@ -XXX,XX +XXX,XX @@ AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7
 mte-%: CFLAGS += -march=armv8.5-a+memtag
 endif
 
+# SME Tests
+ifneq ($(CROSS_AS_HAS_ARMV9_SME),)
+AARCH64_TESTS += sme-outprod1
+endif
+
 # System Registers Tests
 AARCH64_TESTS += sysregs
 
-- 
2.34.1

From: John Högberg <john.hogberg@ericsson.com>

Unlike architectures with precise self-modifying code semantics
(e.g. x86) ARM processors do not maintain coherency for instruction
execution and memory, requiring an instruction synchronization
barrier on every core that will execute the new code, and on many
models also the explicit use of cache management instructions.

While this is required to make JITs work on actual hardware, QEMU
has gotten away with not handling this since it does not emulate
caches, and unconditionally invalidates code whenever the softmmu
or the user-mode page protection logic detects that code has been
modified.

Unfortunately the latter does not work in the face of dual-mapped
code (a common W^X workaround), where one page is executable and
the other is writable: user-mode has no way to connect one with the
other as that is only known to the kernel and the emulated
application.

This commit works around the issue by telling software that
instruction cache invalidation is required by clearing the
CPR_EL0.DIC flag (regardless of whether the emulated processor
needs it), and then invalidating code in IC IVAU instructions.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1034

Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: John Högberg <john.hogberg@ericsson.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 168778890374.24232.3402138851538068785-1@git.sr.ht
[PMM: removed unnecessary AArch64 feature check; moved
 "clear CTR_EL1.DIC" code up a bit so it's not in the middle
 of the vfp/neon related tests]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c    | 11 +++++++++++
 target/arm/helper.c | 47 ++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
+#ifdef CONFIG_USER_ONLY
+    /*
+     * User mode relies on IC IVAU instructions to catch modification of
+     * dual-mapped code.
+     *
+     * Clear CTR_EL0.DIC to ensure that software that honors these flags uses
+     * IC IVAU even if the emulated processor does not normally require it.
+     */
+    cpu->ctr = FIELD_DP64(cpu->ctr, CTR_EL0, DIC, 0);
+#endif
+
     if (arm_feature(env, ARM_FEATURE_AARCH64) &&
         cpu->has_vfp != cpu->has_neon) {
         /*
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void mdcr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
     }
 }
 
+#ifdef CONFIG_USER_ONLY
+/*
+ * `IC IVAU` is handled to improve compatibility with JITs that dual-map their
+ * code to get around W^X restrictions, where one region is writable and the
+ * other is executable.
+ *
+ * Since the executable region is never written to we cannot detect code
+ * changes when running in user mode, and rely on the emulated JIT telling us
+ * that the code has changed by executing this instruction.
+ */
+static void ic_ivau_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                          uint64_t value)
+{
+    uint64_t icache_line_mask, start_address, end_address;
+    const ARMCPU *cpu;
+
+    cpu = env_archcpu(env);
+
+    icache_line_mask = (4 << extract32(cpu->ctr, 0, 4)) - 1;
+    start_address = value & ~icache_line_mask;
+    end_address = value | icache_line_mask;
+
+    mmap_lock();
+
+    tb_invalidate_phys_range(start_address, end_address);
+
+    mmap_unlock();
+}
+#endif
+
 static const ARMCPRegInfo v8_cp_reginfo[] = {
     /*
      * Minimal set of EL0-visible registers. This will need to be expanded
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
     { .name = "CURRENTEL", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 0, .opc2 = 2, .crn = 4, .crm = 2,
       .access = PL1_R, .type = ARM_CP_CURRENTEL },
-    /* Cache ops: all NOPs since we don't emulate caches */
+    /*
+     * Instruction cache ops. All of these except `IC IVAU` NOP because we
+     * don't emulate caches.
+     */
     { .name = "IC_IALLUIS", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 1, .opc2 = 0,
       .access = PL1_W, .type = ARM_CP_NOP,
@@ -XXX,XX +XXX,XX @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .accessfn = access_tocu },
     { .name = "IC_IVAU", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 5, .opc2 = 1,
-      .access = PL0_W, .type = ARM_CP_NOP,
+      .access = PL0_W,
       .fgt = FGT_ICIVAU,
-      .accessfn = access_tocu },
+      .accessfn = access_tocu,
+#ifdef CONFIG_USER_ONLY
+      .type = ARM_CP_NO_RAW,
+      .writefn = ic_ivau_write
+#else
+      .type = ARM_CP_NOP
+#endif
+    },
+    /* Cache ops: all NOPs since we don't emulate caches */
     { .name = "DC_IVAC", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 1,
       .access = PL1_W, .accessfn = aa64_cacheop_poc_access,
-- 
2.34.1

From: Vikram Garhwal <vikram.garhwal@amd.com>

Following are done to fix the coverity issues:
1. Change read_data to fix the CID 1512899: Out-of-bounds access (OVERRUN)
2. Fix match_rx_tx_data to fix CID 1512900: Logically dead code (DEADCODE)
3. Replace rand() in generate_random_data() with g_rand_int()

Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Message-id: 20230628202758.16398-1-vikram.garhwal@amd.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 tests/qtest/xlnx-canfd-test.c | 33 +++++++++++----------------------
 1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/tests/qtest/xlnx-canfd-test.c b/tests/qtest/xlnx-canfd-test.c
index XXXXXXX..XXXXXXX 100644
--- a/tests/qtest/xlnx-canfd-test.c
+++ b/tests/qtest/xlnx-canfd-test.c
@@ -XXX,XX +XXX,XX @@ static void generate_random_data(uint32_t *buf_tx, bool is_canfd_frame)
     /* Generate random TX data for CANFD frame. */
     if (is_canfd_frame) {
         for (int i = 0; i < CANFD_FRAME_SIZE - 2; i++) {
-            buf_tx[2 + i] = rand();
+            buf_tx[2 + i] = g_random_int();
         }
     } else {
         /* Generate random TX data for CAN frame. */
         for (int i = 0; i < CAN_FRAME_SIZE - 2; i++) {
-            buf_tx[2 + i] = rand();
+            buf_tx[2 + i] = g_random_int();
         }
     }
 }
 
-static void read_data(QTestState *qts, uint64_t can_base_addr, uint32_t *buf_rx)
+static void read_data(QTestState *qts, uint64_t can_base_addr, uint32_t *buf_rx,
+                      uint32_t frame_size)
 {
     uint32_t int_status;
     uint32_t fifo_status_reg_value;
     /* At which RX FIFO the received data is stored. */
     uint8_t store_ind = 0;
-    bool is_canfd_frame = false;
 
     /* Read the interrupt on CANFD rx. */
     int_status = qtest_readl(qts, can_base_addr + R_ISR_OFFSET) & ISR_RXOK;
@@ -XXX,XX +XXX,XX @@ static void read_data(QTestState *qts, uint64_t can_base_addr, uint32_t *buf_rx)
     buf_rx[0] = qtest_readl(qts, can_base_addr + R_RX0_ID_OFFSET);
     buf_rx[1] = qtest_readl(qts, can_base_addr + R_RX0_DLC_OFFSET);
 
-    is_canfd_frame = (buf_rx[1] >> DLC_FD_BIT_SHIFT) & 1;
-
-    if (is_canfd_frame) {
-        for (int i = 0; i < CANFD_FRAME_SIZE - 2; i++) {
-            buf_rx[i + 2] = qtest_readl(qts,
-                                    can_base_addr + R_RX0_DATA1_OFFSET + 4 * i);
-        }
-    } else {
-        buf_rx[2] = qtest_readl(qts, can_base_addr + R_RX0_DATA1_OFFSET);
-        buf_rx[3] = qtest_readl(qts, can_base_addr + R_RX0_DATA2_OFFSET);
+    for (int i = 0; i < frame_size - 2; i++) {
+        buf_rx[i + 2] = qtest_readl(qts,
+                                can_base_addr + R_RX0_DATA1_OFFSET + 4 * i);
     }
 
     /* Clear the RX interrupt. */
@@ -XXX,XX +XXX,XX @@ static void match_rx_tx_data(const uint32_t *buf_tx, const uint32_t *buf_rx,
             g_assert_cmpint((buf_rx[size] & DLC_FD_BIT_MASK), ==,
                             (buf_tx[size] & DLC_FD_BIT_MASK));
         } else {
-            if (!is_canfd_frame && size == 4) {
-                break;
-            }
-
             g_assert_cmpint(buf_rx[size], ==, buf_tx[size]);
         }
 
@@ -XXX,XX +XXX,XX @@ static void test_can_data_transfer(void)
     write_data(qts, CANFD0_BASE_ADDR, buf_tx, false);
 
     send_data(qts, CANFD0_BASE_ADDR);
-    read_data(qts, CANFD1_BASE_ADDR, buf_rx);
+    read_data(qts, CANFD1_BASE_ADDR, buf_rx, CAN_FRAME_SIZE);
     match_rx_tx_data(buf_tx, buf_rx, false);
 
     qtest_quit(qts);
@@ -XXX,XX +XXX,XX @@ static void test_canfd_data_transfer(void)
     write_data(qts, CANFD0_BASE_ADDR, buf_tx, true);
 
     send_data(qts, CANFD0_BASE_ADDR);
-    read_data(qts, CANFD1_BASE_ADDR, buf_rx);
+    read_data(qts, CANFD1_BASE_ADDR, buf_rx, CANFD_FRAME_SIZE);
     match_rx_tx_data(buf_tx, buf_rx, true);
 
     qtest_quit(qts);
@@ -XXX,XX +XXX,XX @@ static void test_can_loopback(void)
     write_data(qts, CANFD0_BASE_ADDR, buf_tx, true);
 
     send_data(qts, CANFD0_BASE_ADDR);
-    read_data(qts, CANFD0_BASE_ADDR, buf_rx);
+    read_data(qts, CANFD0_BASE_ADDR, buf_rx, CANFD_FRAME_SIZE);
     match_rx_tx_data(buf_tx, buf_rx, true);
 
     generate_random_data(buf_tx, true);
@@ -XXX,XX +XXX,XX @@ static void test_can_loopback(void)
     write_data(qts, CANFD1_BASE_ADDR, buf_tx, true);
 
     send_data(qts, CANFD1_BASE_ADDR);
-    read_data(qts, CANFD1_BASE_ADDR, buf_rx);
+    read_data(qts, CANFD1_BASE_ADDR, buf_rx, CANFD_FRAME_SIZE);
     match_rx_tx_data(buf_tx, buf_rx, true);
 
     qtest_quit(qts);
-- 
2.34.1

From: Fabiano Rosas <farosas@suse.de>

This code is only relevant when TCG is present in the build. Building
with --disable-tcg --enable-xen on an x86 host we get:

$ ../configure --target-list=x86_64-softmmu,aarch64-softmmu --disable-tcg --enable-xen
$ make -j$(nproc)
...
libqemu-aarch64-softmmu.fa.p/target_arm_gdbstub.c.o: in function `m_sysreg_ptr':
 ../target/arm/gdbstub.c:358: undefined reference to `arm_v7m_get_sp_ptr'
 ../target/arm/gdbstub.c:361: undefined reference to `arm_v7m_get_sp_ptr'

libqemu-aarch64-softmmu.fa.p/target_arm_gdbstub.c.o: in function `arm_gdb_get_m_systemreg':
../target/arm/gdbstub.c:405: undefined reference to `arm_v7m_mrs_control'

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20230628164821.16771-1-farosas@suse.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/gdbstub.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -XXX,XX +XXX,XX @@ static int arm_gen_dynamic_sysreg_xml(CPUState *cs, int base_reg)
     return cpu->dyn_sysreg_xml.num;
 }
 
+#ifdef CONFIG_TCG
 typedef enum {
     M_SYSREG_MSP,
     M_SYSREG_PSP,
@@ -XXX,XX +XXX,XX @@ static int arm_gen_dynamic_m_secextreg_xml(CPUState *cs, int orig_base_reg)
     return cpu->dyn_m_secextreg_xml.num;
 }
 #endif
+#endif /* CONFIG_TCG */
 
 const char *arm_gdb_get_dynamic_xml(CPUState *cs, const char *xmlname)
 {
@@ -XXX,XX +XXX,XX @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
                              arm_gen_dynamic_sysreg_xml(cs, cs->gdb_num_regs),
                              "system-registers.xml", 0);
 
+#ifdef CONFIG_TCG
     if (arm_feature(env, ARM_FEATURE_M) && tcg_enabled()) {
         gdb_register_coprocessor(cs,
             arm_gdb_get_m_systemreg, arm_gdb_set_m_systemreg,
@@ -XXX,XX +XXX,XX @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
         }
 #endif
     }
+#endif /* CONFIG_TCG */
 }
-- 
2.34.1

From: Akihiko Odaki <akihiko.odaki@daynix.com>

AwSRAMCClass is larger than SysBusDeviceClass so the class size must be
advertised accordingly.

Fixes: 05def917e1 ("hw: arm: allwinner-sramc: Add SRAM Controller support for R40")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230628110905.38125-1-akihiko.odaki@daynix.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/misc/allwinner-sramc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/misc/allwinner-sramc.c b/hw/misc/allwinner-sramc.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/allwinner-sramc.c
+++ b/hw/misc/allwinner-sramc.c
@@ -XXX,XX +XXX,XX @@ static const TypeInfo allwinner_sramc_info = {
     .parent        = TYPE_SYS_BUS_DEVICE,
     .instance_init = allwinner_sramc_init,
     .instance_size = sizeof(AwSRAMCState),
+    .class_size    = sizeof(AwSRAMCClass),
     .class_init    = allwinner_sramc_class_init,
 };
 
-- 
2.34.1

In handle_interrupt() we use level as an index into the interrupt_vector[]
array. This is safe because we have checked it against env->config->nlevel,
but Coverity can't see that (and it is only true because each CPU config
sets its XCHAL_NUM_INTLEVELS to something less than MAX_NLEVELS), so it
complains about a possible array overrun (CID 1507131)

Add an assert() which will make Coverity happy and catch the unlikely
case of a mis-set XCHAL_NUM_INTLEVELS in future.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Message-id: 20230623154135.1930261-1-peter.maydell@linaro.org
---
 target/xtensa/exc_helper.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/xtensa/exc_helper.c b/target/xtensa/exc_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/xtensa/exc_helper.c
+++ b/target/xtensa/exc_helper.c
@@ -XXX,XX +XXX,XX @@ static void handle_interrupt(CPUXtensaState *env)
         CPUState *cs = env_cpu(env);
 
         if (level > 1) {
+            /* env->config->nlevel check should have ensured this */
+            assert(level < sizeof(env->config->interrupt_vector));
+
             env->sregs[EPC1 + level - 1] = env->pc;
             env->sregs[EPS2 + level - 2] = env->sregs[PS];
             env->sregs[PS] =
-- 
2.34.1

We already squash the ID register field for FEAT_SPE (the Statistical
Profiling Extension) because TCG does not implement it and if we
advertise it to the guest the guest will crash trying to look at
non-existent system registers.  Do the same for some other features
which a real hardware Neoverse-V1 implements but which TCG doesn't:
 * FEAT_TRF (Self-hosted Trace Extension)
 * Trace Macrocell system register access
 * Memory mapped trace
 * FEAT_AMU (Activity Monitors Extension)
 * FEAT_MPAM (Memory Partitioning and Monitoring Extension)
 * FEAT_NV (Nested Virtualization)

Most of these, like FEAT_SPE, are "introspection/trace" type features
which QEMU is unlikely to ever implement.  The odd-one-out here is
FEAT_NV -- we could implement that and at some point we probably
will.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230704130647.2842917-2-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.c | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
 
     if (tcg_enabled()) {
         /*
-         * Don't report the Statistical Profiling Extension in the ID
-         * registers, because TCG doesn't implement it yet (not even a
-         * minimal stub version) and guests will fall over when they
-         * try to access the non-existent system registers for it.
+         * Don't report some architectural features in the ID registers
+         * where TCG does not yet implement it (not even a minimal
+         * stub version). This avoids guests falling over when they
+         * try to access the non-existent system registers for them.
          */
+        /* FEAT_SPE (Statistical Profiling Extension) */
         cpu->isar.id_aa64dfr0 =
             FIELD_DP64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, PMSVER, 0);
+        /* FEAT_TRF (Self-hosted Trace Extension) */
+        cpu->isar.id_aa64dfr0 =
+            FIELD_DP64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, TRACEFILT, 0);
+        cpu->isar.id_dfr0 =
+            FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, TRACEFILT, 0);
+        /* Trace Macrocell system register access */
+        cpu->isar.id_aa64dfr0 =
+            FIELD_DP64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, TRACEVER, 0);
+        cpu->isar.id_dfr0 =
+            FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, COPTRC, 0);
+        /* Memory mapped trace */
+        cpu->isar.id_dfr0 =
+            FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, MMAPTRC, 0);
+        /* FEAT_AMU (Activity Monitors Extension) */
+        cpu->isar.id_aa64pfr0 =
+            FIELD_DP64(cpu->isar.id_aa64pfr0, ID_AA64PFR0, AMU, 0);
+        cpu->isar.id_pfr0 =
+            FIELD_DP32(cpu->isar.id_pfr0, ID_PFR0, AMU, 0);
+        /* FEAT_MPAM (Memory Partitioning and Monitoring Extension) */
+        cpu->isar.id_aa64pfr0 =
+            FIELD_DP64(cpu->isar.id_aa64pfr0, ID_AA64PFR0, MPAM, 0);
+        /* FEAT_NV (Nested Virtualization) */
+        cpu->isar.id_aa64mmfr2 =
+            FIELD_DP64(cpu->isar.id_aa64mmfr2, ID_AA64MMFR2, NV, 0);
     }
 
     /* MPU can be configured out of a PMSA CPU either by setting has-mpu
-- 
2.34.1

Now that we have implemented support for FEAT_LSE2, we can define
a CPU model for the Neoverse-V1, and enable it for the virt and
sbsa-ref boards.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230704130647.2842917-3-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 docs/system/arm/virt.rst |   1 +
 hw/arm/sbsa-ref.c        |   1 +
 hw/arm/virt.c            |   1 +
 target/arm/tcg/cpu64.c   | 128 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 131 insertions(+)

diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/arm/virt.rst
+++ b/docs/system/arm/virt.rst
@@ -XXX,XX +XXX,XX @@ Supported guest CPU types:
 - ``a64fx`` (64-bit)
 - ``host`` (with KVM only)
 - ``neoverse-n1`` (64-bit)
+- ``neoverse-v1`` (64-bit)
 - ``max`` (same as ``host`` for KVM; best possible emulation with TCG)
 
 Note that the default is ``cortex-a15``, so for an AArch64 guest you must
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -XXX,XX +XXX,XX @@ static const char * const valid_cpus[] = {
     ARM_CPU_TYPE_NAME("cortex-a57"),
     ARM_CPU_TYPE_NAME("cortex-a72"),
     ARM_CPU_TYPE_NAME("neoverse-n1"),
+    ARM_CPU_TYPE_NAME("neoverse-v1"),
     ARM_CPU_TYPE_NAME("max"),
 };
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -XXX,XX +XXX,XX @@ static const char *valid_cpus[] = {
     ARM_CPU_TYPE_NAME("cortex-a76"),
     ARM_CPU_TYPE_NAME("a64fx"),
     ARM_CPU_TYPE_NAME("neoverse-n1"),
+    ARM_CPU_TYPE_NAME("neoverse-v1"),
 #endif
     ARM_CPU_TYPE_NAME("cortex-a53"),
     ARM_CPU_TYPE_NAME("cortex-a57"),
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void define_neoverse_n1_cp_reginfo(ARMCPU *cpu)
     define_arm_cp_regs(cpu, neoverse_n1_cp_reginfo);
 }
 
+static const ARMCPRegInfo neoverse_v1_cp_reginfo[] = {
+    { .name = "CPUECTLR2_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 0, .crn = 15, .crm = 1, .opc2 = 5,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "CPUPPMCR_EL3", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 0,
+      .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "CPUPPMCR2_EL3", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 1,
+      .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "CPUPPMCR3_EL3", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 6, .crn = 15, .crm = 2, .opc2 = 6,
+      .access = PL3_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+};
+
+static void define_neoverse_v1_cp_reginfo(ARMCPU *cpu)
+{
+    /*
+     * The Neoverse V1 has all of the Neoverse N1's IMPDEF
+     * registers and a few more of its own.
+     */
+    define_arm_cp_regs(cpu, neoverse_n1_cp_reginfo);
+    define_arm_cp_regs(cpu, neoverse_v1_cp_reginfo);
+}
+
 static void aarch64_neoverse_n1_initfn(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
@@ -XXX,XX +XXX,XX @@ static void aarch64_neoverse_n1_initfn(Object *obj)
     define_neoverse_n1_cp_reginfo(cpu);
 }
 
+static void aarch64_neoverse_v1_initfn(Object *obj)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    cpu->dtb_compatible = "arm,neoverse-v1";
+    set_feature(&cpu->env, ARM_FEATURE_V8);
+    set_feature(&cpu->env, ARM_FEATURE_NEON);
+    set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
+    set_feature(&cpu->env, ARM_FEATURE_AARCH64);
+    set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
+    set_feature(&cpu->env, ARM_FEATURE_EL2);
+    set_feature(&cpu->env, ARM_FEATURE_EL3);
+    set_feature(&cpu->env, ARM_FEATURE_PMU);
+
+    /* Ordered by 3.2.4 AArch64 registers by functional group */
+    cpu->clidr = 0x82000023;
+    cpu->ctr = 0xb444c004; /* With DIC and IDC set */
+    cpu->dcz_blocksize = 4;
+    cpu->id_aa64afr0 = 0x00000000;
+    cpu->id_aa64afr1 = 0x00000000;
+    cpu->isar.id_aa64dfr0  = 0x000001f210305519ull;
+    cpu->isar.id_aa64dfr1 = 0x00000000;
+    cpu->isar.id_aa64isar0 = 0x1011111110212120ull; /* with FEAT_RNG */
+    cpu->isar.id_aa64isar1 = 0x0111000001211032ull;
+    cpu->isar.id_aa64mmfr0 = 0x0000000000101125ull;
+    cpu->isar.id_aa64mmfr1 = 0x0000000010212122ull;
+    cpu->isar.id_aa64mmfr2 = 0x0220011102101011ull;
+    cpu->isar.id_aa64pfr0  = 0x1101110120111112ull; /* GIC filled in later */
+    cpu->isar.id_aa64pfr1  = 0x0000000000000020ull;
+    cpu->id_afr0       = 0x00000000;
+    cpu->isar.id_dfr0  = 0x15011099;
+    cpu->isar.id_isar0 = 0x02101110;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232042;
+    cpu->isar.id_isar3 = 0x01112131;
+    cpu->isar.id_isar4 = 0x00010142;
+    cpu->isar.id_isar5 = 0x11011121;
+    cpu->isar.id_isar6 = 0x01100111;
+    cpu->isar.id_mmfr0 = 0x10201105;
+    cpu->isar.id_mmfr1 = 0x40000000;
+    cpu->isar.id_mmfr2 = 0x01260000;
+    cpu->isar.id_mmfr3 = 0x02122211;
+    cpu->isar.id_mmfr4 = 0x01021110;
+    cpu->isar.id_pfr0  = 0x21110131;
+    cpu->isar.id_pfr1  = 0x00010000; /* GIC filled in later */
+    cpu->isar.id_pfr2  = 0x00000011;
+    cpu->midr = 0x411FD402;          /* r1p2 */
+    cpu->revidr = 0;
+
+    /*
+     * The Neoverse-V1 r1p2 TRM lists 32-bit format CCSIDR_EL1 values,
+     * but also says it implements CCIDX, which means they should be
+     * 64-bit format. So we here use values which are based on the textual
+     * information in chapter 2 of the TRM (and on the fact that
+     * sets * associativity * linesize == cachesize).
+     *
+     * The 64-bit CCSIDR_EL1 format is:
+     *   [55:32] number of sets - 1
+     *   [23:3]  associativity - 1
+     *   [2:0]   log2(linesize) - 4
+     *           so 0 == 16 bytes, 1 == 32 bytes, 2 == 64 bytes, etc
+     *
+     * L1: 4-way set associative 64-byte line size, total size 64K,
+     * so sets is 256.
+     *
+     * L2: 8-way set associative, 64 byte line size, either 512K or 1MB.
+     * We pick 1MB, so this has 2048 sets.
+     *
+     * L3: No L3 (this matches the CLIDR_EL1 value).
+     */
+    cpu->ccsidr[0] = 0x000000ff0000001aull; /* 64KB L1 dcache */
+    cpu->ccsidr[1] = 0x000000ff0000001aull; /* 64KB L1 icache */
+    cpu->ccsidr[2] = 0x000007ff0000003aull; /* 1MB L2 cache */
+
+    /* From 3.2.115 SCTLR_EL3 */
+    cpu->reset_sctlr = 0x30c50838;
+
+    /* From 3.4.8 ICC_CTLR_EL3 and 3.4.23 ICH_VTR_EL2 */
+    cpu->gic_num_lrs = 4;
+    cpu->gic_vpribits = 5;
+    cpu->gic_vprebits = 5;
+    cpu->gic_pribits = 5;
+
+    /* From 3.5.1 AdvSIMD AArch64 register summary */
+    cpu->isar.mvfr0 = 0x10110222;
+    cpu->isar.mvfr1 = 0x13211111;
+    cpu->isar.mvfr2 = 0x00000043;
+
+    /* From 3.7.5 ID_AA64ZFR0_EL1 */
+    cpu->isar.id_aa64zfr0 = 0x0000100000100000;
+    cpu->sve_vq.supported = (1 << 0)  /* 128bit */
+                            | (1 << 1);  /* 256bit */
+
+    /* From 5.5.1 AArch64 PMU register summary */
+    cpu->isar.reset_pmcr_el0 = 0x41213000;
+
+    define_neoverse_v1_cp_reginfo(cpu);
+
+    aarch64_add_pauth_properties(obj);
+    aarch64_add_sve_properties(obj);
+}
+
 /*
  * -cpu max: a CPU with as many features enabled as our emulation supports.
  * The version of '-cpu max' for qemu-system-arm is defined in cpu32.c;
@@ -XXX,XX +XXX,XX @@ static const ARMCPUInfo aarch64_cpus[] = {
     { .name = "cortex-a76",         .initfn = aarch64_a76_initfn },
     { .name = "a64fx",              .initfn = aarch64_a64fx_initfn },
     { .name = "neoverse-n1",        .initfn = aarch64_neoverse_n1_initfn },
+    { .name = "neoverse-v1",        .initfn = aarch64_neoverse_v1_initfn },
 };
 
 static void aarch64_cpu_register_types(void)
-- 
2.34.1

If you build QEMU with the clang sanitizer enabled, you can see it
fire when running the arm-cpu-features test:

$ QTEST_QEMU_BINARY=./build/arm-clang/qemu-system-aarch64 ./build/arm-clang/tests/qtest/arm-cpu-features
[...]
../../target/arm/cpu64.c:125:19: runtime error: shift exponent 64 is too large for 64-bit type 'unsigned long long'
[...]

This happens because the user can specify some incorrect SVE
properties that result in our calculating a max_vq of 0.  We catch
this and error out, but before we do that we calculate

vq_mask = MAKE_64BIT_MASK(0, max_vq);$

and the MAKE_64BIT_MASK() call is only valid for lengths that are
greater than zero, so we hit the undefined behaviour.

Change the logic so that if max_vq is 0 we specifically set vq_mask
to 0 without going via MAKE_64BIT_MASK().  This lets us drop the
max_vq check from the error-exit logic, because if max_vq is 0 then
vq_map must now be 0.

The UB only happens in the case where the user passed us an incorrect
set of SVE properties, so it's not a big problem in practice.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230704154332.3014896-1-peter.maydell@linaro.org
---
 target/arm/cpu64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
         vq = ctz32(tmp) + 1;
 
         max_vq = vq <= ARM_MAX_VQ ? vq - 1 : ARM_MAX_VQ;
-        vq_mask = MAKE_64BIT_MASK(0, max_vq);
+        vq_mask = max_vq > 0 ? MAKE_64BIT_MASK(0, max_vq) : 0;
         vq_map = vq_supported & ~vq_init & vq_mask;
 
-        if (max_vq == 0 || vq_map == 0) {
+        if (vq_map == 0) {
             error_setg(errp, "cannot disable sve%d", vq * 128);
             error_append_hint(errp, "Disabling sve%d results in all "
                               "vector lengths being disabled.\n",
-- 
2.34.1